_ _ _/B\_ _/W\_ (* *) Phrack #64 file 14 (* *) | - | | - | | | The art of Exploitation: | | | | | | | | Come back on a exploit | | | | | | | | by vl4d1m1r of Ac1dB1tch3z | | (____________________________________________________) Dear Underground, starting from this release, the Circle of Lost Hackers decided to publish in each release a come back on a public exploit known from a long time. This section could be called 'autopsy of an exploit'. The idea is to explain the technical part of a famous exploit as well as its story, post-mortem. Here we start with the CVS "Is-modified" exploit who leaked in 2004. ------------------- PRELUDE Exploitation is an art. Coding an exploit can be an art form in itself. To code a true exploit, you need the total control on the system. To achieve this feat, we usually need to understand, analyze and master every pieces of the puzzle. Nothing is left to chance. The art of exploitation is to make the exploit targetless, ultimately oneshot. To go further the simple pragmatic exploitation. Make it more than a simple proof of concept shit. Put all your guts in it. Try to bypass existing protection techniques. A nice exploit is a great artwork, but confined to stay in the shadow. The inner working are only known by its authors and the rare code readers searching to pierce its mysteries. Its for the latter ones that this section was created. For the ones who are hungry about the information that hides behind the source code. This is the only reason behind the "r34d 7h3 c0d3 d00d" of the usage() function in this exploit : to force people to read the code, appreciate what you have in hand. Not to provide them a new tools or a new weapons but make them understand the various technical aspects of it. Each exploit is built following a particular methodology. We need to deeply analyze all the possibilities of the memory allocations until we master all of its parameters, often to a point where even the original programmers were ignoring these technical aspects. It is about venturing yourselves in the twists and turns, the complexity of the situation and finally discovering all the various opportunities that are available to you. To see what the fate has to offer us, the various potentials at our disposal. To make something out of it. Try to take out the best from the situation. When you'll get through this invisible line, the line that separates the simple proof of concept code from the best exploit possible, the one that guarantees you a shell every time, you could then say that the creation of an art form has just begun. The joy of gazing at your own piece of work leveraging a simple memory overwrite to a full workable exploit. It is a technical jewel of creativity and determination to bring a small computer bug to its full potential. Who has never rooted a server with the exploit 'x2'? Who never waited in front of his screen, watching the different steps, waiting for it to realize the great work it was made for ? But, how many people really understood the dichotomies of 'x2' and how it worked ? What was really happening behind what was printed on the screen, in this unfinished version of the exploit that got leaked and abused? Beyond the pragmatic kiddie who wants to get an access, this section aims at being the home for those who are motivated by curiosity, by the artistic sensibility at such exploits. This section is not meant to learn others how to own a server, but instead to teach them how the exploit is working. It is a demystification of the few exploits that leaked in the past of the underground to become in the public domain. It is about exploits that have been over exploited by a mass of incompetent people. This section is for people who can see, not for people who are only good at fetching what really have value. In fact, this section is about making justice to the original exploit. It is a return on what really deserves attention. At a certain point in time, the required level of comprehension to achieve a successful exploitation reaches the edge of insanity. The spirit melts with madness, we temporarily loose all kind of rationality and we enter a state of illumination. It's the fanaticism of the passionate that brings this to its full extent, at his extreme, demonstrate that it's possible to transcend the well known, to prove we can always achieve more, It is about pushing the limits. And then we enter the artistic creation, No, we are not moving away, but we are instead getting closer to the reality that hides behind an exploit. Only a couple of real exploits have been made public. The authors of them are generally smart enough to keep them private. Despite this, leaks happen for various reasons and generally it's a beginner error. The real exploit is not the one that has 34 targets, but only one, namely all at the same time. An exploit that takes a simple heap overflow and makes it work against GRsec, remotely and with ET_DYN on the binary. You will probably use this exploit only once in your whole life, but the most important part is the work accomplished by the authors to create it. The important part is the love they put in creating it. Maybe you'll learn nothing new from this exploit. In fact, the real goal is not to give you new exploitation techniques. You are grown up enough to read manuals, find your own techniques, make something out of the possibilities offered to you, the goal is to simply give back some praise to this arcane of obscured code forsaken from most of the people, this pieces of code which have been disclosed but still stay misunderstood. A column with the underground spirit, the real, for the expert and the lover of art. For the one who can see. ----------------------------------- The CVS "Is_Modified" exploit vl4d1m1r of ac1db1tch3z vd@phrack.org 1 - Overview 2 - The story of the exploit 3 - The Linux exploitation: Using malloc voodoo 4 - A couple of words on the BSD exploitation 5 - Conclusion --[ 1 - Overview We will, through this article, show you how the exploitation under the Linux operating system was made possible, and then study the BSD case. Both exploitation techniques are different and they both lead to a targetless and "oneshot" scenario. Remember that the code is 3-years old. I know that since, the glibc library has included a lot of changes in its malloc code. Foremost, with glibc 2.3.2, the flag MAIN_ARENA appeared, the FRONTLINK macro was removed and there was the addition of a new linked list, the "fast_chunks". Then, since version 2.3.5, the UNLINK() macro was patched in a way to prevent a "write 4 bytes to anywhere" primitive. Last but not least, on the majority of the systems, the heap is randomized by default along with the stack. But it was not the case at the time of this exploit. The goal of this article, as it was explained earlier, is not to teach you new techniques but instead to explain you what were the techniques used at that time to exploit the bug. --[ 2 - The story of the exploit This bug has originally been found by [CENSORED]. A first proof of concept code was coded by kujikiri of ac1db1tch3z in 2003. The exploit was working but only for a particular target. It was not reliable because all the parameters of the exploitable context were not taken into account. The main advantage of the code was that it could authenticate itself to the CVS server and trigger the bug, which represents an important part in the development of an exploit. The bug was then showed to another member of the ac1db1tch3z team. It's at that moment that we finally decided to code a really reliable exploit to be use in the wild. A first version of the exploit was coded for Linux. It was targetless but it needed about thirty connexions to succeed. This first version of the exploit submitted some addresses to the CVS server in order to determine if they were valid or not by looking if the server crashed or not. Then another member ported the exploit for the *BSD platform. As a result, a targetless and "oneshot" exploit was born. As a challenge, I tried to came up with the same result for the Linux version, and my perseverance finally paid back. Meanwhile, a third member found an interesting functionality in CVS, that wont be presented here, that gives the possibility to bruteforce the three mandatory parameters necessary for a successful exploitation: the cvsroot, the login and the password. It took me one night of passion (nothing sexual) to gather all those three pieces of code into one, and the result was cvs_freebsd_linux.c, which was later leaked. Another member of the underground later coded a Solaris version, but without the targetless and "oneshot" functionality. This exploit won't be presented here. This bug, as a matter of fact, was later "discovered" by Stefan Esser and disclosed by e-matters. We had a doubt that Stefan Esser himself found that exact same bug which was known in the underground. Even if he hadn't done so, he later redeemed himself while auditing the CVS source code with a fellow of his and by finding a certain number of other bugs. This proves he is able to find bugs, whatever. The code was finally made public by [CENSORED] who signed it with "The Axis of Eliteness", and bragged about the fact that he already rooted every i interesting targets currently available. It was not a great lost, even though it made a pinch at the heart to see publicly that opensource CVS servers went compromised. --[ 3 - The Linux exploitation: Using malloc voodoo The original flaw was a basic heap overflow. Indeed, it was possible to overwrite the heap with data under our control, and even to insert non alphanumeric characters without buffer length restrictions. It was a typical scenario. Moreover, and that's what is wonderful with the CVS server, by analyzing the different possibilities, we figured out that it was quite easy to force some calls to malloc() of an arbitrary size and chose the ones that we want to free(), with little restrictions. The funny thing is, when I originally coded the Linux version of the exploit, I did not know that it was possible to overwrite the memory space with completely arbitrary data. I thought that the only characters that you could overwrite memory with were 'M' and 0x4d. I had not analyzed the bug enough because I was quickly trying to find an interesting exploitation vector with the information I already had in my hands. Consequently, the Linux version exploits the bug like a simple overflow with the 0x4d character. The first difficulty that you meet with the heap, is that it's pretty unstable for various reasons. A lot of parameters change the memory layout, such as the amount of memory allocations that were already performed, the IP address of the server and other internal parameters of the CVS server. Consequently, the first step of the process is to try to normalize the heap and to put it in a state where we have complete control over it. We need to know exactly what is happening on the remote machine: to be sure about the state of the heap. A small analysis of the possibilities that the heap offers us reveal this: I had to analyze the various possibilities of memory allocation offered by the CVS server. Fortunately, the code was quite simple. I quickly found, by analyzing all the malloc() and free() calls, that I could allocate memory buffers with the "Entry" command. The function that accomplishes this is serve_entry, the code is quite straightforward: static void serve_entry (arg) char *arg; { struct an_entry *p; char *cp; [...] cp = arg; [...] p = xmalloc (sizeof (struct an_entry)); cp = xmalloc (strlen (arg) + 2); strcpy (cp, arg); p->next = entries; [1] p->entry = cp; entries = p; } Inside this function, which takes as an argument a pointer to a string that we control, there is a memory allocation of the following structure: struct an_entry { struct an_entry *next; char *entry; } ; Then, memory for the parameter will be allocated and assigned to the field "entry" of the previously allocated "an_entry" structure that we already defined, as you can see in [1]. This structure is then added to the linked list of entries tracked by the global variable "struct an_entry * entries". Therefore, if we are Ok with the fact that small "an_entry" structures are getting allocated in between our controlled buffers, we can then use this vector to allocate memory whenever we want. Now, if we want to call a free(), we can use the CVS "noop" command which calls the "server_write_entries()" function. Here is a code snippet from this function: static void server_write_entries () { struct an_entry *p; struct an_entry *q; [...] for (p = entries; p != NULL;) { [...] free (p->entry); q = p->next; free (p); p = q; } entries = NULL; } As you can see, all the previously allocated entries will now be free(). Note that when we talk about an 'entry' here, we refer to a pair of structure an_entry with his ->entry field that we control. Considering the fact that all the buffers that we allocated will be freed, this technique suits us well. Note that there were other possibilities less restrictive but this one is convenient enough. So, we know now how to allocate memory buffers with arbitrary data in it, even with non alphanumeric characters, and how to free them too. Let's come back to the original flaw that we did not described yet. The vulnerable command was "Is_Modified" and the function looked like this: static void serve_is_modified (arg) char *arg; { struct an_entry *p; char *name; char *cp; char *timefield; for (p = entries; p != NULL; p = p->next) { [1] name = p->entry + 1; cp = strchr (name, '/'); if (cp != NULL && strlen (arg) == cp - name && strncmp (arg, name, cp - name) == 0) { if (*timefield == '/') { [...] cp = timefield + strlen (timefield); cp[1] = '\0'; while (cp > timefield) { [2] *cp = cp[-1]; --cp; } } *timefield = 'M'; break; } } } As you can see, in [2], after adding an entry with the "Entry" command, it was possible to add some 'M' characters at the end of the entries previously inserted in the "entries" linked list. This was possible for the entries of our choice. The code is explicit enough so I don't detail it more. We now have all the necessary information to code a working exploit. Immediately after we have established a connection, the method used to normalize the heap and put it in a known state is to use the "Entry" command. With this particular command, we can add buffers of an arbitrary size. The fill_heap() function does this. The macro MAX_FILL_HEAP tells the maximum number of holes that we could find in the heap. It is set at a high value, to anticipate for any surprise. We start by allocating many big buffers to fill the majority of the holes. Then, we continue to allocate a lot of small buffers to fill all the remaining smaller holes. At this stage, we have no holes in our heap. Now, if we sit back and think a little bit, we know that the heap layout will looked something like this: [...][an_entry][buf1][an_entry][buf2][an_entry][bufn][top_chunk] Note : During the development of the exploit, I modified the malloc code to add functions of my own that I preloaded with LD_PRELOAD. This modified version would then generate various heap schemes to help me debug the heap. Note that some hackers use heap simulators to know the heap state during the development process. These heap simulators can be simply a gdb script or something using the libncurses. Any tools which can represent the heap state is useful. Once the connection was established and the fill_heap() function was called, we knew the exact layout of the heap. The challenge was now to corrupt a malloc chunk, insert a fake chunk and make a call to free() to trigger the UNLINK() macro with 'fd' and 'bk' under our control. This would let us overwrite 4 arbitrary bytes anywhere in memory. This is quite easy to do when you have the heap in a predictable state. We know that we can overflow "an_entry->entry" buffers of our choice. We will also inevitably overwrite what's located after this buffer, either the top chunk or the next "an_entry" structure if we have previously allocated one with another "Entry". We will try to use the latter technique because we don't want to corrupt the top chunk. Notice: From now on, since the UNLINK macro now contains security checks, we could instead use an overflow of the top chunk and trigger a call to set_head() to exploit the program, as explained in another article of this issue. Practically, we know that chunk headers are found right before the allocated memory space. Let's focus on the interesting part of the memory layout at the time of the overflow: [struct malloc_chunk][an_entry][struct malloc_chunk][buf][...][top_chunk] By calling the function "Is_modified" with the name of the entry that we want to corrupt, we will overwrite the "an_entry" structure located after the current buffer. So, the idea is to overwrite the "size" field of a struct an_entry, so it become bigger than before and when free will compute the offset to the next chunk it will directly fall inside the controlled part of the ->entry field of this struct an_entry. So, we only need to add an "Entry" with a fake malloc chunk at the right offset. See : #define NUM_OFF7 (sizeof("Entry ")) #define MSIZE 0x4c #define MALLOC_CHUNKSZ 8 #define AN_ENTRYSZ 8 #define MAGICSZ ((MALLOC_CHUNKSZ * 2) + AN_ENTRYSZ) #define FAKECHUNK MSIZE - MAGICSZ + (NUM_OFF7 - 1) The offset is FAKECHUNK. Let's sum up all the process at this point: 1. The function fill_heap() fills all the holes in the heap by sending a lot of entry thanks to the Entry command.. 2. We add 2 entries : the first one named "ABC", and another one with the name "dummy". The ->entry field of "ABC" entry will be overflowed and so the malloc_chunk of the struct an_entry "dummy" will be modified. 3. We call the function "Is_modified" with "ABC" as a parameter, numerous times in a row until we hit the size field of the malloc_chunk. This has for effect to add 'M' at the end of the buffer, outside its bound. Inside the ->entry field of the "dummy" entry we have a fake malloc_chunk at the FAKECHUNK offset. 4. If we now call the function "noop", it will have for effect to free() the linked list "entries". Starting from the end, the entry "dummy", and its associated "an_entry" structure, the entry "ABC" and its associated "an_entry" structure will be freed. Finally, all the "an_entry" structures that we used to fill the holes in the heap will also be freed. So, the magic occurs during the free of the an_entry of "dummy". The exact malloc voodoo is like this : We have overwritten with 'M' characters the "size" field of the malloc chunk of the "an_entry" structure next to our "ABC" buffer. From there, if we free() the "an_entry" structure that had its "size" field corrupted, free() will try to get to the next memory chunk at the address of the chunk + 'M'. It will bring us exactly inside a buffer that we have control on, which is the buffer "dummy". Consequently, if we can insert a fake chunk at the right offset, we are able to write 4 bytes anywhere in memory. From this point, 90% of the job is already done! Notice: Practically, it is not enough to only create a fake next chunk. You need to make sure a second next chunk is also available. Indeed, DLmalloc is going to check the PREV_INUSE byte of the second next chunk to check if it the next chunk buffer is free or occupied. The problem is that we can not put '\0' characters inside the fake chunk, so we need to put a negative size field, to make sure that the next chunk of the next chunk is before the first chunk. Practically, it works and I have used this technique many times to code heap overflows. Check the macro SIZE_VALUE inside the exploit code for more information. Its value is -8. Now, we will dig a little bit deeper inside the exploit. Let's take a look at the function detect_remote_os(). Here is the code: int detect_remote_os(void) { info("Guessing if remote is a cvs on a linux/x86...\t"); if(range_crashed(0xbfffffd0, 0xbfffffd0 + 4) || !range_crashed(0x42424242, 0x42424242 + 4)) { printf(VERT"NO"NORM", assuming it's *BSD\n"); isbsd = 1; return (0); } printf(VERT"Yes"NORM" !\n"); return (1); } With this technique, we will trigger an overwrite operation to an address that is always valid. This location will be a high address inside the stack, for example 0xbfffffd0. If the server answers properly, it means it did not crashed. If it did not crashed despite the overflow, it either means that the UNLINK call worked (i.e. It means we are under Linux with a stack mapped below 0xc0000000) or that the UNLINK call did not get triggered (= not Linux). To verify this, we will then try to write to an invalid, non mapped address, such as 0x42424242. If the server crashes, then we know for sure that the exploit does work correctly and that we are now on a Linux system. If it's not the case, we switch to the FreeBSD exploitation. Right now, the only thing that we are able to do is to trigger a call to UNLINK in a reliable way and to make sure that everything is working properly. We now need to get more serious about this, and get to the exploitation process. Generally, to successfully exploit such a vulnerability, we need to know the address of the shellcode and the address of a function pointer in memory to overwrite. By digging more into the problem, it is always possible to make the exploit work with only one address instead of two. It may even be possible to make it work without providing any memory addresses! Here is the technique used to accomplish such a feat. Indeed, we are able to allocate an infinite number of buffers next to each others, to corrupt their chunk headers and to free() them after with server_write_entries(). Being able to do this means that we can trigger more than one call to UNLINK, and this is what is going to make the difference. Being able to overwrite more than one memory address is a technique frequently used inside heap overflow exploits and usually makes the exploit targetless. In the following lines, I will explain how this behavior can lead us to the creation of the memcpy_remote() function, which takes the same arguments as the famous memcpy() function with the exception that it writes in the memory space of the exploited process. When we are able to trigger as many UNLINK calls as we want, we will see that it's possible to turn the exploitation scenario in a "write anything anywhere" primitive. What are the benefits of being able to do this? If we can write what we want at the address that we want, without any size constraints, we can copy the shellcode in memory. We will write it at a really low address of the stack, and I will explain why later. To know what address to overwrite, we will overwrite the majority of the stack with addresses that point to the beginning of the shellcode. That way, we will overwrite the saved instruction pointer from a call to free() and we will obtain the control of %eip. All the art of this exploitation resides in the advance use of the UNLINK macro. We will go in the details, but before, let's remember what is the purpose of the UNLINK macro. The UNLINK macro takes off an entry from the doubly linked list. Indeed, the pointer "prev" of the next chunk following the one we want to unlink is switched with the "prev" pointer of the chunk we are currently unlinking. Also, the pointer "next" of the preceding chunk before the one we want to unlink is switched with the "next" pointer of the chunk we are currently unlinking. Remember the fact that only free malloc chunks are in the doubly linked lists, which are then grouped by inside binlists. The "prev" field is named BK and it is located at offset 12 of a malloc chunk. The "next" field is named FD and is at offset 8 of malloc chunk. We can then obtain the following macros: #define CHUNK_FD 8 #define CHUNK_BK 12 #define SET_BK(x) (x - CHUNK_FD) #define SET_FD(x) (x - CHUNK_BK) If we want to write 0x41424344 at 0x42424242, we need to call the UNLINK macro the following way: UNLINK (SET_FD(0x41424344), SET_BK(0x42424242)). The thing is that we want to write "ABCD" at 0x42424242, but UNLINK will write both at 0x42424242 and at 0x41424344. "ABCD" is not a valid address. The solution to mitigate this problem is to write a character at a time. We will thus write "A", then "B", then "C" and after this "D" until there is nothing left to write. To achieve this, we need a range of 0xFF characters that we are willing to trash. It is easy to obtain. Indeed, if we take a really high address in the stack, we would find ourselves overwriting environment variables that were first stocked at the top of the stack. At the time, we were writing this exploit for stacks that were mapped below the Kernel space / User space, which was 0xc0000000. The exact address that I chose was 0xc0000000 - 0xFF. Basically, if we want to write "ABCD" at 0xbfffd000, we will need to execute the following calls to UNLINK: UNLINK (UNSET_FD(0xbfffd000), UNSET_BK(0xbfffff41)) (0x41 being the hexadecimal equivalent of 'A'). UNLINK (UNSET_FD(0xbfffd001), UNSET_BK(0xbfffff42)) (0x42 being the hexadecimal equivalent of 'B'). And so on ... So, if we are able to execute as many UNLINK as we want, and if we have a range of address of 0xFF that can be modified without consequences on program execution, then we are able to make 'memcpy' calls remotely. To sum up: 1. We normalize the heap to put it in a predictable state. 2. We overwrite the size field of a previously allocated chunk of an "an_entry" struct. When this an_entry entry will be free(), the memory allocator will think that the next chunk is located inside data under our control. This next fake chunk will then be marked as free, and the two memory blocks will be consolidated as one. Malloc will then take the next chunk off its doubly linked list of free chunks, and it will thus trigger an UNLINK, with a FD and BK under our control. 3. Since we can allocate as many "an_entry" entries as we want and free them all at the same time thanks to server_write_entries(), we can trigger as many UNLINK as we want. This leads us, as we just saw, to the creation of the memcpy_remote() function, that will let us write what we want and where we want. 4. We use the function memcpy_remote() to write the shellcode at a really low address of the stack. 5. We then overwrite each address in the stack, starting from the top, until we hit a saved instruction pointer. 6. When the internal function that frees the chunk will return, our shellcode will then be executed. Here it is ! Notice: We have chosen a really low address in the stack, because even if we hit an address that is not currently mapped, this will trigger a pagefault(), and instead of aborting the program with a signal 11, it will stretch the stack with the expand_stack() function from the kernel. This method is OS generic. Thanks bbp. --[ 4 - A couple of words on the BSD exploitation As promised, here is the explanation of the technique used to exploit the FreeBSD version. Consider the fact that with only minor changes, this exploit was working on other operating systems. In fact, by switching the shellcode and modifying the hardcoded high addresses of the heap, the exploit was fully functional on every system using PHK malloc. This exploit was not restricted only to FreeBSD, a thing that the script kiddies didn't know. I like to see that kind of tricks inside exploits. It makes them powerful for the expert, and almost useless to the kiddie. The technique explained here is an excellent way to take control of the target process, and it could have been easily used in the Linux version of the exploit. The main advantage is that this method does not use the magic of voodoo, so it can help you bypass the security checks done by the malloc code. First, the heap needs to be filled to put it in a predictable state, like for all the heap overflow exploits. Secondly, what we want to do basically is to put a structure containing function pointers right behind the buffer that we can overflow, in order to rewrite functions pointers. In this case, we overwrote the functions pointers entirely and not partially. Once this is done, the only thing that remains to do is to repeatedly send big buffers containing the shellcode to make sure it will be available at a high address in the heap. After, we need to overwrite the function pointer and to trigger the use of this same function. As a result, the shellcode will then be run. Practically, we used the CVS command "Gzip-stream" that allocated an array of function pointers, inside a subroutine of the serve_gzip_stream() function. Let's recap: 1. We fill_holes() the PHK's malloc allocator so that the buffer that we are going to overwrite is before a hole in the heap. 2. We allocate the buffer containing 4 pointers to shellcode at the right place. 3. We call the function "Gzip-stream" that will allocate an array of function pointers right inside our memory hole. This array will be located right after the buffer that we are going to overflow. 4. We trigger the overflow and we overwrite a function pointer with the address of our shellcode (the macro HEAPBASE in the exploit). See OFFSET variable to know how many bytes we need to overflow. 5. With the "Entry" command, we add numerous entries that contain NOPs and shellcode to fill the higher addresses of the heap with our shellcode. 6. We call zflush(1) function which end the gziped-stream and trigger an overwrited function pointer (the zfree one of the struct z_stream). And so on, we retrieve a shell. If we are not yet root, we look if one cvs's passwd file is writable on the whole cvs tree, which was the case at the time on most of servers, we modify it to obtain a root account. We re-exploit the cvs server with this account and - yes it is - we have rO0t on the remote. :-) --[ 5 - Conclusion We thought that it was worth presenting the exploit the way it was done here, to let the reader learn by himself the details of the exploitation code, which is from now on available in the public domain, even though the authors did not want it. From now on, this section will be included in the upcoming releases of phrack. Each issue, we will present the details of an interesting exploit. The exploit will be chosen because its development was interesting and the the author(s) had a strong determination to succeed in building it. Such exploits can be counted on the fingers of your hands (I am talking about the leaked ones). With the hope that you had fun reading this ... --[ 6 - Greeting To MaXX for his great papers on DL malloc.