UNRELATED TO EMU:: HOWTO: Reverse Engineer .exe's (Long)
This is a brief walkthrough of a recent session I did - it's simply too much to try to explain every detail. Sorry if I accidentally skip over something important!
I've spent a good deal of time lurking in the reverse engineering community, but I've never really seen anyone disclose how it's actually done. So if you are really familiar with low level programming and just need a nudge in this area, this may interest you. The problem: A packet being sent across the network had some sort of CRC algorithm applied to it. But I didn't know what the algorithm was. I only had the executable that performs the algorithm. The solution: The first thing I did was make a list of known facts: * Packets are being sent via UDP/IP * Each packet has a unique identifier, and the identifier I want is 0x0254 * The size of the packet I want is known: 0x18C with CRC, 0x188 without CRC * The CRC is the first 4 bytes of the payload * Remaining payload was data the CRC applied to * No symbols exist for exectuable * I know packet contents both from packet sniffing and from having the ability to generate fake packets. * CRC value of packet is observed to be 0x12345678 (let's say) Right away, I can see an easy way to attack this problem: attach a debugger to the executable, and create a breakpoint for each time a packet is received. Once we find our packet (via the identifier), we can start stepping through the assembly language to figure out what they are doing with the packet. The debugger of choice for me is windbg for this type of problem. This debugger is incredibly powerful, and is COMPLETELY FREE. Go to Microsoft Debugging Tools to download the debugger. It is also necessary to download the windows symbols, which is a hefty 176mb download. Goal #1: Determine when packet arrives Once windbg was installed and had its symbols path set, I cranked up the executable. Right away, the debugger breaks to allow me to enter any commands or breakpoints prior program entry. I know that we need to break each time a packet is received, so I had to figure out how the executable is receiving packets. With UDP/IP, winsock is the most likely candidate. Using VS.NET's dumpbin.exe, I determined the following functions to be candidates for receiving packets: WSOCK32!recv WSOCK32!recvfrom WS2_32!WSARecv WS2_32!WSARecvFrom WS2_32!recv WS2_32!recvfrom So, with winsock symbols loaded, I was able to put a breakpoint on each of these functions. Here's the breakpoints I did: Code:
bp wsock32!recv "g@$ra; r eax; g" Why display EAX after stepping out? Winsock functions are standard DLL exports, so I know they use the standard calling convention. That means the return value of the function is stored in the EAX register upon completion. Since the command g@$ra steps out of the function, r eax will display the return value each time the function executes. For these winsock functions, the return value shows how much data was received. Since we know the packet size, we can just watch program execution to verify which function is the one used to process our data. After program execution, it was determined that the wsock32!recvfrom function was the culprit. On to the next breakpoint! In a game like there, there are thousands of packets that are sent. We cannot possibly break and analyze each of them until we find the packet we want - there's just too much manual work. The trick is to set a breakpoint that triggers only when our packet arrives. There are many ways to do this, and I see a better way to figure it out in hindsight... Nonetheless, here's how I did it: I put a breakpoint on recvfrom() that displays the first 48 bytes of each packet received, and look for the one that identifies our packet: Code:
bp wsock32!recvfrom "g@$ra; db poi(esp-4) L30; g" ESP is the stack pointer. I won't explain what a stack pointer is here, so assuming you know: After stepping out, the char* param to recvfrom() is found at (esp-4). This is not something I knew beforehand; I played around with the memory window, looking at all values around ESP until I figured this out. When I saw the contents of our packet flow across the debug window, I determined that the CRC was at a 22 byte offset from the beginning of this buffer, and the packet was at a 26 byte offset. Knowing this CRC value and its position, I was able to set a breakpoint that *only* triggered when our packet arrives. Here's the conditional breakpoint: Code:
bp wsock32!recvfrom "g@$ra; j (dwo(poi(esp-4) + 22) == 0x78563412) 'db poi(esp-4)' ; 'g'" I've now accomplished goal #1: Stop execution at exactly the point when my packet arrives. Goal #2: Determine what they are doing with the packet! Goal #2: What are they doing with packet? Using the memory window, I saw that our packet was being stored at address 0x01d0fc14. I found that windbg allows you to break when a section of memory is accessed. This is perfect for what I want. So I put a memory access breakpoint onto our packet: Code:
ba r4 0x01d0fc14 Code:
#1 0050c3d3 8a19 mov bl,[ecx] The culprit was the rep movsd command. This makes a copy of our packet to another memory location. The location is specified by the edi register, according to windbg documentation. So I put a breakpoint on that memory access location, and created another list: Code:
ba r4 003fabe4 Code:
#1 004f7752 668b03 mov ax,[ebx] Code:
ba r4 006b0f5c Code:
#1 004d6691 0fb703 movzx eax,word ptr [ebx] Finally! I had found the function that processes the packet. Somewhere in here was code to calculate the CRC. But where do I go from here? There was waaaaaay too much assembly inside of this function for me to make any sense of it. My breakpoint was being hit too many times. I was stuck. Until something dawned on me ... ... If you generate a packet with a CRC that is deliberately invalid, this function fails and everything gets rejected. A common tactic by programmers for signalling failure is to have a function return a bool. If they are using the standard calling convention, a function that returns a bool would put the bool into the EAX register upon function completion... So I generated a fake packet with an intentionally invalid CRC and stepped out of the function, and I was welcomed with a EAX=0! My guess was correct - they are returning a bool from this function. Somewhere in there, a failure was being encountered. I figured this must be a generic function of some sort that handles all packet functions - and one of the functions calculates the CRC. And, assuming this also uses the bool convention, I just needed to figure out which function within this function was causing it to return false! It was a guess, but it turned out to be the winner. Single stepping through this function and ONLY watching for the assembly instruction call, I determined which function call was causing the main function to abort: Code:
004935f1 e8cc120600 call image00400000+0xf48c2 (004f48c2) This function was doing something with our packet, and that something was failing. Stepping into this function, I put a memory access breakpoint on our packet so I could stop at the exact point that it is messing around with our data: Code:
ba r4 006b0f62 Code:
004f48cd 8b4c2404 mov ecx,[esp+0x4] Code:
(1) ecx = esp+4 Code:
eax = current crc Quod Erat Demonstrandum I was happy, and there was much rejoicing. (yay!) Conclusion Reverse engineering is not necessarily about stepping through millions of lines of assembly code and figuring out what each line is doing... It can be as simple as setting up clever breakpoints! |
Strokes of genius, Merth. Strokes of genius. It's good to see your lithe mind cutting code up and reorganizing it in your head.
I'm glad you have the skills to track products of functions like you do. |
Very interesting stuff, I'd be interested in more of this type of information if anyone is willing to give it up :)
|
That was a very interesting read, thanks for sharing this with us merth, not often do you find someone who is willing to freely share information/knowledge that has taken a lot of time, effort and research for them to come by such as this. Let alone someone who will go out of their way to post it in a very readable form somewhere anyone interested can have read like you did.
Although i have no intentions of trying to reverse engineer any programs any time soon, that was still well worth the read and very informative, its appreciated. |
More in-depth introduction to reverse engineering if anyone is interested.
http://www.acm.uiuc.edu/sigmil/RevEng/ |
All times are GMT -4. The time now is 01:54 AM. |
Powered by vBulletin®, Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.