Manual Protocol Reverse Engineering | Breakin...

勤勉馆 2012-02-28

展开全文

JANUARY 23, 2009

Manual Protocol Reverse Engineering

UPDATE: In Spring of 2010 BreakingPoint unveiled the pioneering Cyber Tomography Machine to help you with problems such as the ones described in this post. Read more.

Last week, Dustin wrote about some tools that aid in reversing network protocols, outlining three tools in particular: Protocol Informatics, PDB, and Discoverer. As Dustin lamented, nothing out there offers a simple push-button analysis, and I'm quick to agree. Accurate, rapid analysis of proprietary binary protocols is pretty hard, requiring a familiarity in usual socket programming practices, a determined patience, and a whole lot of note-taking. Closed binary protocols are rarely as simple as ICMP ping, they often have no real documentation, and the people who do know how they work usually keep their mouths shut.

With that said, I'd like to share some screenshots and discuss some of the techniques I've developed to deal with this arcane art.

A Simple Select Statement

screenshot: basic_select

The above is a simple SQL statement (starting at offset 0x6a), repeated ten times in fairly quick succession, then churned through a very simplistic packet payload difference engine "payload_diffgen.rb." It's written in Ruby, using PacketFu to pick out packets from a recorded session, and some simple heuristics to flag the bytes that change from action to action; those bytes are red and masked with an arbitrary 0xbd byte ("bd" because it kind of looks like a pair of upside-down glasses, and because the lines are on the outside so the kerning is nice). As you can see, the whole payload is 188 bytes, 31 of which are easily recognizable as text, giving a total binary "bloat" factor of about 6x.

The first ten bytes, colored black, are part of the known packet header for this particular protocol, so I'm not too interested in those (other people have already documented that part of this protocol). Among these ten queries, the lone byte at 0x0c is the only one that's changed. The mutant_diffs() function pulls that byte out of the session history, and we can see at a glance that it's steadily incrementing by four on each successive run of the command. Thus, we can postulate that this byte is responsible for some kind of counter. Of course, we can't tell if the surrounding bytes have anything to do with it with such a small sample -- what happens when this value rolls over after it hits 0xfe? Byte 0x0b might increment, or 0x0d, or maybe nothing at all. It all depends on the endianness of the value, or if it's just a single char.

The Response

screenshot: basic_select_response

Here, we have the response packet to my initial select. Again, there's some text -- 17 bytes' worth -- and even more binary overhead (288 bytes), giving an almost 17:1 binary:text ratio. Clearly, a lot is going on in this packet, and there's a lot more than one byte of difference between these two ostensibly identical responses. For brevity, I'll just concentrate on a couple of the more unusual features. Byte 0xba, as revealed by mutant_diffs(), is not a serial counter, but is more of a cyclical value -- and more importantly, seems to have no correlation to the initial request. This value is keeping track of something internal to the server, and we will only be able to tell if the client does anything with this value later on in the session. So, it's something to keep an eye on for later (this is part of that note-taking aspect required for protocol analysis).

Elsewhere in the payload, byte 0xd7 is incrementing at a rate of four per response, so it's surely related to byte 0x0c in the initial request -- even though each iteration is two less than the initial request (0xc2 produces 0xc0, 0xc6 produces 0xc4, etc). Unlike the request's 0x0c byte, though, this byte is surrounded by nulls. This implies that those nulls are just padding characters, and the original sequence number is just char. By the way, it's this kind of contextual awareness that today's automated analysis tools are notoriously bad at doing.

Sessions Separated by Time

screenshot: select_two_days

This screenshot shows the differences between the same select statement being sent on different days. The requests are identical -- both ask for the same data, are 188 bytes long, etc. -- but we can see that there are a lot of differences between the first ten (issued on day 1) and the next ten (issued on day 2). We can hypothesize that there are variables in the request that are responsible for time-stamping. The byte at 0x18 increments by one between yesterday and today, so we can expect it to increment by one tomorrow -- unless this is a coincidence.

Coincidental correlation is a common error in protocol analysis (both automated and manual). However, in this case, we have a little more support for the time theory. Byte 0x54 is identical to byte 0x18, so it's more likely they both are tracking the same event. It may still be unrelated to time -- it may be a process ID marker, say -- but at least we've ruled out the idea that it's a constant magic value, or a length, or an offset, or anything else related to to the structure of the request itself.

Time-Sensitive Responses

screenshot: select_two_days_response

The response between yesterday and today similarly reflects more differences. Again, the data is identical, but we can see a series of bytes, 0x1f through 0x22, that change between sessions. In socket programming, 4 byte groups are very common, since it indicates a value with a width of at least 32 bits, a typical long. In this case, the value at 0x1f[0,4] yesterday was 0x0d0e232d, while today's is 0x0f0c2102. Depending on endianness, they might also be 0x2d230e0d vs 0x02210c0f. However, it seems unlikely that this is a millisecond or even a second counter -- remember, we're looking at a set of ten responses, each of which are separated by about two seconds. If it's a timestamp with at least second fidelity, we should have seen this in just yesterday's sample. Besides, bytes 0x0f[0,3] all seem awfully close together -- byte 0x1f is up two, and bytes 0x20 and 0x21 are both down two. So, they do change over time, but they're not measuring time. Maybe things will get clearer when we look at other aspects of the session.

Two Different Selects

screenshot: two_selects

In my final example, I have two different select statements. One is for, "select region_name from regions;" and the other is, "select region_name,abbr from regions;". As you'd expect, the bytes where the data starts to diverge are blanked out. However, byte 0x1f, immediately preceding the select statement, pops out immediately as a length variable. The first statement is 31 bytes long (if you drop the terminating semicolon), which is 0x1f, while the second statement is 36 bytes, aka, hex 0x24. And because if the differing lengths, and a lack of padding characters in the protocol, any analysis after this divergence point is going to be off; my first-pass analyzer here is not smart enough to adjust out the differences accordingly. If we want to experiment select statements of differing data, we should probably stick with a single length common to both so as to eliminate that sort of self-induced interference.

Two Different Responses

screenshot: two_selects_response

Given that I'm eliciting a very different response, it's unsurprising that payload_diffgen splashes red all over my data. We've run into a similar problem as the last packet; not only is the result set different, but more importantly, they're of different lengths. However, up until the divergence, we can still make some use of the payload.

Notice bytes 0x0c through 0x1b. Like four, sixteen is a magical length. Looking at these bytes side-by-side, we can see that they appear randomly changed -- and here's where some background in other protocols comes in handy. Whenever I see 16 bytes a) near the top of a payload that b) stays static when the payload's static and changes wildly when you have different content, I immediately think, "MD5 hash." Unfortunately, simply hashing the payload after the suspected hash doesn't produce a match, nor do permutations on hashing the data -- but I'm still quite confident that this is a hash of something.

Eventually, experimentation proved this theory out. Turns out, if you change the select statement in a way that's syntactically meaningless (in this case, adding an extra space), this MD5 hash value changes as you'd expect, which is a solid indicator that the theory is correct. Sadly, though, it's not a straight hash of the select statement. That's okay, though. Elsewhere in this protocol, I've noticed that it has a tendency to xor values together and then produce a hash of that operation, so I suspect something similar is going on here.

And repeat, and repeat, and repeat...

We've looked at a single query-response pair, and really have only scratched the surface of the protocol. All together, I'd say there are about a hundred-ish unknowns across the whole session that were investigated using these highly manual techniques. I wish there was more automation. Flagging bytes as length fields, and padding accordingly, is the next big trick. Today, there really isn't a lot out there that replaces this classical scientific method. Toolchains like PDB are helpful (for fuzzing), as are stats engines like PI (for simple protocols). I'm sure Microsoft's Discoverer is fantastic, but alas, its implementation is also quite secret.

At any rate, if there's enough public interest in this sort of thing, I'm hopeful I'll have something beyond mere screenshots to share. Believe me, I'm not keeping my code under wraps because it's too awesome for mere mortals -- it's really just incredibly messy and hackish, and certainly not ready for public scrutiny. If and when I do release the next automated analysis whatsamajigger, you'll certainly read about it here.

Posted by Tod Beardsley