I've been exploring the OpenJK (open-source maintained version of Jedi Academy) codebase.
The game engine is a quake derivative, so even though it has a single-player mode, there's a conceptual separation between the "server" and the "client". The server processes game logic, while the client handles display and input. There's some blurring due to the fact that these two components are actually dynamic libraries loaded into the same process.
I wanted to get a better understanding of the communication between these two halves, especially at game startup. Most of the communication takes a place over a virtual network device (an in-memory ring buffer; packet sends and receives operate on sections of the ring buffer).
Logging these packets to a file seemed simple enough, but exploring that seemed like it would be a pain, since most of the packets are serialized binary structures. Thinking about it some more, I realized that the problem I was actually very similar to analyzing a network capture using Wireshark.
The basic strategy is to dump the internal packets I'm interested in into a pcap file, then write a Wireshark custom dissector in Lua to analyze the packets. I'm quite happy with how it turned out — I get the full power of Wireshark's filtering and searching, as well as a nice visual dissection for all of the packet types I support in my custom dissector.
I was originally nudged toward this technique because of the packet-like nature of the data I was analyzing, but it works so well that I would consider using it for analyzing more general program traces as well, especially for a large program that I don't understand well.
Wireshark's custom dissector framework seems well-designed and capable. I've been looking for a tool like it for reverse-engineering binary formats; I often find myself wanting a way to prototype parsing logic with visual feedback; for example, showing me both the extracted value for a field as well as highlighting the bytes in the file that it occupies. I'm considering creating a fork of Wireshark that just keeps the dissection framework and the hex editor view (or, more likely, just wrapping the file I'm analyzing in a dummy pcap file).
Read on for a walkthrough of the process on a simple example program.
Step-By-Step
We'll start with a small example program. Pretend that it's a big, complicated program that we're having trouble understanding. Pretend it makes some calls to a plugin system, and we want to log them for later analysis using Wireshark.
The first step is to instrument our program so that it saves a pcap file containing our "packets". The pcap format is very simple, so we don't need a library. It's a binary format, with a short header at the beginning of the file, followed by a series of length-delimited packet records. We just need to add two functions: one to write the overall pcap header, and another to write out each packet.
Now, running the program will save a file named "packet_dump.pcap" in the current directory. If we open it up in Wireshark, we can see the sequence of "packets" that we logged, as well as their contents in the hex editor. But Wireshark doesn't understand the format of what's inside the packets, so it's not too useful.
We can fix this by giving Wireshark a custom dissector, registered for the link-layer packet type we hardcoded into our pcap file (USER0). Wireshark supports custom dissectors written in Lua.
To compile and run the program, and run Wireshark on the resulting pcap file:
Now, Wireshark will show you the parsed data in each packet. Additionally, things like display filters now work — try entering "example_proto.request_tag == 5" in the display filter box.