Once we’ve used miasm to load data into a container, then its time to disassemble instructions. I’ve found that there are two useful modes of disassembling:
- parse a few bytes into an instruction instance, or
- parse many bytes, analyze the instructions, and recover a Control Flow Graph (CFG).
The first approach is useful when dealing with very small snippets, while the second is better when reasoning about sequences of instructions. If you’re picturing your algorithm in terms of IDA’s graph view, then you’ll want to reach for the CFG technique.
Disassembling single instructions
You can use miasm’s disassembler object to parse bytes into an instruction. Note that only the first instruction is parsed.
buf = b'\xB8\x01\x00\x00\x00\x90\x90\x90\x90'
container = miasm.analysis.binary.Container.from_string(buf)
machine = miasm.analysis.machine.Machine('x86_32') # use container.arch when inspecting PE/ELF/etc.
disassembler = machine.dis_engine(container.bin_stream, follow_call=True)
insn = disassembler.dis_instr(offset=0x0)
print(insn)
>>> <miasm.arch.x86.arch.instruction_x86 at 0x7fbaa250f2e8>
print(str(insn))
>>> 'MOV EAX, 0x1'
The mnemonic is accessible as name
and the operands are called args
.
The length of the instruction is available as a property called l
.
assert insn.l == 5
assert insn.name == 'MOV'
print(insn.args)
>>> [ExprId('EAX', 32), ExprInt(0x1, 32)]
An operand instance has some helpful accessors to help you determine the type and value:
arg0 = insn.args[0]
arg1 = insn.args[1]
print(arg0)
>>> ExprId('EAX', 32)
assert arg0.is_id() == True
assert arg0.is_int() == False
assert arg0.is_mem() == False
assert arg0.is_loc() == False
assert arg0.name == 'EAX'
print(arg1)
>>> ExprInt(0x1, 32)
assert arg1.is_id() == False
assert arg1.is_int() == True
assert arg1.is_mem() == False
assert arg1.is_loc() == False
assert arg1.arg == miasm.expression.modint.uint32(0x1)
Disassembling multiple instructions
Most of the time, I deal with long sequences of instructions.
For these, I’ve found miasm’s CFG analysis more useful because it helps me understand the dependencies among instructions.
In this mode, miasm parses many instructions that it groups by basic block.
Miasm follows control flow instructions, such as jmp
or call
, and records links among the basic blocks.
The result of this operation is an AsmCFG
instance, which stands for “AsSeMbly Control Flow Graph”
(miasm also has IrCFG
and IraCFG
objects for reasoning about intermediate representations, too).
This object provides accessors for the recovered basic blocks and their control flow edges.
Within a basic block, you can enumerate the instruction instances and inspect them as shown in the first section.
Sidebar: rendering CFGs
While unlikely to be useful to large scale analysis, miasm can render AsmCFG
instances to Graphviz Dot format.
This is how I’ll demonstrate examples. An empty CFG has an empty image, of course:
import graphviz
cfg = miasm.core.asmblock.AsmCFG()
graphviz.Source(cfg.dot())
The output (empty):
back to disassembling…
To demonstrate features of miasm’s CFG analysis, let’s use the following disassembly snippet:
00000000 <A>:
0: b8 01 00 00 00 mov eax,0x1
5: eb 00 jmp 7 <B>
00000007 <B>:
7: b8 02 00 00 00 mov eax,0x2
c: eb 00 jmp e <C>
0000000e <C>:
e: b8 03 00 00 00 mov eax,0x3
13: eb f9 jmp e <C>
00000015 <D>:
15: b8 04 00 00 00 mov eax,0x4
1a: eb 00 jmp 1c <E>
0000001c <E>:
1c: b8 05 00 00 00 mov eax,0x5
21: eb eb jmp e <C>
These instructions can be broken down into five basic blocks that have labels A
- E
.
Basic block A
flows to B
flows to C
, and D
flows to E
flows to C
.
The two entry points A
and D
both eventually flow to C
.
Basic block C
flows to itself.
If you were to lay them out in IDA, you might see a Y
shape:
We’ll load the raw shellcode into a container, select the appropriate architecture, and create a disassembler.
Then, dis_multiblock
disassembles many instructions, grouping them into basic blocks and control flow edges.
buf = b'\xB8\x01\x00\x00\x00\xEB\x00\xB8\x02\x00\x00\x00\xEB\x00\xB8\x03\x00\x00\x00\xEB\xF9\xB8\x04\x00\x00\x00\xEB\x00\xB8\x05\x00\x00\x00\xEB\xEB'
container = miasm.analysis.binary.Container.from_string(buf)
machine = miasm.analysis.machine.Machine('x86_32') # use container.arch when inspecting PE/ELF/etc.
disassembler = machine.dis_engine(container.bin_stream, follow_call=True)
cfg = disassembler.dis_multiblock(offset=0x0)
This results in the following CFG:
Some things to note:
- miasm started disassembling from address 0x0, which we specified as
offset=0x0
above. - miasm did not find basic blocks
D
orE
, because they are not referenced from the entry point. - the disassembly looks consistent with our initial listing, that’s good!
- miasm created its own names for these basic blocks using a reasonable default schema (
loc_
+ address).
Naming locations with LocationDB
Let’s start by addressing that final point and teach miasm about our names for these basic blocks.
To do this, we’ll use a LocationDB
to associate names with locations within a container.
I’ve found that miasm uses the LocationDB heavily to refer to locations in a Container, so you’ll quickly get familiar with this abstraction.
container = miasm.analysis.binary.Container.from_string(buf)
loc_db = container.loc_db
loc_db.add_location(offset=0x0, name="A")
loc_db.add_location(offset=0x7, name="B")
loc_db.add_location(offset=0xE, name="C")
loc_db.add_location(offset=0x15, name="D")
loc_db.add_location(offset=0x1C, name="E")
machine = miasm.analysis.machine.Machine('x86_32') # use container.arch when inspecting PE/ELF/etc.
disassembler = machine.dis_engine(container.bin_stream, loc_db=loc_db, follow_call=True) # note: added reference to `loc_db`
cfg = disassembler.dis_multiblock(offset=0x0)
Good! Miasm now renders our basic blocks titles and labels using the names we expect. Since miasm never does this automatically, we should always apply relevant names when setting up a workspace. For example, we can apply the names of imported and exported symbols to a PE container:
pe.loc_db.add_location(offset=pe.entry_point, name='entry', strict=False)
for symbol, va in miasm.jitter.loader.pe.get_export_name_addr_list(pe.executable):
if isinstance(symbol, int):
# ordinal
pe.loc_db.add_location(offset=va, name=f'#{symbol}', strict=False)
else:
pe.loc_db.add_location(offset=va, name=symbol, strict=False)
for ((dll, symbol), addresses) in miasm.jitter.loader.pe.get_import_address_pe(pe.executable).items():
for va in addresses:
pe.loc_db.add_location(offset=va, name=f'{dll}!{symbol}', strict=False)
Returning to our shellcode example, we can query the LocationDB by offset and name:
loc_db.get_offset_location(0x0)
>>> <LocKey 0>
loc_db.get_name_location('A')
>>> <LocKey 0>
loc_db.get_location_names(loc_db.get_name_location('A'))
>>> frozenset({b'A'})
That final line is neat to me: miasm supports assigning multiple names to the same location. This is helpful when a function is exported many times with different names.
Disassembling from many entry points
We know, from prior knowledge, that our shellcode snippet has multiple entry points: A
and D
.
How can we get miasm to recognize both of these?
I found that the best way is to generate a CFG for each entry point and then merge the graphs together into a single, master CFG:
job_done = set()
a_cfg = disassembler.dis_multiblock(loc_db.get_name_offset("A"), job_done=job_done)
d_cfg = disassembler.dis_multiblock(loc_db.get_name_offset("D"), job_done=job_done)
cfg = miasm.core.asmblock.AsmCFG(loc_db)
cfg.merge(a_cfg)
cfg.merge(d_cfg)
cfg.apply_splitting(disassembler.loc_db,
dis_block_callback=disassembler.dis_block_callback,
mn=disassembler.arch,
attrib=disassembler.attrib,
pool_bin=disassembler.bin_stream)
(In my testing, I discovered a couple ways to build a single, master CFG and that the above strategy shows the best performance. Please review the Appendix for further details.)
Conclusion
This is the second post in a series. Next, I’ll collect some thoughts on how to use miasm to identify functions and other constructs. Maybe we’ll get to emulation (still haven’t attempted this yet).
When it comes to disassembly, I like how miasm is very explicit about its analysis and results. It doesn’t hide side effects within stateful objects, rather it returns the results and lets you decide how to merge, diff, and/or serialize them. I can imagine building an “undo” feature when using miasm, while this would be nearly impossible in IDAPython or vivisect.
The downside, of course, is that you have to stitch together analyzers as you see fit. I wonder if somewhere there is a repository of reusable miasm analyzers. I haven’t found one yet.
Appendix I: Miasm disassembly internals
I’ve shared a technique for disassembling multiple entry points and merging their control flow graphs. For me, this construction worked the best, but it took multiple tries to get here. Let’s learn a bit more about miasm’s algorithm so we can maintain performance and avoid a subtle bug.
Here
is the implementation of dis_multiblock
:
def dis_multiblock(self, offset, blocks=None, job_done=None):
if job_done is None:
job_done = set()
if blocks is None:
blocks = AsmCFG(self.loc_db)
todo = [offset]
while len(todo):
...
target_offset = int(todo.pop(0))
if (target_offset is None or
target_offset in job_done):
continue
cur_block, nexts = self._dis_block(target_offset, job_done)
todo += nexts
blocks.add_block(cur_block)
...
This is a straightforward recursive descent disassembler that uses a job queue of virtual addresses called todo
.
Starting from the given virtual address, miasm disassembles a basic block, saves it off, and adds the successors to the queue.
It then pops a new virtual address from the queue and repeats, until the queue is empty.
To avoid looping endlessly over blocks that branch to themselves, miasm uses an index called job_done
to track the virtual addresses of blocks it has already considered.
It skips blocks that it’s already seen.
You can provide an initialized set for job_done
to restrict the paths that miasm will disassemble.
Miasm places the collected basic blocks into the AsmCFG
instance named blocks.
Again, you can provide an initialized instances for blocks
, in which case miasm will merge the results together.
With this knowledge, we can create a naive solution in which we disassemble from multiple points, reusing the same AsmCFG
instance:
cfg = miasm.core.asmblock.AsmCFG(loc_db=loc_db)
disassembler.dis_multiblock(offset=loc_db.get_name_offset("A"), blocks=cfg)
disassembler.dis_multiblock(offset=loc_db.get_name_offset("D"), blocks=cfg)
That graph looks great! It has all the basic blocks that we expect.
It works by disassembling from each entry point, and having miasm place the results into the same mutable AsmCFG
instance.
However, there are two issues with this naive solution:
- the performance is sub-optimal, because
job_done
is not maintained across calls. - there’s unexpected O(n2) runtime thanks to internal basic block bookkeeping.
job_done
First, miasm does too much work because we’re not providing a shared job_done
index to dis_multiblock
.
Instead, the local variable get reinitialized on each call, so miasm forgets which blocks it has already processed and repeats work - wasting our time.
In our example, we call dis_multiblock
twice, for entry points A
and D
.
Miasm will disassemble blocks A
, B
, and then C
during the first invocation, then D
, E
, and then C
(again!).
The work for C
(and any blocks that flow from it) is done twice.
While our example is simple, and C
is trivial to disassemble, C
may be extremely complex in the real world.
We might be left wondering why miasm seems slow at analysis.
Therefore, we should tweak our solution to share an instance of job_done
across calls to dis_multiblock
:
cfg = miasm.core.asmblock.AsmCFG(loc_db=loc_db)
job_done = set()
disassembler.dis_multiblock(offset=loc_db.get_name_offset("A"), job_done=job_done, blocks=cfg)
disassembler.dis_multiblock(offset=loc_db.get_name_offset("D"), job_done=job_done, blocks=cfg)
And the results are exactly the same:
Block splitting
However, I found that this solution still showed very poor performance when analyzing real programs.
The culprit is that miasm invokes a routine
called apply_splitting
at the tail of dis_multiblock
, whose purpose is to split basic blocks that have a jump target in the middle of them.
For example, consider the assembly snippet:
A:
mov eax, 1
B:
mov eax, 2
jmp B
Before splitting, miasm incorrectly computes the following CFG, which groups all instructions into a single basic block:
After splitting, miasm correctly recovers the CFG, splitting A
to account for the jump target B
:
This behavior is good; however, there is trouble in the way dis_multiblock
interacts with apply_splitting
.
Every time apply_splitting
is invoked, miasm
checks
all basic blocks in the CFG:
def apply_splitting(self, loc_db, dis_block_callback=None, **kwargs):
...
todo = set(self.blocks)
while todo:
cur_block = todo.pop()
If we use the same AsmCFG
in across multiple calls to dis_multiblock
, then we repeatedly process the same blocks many times.
While it may look innocuous, when disassembling kernel32.dll
with its 2000 exports, miasm spends most of its time re-re-re-checking blocks for splits.
The runtime performance is O(n2).
The trick is to create a separate CFG for each entry point and merge them at the end (also using the job_done
trick, too).
This way, each basic block is checked for splitting just once, and runtime performance is back to O(n).
job_done = set()
a_cfg = disassembler.dis_multiblock(loc_db.get_name_offset("A"), job_done=job_done)
d_cfg = disassembler.dis_multiblock(loc_db.get_name_offset("D"), job_done=job_done)
cfg = miasm.core.asmblock.AsmCFG(loc_db)
cfg.merge(a_cfg)
cfg.merge(d_cfg)
cfg.apply_splitting(disassembler.loc_db,
dis_block_callback=disassembler.dis_block_callback,
mn=disassembler.arch,
attrib=disassembler.attrib,
pool_bin=disassembler.bin_stream)
The code is a little ugly, but its easily wrapped in a function.
For me, this brought the runtime of disassembling 100 exports from kernel32.dll
down from 90 seconds to 3 seconds!
Wrapping everything together, here’s the utility function I’ve been using recently:
def disassemble_all(container, entries):
machine = miasm.analysis.machine.Machine(container.arch)
mdis = machine.dis_engine(container.bin_stream, loc_db=container.loc_db, follow_call=True)
job_done = set()
cfgs = {}
for va in entries:
cfgs[va] = mdis.dis_multiblock(va, job_done=job_done)
blocks = miasm.core.asmblock.AsmCFG(container.loc_db)
for cfg in cfgs.values():
blocks.merge(cfg)
blocks.apply_splitting(container.loc_db,
dis_block_callback=mdis.dis_block_callback,
mn=mdis.arch,
attrib=mdis.attrib,
pool_bin=mdis.bin_stream)
return blocks
entries = [pe.entry_point]
entries.extend([va for symbol, va in miasm.jitter.loader.pe.get_export_name_addr_list(pe.executable)])
cfg = disassemble_all(pe, entries)