IDA Pro being and old and time-proven platform for binary analysis,
many plugins grew on it. There are custom made plugins for new processors
and file formats. There are deobfuscators, exporters, data visualizers,
object reconstructors and other stuff.
No one can preview and implement everything. Some “innovations” are
the result of software analysis improvements: malware authors
come up with something new if the old obfuscation methods do not work anymore.
New platforms and compilers require different analysis: for example,
the latest GNU compilers generate quite complex code which requires much
deeper approach.
Open architecture gives the users the opportunity to extend the core engine and build on it.
Be it one-day small script or plugin or something fundamental and serious,
it is for the benefit of everyone.
That’s why the decompiler will have an API. While it itself is built on the top of IDA,
you will be able build on the top of the decompiler. This is a pretty natural growth pattern:
Below are the descriptions in no particular order:
This plugin reconstructs object types used in the program. The object boundaries
can be approximatively determined as a side effect.
This plugin uses data flow analysis to find out possible value ranges of local
variables and global data.
The output of the Typist is leveraged into class (object) definitions.
Class hierarchy emerges as a result. The notion of virtual functions
comes into existence.
Find code sequences which can be converted into inline functions.
The output becomes more readable.
This plugin optimizes functions by performing ‘slices’ of
only possible input argument values. For example, if a function with two argument
is known to be always called with the second argument equal to zero, the plugin can
remove all code which handles non-zero cases. More generic form of this plugin
performs slicing on other data values, not only on function arguments.
Data flow visualizer. It uses information provided by the decompiler
engine and other plugins. May have several different display methods.
The least intrusive display is in the form of mouse hints (locations where the current
variable is used/defined, its possible values, tainted/no). It can also display
graphs and plain text. Other plugins will have their own visualization methods
but this plugin will provide services for other plugins to use.
Performs taint analysis and displays potential uses of untrusted data.
Memory allocation verifier. Typical problems like failure to verify
the result of memory allocation, double frees, frees of non-allocated
memory can be detected.
Verify object boundaries are respected and there are no overflows.
Detects unreachable functions and removes from the further analysis.
This is a generic name for plugins which verify consistent use of programming
idioms. For example, if before modifying a variable we acquire a lock
in all program locations but one, we have a idiom violation. There are many
programming idioms and there can be many different idiom verifiers.
Generic name for plugins which export information into other systems. The output
can be ubiquitous XML or old good SQL databases.
Generic name for plugins which modify the decompiler output. The goal can vary
tremendously from making the output more human readable to optimizing or instrumenting it.
CodeSlicer and Inliner are examples of such plugins.
Generic name for plugins which translate assembly text into microcode.
Microgens are also responsible for mapping CPU registers into microcode
registers and resolving memory references. Microgens ‘port’ the decompiler
to new processors and platforms. Ideally, we need to divide them into two parts:
processor specific and operating system (environment) specific parts.
Generic name for plugins which modify the assembly text to conform the
decompiler assumptions. An example: low level assembly instructions which
are not used by compilers and therefore can not be decompiled are replaced
by equivalent function calls. These plugins are add-ons to microgens.
A plugin which modifies the core decompiler engine by adding a new transformation rule.
For example, if some data is known to be read-only but the decompiler
has no means of knowing it, a plugin could replace “load memory”
instructions by “load constant” instructions for this data.
Plugins like CleanBounds, VeriHeap, and Idiomizer can be used to solve today’s practical problems.
Other plugins can be used to facilitate binary analysis and render it less time consuming.
I tried to come up with the list of plugins I’d personally like to have.
The list is far from being exhaustive. Feel free to add to it 😉
Plugins names and descriptions are completely fictional.