Wednesday, May 14, 2014

Java Code Analysis and Optimization with Soot

This is a quick shout out about the project Soot. If you're doing anything even remotely related to static analysis in Java, Soot is the way to go. It's simple, open source, well documented and extremely powerful. Soot can analyze any Java program (source or bytecode), and provide you with the control flow graph (CFG). Here's an example that shows how to construct the CFG for the main method of a class named MyClass.
SootClass c = Scene.v().loadClassAndSupport("MyClass");
c.setApplicationClass();
SootMethod m = c.getMethodByName("main");
Body b = m.retrieveActiveBody();
UnitGraph g = new BriefUnitGraph(b);
Once you get your hands on the CFG, you can walk it, search it and do anything else you would normally do with a graph data structure. 
Soot converts Java code into one of four intermediate representations (Jimple, Baf, Shimple and Grimp). These representations are designed to make it easier to analyze programs written in Java. For example, Jimple maps Java code from its typical stack-based model to a three-registers-based model. You can also make modifications/optimizations to the code and try out new ideas for compiler and runtime optimizations. Alternatively you can "tag" instructions with metadata which can be helpful in building new development tools with powerful code visualization capabilities.
Soot also provides a set of APIs for performing data flow analysis. These APIs can help you to code anything from live variable analysis to very busy expression analysis and more. And finally, Soot can also be invoked from the command-line without having to write any extension code.
So if you have any cool new ideas related to program analysis or optimization, grab the latest version of Soot. Whatever it is that you're trying to do, I'm sure Soot can help you implement it.