Packs and phases in Soot
Eric | November 26, 2008This the fourth post in a series of blog posts about frequently asked questions with using Soot. Today’s topic will be on packs and phases in Soot.
One frequent question that comes up on the Soot mailing list is when to run a particular analysis in Soot. Soot’s execution is divided in a set of different packs and each pack contains different phases. Therefore the question could be rephrased as “In which pack do I have to run my analysis or transformation?”. This tutorial tries to help you answer this question.
jb: Jimple Body Creation
The diagram to the right shows you the different packs that exist in Soot. First, Soot applies the jb pack to every single method body, or in other terms to every method that has a body. Native methods such as System.currentTimeMillis() have no body. The jb pack is fixed and it is concerned with the creation of the Jimple representation. It cannot be changed!
Whole-program packs
Then Soot next applies four whole-program packs
- cg, the call graph pack,
- wjtp, the whole-jimple transformation pack,
- wjop, the whole-jimple optimization pack, and
- wjap, the whole-jimple annotation pack.
All of these packs can be changed, and in particular one can add SceneTransformers to these packs that conduct a whole-program analysis. A SceneTransformer accesses the program through the Scene in order to analyze and transform the program. This code snippet here adds a dummy transformer to the wjtp pack:
public static void main(String[] args) {
PackManager.v().getPack("wjtp").add(
new Transform("wjtp.myTransform", new SceneTransformer() {
protected void internalTransform(String phaseName,
Map options) {
System.err.println(Scene.v().getApplicationClasses());
}
}));
soot.Main.main(args);
}
Note though, that whole-program packs are not enabled by default. You have to state the -w option on Soot’s command line to enable them.
Jimple packs jtb, jop, jap
Similar to Soot’s whole-program packs, Soot then applies –again to each body– a sequence of three packs:
- jtp, the jimple transformation pack,
- jop, the jimple optimization pack, and
- jap, the jimple annotation pack.
jtp is empty and enabled by default. This is usually where you want to place your intra-procedural analyses.
jop comes pre-equipped with a set of Jimple optimizations. It is disabled by default and can be enabled by using Soot’s -o command line option, or by using the switch –p jop enabled.
jap is the annotation pack for Jimple. Here, annotations are added to each Jimple body that let you or others or a JVM assess the results of the optimizations. By default, this pack is enabled but all default phases in the pack are disabled. Hence, if you add your own analysis to this pack it will automatically be enabled by default.
The following code snippet enabled the null pointer tagger and registers a new BodyTransformer which prints out tags for each statement in every method:
public static void main(String[] args) {
PackManager.v().getPack("jap").add(
new Transform("jap.myTransform", new BodyTransformer() {
protected void internalTransform(Body body, String phase, Map options) {
for (Unit u : body.getUnits()) {
System.out.println(u.getTags());
}
}
}));
Options.v().set_verbose(true);
PhaseOptions.v().setPhaseOption("jap.npc", "on");
soot.Main.main(args);
}
Note that every Transform added to a (non-whole) Jimple pack must be a BodyTransformer.
Packs bb and tag
As the diagram at the top shows, Soot next applied the packs bb and tag to each body. The bb pack converts the (optimized and tagged) Jimple bodies into Baf bodies. Baf is Soot’s stack based intermediate representation from which Soot creates bytecode. The tag pack last but not least aggregates certain tags. For instance, if multiple Jimple (or Baf) statements share the same line number tag then Soot will retain this tag only on the first instruction that carries this tag, to gain uniqueness.
The Dava body pack db
Since a little more than a year, Soot has an additional pack, db, not shown on the slide at the top. Its sole use is to enable or disable certain transformations when decompiling code into java using the –f dava command line option. It contains no actual transforms, and nothing should be added to it.
Which packs are enabled when?
One other big point confusion is always which packs are enabled when. The following two documents explain all the settings in question and their defaults:
- Soot command line (watch out for the –W and –O options)
–O will enable the packs bop, gop, jop and sop (i.e. sets e.g. –p jop enabled:true), –W enables wjop and wsop
- Soot phase options (explains every single phase in every pack and its settings)
Related Posts
No related posts.







Hi
I want to know how can i export Call Graph of whole program to a file like XML
thank you
Hi Amir.
I don’t think that there’s a ready-made class for this but you can easily do it yourself. Ondrej once posted some code that dumps it to a text file. See here…
Hi Eric
Thanks for your reply
I want to know how can I getting the CallGraph for a Method not for a scene?
I write a class that extends BodyTransformer , I like to get CallGraph of a method in the internalTrasforme(Body body,….)
thanks
That’s quite easy. Either (1) you just get the whole-program call graph from the scene and then query the outgoing edges of this graph for the method that you are interested in, or (2) you simply inspect all invoke statements in the method manually to see what may be called.
Hello,
I see that recent 1.6 HotSpot JVMs do quite a lot optimizations by themselves, so Soot doesn’t help too much now as it could a few years ago.
But I noticed that the biggest performance problem with recent JVMs for me is lack of object inlining (boxing of primitive types for example); creating millions of Doubles really costs a lot.
Does Soot, or some other generally-available optimizer for Java support object inlining?
Hello.
You observation about the modern JITs is certainly correct. About Object Inlining: No, I am not aware of such an optimization. The problem with such an approach may be that most of the time programmers are aware of the performance impact of using objects to represent primitive values and therefore they only do so if they have to, e.g. when storing primitive values in hash maps. In this case, the problem is rather in the data structures and type system than in the program itself and therefore there is little or no optimization potential. But this is just my theory and it may still be worth to investigate this on actual programs.
Eric