Using Soot and TamiFlex to analyze DaCapo

Eric | March 29, 2010

In this tutorial, I describe how to use TamiFlex to facilitate the static analysis of the DaCapo benchmarks with Soot. You can also find this tutorial on the TamiFlex website.

Also feel free to use our scripts for this purpose. You can also find many details in our Technical Report.

Step 0: Downloading the necessary components

To analyze DaCapo benchmarks with Soot, first download the following:


After downloading the JAR files, your setup should look roughly like follows:

$ls -1
dacapo-9.12-bach.jar
pia.jar
poa.jar
soot-2.4.0.jar

Step 1: Dumping classes and creating log files

Next we use the Play-Out Agent to dump for each of the DaCapo benchmark configurations all class files that the JVM loads when executing this configuration and a reflection trace file that contains information about reflective calls.

Let us first consider a single run on dacapo, on avrora-small. Normally, we run avrora simply by stating java -jar dacapo-9.12-bach.jar avrora -s small. To activate the Play-Out Agent, we instead use the following command:

$ java -javaagent:poa.jar=out/avrora-small -jar dacapo-9.12-bach.jar avrora -s small
===== DaCapo 9.12 avrora starting =====
===== DaCapo 9.12 avrora PASSED in 5065 msec =====
=============================================
TamiFlex Play-Out Agent Version 1.0
Found 36 new log entries.

The part -javaagent:poa.jar instructs the VM to use the Play-Out Agent. The suffix out/avrora-small tells the agent where to dump the files to. Note the additional output Found 36 new entries.. The agent reports that it found 36 new entries for the reflection-log file.

You can inspect the log file (And dumped class files) if you want:

$ ls out/avrora-small/
avrora  cck  Harness.class  java  org  refl.log  sun
$ head out/avrora-small/refl.log
Class.forName;avrora.Main;org.dacapo.harness.Avrora.<init>;26;
Class.forName;java.security.MessageDigestSpi;java.security.Security.getSpiClass;640;
Class.forName;java.util.CurrencyData;java.util.Currency$1.run;128
...

To see what additional arguments the Play-Out Agent supports, use this command:

$ java -javaagent:poa.jar
This agent accepts the following options:
[verbose,][count,]<path>

...

NOTE: The Play-Out Agent requires additional heap space. We therefore advise you to provide the JVM with additional space using the -Xmx flag.

For your convenience, we provide a script that allows you to dump class files and reflection traces for all dacapo benchmarks and all input sizes.

Step 2: Running Soot

We next want to use Soot to analyze (and potentially transform) the dumped class files, constructing a call graph based on the information from the reflection trace file.

To apply Soot to avrora-small, we can use the following command:

$ java \
-Xmx10G \                                            # use 10GB heap space
-cp soot-2.4.0.jar soot.Main \                       # run Soot
-w -app -p cg.spark enabled \                        # enable Spark
-p cg reflection-log:out/avrora-small/refl.log \     # use the given reflection log
-cp ${JRE}/jce.jar:${JRE}/rt.jar:out/avrora-small \  # classes that Soot should analyze
-include org.apache. -include org.w3c. \             # include given packages (see below)
-main-class Harness \                                # use Harness as entry point for call graph
-d sootified/avrora-small \                          # place transformed classes here
Harness                                              # analyze program starting at Harness

Soot started on Thu Mar 11 13:33:37 CET 2010
Warning: avrora.sim.mcu.ATMega128$Factory is a phantom class!
Warning: avrora.sim.platform.DefaultPlatform is a phantom class!
Warning: avrora.sim.mcu.MicrocontrollerFactory is a phantom class!
Warning: cck.util.Util is a phantom class!
...
[Call Graph] For information on where the call graph may be incomplete, use the verbose option to the cg phase.
[Spark] Pointer Assignment Graph in 2.5 seconds.
[Spark] Type masks in 1.0 seconds.
[Spark] Pointer Graph simplified in 0.0 seconds.
[Spark] Propagation in 77.9 seconds.
[Spark] Solution found in 77.9 seconds.
Transforming org.w3c.dom.Document...
Transforming org.w3c.dom.Element...
Transforming org.w3c.dom.Node...
Transforming org.w3c.dom.NodeList...
Transforming org.w3c.dom.Text...
...
Writing to sootified/avrora-small/org/w3c/dom/Document.class
Writing to sootified/avrora-small/org/w3c/dom/Element.class
Writing to sootified/avrora-small/org/w3c/dom/Node.class
Writing to sootified/avrora-small/org/w3c/dom/NodeList.class
Writing to sootified/avrora-small/org/w3c/dom/Text.class
...
Soot finished on Thu Mar 11 13:35:21 CET 2010
Soot has run for 1 min. 43 sec.

The parameters -include org.apache. -include org.w3c. are not really necessary for avrora but we recommend using them for DaCapo in general. The problem is that, by default, Soot does not analyze any classes residing in the following packages:

  • java.
  • sun.
  • javax.
  • com.sun.
  • com.ibm.
  • org.xml.
  • org.w3c.
  • org.apache.
  • apple.awt.
  • com.apple.

But some of the DaCapo benchmarks, e.g. batik consist mostly of classes in org.apache. Therefore we must instruct Soot explicitly to include these packages.

After running Soot you will find the transformed class files on disk:

$ ls sootified/avrora-small/
avrora  cck  Harness.class  org

(In default mode Soot applies virtually no transformations on the given classes but we could, of course, have enabled some whole-program optimizations at this point.)

Again, for your convenience we provide a script to process all DaCapo? benchmarks with Soot.

3. Running DaCapo with the transformed class files

Next we use the Play-In Agent to run DaCapo with the transformed class files:

$ java -javaagent:pia.jar=sootified/avrora-small -jar dacapo-9.12-bach.jar avrora -s small
===== DaCapo 9.12 avrora starting =====
===== DaCapo 9.12 avrora PASSED in 5534 msec =====
=============================================
TamiFlex Play-In Agent Version 1.0
Replaced 1060 out of 1066 classes.

The agent reports that it replaced X out of Y loaded classes by those taken from the directory. (Not that there are some classes that it may not be able to replace because, by default, Soot does not transform, nor output, classes in java.lang.* etc. See our comment on the -include flag above for details.) The agent caused the VM to load avrora’s class files from sootified/avrora-small instead of dacapo-9.12-bach.jar. In case you don’t believe us, try the following:

$ echo "MALICIOUS" > sootified/avrora-small/avrora/Main.class
$ java -javaagent:pia.jar=sootified/avrora-small -jar dacapo-9.12-bach.jar avrora -s small
java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at org.dacapo.harness.TestHarness.runBenchmark(TestHarness.java)
	at org.dacapo.harness.TestHarness.main(TestHarness.java)
	at Harness.main(Harness.java)
Caused by: java.lang.ClassFormatError: Incompatible magic value 1296125001 in class file avrora/Main
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:621)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
	at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
	at org.dacapo.harness.DacapoClassLoader.loadClass(DacapoClassLoader.java)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:247)
	at org.dacapo.harness.Avrora.<init>(Avrora.java)

NOTE: The Play-In Agent consumes time replacing class files when classes are loaded into the VM. This may cause additional overhead on the first iteration of any DaCapo benchmark. We therefore recommend using the -n command-line option to DaCapo to increase the number of iterations and not measure the first one.

For your convenience, we provide a script to run all “sootified” benchmarks, using the Play-in Agent.