Efficient Runtime Verification
Early phases: The J-LO "Java Logical Observer"
As my Diploma Thesis (under supervision of Klaus Indermark and Mira Mezini), back at the RWTH Aachen University, I developed the Java Logical Observer, J-LO for short, a tool for runtime-checking temporal assertions. J-LO was the first tool to combine a temporal logic from the world of runtime verification with predicates in the form of pointcuts in the aspect-oriented programming language AspectJ. For the first time, this approach made it possible to monitor a program’s execution using AspectJ technology but without having to manually keep track of internal state. This "keeping track of state" was done by the J-LO tool by the means of a special alternating automaton.
During this time I submitted a student paper to the OOPSLA 2004 Student Research Competition. There, the paper was selected for the final round. An extended version was then awarded the first price in the undergraduate category of the ACM Grand Finals of the Student Research Competition! The price was awarded at the same banquet at which the annual Turing award was given to Vint Cerf and Robert E. Kahn that year, which was a great honour to me. Want to find more about the Student Research Competition? Click here.
Transition phase: From J-LO to tracematches
The J-LO tool had two problems that could easily be identified. First it was really slow. A program instrumented with and monitored by J-LO ran several orders of magnitude slower than without monitoring enabled. Secondly, the LTL based specification language used by J-LO was really hard to comprehend.
Hence, in 2005 I was looking out for opportunities to continue my research on the topic and "make it scale", both in terms of runtime performance and size of comprehensible specifications. Lucky for me that I was awarded an AOSD-Europe sponsorship that year to attend the AOSD 2005 conference where I met Laurie Hendren, my current supervisor. Laurie had just started up a new research group dedicated to the development of the AspectBench Compiler. During AOSD 05, this group was just about to submit a paper to OOPSLA 2005, about tracematches. As it turned out, tracematches have a lot of concepts in common with J-LO, although they use a different syntax (regular expressions, not LTL) and an entirely different implementation strategy (constraints in a finite state machine instead of alternating automata). Furthermore, it turned out that grants were available to get me a Ph.D. position at McGill and work on the static optimizations of tracematches. After all, I already knew the AspectBench Compiler quite a bit, because J-LO had been implemented using that compiler as well.
Current research: static optimization of tracematches
Tracematches can be optimized in at least two different ways. Firstly, the runtime monitor that is generated for a tracematch needs to be optimized. Pavel Avgustinov and Julian Tibble showed how trace monitoring tracking single groups of objects can be made feasible using two techniques called leak eliminitation and indexing. Leak elimination takes care that monitor state associated with objects can be garbage collected whenever the associated objects are garbage collected themselves. Objects that are still required for matching however, are strongly referenced, to keep the semantics intact. All this is done through a static analysis of the finite state machine only. Through a similar analysis, a clever indexing scheme is devised that looks up monitor state through the fastest possible indexing structure, again depending on the current state.
Even after such optimizations are applied, significant overhead can still remain. This is particularly the case for tracematches that (1) apply in many places of a program, (2) apply at places that are executed a lot and (3) at those places switch states back and forth. On such pattern notorious is the HasNext pattern:
Do not call next() twice on an Iterator without calling hasNext() in between!
Iterators are used a lot in many Java programs. Moreover, they naturally occur in (potentially hot loops). Within those loops, usually there is a sequence of interleaved calls to next(), then hasNext(), then next()… This leads to the automaton switching states at ever step (between q0 and q1, see figure), which is expensive.
In my ongoing research we published two different static whole-program analyses that identify parts of a given program where the tracematch’s internal state does not need to be updated. The first analysis, published in ECOOP 2007, consists of three different stages: Quick-check, flow-insensitive "consistent shadows" analysis and flow-sensitive "active shadows" analysis. The quick check rules out patterns that only match in part, e.g. for programs that never call next() in the above example. The flow-insensitive checker uses points-to information to find groups of instrumentation points ("shadows" in AspectJ lingo) that are incomplete. For instance, if there is an iterator on which hasNext() is called but never next(), references to that iterator need not to be instrumented. The last checker, is flow-sensitive and tries to take into account he order in which instrumentation points might execute. Unfortunately, that last checker turned out to be too imprecise and expensive. While the first two stages gave huge improvements, the last stage gave no improvements at all.
This is why we reconsidered the flow-sensitive approach and then in this technical report published a set of intra-procedural optimizations for tracematches. Those use extremely precise flow-sensitive and intra-procedural alias information to disambiguate references. References from other methods are taken into account via a relatively cheap global points-to analysis that yields summary information for those methods. As it turns out, this mechanisms works very well for exactly those patterns that we care about: Patterns in hot loops, often confined to a single method. In some cases, where patterns are not confined to a single method, we found out that preconditions were missing on method declarations – a very interesting side result!
The reason for why this analysis works so well is very interesting. Usually one would think that intra-procedural analysis of tracematches is impossible, because when entering a method it is impossible to know which state the tracematch automaton may be in, without taking global information into account. It turns out that this is correct, however after the first execution of a single instrumentation point the state for a particular object can usually be inferred. If then in the rest of the method the state for that same object is changed again, its state is already known! That way, we can easily optimize executions in hot loops etc..
RacerAJ: Effective race detection using AspectJ
In 2008, Klaus Havelund and I implemented a language extension to AspectJ that supports new pointcuts to support the detection of programming errors related to multithreading. Using this language extension we implemented RacerAJ, an AspectJ based version of the Racer algorithm (a variant of Eraser) that uses our language extension. Using RacerAJ we found 70 data races in heavily audited NASA code of the K9 Rover Executive. Our AspectJ language extension moreover enables other researchers to express algorithms like Racer in a very concise manner.





