Visualizing Dataflow
Posted by Mike Haller
on Monday, June 21. 2010
at 22:49
in Java
Not being an AOP guru, i wondered if there was something else to use. Something, which is unobtrusive and can be applied to existing systems. The first thing i'm trying is Java's Debugging APIs, namely JDI, to automatically step through a program and record method entries.
Method parameters represent data flow into methods: all parameter values (ObjectReferences actually, no primitives currently) are recorded as a dataflow relationship between the caller and the receiver based on the call stack.
Assume the following little example program:
public class Application {
public void doSomething() {
Person person = new Person("Chuck Norris");
Order order = new Order();
order.setBuyer(person.getName());
}
}The program is started in debug mode, so the VM will suspend and wait for a debugger to connect. Next, i'm starting my tool which connects to the JDI port and registers debugging hooks. After these debug hooks are configured, the VM is resumed and the threads begin to spawn. After bootstrapping the main method (it contains only new Application();), the tool will receive the low-level debug events, stop all threads, record parameter values, resolve Object references, send out high-level events and visualize the high-level events using a graph viewer.
The output (made with JUNG graph visualization library) is then displayed on screen. You can watch the object's being created and the data flow evolving in "realtime" (debugging is very slow, hence the quotes).

For the simple example, it looks nice. The green dots are application code, orange dots are java.* classes. The little orange dot in the bottom right corner is probably the Class's string constant pool, just ignore it for now.
You can see that the example data (the constant string Chuck Norris) flows into a Person object and it also somehow flows into the Order object. Note that the diagram does not show that the Application was involed in any way, e.g. that the application was under control when the data was copied from the Person object into the Order object. This may seem a little odd at first, since as a human we tend to see the Person object as the sole owner of the String Chuck Norris. But in the virtual reality, the String owns itself of course, since it's a normal Object and lives in the constant pool space.
I'm not sure if it would be better to see data as edges instead of vertices. But if data objects like the string are represented as standalone vertices, they can be referenced multiple times and their value flows into other objects. The more often the value flows from one object to another, the more weight can be put into the edge's stroke. You can see this in the example above. The edge between "Chuck Norris" and the Person object is thicker when more data flows.
The Person accesses the String object in its constructor (1) the first time:
public class Person {
private final String name;
public Person(String name) {
this.name = name; // (1)
}
public String getName() {
return name; // (2)
}
}
and then a second time when its getter method accesses the value (2), to return it to the application which forwards it to the setter method of the Order. Since the setter method in the Order is only called once, it is the sole and last access (3) from the Order to "Chuck Norris" and therefore is only a thin line.
public class Order {
private String buyer;
public void setBuyer(String buyer) {
this.buyer = buyer; // (3)
}
}
The main problem with using JDI, besides that it was not meant for this kind of processing, is that it's very unstable. The VM crashes regularly:
# A fatal error has been detected by the Java Runtime Environment: # # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000000006dd098a0, pid=10416, tid=5180 # # JRE version: 6.0_16-b01 # Java VM: Java HotSpot(TM) 64-Bit Server VM (14.2-b01 mixed mode windows-amd64 ) # Problematic frame: # >>>STDOUT>>> V >>>STDOUT>>> [jvm.dll+0x4798a0] # >>>STDOUT>>> # An error report file with more information is saved as: # E:\workspaces\workspace\sandbox-debugging\hs_err_pid10416.log >>>STDOUT>>> # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # Program exited:1
How sad. Perhaps I should check out AspectJ for my toy project instead of the debugger api.

Have you tried if it works better when using a more up to date hotspot version? (Also a 32bit client vm?)