Monitoring code quality
I've been looking for tools to measure and visualize code quality for some time now. Every once in a while, there's some tool which aggregates results from quality measurements and monitoring tools like PMD, Findbugs, Checkstyle etc. but I haven't yet found something which attracted my interest.Read on to find out about Codehaus' Sonar.
My requirements for such a tool are these:
- It must run as a server and persist the data. Maven project site is nice, but it's a snapshot-view in time. I want to be able to compare the current situation of the project to its historic evolution. I want to give my boss answers whether our efforts to improve code quality really made a difference or not and how much improvement we gained.
- It must be able to import historic data. A projct lasts for years and the lifecycle of software tools is sometimes shorter than the enterprise projects we're working on. Hence, I'd like to be able to import build information of a project from its tagged releases in version control.
- It must provide the information about code quality in a meaningful and easy to grasp way. Tons of XML lines are not easy to read, so I want to have few numbers and few charts, with the ability to dig into the details when I want to.
- It must support Maven build process and must be integrated into the automated build and release process. If manual steps are involved, it will be forgotten when team members leave and rejoin.
- It should be customizable in regard to what and how it measures, like adding new coding rules or disabling rules which are not approriate. That includes filtering out specific packages or components.
Yesterday, I stumbled accross a relatively new project at Codehaus named Sonar. Peter Chilcott replied on a question I tweeted regarding persisting JUnit results.
Sonar is a web application and can be run standalone. Sonar's Maven plugin integrates nicely into the automated build and release process. It provides meaningful and beautifully aggregated information about the health of your software. It's free and easy to administrate. Sonar is exactly what I was looking for. It's extensible using plugins, although the currently available plugins lack of variety.
Sonar has a live installation at codehaus which monitors many of the Codehaus projects. Have a look at the Nemo page for a live online demo.
In the upper left corner, you will find the first metric which introduces the project you're looking at. It shows the amount of code we're dealing with. This metric will give you a clue about the size of the project. In this case, 1600 lines of code is pretty smally. Therefore, we expect the overall complexity to be rather small and I don't expect any critical quality issues.
Online-tools like Ohloh show the size of open source projects. If you ever wanted to know how many lines have been written for Linux, Eclipse or your favourite media player, go ahead and take a look. That's pretty amazing what we get for free and take it as granted today.
The next block shows the amount of documentation within the code. Only 10%, that seems bad. However, since the project is very small (see lines of code), perhaps the code itself is readable and understandable enough so it does not need any more documentation.
The third block shows the amount of duplication within the code. That usually means Copy'n'Paste code. Copy+paste itself is not necessarily bad, but it often leads to duplication of code and that is bad. If you've got duplicate code, have a look at Jeff Atwood's Real Ultimate Programming Power.
Whew, that's the meat we've been looking for: the complexity obviously shows how complex the project is. What does it mean? A high complexity number per method means that your methods are too large, for example with multiple nested if/for blocks or just overly long. Remember, a method should not be larger than 10-20 lines of code. If it's longer this might indicate that it's doing too much and you should refactory, e.g. extract a new method. The complexity per class shows what the classes are responsible for. Again, if the number is high, it's an indication for your classes doing too much different things. A unit should only do what it was meant for: one thing.
Sonar even displays the complexity as a histogram, so you can see how many classes fall into which complexity category. You can hover the diagram to see a bigger view of the three classes Simple, Average and Complex. Good is, when the bars on the left side are high and the bars on the right side are low.
The compliance shows how often coding rules have been violated.
These are often things which are "in the way" while programming, stuff which you don't think about while you're concerned with implementing the business requirements from the specification. Stuff which is purely technical, often not fun to write and no project manager ever thinks about or wants to know. And that's what a software developers job is also about: remembering that these rules have to be thought of and obeyed, too. This is the make it right part of the three steps make it work, make it right, make it fast. In this case, 85% of the code is fine, as it obeys all rules. The other 15% of the code need inspection and perhaps correction.
Compliance and number of violations have a direct relationship. Think of the compliance as an indicator for "How good is our code, if technically perfectly written code would be 100%". The number of violations show the absolute number of code lines we have to fix. 83 violations is not much. We can probably fix them within an hour or so. From the bars i can see that the first thing I should look at is the violations regarding Usability. This is not the usability the end-user of the software experiences, but the usability experienced by some other programmer working with my code. Often drills down to coding style glitches, confusing things, duplications, unnecessary code etc.
Violations are categorized into five main categories:
Efficiency violations may have a negative impact on performance or throughput of your software. In an enterprise project, tiny code improvements suggested by these alerts seem to be useless compared to rough performance impacts like long database access. However, a penny saved is a penny got.
Maintainability violations make it hard for your fellow collegues (and of course for yourself) to read the code - especially after some years.
Portability violations are indicators for sloppy code which may run on the developers machine, but misbehave or even crash on the customers computer.
Reliability violations are really really bad. This is code which behaves fine. You think. but then, at some day under heavy load, the code has subtle bugs and just doesn't do the right thing it is supposed to do. Reliability violations are the worst kind of coding failures, as they are the cause of many untracable bugs. ("Somewhere deep down in some library, setRollbackOnly() is set, but nobody knows when or where. Took days to find out and fix it.")
Usability violations shows how awkward it is to use the code. If you're writing a library, framework or any other kind of public API, this is the part where you'll probably try to be very clear and clean.
Test-infected people tend to look at the code coverage metric first. It is one of the easier quality metrics to grasp. Naturally, we know that the goal is to reach 100% code coverage. However, we also know from our profession and experience that it's impossible to reach perfectness. (It's also undesirable from an economic point of view. Which employer would like to pay employees for wasting time trying to write the perfect software? It's simply not possible.) On the other hand, 100% code coverage still does not take every possibility into account.
The number of successfully executed unit tests should be 100%. As with the other quality metrics, the percentage should be 100%. While being in the development phase, it's okay that sometimes a test fails. But when working in a team, it is crucial that code being committed to the central code repository is always, at any time, compilable and buildable. For us, a test fail is a build fail. Whenever a test case fails, the build is broken and the software is incorrect and cannot be released by definition. Fixing the test case has the highest priority as it ensures that the software will behave as expected, at least in the areas covered by tests.

The Compliance shown as package map helps me to find out in which module I have to dig to improve the code quality. The graph can be configured to show Rules compliance, amount of duplication and test coverage. Hovering the packages shows some quick infos about them and clicking ont he package drills down and shows the whole detailed information.

Finally, the dashboard shows the project information including the history of the project. It uses all the data provided by Maven to show links to the homepage, the issue tracker and build server of the project. This is not only a good fit for Open Source projects, but is also very useful for in-house development teams collaborating with other in-house teams. It helps the communication and visibility of the project.
Conclusion
I'm pretty happy with what I've seen so far from Sonar. I've imported about 10 projects for testing it and it works fine. I'm going to use it at work for our enterprise projects to support and improve code quality and make it more visible to our stakeholders that we're actually caring about software quality.



Looks like something, we were searching for a long time.
It supports non-Maven projects as well (with restrictions of course).
So I'm going to try it with our Maven builds on Hudson.
But the ultimate goal would be integrating it with CruiseControl and PDE Build.