分享

Automated code reviews with Checkstyle, Part 1 | JavaWorld

 bananarlily 2016-05-24

Code reviews are essential to code quality, but no team wants to review tens of thousands of lines of code, or should have to. In this two-part article, ShriKant Vashishtha and Abhishek Gupta show you how to overcome the challenges associated with code reviews by automating them. Find out why Checkstyle is one of the most popular tools used for code review automation, then learn how to quickly enhance its built-in rules with custom ones just for your project. Level: Intermediate


Enjoy premium bluetooth wireless Music and a lifetime warranty against sweat, at a significantly

Read Now

If you've worked on a large-scale project you know first-hand the value of an automated code review. Really big projects require the input of hundreds of programmers, often geographically dispersed and with great differences in skill level. Code written by engineers interacts with code written by novices in such projects -- haiku interspersed with high-school poetry.

In many cases, a QA team is assigned to review this code manually, based on coding standards used as guidelines for development. Manually reviewing millions of lines of code is a tedious job, though; ultimately as exhausting as it must be exhaustive.

Smart teams don't do code reviews manually: instead they rely on source code analyzers like Checkstyle, PMD, and JTest. Such tools come with readymade rules that help in maintaining code standards. These rules are a good starting point, but they don't account for project-specific requirements. The trick to a successful automated code review is to combine the built-in rules with custom ones. The more refined your rules, the more truly automated your code review becomes.

Given the benefits of automated code review, you might expect more people to do it. In fact, many developers want to implement project-specific custom rules like those available with Checkstyle. To the uninitiated, custom rule creation seems difficult and time-consuming. There's very little documentation on the internals of code review tools, and very few tutorials show how to create custom checks.

In this first article in JavaWorld's two-part introduction to Checkstyle, we'll remedy that situation, making the task of writing custom Checkstyle rules so simple that any Java developer can potentially do it in a day. After reading this article, you will be able to write your own custom Checkstyle rules without the help of specialized skills. In Part 2, we'll show you how to be more proactive about code quality, by stopping faulty code before it enters your code base.

Checkstyle and Java grammar

Checkstyle is a free and open source development tool that helps ensure that your Java code conforms to the coding conventions you've established. It automates the boring but crucial task of checking Java code. Checkstyle is often used as an Eclipse plugin, and also as part of a project build to create a report of coding-standard violations. It can be used in conjunction with build tools such as Ant or Maven. Checkstyle provides many readymade standard coding rules, which are very useful. However, this article focuses on creating custom rules that are more useful in enterprise development.

Before you write any custom rules for Java files, you need to consider the grammar used to write those Java files. Whenever you think about Java classes, a certain structure comes to mind. A Java class begins with a package definition, followed by import statements. In the object block (for a class or interface) you will find instance variables, a constructor, and methods. You could compare this to an XML tree structure. When you want to read an XML file, you use a parser. You the same thing with Checkstyle, but for Java files. Checkstyle uses the ANTLR Parser. Figure 1 illustrates the tree structure you get when the ANTLR Parser takes on a Java file.

A diagram of the tree structure of a Java file.
Figure 1. Tree structure of a Java file (click to enlarge)

You'll find no surprises in the structure shown in Figure 1. Continuing with the metaphor of an XML structure, the Type column in Figure 1 corresponds to XML tags, the Text column corresponds to the value of a tag, and the Line and Column columns correspond to tag attributes.

Checkstyle provides a Java Swing GUI tool that will let you view the tree structure for your Java files. You can invoke this tool with the following command:

java -classpath checkstyle-all-<version>.jar com.puppycrawl.tools.checkstyle.gui.Main <JavaFileToParse>

To produce Figure 1, we used SessionAwareCacheStore.java on the command line in place of <JavaFileToParse>. This class, among the many others discussed in this article, is included in the article source, checkstyle-src.zip. This package provides the code for all the Checkstyle rules, or checks used in this article. It also contains some very useful utility classes that simplify the task of writing Checkstyle checks, along with a readme.txt file that provides details on how to build, configure, and use these custom checks.

The Java tree structure shown by the GUI tool forms the basis for the creation of Checkstyle checks. The tree structure helps define the test cases for creating them.

How does Checkstyle work?

When you say that you want to write a custom Checkstyle rule or check, you're essentially saying that you want to write a class that extends the Check class. Checkstyle is implemented in terms of modules of checks. Modules can contain other modules and hence form a tree structure, as you can see in Listing 1.

Listing 1. File containing the list of modules in a custom Check configuration file

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE module PUBLIC "-//Puppy Crawl//DTD Check Configuration 1.2//EN" "http://www./dtds/configuration_1_2.dtd">
<module name="Checker">
  <property name="severity" value="error" />
  <module name="TreeWalker">
    <property name="severity" value="error" />
    <module
      name="com.abc.checkstyle.check.IllegalMethodCallInForCheck">
      <property name="severity" value="error" />
      <property name="methodNames" value="length,size" />
    </module>
  </module>
</module>

Using custom rules for code analysis

Custom rules can be used for the purposes of code analysis, as well as review automation. For instance, if you suspect a memory leak in an application, one of the first steps you should take is to a look at the Java collection classes, which often do not discard objects that are no longer in use. As a rule, you may want to look at static collections. The problem becomes a bit more complex, however, when a collection as a variable is not static, but the class that contains it is a static data member for another class. In those cases, simple grep-like tools may not do the trick. Rather than analyze thousands of classes, try creating a custom Checkstyle rule that will pick up all the instances of collection instance variables in different classes. This rule could also pick up all the data members in and out of the collection. Once you know how to write custom rules, you find that they are useful for more than just code reviews.

The Checkstyle kernel interacts with modules that implement the FileSetCheck interface. Checkstyle provides some FileSetCheck implementations by default. One of them is TreeWalker. TreeWalker walks through all modules (classes) that derive from the Check class. To write a custom rule, you need to extend the Check class and plug it into the Checkstyle configuration file.

In Listing 2, the class com.abc.checkstyle.check.IllegalMethodCallInForCheck is a custom Checkstyle rule that extends from the Check class.

TreeWalker is based on the fundamentals of the Visitor pattern. It walks through all classes that extend from the Check class. However, as a custom rule developer, you can specify the event that should prompt TreeWalker to visit a particular extension of the Check class.



Your first custom check

Imagine that you want to implement a rule, or check that gives you a warning when you put java.lang.Exception in athrows clause. In a Java class tree, you can specify that whenever the TreeWalker walks through a throws clause (theLITERAL_THROWS token in Figure 2), it should visit your Checkclass.

Sample throws clause
Figure 2. Sample throws clause (click to enlarge)

To achieve this, you need to specify a token (LITERAL_THROWS) in thegetDefaultTokens()method of your custom Checkimplementation. This token tells theTreeWalker to call the IllegalExceptionThrowsCheck class, as shown in Listing 2.

Listing 2. getDefaultTokens() method definition in IllegalExceptionThrowsCheck

...
public final class IllegalExceptionThrowsCheck extends Check {

  @Override
  public int[] getDefaultTokens() {
    return new int[] { TokenTypes.LITERAL_THROWS };
  }
             ...
             ...
}

getDefaultTokens() is an abstract method of the Check class; you need to implement it in every custom Check class. TreeWalker uses the getDefaultTokens() method to determine the tokens on which TreeWalker should call the check. In the current scenario, you've specified the LITERAL_THROWS token as the trigger to callIllegalExceptionThrowsCheck.

As you might have noticed, getDefaultTokens() returns multiple tokens. It indicates that Check will be called for all tokens you return from this method. For example, aCheck class with the getDefaultTokens() implementation in Listing 3 will be called whenever TreeWalker encounters the forwhile, or do-while tokens.

Listing 3. getDefaultTokens() method definition with multiple tokens returned

public int[] getDefaultTokens() {
  return new int[] { TokenTypes.FOR_CONDITION, TokenTypes.LITERAL_WHILE,
      TokenTypes.DO_WHILE };
}
jaybird headphones

Enjoy premium bluetooth wireless Music and a lifetime warranty against sweat, at a significantly

READ NOW

ShriKant Vashishtha is a principal consultant at Xebia, specializing in agile offshore software development and consulting. He has more than ten years of experience in the IT industry and is involved in designing technical architectures for various large-scale Java EE projects, applying Agile methodologies like Scrum and Agile-RUP. ShriKant holds a bachelor's degree in engineering from the Motilal Nehru National Institute of Technology in Allahabad, India.

Abhishek Gupta currently works as a technical lead for Tata Consultancy Services in India. He has more than four years of experience in the IT industry and is involved in designing technical architectures for various large-scale Java EE and Java ME projects for the transport, banking, and retail industries. Abhishek holds a bachelor's degree in engineering from Uttar Pradesh Technical University in Lucknow, India.

Learn more about this topic

Product information and downloads

Automated code reviews on JW

Build tools and integration management on JW

JW recommends

  • Light Heat Code: Effective code reviews: Presents a stripped down code review process for a small, agile shop.
  • In pursuit of code quality (Andrew Glover, IBM developerWorks): A column series introducing tools and techniques for establishing and maintaining code quality.
  • Java Power Tools (John Ferguson Smart; O'Reilly Media, April 2008): A must-have reference guide to open source tools for Java developers. Hear this: Author John Smart discusses the tools he uses and why.

More from JavaWorld

  • Visit the JavaWorld SDLC research center for more articles about tools and techniques for managing the Software Development Life Cycle.
  • JavaWorld's community platform is growing: Check out JW Blogs and the improved Java Q&A Forums.
  • Also see Network World's IT Buyer's Guides: Side-by-side comparison of hundreds of products in over 70 categories.

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多