Crash Course in Lightweight Code Review

royy 2010-12-02

展开全文

Crash Course in Lightweight Code Review

Get results in a week

By Jason Cohen
March 03, 2009
URL:http://www./architecture-and-design/215800147

Jason Cohen is the Founder of Smart Bear Software and author of Best Kept Secrets of Peer Code Review.

Most programmers agree that another pair of eyes on your code will uncover bugs and disseminate knowledge across development teams. But many also recognize that peer review can waste a lot of time.

So how do you get started in such a way that you 1) don't waste time, 2) match the process to your team and your goals, and 3) have a clear way to evaluate results? So many code review techniques exist, and each with pros and cons, so which are right for your team? Even if you're unwilling to spend the time to review all your code, perhaps spending a little time reviewing a specific subset would be worthwhile.

The only way to know is to try it for yourself. Use these tips to simplify, expedite, and measure the process.

What Is Code Review and How Do You Do It?

First, let's make sure we're on the same page. What is "code review?" The answer varies depending on whom you ask, and when; code review has changed dramatically in recent years.

Thirty years ago, if you wanted a well-known and measurable process that delivered results, your only choice was the dreaded Formal Inspection -- a seven-phase, multi-meeting, paperwork-heavy system popularized by Michael Fagan at IBM. Properly done, it takes over ten person-hours to review at most 200 lines of code -- a sluggish rate that makes this system impractical (not to mention excruciating) even if it does uncover bugs.

In the past 10 years a variety of evidence [1][2][3][4] has shown that many of the components of the traditional "heavyweight" inspection process are inefficient or even unnecessary. "Lightweight" modern code review methods include:

Over-the-Shoulder: The reviewer physically looks over the author's shoulder and is walked through the code.
- Pros: Easy to implement, no tools required.
- Cons: Interrupts reviewers, hard to know which code is reviewed, impossible to measure and track. Requires coordinating with someone else's schedule.
Email Pass-Around: Code differences are emailed automatically by the version control system. Reviewers respond to changes when they have time.
- Pros: Easy to set up, nothing to purchase, works remotely.
- Cons: No tracking -- don't know if found defects were fixed, often reviews never happen because everyone assumes someone else looked at it, no metrics; reviews "fizzle out" instead of finishing. Hard to follow conversations as emails get replied to and forwarded among several people.
Tool-Based Review: Use a development tool specifically designed to assist in code review. A variety of open-source and commercial tools exist.
- Pros: Removes most of the busywork associated with file-collection, discussion/defect management, and metrics/reporting. Tracks reviews and defects.
- Cons: Have to learn and integrate another tool, can cost money.
Pair-Programming: Two developers at the same keyboard and monitor working on the code together.
- Pros: Deep level of review; teams know review is happening but it's difficult to track.
- Cons: Large time investment, some developers dislike it. Requires coordinating with someone else's schedule.

Pick a method that most closely matches the existing culture of the team. Use the pros, cons and amount of time you have to select one and stick with it for one week before you start to make changes.

Getting Started: Baby Steps

It's difficult to introduce a process that involves a large disruption: projects get behind, people rebel, and the process fails -- regardless of potential future efficiency.

It's even harder when there's push-back from developers who aren't yet convinced the process makes sense. This effect is common in code review because people recall previous bad experiences with time-sucking formal inspections or with review processes that were designed with more rules than thought. There's also the worry of hurting people's feelings.

So the best way to introduce code review is at a small scale, with only a handful of files, incrementally, and with full transparency to the team. Taking baby steps is especially effective against those who say "We cannot change our process right now."

Given this, you'll want to select a style of code review that is lightweight and easy yet provides transparency by tracking things like "number of defects found" and "time spent in review." Any of the methods listed above will work; some are easier to get going but require more manual work to collect metrics. A good compromise as you're getting started is to use a code review tool for free, either by trialing a commercial tool or using an open-source tool. That way you get the benefit of structured guidance through the review but you can do it without dipping into the budget.

Low-hanging Fruit: Review What Matters Most

Besides the social and training benefits of introducing a small change to your process at a time, you may find that you can reap significant benefits even if you never get to the point where you're reviewing every line of code. If it turns out to be valuable only in specific circumstances, that's okay. Not every process needs to cover every file in every project over all time and all developers.

So if you're only going to do a little bit of review, what should you review to have the greatest impact?

Here are some ideas:

Review changes to the stable branch
Review changes to the core module -- the one all other code depends on
Review changes to the "Top 10 Scariest Files" as voted by the developers
Review only unit tests
Review only incremental changes, not entire files
Review code whenever a developer feels it's necessary

Once you get comfortable you can cast your net to a wider set of code if it makes sense.

By then you'll have a good idea of what you're getting from review and how much time it's taking, so you'll have the information you need to decide exactly how to expand the process.

Not All Bugs are Created Equal

It's worth mentioning that not all bugs are easy to find in code review. GUI problems are a great example. GUI code is often generated automatically and it's almost impossible to read the code and know if the GUI makes sense. In QA, you can instantly spot visual flaws, and mechanical problems are usually straightforward as well. Similarly, fixing bugs in GUIs is usually a simple, fast matter.

Clearly the type of bug found in code review matters to your efficiency and effectiveness calculations. Track the "category" of bug so you can determine efficiency by type. Example types are:

Algorithm
Build
Data Access
Documentation
Error-Handling
Interface
Maintainability
Performance
Robustness
Style
Testing
User Interface

This way you can empirically determine where your reviews are efficient and where they're a waste of time.

Measure Success

Code review takes up valuable developer hours, so it'd better be worth it!

Researchers have proposed many ways of measuring the productiveness and efficiency of code review, usually centered on the number of defects uncovered per thousand lines of code. But the clearest way is to use the two things that matter most: time and bugs.

The easiest way to visual this is: "How much time does it take us to find and fix one bug?" I call this "Time per Bug Fix." The formula is simply: (Total Time Spent ) / ( Number of Defects Found ).

This metric is clear and tangible; you can easily understand what it means and why it's interesting. It's also easy to compare to other processes, even if you're just estimating.

For example, how long does it take to find and fix a bug found in QA? A QA person has to run through some code, discover a problem, create a trouble ticket, and write down how to reproduce the problem. Then, a developer has to go find the bug - often the evidence in QA is an error dialog that doesn't help you understand the root cause of the problem - then fix it. QA then has to verify the fix before it can be considered finished.

A typical value for Time per Bug Fix for a lightweight peer review process is 5 to 15 minutes. The QA cycle described above is surely much longer than that. And let's not even talk about the damage a bug can do (in both costs and reputation) if it escapes QA and gets into customers' hands.

You can get your own number in only a week. Have everyone review code for twenty minutes a day for five days and you'll have enough data for a meaningful result.

Start Today

With a sufficiently lightweight peer code review process, focused on a specific subset of your code, designed to look for a specific set of problems, there's no doubt you'll able to find and fix defects faster and more easily than any other method.

Now you have the background to test and build a process that gets you there, so there's no excuse! Get started today.

References

[1] Lawrence G. Votta, Jr., Does every inspection need a meeting?, Proceedings of the 1st ACM SIGSOFT symposium on Foundations of software engineering, p.107-114, December 08-10, 1993, Los Angeles, California, United States

[2] Reidar Conradi, Parastoo Mohagheghi, Tayyaba Arif, Lars Christian Hegde, Geir Arne Bunde, and Anders Pedersen; Object-Oriented Reading Techniques for Inspection of UML Models - An Industrial Experiment. In European Conference on Object-Oriented Programming ECOOP'03. Springer-Verlag, Darmstadt, Germany, pages 483-501

[3] Kelly, D. and Shepard, T. 2003. An experiment to investigate interacting versus nominal groups in software inspection. In Proceedings of the 2003 Conference of the Centre For Advanced Studies on Collaborative Research (Toronto, Ontario, Canada, October 06 - 09, 2003). IBM Centre for Advanced Studies Conference. IBM Press, 122-134.

[4] In Proceedings of the 6th European Conference Held Jointly with the 5th ACM SIGSOFT international Symposium on Foundations of Software Engineering (Zurich, Switzerland, September 22 - 25, 1997). M. Jazayeri and H. Schauer, Eds. Foundations of Software Engineering. Springer-Verlag New York, New York, NY, 294-309.