Dr. Dobb's | Improving the Development Process | May 19, 2010

royy 2010-12-02

展开全文

Improving the Development Process

It's just as important to have good development processes as it is a good system architecture

By Eric J. Bruno
May 19, 2010
URL:http://www./tools/224900379

Eric J. Bruno is a contributing editor to Dr. Dobb's. He can be contacted at eric@ericbruno.com.

For an application's long-term success, it's just as important to have good development processes as it is a good system architecture. In my experience, even the best development organizations -- those who manage their design and coding tasks well -- tend to leave application deployment and management until the end. This includes the proper use of a source code repository such as CVS, where multiple development projects may be working in parallel. Additionally, the source code repository should be used, along with a scripted procedure, to pull and deploy code to production environments.

Too many organizations rely on developers or system administrators to build software releases by hand, and then manually deploy them to the appropriate servers in production. This manual process applies to both new software releases, as well as the provisioning of new servers. Further, they also rely on developers or other build specialists to manage parallel programming projects in the same code base, again by hand. These approaches are fraught with danger because they rely on manual processes, as well as specialized knowledge only a few people in the organization may possess. To remedy this, I suggest the following:

Parallel development: Use source code branching along with a process that I've found to work well (described below).
Controlled builds: Use an automated build process that pulls code from a repository by label and compiles on a dedicated build machine.
Software deployments: Use a combination of Ant scripts and source code repository labels to deploy the proper artifacts to different types of servers (i.e. web server, application server, database, and so on).

Before getting into the details, let's cover some basics that, hopefully, you're already following. When developing code for any system, it's best to adhere to the following development process principals:

Ensure that all code is kept in a source-code repository (i.e., CVS or Subversion).
Perform system builds on a regular schedule that may change according to development phase (i.e. once per week early in a development cycle, or once per day towards the end).
Ensure that all code checked into the repository compiles, is unit tested to some degree, and does not break the build in anyway.
Any build that is a candidate for release and is either deployed to a test system or a production system is built with the latest code, and is tagged (labeled) in the source-code repository. The build should then be "pulled" from the repository bytag to ensure consistent releases across time and servers.
Use branching to ensure that changes to a particular release can be made to that release's specific code base, even when development has begun on the next release; i.e. parallel development.

Although I use CVS for examples in this article, most of the concepts should apply to other source-code repositories as well, be it Subversion, Git, or Mercurial. Let's examine the procedure for handling parallel application development first.

Improving Release Management

Based on experience on multiple projects in the past, I've settled upon a CVS method of working that seems to work best. The process is based on code branching. It's important to note that I've seen projects go very wrong through the improper use if branching, so I've come up with a recipe that works well.

Even if you don't maintain parallel programming teams, there may be times when you'll need to work on two versions of the same application at one time. For instance, when a software release is deployed to production, it's important that you be able to get to the exact source code and all supporting files that were used for that release, even if coding has progressed since then, in case bugs are found in production.

To achieve this, development is done in the HEAD (sometime called the "tip") of the source code repository, and each release cycle must be performed in its own branch. This branch, uniquely labeled to represent the particular release being tested/deployed, is created when development is complete, and a comprehensive quality-assurance (QA) cycle is about to begin prior to its release to production.

Also note that there is only one active source code branch at any one time. This rule should be strictly followed. The branch is created when development is "feature complete," and a QA cycle is about to begin. The following steps discuss this procedure in more detail:

Create a new branch with the release cycle/version/name as part of the branch name (i.e., BRANCH_MYAPP_v_2_34)
1. Important note: At this time, development for the next release can begin in the HEAD again. The branch ensures you can get back to this release's specific code base.
Pull the latest code from the branch created in step 1, and do a complete build on a trusted machine. This can be a dedicated build machine, or a developer's machine that is considered the build master.

Place the resulting binaries back into the CVS branch.

Label the branch with a unique label for that release and build (i.e., MYAPP_v_2_34_build_001)

Pull the binaries and all supporting files (i.e. configuration files) from the CVS branch using the label created in step 4, and deploy to your QA servers.
If bugs are found, perform code fixes in the branch (not the HEAD), and repeat the QA cycle starting from step 2. Note that this will result in an incremented label in step four due to the new build number.
When the release passes QA, pull the binaries and supporting files from the branch using the most recent label created in step 4, and deploy to production.
Merge the branch back into the CVS HEAD, paying careful attention to any potential conflicts (remember that development may have started for the next release there). In my experience conflicts are rare, and are easily resolved when they do occur. However, they should never be done automatically; it's crucial to review the conflicts and make careful decisions to resolve them.
Development for the next release can continue in the CVS HEAD, and this process is repeated when that cycle is "code complete."

Because the code in a branch is merged into the HEAD when it passes QA and is released, the HEAD is sure to have all of the code changes (bug fixes) made during the QA cycle, even if development has progressed in the HEAD for the next release. Also, if a problem arises in an earlier release, because the branches are not deleted or removed in any way, the precise code base is always available for every release ever made. This guarantees that you can recreate any revision of your application at any point in the future.

This process works well for a one development group working on one release at a time, as well as multiple development groups working on multiple releases at a time. The most efficient approach to follow involves two parallel development groups, where you align the releases so that when one development group is in a QA/bug fix/release cycle, the second development group is coding the next release. As a result, if you plan each release to be roughly equal in terms size and development time, the second development group should be entering QA as the first development group deploys its release and begins coding the next version.

This swapping of roles between the development groups (when one is coding, the other is testing/bug fixing) ensures the most efficient use of resources, and that the release frequency is increased. This process has the following advantages:

Both of your development groups work at a reasonable pace
You achieve twice as many software releases in a normal time frame
New features made available to users sooner, without sacrificing quality

Next, let's dive deeper into improvements to the software deployment process. These improvements build from the process we just examined. The example we'll follow is for enterprise (hosted) applications, but the process should apply to any type of software.

Improving Release Deployment

The CVS development and branching procedures outlined in the previous section are important to ensure that the exact code and configuration for every release can be retrieved when needed. It also provides an efficient process for parallel development groups to follow. However, it's also important that software deployment be done efficiently, and without error. To make this process painless and error-free, the solution I use is to combine Ant scripts with CVS, using the proper software release label.

Ant scripts that take as parameters the task/software name, the software's CVS label, and the server name to deploy to, make for a quick, easy, and precise deployment process. For example, look at the following command:

> ant deploy.xml -Dcvsuser=ebruno -Dlabel=MYAPP_2_234_build_012 -
Denvironment_name=prodserver_01 deploy_app

In this command, the first parameter is the Ant script; next is the CVS username to use; next is the CVS label to use to pull the release; next is the server to deploy to; and the final parameter is the ant task itself. In this case, the Ant task, deploy_app, clearly indicates that we are deploying software.

Ant Tasks for Application Deployment

Let's take a closer look at the Ant scripts themselves, along with the shell script that makes it easier to pass in all of those parameters. For instance, the shell script in Example 1 sets the path, classpath, and Ant options (such as heap size) to run Ant. It also collects the parameters (CVS user, CVS label, and the environment name), and calls the Ant script to execute the deployment.

#! /bin/sh
ANT_HOME=/usr/local/ant
JAVA_HOME=/usr/java
ANT_OPTS="-Xmx512m"
PATH=${PATH}:${ANT_HOME}/bin
CLASSPATH=$CLASSPATH:$ANT_HOME/lib:$ANT_HOME/lib/jakarta-ant-1.4.1-optional.jar
export ANT_HOME JAVA_HOME ANT_OPTS PATH CLASSPATH
CVSUSER=$1
RELEASE=$2
## ex: qa, prodserver_01, prodserver_02, etc.
ENVIRONMENT=$3
## true or false
FORCE_CHECKOUT=$4
ant -buildfile deploy.xml -Dcvsuser=$CVSUSER -Dlabel=$RELEASE -
Denvironment_name=$ENVIRONMENT -Dforce_checkout=$FORCE_CHECKOUT deploy_app

Example 1: The shell script to call the Ant script for release deployment.

The parameters passed into this shell script are set as environment variables in the line that executes the ant command at the end. These are read in as part of the Ant script, as shown here (see Listing 1 for the entire script):

...
<property environment="env"/>
<property name="cvsuser" value="${env.CVSUSER}" />
<property name="label" value="${env.LABEL}" />
<property name="environment_name" value="${env.ENVIRONMENT_NAME}" />
<property name="force_checkout" value="${env.FORCE_CHECKOUT}" />
...

The properties listed here are set with the parameters entered in the shell script and specified as -D command-line parameters when calling Ant. There are other properties, such as the CVS project name, and the name of application archive file being deployed, that are hard-coded in this example. You'll need to replace these with the correct names for your project, or you can pass them as additional environment variables if you choose.

<project name="My Application Release" default="init" basedir=".">
<!--
Read some values set in the environment (as -D parameters)
-->
<property environment="env"/>
<property name="cvsuser" value="${env.CVSUSER}" />
<property name="label" value="${env.LABEL}" />
<property name="environment_name" value="${env.ENVIRONMENT_NAME}" />
<property name="force_checkout" value="${env.FORCE_CHECKOUT}" />
<!--
The following could also be set from the script used to run deploy.xml
These are examples. Replace with your project's name in CVS, etc.
-->
<property name="cvsproject" value="MYAPP"/>
<property name="ear" value="myapp_ear"/>
<!--
These params store backup location (to store current release)
and the directory to deploy the new release to
-->
<property name="application_dir" location="/apps"/>
<property name="backup_dir" location="/apps/backups"/>
<property name="release_dir" location="/apps/releases"/>
<property name="cvs_checkout_dest" value="${release_dir}/${label}" />
<target name="init">
<echo message="CVS User is ${cvsuser}"/>
<!--  create the timestamp   -->
<tstamp>
<format property="backup.date" pattern="yyyy.MM.dd" locale="en"/>
</tstamp>
<!-- has backup been made? -->
<available file="${backup_dir}/${cvsproject}_bak_${backup.date}"
type="dir" property="backup.present"/>
</target>
<target name="check_release">
<!--
Skip checkout of if already there and not forcing a new one
-->
<condition property="skip.checkout">
<and>
<available file="${cvs_checkout_dest}/${cvsproject}"
type="dir" property="release.present"/>
<isfalse value="${force_checkout}" />
</and>
</condition>
<echo message="skip checkout=${skip.checkout}" />
</target>
<target name="cvs_checkout"
depends="init,check_release"
unless="skip.checkout"
description="check out project from cvs">
<!-- delete existing files before checking out -->
<delete dir="${cvs_checkout_dest}/${cvsproject}"
quiet="true" failonerror="false" />
<!-- perform the checkout (replace cvs server details with your own) -->
<echo message="Checking out release..."/>
<mkdir dir="${cvs_checkout_dest}" />
<cvs command="checkout"
cvsRoot=":pserver:${cvsuser}@192.168.1.2/repository/cvs"
dest="${cvs_checkout_dest}"
package="${cvsproject}/${ear}"
tag="${label}"
/>
<echo message="Removing CVS-specific files..."/>
<delete includeEmptyDirs="true">
<fileset dir="${cvs_checkout_dest}" defaultexcludes="no">
<include name="**/CVS/**" />
</fileset>
</delete>
<echo message="Creating version.txt file..."/>
<!-- update version number -->
<propertyfile file="${cvs_checkout_dest}/${cvsproject}/${ear}/version.txt">
<entry key="version" value="${label}" />
</propertyfile>
</target>
<target name="backup"
depends="init"
unless="backup.present"
description="backup old release">
<echo message="Backing up old release..."/>
<mkdir dir="${backup_dir}/${cvsproject}_bak_${backup.date}" />
<copy todir="${backup_dir}/${cvsproject}_bak_${backup.date}">
<fileset dir="${application_dir}/${ear}">
<!-- you can exclude directories here -->
<exclude name="data/**" />
</fileset>
</copy>
</target>
<target name="deploy_app"
depends="cvs_checkout,backup"
description="deploy the application">
<echo message="Deploying new release..."/>
<!-- copy application files -->
<copy todir="${application_dir}"
preservelastmodified="true" overwrite="true">
<fileset dir="${cvs_checkout_dest}/${cvsproject}" includes="${ear}/**" />
</copy>
<!-- move environment properties -->
<copy todir="${application_dir}/properties"
preservelastmodified="true" overwrite="true">
<fileset dir="${application_dir}/${ear}/properties/${environment_name}" />
</copy>
<!-- move environment scripts and make executable -->
<copy todir="${application_dir}/scripts"
preservelastmodified="true" overwrite="true">
<fileset dir="${application_dir}/scripts/${environment_name}/${server_type}" />
</copy>
<chmod dir="${application_dir}/scripts" perm="ugo+x" includes="**/*"/>
</target>
</project>

Listing 1: Ant script that pulls from CVS and deploys to the specified server.

The rest of the Ant script contains five targets:

init sets the CVS user, and prepares to backup the existing deployment.
check_release determines if we can skip the CVS checkout process. This is done when deploying the same release to multiple servers. The checkout is done once, and the same files are< used to deploy multiple times. This can be overridden by setting the FORCE_CHECKOUT parameter to True.
backup backs up the existing application files on the server where you're deploying the new release. This is done in case you need to quickly get back to the original files.
cvs_checkout checks out the release by specified label from CVS, and creates a file to records that release's version number. You can use this as a sanity to check to ensure the deployment actually occured.
deploy_app is the main target that calls all of the other targets (via the dependancy tree) to backup the existing application files, checkout the new release from CVS, and then copy the files to the right places on the specified server.

Ant has built-in support for source code repositories such as CVS. For instance, the cvs_checkout target uses the following command to do its work:

<cvs command="checkout"
cvsRoot=":pserver:${cvsuser}@192.168.1.2/repository/cvs"
dest="${cvs_checkout_dest}"
package="${cvsproject}/${ear}"
tag="${label}"
/>

Remember to replace the CVS server entry with the address of your actual CVS server, and the path to your CVS repository. The package name, tag name (label), and destination directory are all derived from the parameters you specified in the shell script earlier.

This sample Ant script assumes you're deploying a Java EE enterprise application archive (EAR) file. However, you can pull any number of files and file types. The deploy Ant target contains the entries to copy the files to the server, as shown here:

<!-- copy application files -->
<copy todir="${application_dir}"
preservelastmodified="true" overwrite="true">
<fileset dir="${cvs_checkout_dest}/${cvsproject}"
includes="${ear}/**" />
</copy>

The parameters in this Ant task are pretty self explanatory. The example script contains additional entries that copy other files, such as properties files and execution scripts to start start and stop the server, simply as an example. You can take this further, and create scripts that deploy the actual binaries and other files for your web server, application server, database, and so on. This will enable you to provision new servers identically when you need to, and then deploy your software to them from that point forward.

Conclusion

The success of a software project goes beyond good architecture and code; it relies on good processes and procedures throughout the development, testing, and deployment phases as well. This article illustrates that maintaining good source-code repository practices when it comes to release management and parallel development teams is possible if a proven plan is followed. It also illustrates that maintaining Ant scripts to automate the deployment of labeled releases from your source code repository is not difficult. You can even use them to provision entire servers into your data center. In the end, having a reliable, and error-free, deployment process will save you a lot of time and headaches. Using the processes and scripts provided here in your own projects can help you achieve that.