Central Iowa Railroad Herald

CIRR.COM

Compile Time

Managing Source Code and Development Environments


This month, we talk about source code management and maintaining development environments. I'm sure nearly everyone is familiar with the former, but perhaps not with the latter.

I hope I don't need to explain why source management is important, nay, vital to any software development project. Source management tools allow the developers (and testers, and documenters) to track changes to the code base over time, hopefully with useful comments about the each of the changes. A good source control system will also track who made the change, and when it was made. Such information allows all on the team to identify what changes where made, who made them, and when. (if for no other reason than to figure out who broke something. :-)

Development environment management is something that is not widely practiced in the open source development arena. On commercial/professional projects, the build system is tightly controlled, probably as tightly as the software source, if not more so. The goal is to have a completely reproduceable build environment for the software product. Once the build environment is defined, it is frequently set up as a box containing the approved tools, libraries, and other components. Once set up, the box is unchanging without going through some change control mechanism, where all the developers (testers, documenters, and perhaps others) have a say in the modifications proposed.

Upon release of the software product, the sources are tagged in the source repository, the build environment is tagged, and perhaps archived (both sources and build environment.) This assures that the release may be regenerated at any point in the future.

Having spent much space talking about build environments, lets start discussing some of the tools and processes used in source and build environment control.

Managing sources

There have been many mechanisms used to manage sources and releases over the years. The earliest used on UNIX-like systems had to be the making of a magnetic tape (remember those?) of the sources as the release was completed. This provided a snapshot of the sources at the time of release, and if the developers were really thinking a head, they put a copy of the build tools and libraries on there as well.

Unfortunately, this did nothing to preserve the intermediate versions of the sources. Those intermediate versions of the sources were lost to eternity (and the developers!)

File based source control

The first "real" source management systems on UNIX were file based. SCCS and RCS are two that have survived, and been fairly widely distributed over time. Both SCCS (and its GNU clone, CSSC) and RCS are interested in operating on files, not on projects or systems.

Both SCCS and RCS support the concept of branches, the idea that several different streams of development can take place in the same file, starting from different revisions. This allows patches to be made and checked in on the release branch of a file, allowing new development to continue on the main branch concurrently.

RCS also provides a way to tag a revision with a symbolic name at the per file level. SCCS doesn't provide a similar function, therefore when releases are made using SCCS, everything tends to get checked in with a new major version number (for example, 4.4 BSD asked that everything be checked in with an SCCS major version number of 8.)

This helped/led many people (and companies) to write wrappers around RCS and SCCS that worked on modules, projects or systems. Using the tool kit approach of UNIX, this made sense. From a practical sense, there were often failings. As an example, there was a system developed over RCS (an internal tool known as MCS at a super-mini-computer vendor) which handled branches poorly. So poorly that almost no one besides the original author knew how to do it, and by the time others divined the process, the rest of the development staff were unwilling to use it, ultimately replacing it with a commercial product.

RCS's tagging mechanism, and its merging tools caused it to be the commonly used basis for a module based source management system. RCS is still the basis of the currently most popular Open Source SCM, CVS.

Module based source control

Most development groups had determined that a module based source management system was far more desirable that a file based system. Quite a number showed up on comp.sources.unix and comp.sources.misc during the 1980's. As mentioned above, the systems were frequently were based on RCS (which itself had originally been posted to comp.sources.unix in the middle '80's.)

There are several module style source management systems (SCMs) in the commercial domain as well. The best know of them is probably Rational's ClearCase management system. Other systems known to the author are Perforce, and BitKeeper.

A recent addition in the Open Source arena is subversion, which looks to version the entire tree, not just the files. Thus, it is inherently module based. Commercial products doing this include Perforce, and ClearCase. By version-ing the entire tree, you effectively version all the directory changes as well as all the file changes.

subversion uses one version number for the entire tree, rather than each file having individual versions. Personally, I haven't decided if this is good or bad. Since subversion also supports symbolic tags the point is moot.

User Interface

The commercial source management systems all seem to have some sort of GUI interface, as well as a command line interface.

ClearCase requires you to start a subshell when additional environment variables, and some other magic to allow you to work in a shell environment. Additionally, ClearCase requires an additional kernel module to implement a ClearCase file system for the developers work trees.

perforce seems to be an interesting cross between CVS and ClearCase. It requires a dedicated server process to be running, but doesn't require a special file system to be added to the clients or the servers. (NB: I haven't extensively used perforce yet; I'm probably missing everyone's favorite feature.)

subversion claims to be attempting to mimic the CVS interface as much as possible, in an effort make the conversion easier. On the less positive side, subversion requires a reasonable amount of additional software on the server (in a client server environment). Not as much as Gnome or KDE, but it is more than just subversion itself. The additional software is primarily apache 2.0, Apache's WebDAV support, and python. (see the subversion documentation for the full list of required software.)

Using the Tools

We're about to look at using three different tools to control a small source tree. The source tree is from a couple of columns ago, the autoconf version of logger. We'll be looking at doing different common actions using each of cvs, subversion and p4.

The tasks represent those commonly used in the software development process. This is not intended as a complete review of any of the systems presented.

The tasks we're going to go through are setting up the repository, adding, editing, deleting, and tagging files. Without further ado, lets start with setting up the repository.

Using perforce, setting up the repository seems exceedingly simple. Just create the repositories root directory, change to it, and start the daemon, p4d. To support multiple repositories, you start multiple daemons, each listening on a different port.

To actually connect to the repository, you create a client instance, using p4 client. You'll get dropped into your $EDITOR and asked to fill out a form. The defaults are probably acceptable if you're in the directory you wish to work in.

Using subversion, setting up a repository is fairly simple and straight forward as well. The primary command used is svnadmin. Using the create subcommand, specify the repositories root directory as the second argument. A example (creating a repository in your home directory):

	% svn create ~/repository

Setting up a repository using cvs used to be a fairly daunting task, but in recent releases, it has become much easier. The root directory is specified using the -d option to cvs, and repository creation is indicated using the init subcommand. An example (again, creating a repository under your home directory):

	% cvs -d ~/repository init

Adding files to a new or existing repository is similar for all of the SCM packages as well. The directory is checked out (or has already been checked out), copy in/create the new files, and use the respective add command to enter the files into the repository. All the systems under review require a following commit command to actually add the new files to the repository.

For Perforce, the commands are as follows:

	% cd AutoConf
	% p4 client
	% p4 add COPYRIGHT Makefile.am features.h snprintf.c syslog.c
	% p4 add aclocal.m4 logger.1 strftime.c syslog.h MANIFEST
	% p4 add configure.in logger.c syslog.3
	% p4 submit

p4 client establishes a "workspace" and a client identity with the Perforce server. Included in the "workspace" information is the directory where the sources will be manipulated. The default value provided is $cwd.

p4 submit places you in $EDITOR, where the appropriate update message may be written.

Adding new files for Subversion is also fairly easy. It looks very similar to how CVS does the same task (see below for CVS.) The command sequence is as follows:

	% svn file:///$HOME/repository checkout
	% mkdir AutoConf
	% svn add AutoConf
	% cd AutoConf
	% cp $oldsrc/* .
	% svn add COPYRIGHT Makefile.am features.h snprintf.c syslog.c
	% svn add aclocal.m4 logger.1 strftime.c syslog.h MANIFEST
	% svn add configure.in logger.c syslog.3
	% svn commit -m 'logger -- a simple syslog replacement'

Adding files to cvs closely follows the process used by Subversion. This isn't terribly surprising, as the Subversion authors have expressed their intent to make Subversion's command line interface close to the command line interface of cvs.

The command sequence for adding files to CVS is:

	% cvs -d $HOME/repository checkout
	% mkdir AutoConf
	% cvs add AutoConf
	% cd AutoConf
	% cp $oldsrc/* .
	% cvs add COPYRIGHT Makefile.am features.h snprintf.c syslog.c
	% cvs add aclocal.m4 logger.1 strftime.c syslog.h MANIFEST
	% cvs add configure.in logger.c syslog.3
	% cvs commit -m 'logger -- a simple syslog replacement'

Editing files is the next major task on our list. CVS and subversion requires no special activities to make a file ready to edit, as both check the files out read-write. Perforce checks files out read-only by default, and you have to explicitly request to edit them. To check files out for read-write access using Perforce, the following command is executed:

	% p4 edit COPYRIGHT
	//depot/AutoConf/COPYRIGHT#1 - opened for edit
	% $EDITOR COPYRIGHT

I hope it is now obvious how to check in a changed file.

For perforce:

	% p4 submit COPYRIGHT

For SubVersion:

	% svn commit COPYRIGHT

For CVS:

	% cvs commit COPYRIGHT

All of the preceding check-in examples cause the respective SCM system to start the users $EDITOR, and fill out the appropriate form. CVS and SubVersion have a command line shorthand as well, in the -m option, taking an argument describing the reason for the change.

Removing files from the repository is occasionally required. We all wish to believe that everything we write will last forever, but there comes a time when something is no longer be needed. With that in mind, we'll look at removing files in each of our tools under study.

In Perforce:

	% p4 delete aclocal.m4
	//depot/AutoConf/aclocal.m4#1 - opened for delete
	% p4 submit aclocal.m4
	Change 3 created with 1 open file(s).
	Submitting change 3.
	Locking 1 files ...
	# $EDITOR runs here.
	delete //depot/AutoConf/aclocal.m4#2
	Change 3 submitted.

Perforce seems to be rather verbose about what it is doing..

In SubVersion:

	% svn delete aclocal.m4
	D  aclocal.m4
	% svn commit -m "auto-generated file, oops" aclocal.m4
	Deleting       aclocal.m4

	Committed revision 2.
	%

In CVS:

	% cvs delete aclocal.m4
	cvs remove: file `aclocal.m4' still in working directory
	cvs remove: 1 file exists; remove it first
	% rm aclocal.m4
	% cvs delete aclocal.m4
	cvs remove: scheduling `aclocal.m4' for removal
	cvs remove: use 'cvs commit' to remove this file permanently
	% cvs commit -m "auto-generated file, oops" aclocal.m4
	Removing aclocal.m4;
	/home/boomer/eric/tmp/CVS/AutoConf/aclocal.m4,v  <--  aclocal.m4
	new revision: delete; previous revision: 1.1
	done
	% 

As seen above, all three systems treat the removal of a file as just another change to be made to the repository.

Next up on our list of tasks is tagging the module. By tagging, I am speaking of applying a symbolic name to a revision in each file in the repository and/or the repository itself.

Tagging a repository or module in Perforce isn't as obvious as one might hope, at least having used other SCM systems. Perforce's marking subcommand is called label, and like many of the Perforce commands, puts you in the editor to fill out a form. The command line portion looks like this:

	% p4 label V1_1
	Label V1_1 saved.
	%

Doing the same task in Subversion uses the copy subcommand:

	% cd ..		# be in the directory containing the
			# source directory
	% svn copy AutoConf AutoConf-V1_1
	% svn commit -m "Released 1.1" AutoConf-V1_1
	Adding         AutoConf-V1_1

	Committed revision 3.
	%

And doing the task under CVS looks like this:

	% cvs tag V1_1
	cvs tag: Tagging .
	T COPYRIGHT
	T MANIFEST
	T Makefile.am
	T configure.in
	T features.h
	T logger.1
	T logger.c
	T snprintf.c
	T strftime.c
	T syslog.3
	T syslog.c
	T syslog.h
	%

commiting to an SCM system

An source control system is not something that should be chosen lightly. It can have ramifications on how the project operates, and how challenging it is to maintain the software. Evaluate many systems, and choose based on it best fulfilling your projects needs.

That said, the current favorite SCM in the Open Source world is CVS. SourceForge has helped to make it the most popular system in the world. Many projects are using it, including GNU for the GCC compiler suite, the Zebra Project for the Zebra router, FreeBSD, NetBSD, and OpenBSD for the entire operating system and utilities trees.

On the other hand, SubVersion looks to be an up and coming competitor to CVS. I think I'll be trying it on a project in the future.

On the commercial side of the world, Rational's ClearCase seems to be the dominate product. Many major companies, such as Hewlett Packard, use ClearCase to maintain their OS and other software suites.

From our examinations of Perforce, and their free two user licenses, it looks to be quite a nice package. The FreeBSD package is using Perforce to maintain some of the new hardware architecture ports, at least during the early phases of the projects.

I didn't delve into BitKeeper, but I understand that Linus has selected it as the repository for the Linux kernel.

As always, understand your needs, check out the choices, and make a decision. I hope this article has provided enough of an overview of three popular systems to get you started.

BIO: Eric Schnoebelen has been doing development on UNIX-like systems for 20 years. Eric may be reached at compiletime@cirr.com. Past columns can be found at http://www.cirr.com/compile-time


If you have any questions about our site, please send us mail.
Copyright 2000,2001 Central Iowa (Model) Railroad Contact Us Referral
Program
Support
$Id: 2003-Feb.html,v 1.2 2002/11/27 03:58:26 eric Exp $ Terms of Service Privacy Information