cvstrac: Git Trac

GIT Trac

What The?

http://git.or.cz/

Strictly speaking, GIT isn't an SCM. Close enough.

Be warned that I'm a complete git newbie so some of the details of how git actually works make be incorrect.

Building and Installing

DownloadCvstrac from CVS
build CVSTrac, following the /cvstrac/COMPILING instructions
build GitTrac:

  make APPNAME=gittrac all

You now have a gittrac executable. Copy to wherever you want.

Configuration

Initialize the database:

  gittrac init /path/to/database(s) <project>

Start the server:

  gittrac server 8008 /path/to/database(s) <project>

Point your browser to http://localhost:8008/

Log in as setup/setup and configure. As with CVSTrac, you'll need to configure the repository path and reread the repository. The repository can be either a bare repository (i.e. /var/ftp/pub/project.git) or a private repository (i.e. /home/me/src/project/.git).

If CVS-like modules are desired (i.e. the ability to checkout individual subcomponents but have GitTrac reference all of them), the repository can be a directory holding git repositories. Note that (for now) GitTrac will get the module names from the directory names, so it's probably best to drop the .git extension common to GIT repository names. When using modules, GIT references will be prefixed by module name, as in cvstrac:1190d8d0.

Notes

You may be able to move a CVS/CVSTrac repository to GIT/GitTrac. See my GitTracMigration notes.
As with CVSTrac and SvnTrac, GitTrac does not alter the repository. So you can safely mess around all you want and not harm your source code.
GitTrac uses the GIT revision hashes directly. However, it does represent them in a compressed fashion (i.e. 1190d8d0) when space is an issue.
GitTrac's file browser shows the entire history of the GIT tree in a single view. There's no way to, say, see the whole repository at a specific revision. This isn't normally an issue except when dealing with deleted files. Particular files which are deleted in one branch and live in other branches.
GIT heads and tags are handled by creating Milestones referencing the latest check-in for that head/tag. This is maybe not the most natural way to represent these. One convenience I'd recommend is creating a couple reports like "Heads" and "Tags" and embedding them in some wiki pages somewhere convenient. An example report for listing all heads would be:

  SELECT
    chng(cn) AS 'Cn',
    substr(branch,12,length(branch)-11) AS 'Name',
    sdate(chng.date) AS 'Last',
    directory AS 'Object',
    wiki(message) AS 'Message'
  FROM chng
  WHERE branch LIKE 'refs/heads/%'
  ORDER BY date DESC

GitTrac is mainly intended as a proof-of-concept. There's been enough testing to demonstrate that it works, and the actual GIT-specific code is relatively small, but feedback has been pretty limited. So while it works for me, it may not work for everyone.
If someone manages to pull the entire Linux kernel tree into GitTrac, I really want to hear about it.

How and Why

These are some working notes of the process for adding a new ScmTrac. They're hardly complete, probably not generic enough, and you still need to have some idea of how CVSTrac really works under the hood.

Steps to add GIT support to CVSTrac (#476):

First Steps

cp cvs.c git.c
Add git to the src array in /cvstrac/makemake.tcl, then:
tclsh makemake.tcl >main.mk
Edit /cvstrac/main.c and add a section for git_init() at the top of main().
edit /cvstrac/git.c, implementing git_init() and the minimal set of g.scm callbacks (pxHistoryUpdate, pxDiffVersions, pxDiffChng, and pxDumpVersion).
there may be a need for git-specific items elsewhere in the code. grep for g.scm.zSCM elsewhere in the source and adjust as needed.

Getting something to work

The file /cvstrac/git.c is going to be where most of the action is found. Start by going through and removing stuff that the GIT code won't use (if you get too aggressive, steal them from cvs.c again). The only public function in there is git_init(), but call everything git_ for consistency.

git_init() simply initializes the g.scm structure with some housekeeping information and some pointers to callback functions which do all the SCM-specific work. This is called from /cvstrac/main.c onceit figures out what SCM subsystem will be used (which it determines from the executable name).

Once we get the initialization stuff out of the way, we basically have to start in on git_history_update() (the pxHistoryUpdate callback). This is called periodically to find new changes from the repository. Practically speaking, this means it's going to get called and will need to efficiently say "show me any changes since revision n", for some n, and then suck in the changes into the CHNG, FILE and FILECHNG tables in some way that makes sense to the end user. How the repository file tree is represented to the user really doesn't affect CVSTrac, so go with something "natural" for GIT users. What we care about the entries in the CHNG table which should, as much as possible, look like atomic changesets. That should be feasible for any SCM more modern than CVS.

The first thing that has to be dealt with in git_history_update() is how to handle git's concept of HEADS. Unlike CVS and Subversion, there's no concept of a linear "list of changes" that covers all repository activity. At least, I can't find one. For each "head", however, it's possible to trace everything back. The catch, however, is that we're walking up a tree from its leaf nodes and a certain number of commits in each list are going to be shared between different heads. AFAICT it's not possible to determine from the revision history which "head" is what, in CVS or Subversion, we'd call the trunk. git does have a concept of HEAD, but it's more of a working concept similar to CVS's "sticky" tags. In other words, while git most certainly supports a rich concept of branching (and merging), we can't reliably name any of the nodes in a git tree by inspecting the tree after the fact.

In the short (and probably long) term, this means we're going to ignore the whole issue of git branches. We'll handle the naming of the branches in the forms of tags and heads, but we're unable to associate any particular checkin with a specific branch or tag.

Once past that issue, the actual procedure for pulling git revision information out is trivial. git was designed for having stuff like this built around it and it really shows. The lack of an easy way to diff blobs is a bit odd, but otherwise things just work the way a C coder would want them to.

Git tags are handled simply as a variation of heads. In fact, they're handled identically with the same subroutines. Rather than lose the reference information, we transform all heads and tags into milestones which, via wiki markup, link to the "last" commit under the tag/head. If a tag or head moves (and heads move a lot), the milestone is updated. One immediate consequence is that with just a glance at the timeline, a user can see which head is getting the most activity.

Files and Revisions

The other mandatory callbacks are all related to producing diffs and/or file contents. pxDiffVersions outputs an HTML diff of two versions of a file, pxDiffChng outputs a patchset (a diff of an entire checkin, either raw or HTML), and pxDumpVersion outputs a specific version of a file (either raw or HTML, possibly filtered).

Cosmetic Details

Code like:

  if( !strcmp(g.scm.zSCM,"cvs") ){
  }

is found in various places. You probably want to look around and either disable certain functions which aren't relevant for GIT or enable certain functions which are applicable but aren't used by all other SCMs.

In the case of git, the user-file things (like CVSROOT/passwd) aren't ever going to be used. Since the corresponding g.scm callbacks are empty, those items aren't displayed in git mode.

In addition, it's necessary to adjust how changesets/revisions/hashes are displayed since the usual 40 character hexidecimal hashes make a hash of our layout code and aren't meaningful in most situations anyways. We handle that by creating a printable_vers() function to turn something like:

   1190649aaff433501d6e6c92deb8a0f201fdd8d0

into just

1190649a

Isn't that better? printable_vers() applies this kind of transformation to any long version string, so it's SCM-generic. Mind you, not all SCM's might want this sort of "cleanup".

cvstrac - Git Trac