**GIT Trac**

*What The?*

http://git.or.cz/

Strictly speaking, GIT isn't an SCM. Close enough.

Be warned that I'm a complete git newbie so some of the details of how git
actually works make be incorrect.

*Building and Installing*

*: DownloadCvstrac from CVS
*: build CVSTrac, following the /cvstrac/COMPILING instructions
*: build GitTrac:

  make APPNAME=gittrac all

You now have a =gittrac= executable. Copy to wherever you want.

*Configuration*

Initialize the database:

  gittrac init /path/to/database(s) <project>

Start the server:

  gittrac server 8008 /path/to/database(s) <project>

Point your browser to http://localhost:8008/

Log in as setup/setup and configure. As with CVSTrac, you'll need to configure
the repository path and reread the repository. The repository can be either a
_bare_ repository (i.e. =/var/ftp/pub/project.git=) or a _private_ repository
(i.e. =/home/me/src/project/.git=).

If CVS-like modules are desired (i.e. the ability to checkout individual
subcomponents but have GitTrac reference all of them), the repository can be
a directory holding git repositories. Note that (for now) GitTrac
will get the module names from the directory names, so it's probably
best to drop the .git extension common to GIT repository names. When using
modules, GIT references will be prefixed by module name, as in
{quote:cvstrac:1190d8d0}.

*Notes*

*: You _may_ be able to move a CVS/CVSTrac repository to GIT/GitTrac. See my
GitTracMigration notes.
*: As with CVSTrac and SvnTrac, GitTrac does _not_ alter the repository. So you
can safely mess around all you want and not harm your source code.
*: GitTrac uses the GIT revision hashes directly. However, it does represent
them in a compressed fashion (i.e. =1190d8d0=) when space is an issue.
*: GitTrac's file browser shows the entire history of the GIT tree in a single
view. There's no way to, say, see the whole repository at a specific revision.
This isn't normally an issue except when dealing with deleted files. Particular
files which are deleted in one branch and live in other branches.
*: GIT heads and tags are handled by creating Milestones referencing the latest
check-in for that head/tag. This is maybe not the most natural way to represent
these. One convenience I'd recommend is creating a couple reports like "Heads"
and "Tags" and embedding them in some wiki pages somewhere convenient. An
example report for listing all heads would be:

  SELECT
    chng(cn) AS 'Cn',
    substr(branch,12,length(branch)-11) AS 'Name',
    sdate(chng.date) AS 'Last',
    directory AS 'Object',
    wiki(message) AS 'Message'
  FROM chng
  WHERE branch LIKE 'refs/heads/%'
  ORDER BY date DESC

*: GitTrac is mainly intended as a proof-of-concept. There's been enough
testing to demonstrate that it works, and the actual GIT-specific code is
relatively small, but feedback has been pretty limited. So while it works for
me, it may not work for everyone.
*: If someone manages to pull the entire Linux kernel tree into GitTrac, I
really {link: mailto:cpb@cpan.org want to hear about it}.

*How and Why*

These are some working notes of the process for adding a new ScmTrac. They're
hardly complete, probably not generic enough, and you still need to have some
idea of how CVSTrac really works under the hood.

Steps to add GIT support to CVSTrac (#476):

*First Steps*

*: =cp cvs.c git.c=
*: Add =git= to the _src_ array in /cvstrac/makemake.tcl, then:
*: =tclsh makemake.tcl >main.mk=
*: Edit /cvstrac/main.c and add a section for =git_init()= at the top of
=main()=.
*: edit /cvstrac/git.c, implementing =git_init()= and the minimal set of g.scm
callbacks (=pxHistoryUpdate=, =pxDiffVersions=, =pxDiffChng=, and
=pxDumpVersion=).
*: there may be a need for git-specific items elsewhere in the code. =grep= for
=g.scm.zSCM= elsewhere in the source and adjust as needed.

*Getting something to work*

The file /cvstrac/git.c is going to be where most of the action is found. Start
by going through and removing stuff that the GIT code won't use (if you get too
aggressive, steal them from =cvs.c= again). The only _public_ function in there
is =git_init()=, but call everything =git_= for consistency.

=git_init()= simply initializes the =g.scm= structure with some housekeeping
information and some pointers to callback functions which do all the
SCM-specific work. This is called from /cvstrac/main.c onceit figures out what
SCM subsystem will be used (which it determines from the executable name).

Once we get the initialization stuff out of the way, we basically have to start
in on =git_history_update()= (the =pxHistoryUpdate= callback). This is called
periodically to find new changes from the repository. Practically speaking,
this means it's going to get called and will need to efficiently say "show me
any changes since revision _n_", for some _n_, and then suck in the changes
into the _CHNG_, _FILE_ and _FILECHNG_ tables in some way that makes sense to
the end user. How the repository file tree is represented to the user really
doesn't affect CVSTrac, so go with something "natural" for GIT users. What we
care about the entries in the _CHNG_ table which should, as much as possible,
look like atomic changesets. That should be feasible for any SCM more modern
than CVS.

The first thing that has to be dealt with in =git_history_update()= is how to
handle git's concept of HEADS. Unlike CVS and Subversion, there's no concept of
a linear "list of changes" that covers all repository activity. At least, I
can't find one. For each "head", however, it's possible to trace everything
back. The catch, however, is that we're walking up a tree from its leaf nodes
and a certain
number of commits in each list are going to be shared between different heads.
AFAICT it's _not possible_ to determine from the revision history
which
"head" is what, in CVS or Subversion, we'd call the trunk. git does have a
concept of HEAD, but it's more of a working concept similar to CVS's "sticky"
tags. In other words,
while
git most certainly supports a rich concept of branching (and merging), we can't
reliably
_name_ any of the nodes in a git tree by inspecting the tree after the fact.

In the short (and probably long) term, this means we're going to ignore the
whole issue of git
branches. We'll handle the _naming_ of the branches in the forms of tags and
heads, but we're unable to associate any particular checkin with a specific
branch or tag.

Once past that issue, the actual procedure for pulling git revision information
out is trivial. git was designed for having stuff like this built around it and
it really shows. The lack of an easy way to diff blobs is a bit odd, but
otherwise things just work the way a C coder would want them to.

Git tags are handled simply as a variation of heads. In fact, they're handled
identically with the same subroutines. Rather than lose the reference
information, we transform all heads and tags into milestones which, via
wiki markup, link to the "last" commit under the tag/head. If a tag or head
moves (and heads move a lot), the milestone is updated. One immediate
consequence is that with just a glance at the timeline, a user can see which
head is getting the most activity.

*Files and Revisions*

The other _mandatory_ callbacks are all related to producing diffs and/or file
contents. =pxDiffVersions= outputs an HTML diff of two versions of a file,
=pxDiffChng= outputs a patchset (a diff of an entire checkin, either raw or
HTML), and =pxDumpVersion= outputs a specific version of a file (either raw or
HTML, possibly filtered).

*Cosmetic Details*

Code like:

  if( !strcmp(g.scm.zSCM,"cvs") ){
  }

is found in various places. You probably want to look around and either disable
certain functions which aren't relevant for GIT or enable certain functions
which are applicable but aren't used by all other SCMs.

In the case of git, the user-file things (like CVSROOT/passwd) aren't ever
going to be used. Since the corresponding =g.scm= callbacks are empty, those
items aren't displayed in git mode.

In addition, it's necessary to adjust how
changesets/revisions/hashes are displayed since the usual 40 character
hexidecimal hashes make a hash of our layout code and aren't meaningful in most
situations anyways. We handle that by creating a =printable_vers()= function to
turn something like:

   1190649aaff433501d6e6c92deb8a0f201fdd8d0

into just

   1190649a

Isn't that better? =printable_vers()= applies this kind of transformation to
any long version string, so it's SCM-generic. Mind you, not all SCM's might
want this sort of "cleanup".