Merging subversion forks

or, "How I learned to stop worrying and love the fork."

15 April 2008   Charles Roth

Introduction
I am currently the subversion-meister and "best practices" guru for a small company with 5 J2EE developers spread across four offices on two continents.  Oy.  Pray for me.

Merging development 'forks' was getting out of control, so I enforced some of the classic distributed-development tricks:

But merging each branch back into the trunk was still a pain. 

Then one day the light dawned... merging two forked versions of a file is really merging three files, not two: the forked versions, plus the most recent common ancestor. 

Flashback
Many years ago, dissatisfied with all of the existing file "difference" tools, I wrote my own in 'C', called (d'oh) 'merge2'.  It's meant to be used with an editor like 'vi' (or, I suppose, emacs), where one can easily manipulate both lines and columns.

Merge2 interleaves two (similar) files into one file, with this twist: lines that only appear in the first file get a "1" in column 1; lines only in the second file get a "2"; and lines common to both get a " ".  Merge2 is very smart about finding common blocks of text, and ignores (for the purpose of finding matching lines) indentation, so the result is a minimum of extraneous "1"s and "2"s.

So, for example, here's the result of a 'merge2' on two slightly different versions (two differently drunken typists?) of a bit of A. E. Housman...

   Ale, man!  Ale's the stuff to drink!
  1For fellows whom it hurts to thing.
  2For fellows whom it hurts to think.
   Look into the pewter pot
  1To see the world as the world's not.
  2To see the globe as the globe's not.

The Light Dawns
So the revelation was to apply merge2 three times.

  1. Merge2 the common ancestor with the current version in branch A.
  2. Merge2 the common ancestor with the current version in branch B.
  3. Merge2 the results of merge #1 and #2!
Mix this in with a little bit of tweaking of the first columns to mark the origins of the changes, and suddenly everything becomes clear.

Example
Here's an example.  The original ancestor has a bunch of typos and mis-rememberings (show in red):

   Oh many a peer of England brews
   Livelier licquor than the Muse,
   And malt does more than Milton can
   To justify god's ways to man.
   Ale man, ale's the stuff to drink (missing comma)
   For fellows whom it hurts to thing:
   Look into the pewter pot
   To see the globe as the globe's not.
Here's the result of person A's corrections, in blue:
   Oh many a peer of England brews
   Livelier liquor than the Muse,
   And malt does more than Milton can
   To justify God's ways to man.
   Ale man, ale's the stuff to drink
   For fellows whom it hurts to thing:
   Look into the pewter pot
   To see the globe as the globe's not.
Person B isn't as good at spelling, but he did remember a few more lines:
   Oh many a peer of England brews
   Livelier licquor than the Muse,
   And malt does more than Milton can
   To justify god's ways to man.
   Ale, man, ale's the stuff to drink
   For fellows whom it hurts to think:
   Look into the pewter pot
   To see the world as the world's not.
   And faith, 'tis pleasant till 'tis past:
   The mischief is that 'twill not last.
Now, here's the result when run through 'mergeFork', which does the previously-described three-way merge.  Only now the first two columns indicate the source of the changes: With this information, the proper merge jumps out pretty easily:
  Oh many a peer of England brews
A0Livelier licquor than the Muse,
A1Livelier liquor than the Muse,
B Livelier licquor than the Muse,
  And malt does more than Milton can
A0To justify god's ways to man.
A1To justify God's ways to man.
A Ale man, ale's the stuff to drink
A For fellows whom it hurts to thing:
B To justify god's ways to man.
B0Ale man, ale's the stuff to drink
B0For fellows whom it hurts to thing:
B1Ale, man, ale's the stuff to drink
B1For fellows whom it hurts to think:
  Look into the pewter pot
A To see the globe as the globe's not.
B0To see the globe as the globe's not.
B1To see the world as the world's not.
B1And faith, 'tis pleasant till 'tis past:
B1The mischief is that 'twill not last.

Usage
MergeFork merges (one file at a time) from two checked-out subversion branches.  You have to know the revision number of the common ancestor, and the revision numbers of the branches that you are dealing with.  It automatically launches 'vi' on the merged result.

Here's "usage" text displayed when running mergeFork w/o arguments:

Usage: mergeFork revBase branchDirA revA branchDirB revB filename
 
For example: 
  Checked-out trunk in directory 'trunk'.
  Checked-out branch alpha in directory 'alpha'.
  Trying to merge a fork in a file stuff.java.
  Alpha branched at rev 100, now at rev 120.
  Trunk is at rev 130.
 
mergeFork 100 alpha 120 trunk 130 stuff.java
  produces merged file 'm', each line has 2 leading chars:
  A0 means original (rev 100) line from alpha
  A1 means changed line in alpha.
  B0,B1 similarly for trunk.
  A  means unchanged line in alpha that is not in trunk
  B  similarly for trunk
   0 means original in both alpha and trunk

Downloads

Feedback
...is welcome, at roth@thedance.net.  I hope this helps, pass it on!