Merging subversion forks
or, "How I learned to stop worrying and love the fork."
15 April 2008 Charles Roth
Introduction
I am currently the subversion-meister and "best practices" guru
for a small company with 5 J2EE developers spread across four
offices on two continents.
Oy. Pray for me.
Merging development 'forks' was getting out of control, so I enforced
some of the classic distributed-development tricks:
- short commit cycles, ~ 2 weeks.
("Short" to non XP-ers, anyway. But this group hasn't historically
done unit-tests, so... but I digress.)
- Version history comments in each class header.
- Separate branches for each mini-project.
But merging each branch back into the trunk was still a pain.
Then one day the light dawned... merging two forked versions of a file
is really merging three files, not two: the forked versions,
plus the most recent common ancestor.
Flashback
Many years ago, dissatisfied with all of the
existing file "difference" tools, I wrote my own in 'C',
called (d'oh) 'merge2'.
It's meant to be used with an editor like 'vi' (or, I suppose,
emacs), where one can easily manipulate both lines
and columns.
Merge2 interleaves two (similar) files into one file, with this
twist: lines that only appear in the first file get a "1" in column 1;
lines only in the second file get a "2"; and lines common to both
get a " ".
Merge2 is very smart about finding common blocks of text,
and ignores (for the purpose of finding matching lines) indentation,
so the result is a minimum of extraneous "1"s and "2"s.
So, for example, here's the result of a 'merge2' on
two slightly different versions (two differently drunken typists?)
of a bit of A. E. Housman...
Ale, man! Ale's the stuff to drink!
1For fellows whom it hurts to thing.
2For fellows whom it hurts to think.
Look into the pewter pot
1To see the world as the world's not.
2To see the globe as the globe's not.
The Light Dawns
So the revelation was to apply merge2 three times.
- Merge2 the common ancestor with the current version in branch A.
- Merge2 the common ancestor with the current version in branch B.
- Merge2 the results of merge #1 and #2!
Mix this in with a little bit of tweaking of the first columns to mark
the origins of the changes, and suddenly everything becomes clear.
Example
Here's an example. The original ancestor has a bunch of typos
and mis-rememberings (show in red):
Oh many a peer of England brews
Livelier licquor than the Muse,
And malt does more than Milton can
To justify god's ways to man.
Ale man, ale's the stuff to drink (missing comma)
For fellows whom it hurts to thing:
Look into the pewter pot
To see the globe as the globe's not.
Here's the result of person A's corrections, in blue:
Oh many a peer of England brews
Livelier liquor than the Muse,
And malt does more than Milton can
To justify God's ways to man.
Ale man, ale's the stuff to drink
For fellows whom it hurts to thing:
Look into the pewter pot
To see the globe as the globe's not.
Person B isn't as good at spelling, but he did remember a few more lines:
Oh many a peer of England brews
Livelier licquor than the Muse,
And malt does more than Milton can
To justify god's ways to man.
Ale, man, ale's the stuff to drink
For fellows whom it hurts to think:
Look into the pewter pot
To see the world as the world's not.
And faith, 'tis pleasant till 'tis past:
The mischief is that 'twill not last.
Now, here's the result when run through 'mergeFork',
which does the previously-described three-way merge.
Only now the first two columns indicate the source
of the changes:
- "A0" or "B0" means the common ancestor.
- "A " or "B " means both versions of A's or B's file
(i.e. this line did not change in A or B's edits)
- "A1" means A's edits
- "B1" means B's edits
With this information, the proper merge jumps out pretty easily:
Oh many a peer of England brews
A0Livelier licquor than the Muse,
A1Livelier liquor than the Muse,
B Livelier licquor than the Muse,
And malt does more than Milton can
A0To justify god's ways to man.
A1To justify God's ways to man.
A Ale man, ale's the stuff to drink
A For fellows whom it hurts to thing:
B To justify god's ways to man.
B0Ale man, ale's the stuff to drink
B0For fellows whom it hurts to thing:
B1Ale, man, ale's the stuff to drink
B1For fellows whom it hurts to think:
Look into the pewter pot
A To see the globe as the globe's not.
B0To see the globe as the globe's not.
B1To see the world as the world's not.
B1And faith, 'tis pleasant till 'tis past:
B1The mischief is that 'twill not last.
Usage
MergeFork merges (one file at a time) from two checked-out subversion
branches.
You have to know the revision number of the common ancestor, and the
revision numbers of the branches that you are dealing with.
It automatically launches 'vi' on the merged result.
Here's "usage" text displayed when running mergeFork w/o arguments:
Usage: mergeFork revBase branchDirA revA branchDirB revB filename
For example:
Checked-out trunk in directory 'trunk'.
Checked-out branch alpha in directory 'alpha'.
Trying to merge a fork in a file stuff.java.
Alpha branched at rev 100, now at rev 120.
Trunk is at rev 130.
mergeFork 100 alpha 120 trunk 130 stuff.java
produces merged file 'm', each line has 2 leading chars:
A0 means original (rev 100) line from alpha
A1 means changed line in alpha.
B0,B1 similarly for trunk.
A means unchanged line in alpha that is not in trunk
B similarly for trunk
0 means original in both alpha and trunk
Downloads
Feedback
...is welcome, at roth@thedance.net.
I hope this helps, pass it on!