Re: linux-2.5.4-pre1 - bitkeeper testing

Larry McVoy (lm@bitmover.com)
Mon, 11 Feb 2002 14:14:04 -0800


On Mon, Feb 11, 2002 at 12:20:57AM -0800, Josh MacDonald wrote:
> Bounding the chain length is easy, it just means that instead of
> storing 1000 deltas in a chain you store 50 fully-expanded versions
> and 50 delta chains of max length 20. Of course this means a little
> extra storage, but really how much? Observe that storing 50 out of
> 1000 versions is only 5%, which is pretty good as delta-compression
> ratios go. A typical delta is usually larger than 5% of the file
> size. I won't bore you all by carrying out the math, it can easily be
> found in either my report or his. The point is that bounding the
> chain length by introducing full copies every once in a while does not
> dramatically hurt your compression ratio.

How about some numbers which contrast your claims? Here are the diff
sizes for all the changes in Linus' BK tree. Note that he is importing
patches which may actually be bigger than your typical checkin, but
no matter, the point stands even if they represent exactly one checkin
per patch.

In this tree, at least, a typical delta is less than .63% of the file
size. And if you are measuring against the revision history size,
then your numbers are even more off.

2198 >= 20.0000000%
1647 >= 10.0000000%
2240 >= 5.0000000%
2508 >= 2.5000000%
2879 >= 1.2500000%
2983 >= 0.6250000%
2962 >= 0.3125000%
2564 >= 0.1562500%
1581 >= 0.0781250%
919 >= 0.0390625%
400 >= 0.0195312%
162 >= 0.0097656%
50 >= 0.0048828%
12 >= 0.0024414%
105 >= 0.0012207%

-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/