It would be nice to make darcs record (and pull, and apply) take O(1) time,
rather than being O(N) in the number of patches in the repository. The problem
is that when we write out the new the hashed inventory, we have to hash all the
older inventories, even though we haven't modified them. If we simply cache
the hash of each inventory as we read it (in the PatchSet), and remove the hash
if we modify it, then we can make writing out the new hashed inventory O(1)...
well, actually, O(number of patches since last tag).
On my fast computers, this doesn't seem like a big issue, but when running darcs
on the darcs repository, it takes a few seconds to write the inventories, so
it's well worth fixing--it'll only be worse on a repository with a long history.
This is unfortunately quite an invasive change, since it has to change the type
of PatchSet p, which is currently just RL (RL PatchInfoAnd p). As an example,
you could look at PatchInfoAnd, which caches the hash of each patch in the same
way that we'd like to cache the hash of each tag inventory.
I imagine something like
data PatchSet p C(z) where
PatchSet :: Maybe String -> RL (PatchInfoAnd p) C(y z)
-> PatchSet p C(y) -> PatchSet p C(z)
The tricky part will be writing a safe interface, so that we can't accidentally
modify a PatchSet without setting its hash value to Nothing, but that is
sufficiently expressive that it will be practical to convert the entire code to
use it. For read-only purposes, of course, and as a transition, we can easily
write converters to and from RL (RL (PatchInfoAnd p)), which perhaps will allow
a gradual conversion, with unfixed code just running more slowly (at its current
pace).
David
|