Issue 2605 darcs crashes with duplicate patches

Title darcs crashes with duplicate patches
Priority Status unknown
Milestone Resolved in
Superseder Nosy List bf
Assigned To

Created on 2018-10-28.16:26:27 by bf, last changed 2018-11-09.08:23:13 by bf.

msg20428 (view) Author: bf Date: 2018-10-28.16:26:25
Note that this is /not/ about Duplicate patches as in RepoPatchV2. This bug 
manifests itself with darcs-1 and darcs-2 patches.

The test is simple and is based on the scenario described in Appendix B of 
the camp paper:

In separate repos create patches A1 and A2 that both make the same 
identical change. Record B on top of A1 such that A1:>B does not commute. 
Clone that and pull A2 with --reorder-patches and --allow-conflicts, giving 
A2:>A1:>B. Then obliterate A1. We now have two repos R1=A1:>B and R2=A2:>B, 
where in each of the two repos B "depends" on A1/A2. However, B really 
depends on the /content/ of A1/A2, not on the /name/ of the patches, which 
is why we could obliterate A1 in the first place.

If we then try to merge R1 and R2, Darcs crashes because it cannot commute 
the common B with either A1 or A2.
msg20431 (view) Author: bf Date: 2018-10-28.18:36:53
In the camp paper Ian draws the conclusion that we need to add "names" 
(universally unique identities) to prim patches. He claims that otherwise "the simple natural merge algorithm breaks down in the 
presence of duplicate patches". This is correct. However, I don't 
agree with the conclusion. Instead I think we should extend the merge 
algorithm so that it can cope with such situations, for two reasons:

First, Darcs has /always/ allowed us to create situations like that. 
Adding identities to prim patches would mean that we can't fix this 
until we make yet another incompatible change to the patch format.

Second, being able to replace an implicit dependency with another one 
that makes the same change is a very useful feature.

To fully implement that feature we must accept that we cannot define a 
'merge' function for sequences of named patches with the same simple 
type as for unnamed patches. If we can't pull (all) common patches to 
the common past, then the result of merging ps:\/:qs no longer has the 
simple form (qs':/\:ps'). In general we can only expect it to give us 
a /partial/ merge (as:\/:bs) -> (bs':/\:as'), where the as and bs are 
initial stretches of ps and qs up to the first common patch. We have 
to return whatever remains of both sequences, too. With the remainders 
we can now (again) compute a Fork with all common patches that /can/ 
be commuted to the heads and the remaining  diverging sequences, for 
which we do the next partial merge, and so on, until both lists are 

Implementing that is not difficult. The main problem is that a lot of 
the command implementations need to be adapted to this new merging 
scheme. In particular, I am currently unsure if and how it would 
interact with patch selection.
msg20432 (view) Author: bf Date: 2018-10-31.18:15:35
My last comment was incorrect and too pessimistic. In fact our merge
function for lists of named patches (merge2FL) already deals with
duplicate patches just fine -- as long as they can be commuted to a
common initial segment. The algorithm can be easily fixed to no longer
depend on that without changing the type of the function. I have a patch
that solves this issue, including a (minor) extension to patch
selection, so that patches common to remote and local repos won't be
offered to the user in the apply, push, pull, and send commands.
msg20452 (view) Author: bf Date: 2018-11-09.08:23:11
See patch1758.
Date User Action Args
2018-10-28 16:26:27bfcreate
2018-10-28 18:36:56bfsetmessages: + msg20431
2018-10-31 18:15:37bfsetmessages: + msg20432
2018-11-09 08:23:13bfsetmessages: + msg20452