Issue 27: patch ids are not collision-free

Title	patch ids are not collision-free
Priority	bug	Status	resolved
Milestone		Resolved in
Superseder		Nosy List	Serware, darcs-devel, dmitry.kurochkin, gwern, kowey, markstos, nwf, thorkilnaur, tommy, zooko
Assigned To	droundy	Topics

Created on 2005-11-29.02:44:14 by zooko, last changed 2009-10-23.23:27:51 by admin.

Messages
msg99 (view)	Author: zooko	Date: 2005-11-29.02:44:13
I just got bitten by this bug again. The bug is that darcs relies on gettimeofday to make patch names collision-free, and this is not always sufficient. In particular, when one is playing with automated usage of darcs, such as tailor, deps.pl, and such like, multiple darcs records might be issued within the same fraction of a second on the same machine by the same user. When patches like that get created, then later darcs generates internal errors like this one when it finds two such patches in the same repo: "Fail: bug in get_extra commuting patch:" As has been discussed more than once on the mailing list, adding a sufficient number of random bytes would be sufficient to make patch ids collision free. Every major platform today offers the equivalent of /dev/urandom.
msg100 (view)	Author: zooko	Date: 2005-11-29.02:48:25
Alternatively, instead of making patch ids be collision free by randomness, we could hash in the entire body of the patch, so that patch ids would collide if and only if the author, timestamp, log, and body were identical. That would be perfect for tailor, and probably wouldn't cause a problem for anyone else.
msg101 (view)	Author: beschmi	Date: 2005-11-29.03:39:13
Using the body (patch representation) is problematic, because the same patch can have different bodies [1]. You could use the initial representation when the patch is recorded. Then you would have to store it somewhere because you can't simply recalculate the id without the old body when the body has changed. The only problem with the random ID i see is when two people want to record an identical patch (eg converting the same upstream repo with tailor using the CVS commit date) and get different random bytes. Then darcs can't recognize that the two patches are intended to be identical and you get conflicts. But you can always add a flag to darcs that lets tailor supply a string that is somehow derived from the contents of the commit instead of a random string in this special case. [1] http://www.abridgegame.org/pipermail/darcs-users/2005-November/008962.html
msg102 (view)	Author: zooko	Date: 2005-11-29.11:47:12
We could store just the patch id rather than the original body, as the patch id doesn't need to be recomputed. However, I no longer like this approach because it would allow for collisions among patches that ought to be distinct. So I urge darcs to take the simpler solution of adding randomness to the patch metadata.
msg104 (view)	Author: droundy	Date: 2005-11-29.12:44:04
On Tue, Nov 29, 2005 at 03:39:13AM +0000, Benedikt Schmidt wrote: > The only problem with the random ID i see is when two people want to > record an identical patch (eg converting the same upstream repo with > tailor using the CVS commit date) and get different random bytes. Then > darcs can't recognize that the two patches are intended to be identical > and you get conflicts. But you can always add a flag to darcs that lets > tailor supply a string that is somehow derived from the contents of the > commit instead of a random string in this special case. I think that this is enough of a problem that using a hash of content rather than random data makes a bit more sense. As Zooko points out, we can store it the same as we'd store the random data. We certainly could have a different option (for people who often record identical patches with the same patch name in the same second). The old idea was to stick it in the least significant digits of the date (and ignore it for date use purposes), so that older darcs would still be able to handle it. I still think this is a reasonable approach. Another approach would be to just check the latest patch to see if it has the same date and the same ID. Then keep looking back until you hit one with a different date. It's a lot cheaper than reading all the patches to make sure your new patch is unique, and gives a stronger guarantee of uniqueness (provided the same author isn't simultaneously recording patches with the same name in another repository--and at that stage I'll assert that the author is being silly), and doesn't require that we muck with the patch ID format. -- David Roundy http://www.darcs.net
msg108 (view)	Author: zooko	Date: 2005-11-29.14:29:54
> I think that this is enough of a problem that using a hash of content > rather than random data makes a bit more sense. As Zooko points out, we > can store it the same as we'd store the random data. We certainly could > have a different option (for people who often record identical patches with > the same patch name in the same second). My objection to this is somewhat weak, but there could be two patches with different identical bodies but different meaning due to different contexts which would get identical patch ids. However, I wouldn't object too strongly to the "hash the body" approach, because it would allow a certain kind of limited accidental convergence, especially for uses like tailor's. > The old idea was to stick it in the least significant digits of the date > (and ignore it for date use purposes), so that older darcs would still be > able to handle it. I still think this is a reasonable approach. Yes! Backwards compatibility makes this easier. Let's do it! > Another approach would be to just check the latest patch to see if it has > the same date and the same ID. Then keep looking back until you hit one > with a different date. It's a lot cheaper than reading all the patches > to make sure your new patch is unique, and gives a stronger guarantee of > uniqueness (provided the same author isn't simultaneously recording patches > with the same name in another repository--and at that stage I'll assert > that the author is being silly), and doesn't require that we muck with the > patch ID format. The three colliding patches that I generated overnight last night were generated in three separate repositories. Regards, Zooko
msg616 (view)	Author: zooko	Date: 2006-04-11.18:38:41
This bug bit me again today. I have a script (based on tailor) that automatically syncs between darcs and svn repositories. It generated two identical patchnames within the same second and gave me a corrupted repository that cannot be repaired with "darcs repair", but will instead require me to do manual surgery on the repository: zooko@hanford:/home/darcs/trunk-tailorizer$ time darcs repair -v -v -v Checking that patch names are unique... Error! Duplicate patch name: Mon Apr 3 12:13:50 PDT 2006 zooko * [4346]: deleted a random line to test automated svn->darcs sync
msg2573 (view)	Author: markstos	Date: 2008-01-19.02:43:13
David, Do the the hashed or darcs-2 repo formats resolve this issue? Mark
msg2597 (view)	Author: droundy	Date: 2008-01-19.13:32:24
No, the new formats don't help this, as darcs still relies on the path IDs being unique.
msg2833 (view)	Author: zooko	Date: 2008-01-28.14:33:17
I thought that the darcs-2 format defined the patch id as being the hash of the patch contents: http://lists.osuosl.org/pipermail/darcs-devel/2008-January/007021.html So therefore the ids of different patches are necessarily different. What am I missing?
msg2835 (view)	Author: droundy	Date: 2008-01-28.15:17:15
On Mon, Jan 28, 2008 at 02:33:19PM -0000, Zooko wrote: > I thought that the darcs-2 format defined the patch id as being the hash of the > patch contents: > > http://lists.osuosl.org/pipermail/darcs-devel/2008-January/007021.html > > So therefore the ids of different patches are necessarily different. > > What am I missing? No, that is just the filename of the patch. The patch id is unchanged. -- David Roundy Department of Physics Oregon State University
msg2846 (view)	Author: zooko	Date: 2008-01-28.15:56:07
On Jan 28, 2008, at 8:17 AM, David Roundy wrote: > On Mon, Jan 28, 2008 at 02:33:19PM -0000, Zooko wrote: >> I thought that the darcs-2 format defined the patch id as being >> the hash of the >> patch contents: >> >> http://lists.osuosl.org/pipermail/darcs-devel/2008-January/ >> 007021.html > No, that is just the filename of the patch. The patch id is > unchanged. I'm sorry if I'm being dense, but how can this be consistent with your earlier statement that "All data in the hashed format is hashed." (in the message referenced above), and the consequent idea that using a secure hash function could therefore facilitate "signing" of repositories? Is there a document that I can read to educate myself about this design? Regards, Zooko
msg3779 (view)	Author: zooko	Date: 2008-03-06.13:56:01
I just tried to convert the history of (the trunk of) twistedmatrix.com from SVN into darcs, and this bug -- patch ids are not collision-free -- prevents twisted from being importable into darcs. There are two patches in twisted's history with the same name, timestamp (to the second, at least), and author, but different contents.
msg3789 (view)	Author: markstos	Date: 2008-03-06.22:45:30
Zooko, I'm not surprised the issue persists, since it was never fixed. It can be worked around with an import tool that notices this situation and tweaks the time stamp by a second.
msg3881 (view)	Author: zooko	Date: 2008-03-13.13:28:48
Another person just got on IRC and said he had this problem, converting a project from CVS. Personally, I think that this is a bug in darcs, and I'm loathe to build work-around support for it in other tools. On the other hand, I think that David Roundy's current plan to add SHA-256 hashes of patches contents and context will fix this problem...
msg3899 (view)	Author: droundy	Date: 2008-03-17.22:05:21
On Thu, Mar 13, 2008 at 9:28 AM, Zooko <bugs@darcs.net> wrote: > Another person just got on IRC and said he had this problem, converting a > project from CVS. > > Personally, I think that this is a bug in darcs, and I'm loathe to build > work-around support for it in other tools. On the other hand, I think that > David Roundy's current plan to add SHA-256 hashes of patches contents and > context will fix this problem... Given that this is a bug which is only triggered by tailor (or perhaps almost only?), I think it's reasonable to put the workaround in there rather than in darcs. The potential darcs feature is really only a workaround for the behavior of tailor, in my opinion. A correct fix would be to crash with an error message when you try to record the second patch with the same date, author and name. David
msg3901 (view)	Author: zooko	Date: 2008-03-17.22:21:57
There have been several sightings of this problem, which fall into two categories: 1. Tailor, or another automated tool which creates darcs patches, created two with the same date, author, and name. 2. Some revision control history already exists in a different revision control format which has patches like this. Tailor then tries to create the equivalent darcs patches. These are two different cases, but I believe that both of them are valid. There is no a priori reason why a revision control system can require of its users that each "date, author, description" tuple gets used only once in history. Anyway, won't the planned feature of including secure hashes of patch contents (plus the planned feature of darcs-2-format converging equal patches) fix this? Regards, Zooko
msg3904 (view)	Author: markstos	Date: 2008-03-18.13:27:48
While it would be OK with me if this was addressed with darcs, I agree with David that it is appropriate that it be addressed in tailor, and similar tools. Tailor needs to understand each revision control system to convert between them. In this case, tailor needs to understand this facet of darcs and behave a little better. For the moment I see it as lower priority than the 2.0 release, unless it's relatively easy and involves a repo format change which would better to do now .
msg5749 (view)	Author: kowey	Date: 2008-08-28.08:37:17
So the issue that Zooko reported was that you could generate two different patches in a single repository with identical patch ids (which can confuse darcs terribly). I think we should consider http://bugs.darcs.net/issue1026 as another manifestation of the some fundamental problem (patch ids are not collision free). In issue1026, you have two unrelated repositories that share a common comment history (e.g. two different directories in a single CVS repositories). If you automatically convert these repositories to darcs using something like tailor, you can produce two different patches with the same patchinfo. Summarising the discussion below, both issues can apparently be resolved by computing a hash of the patch (and its context?) upon creation and storing it in the least significant digits of the date field. (Ian also suggests that within the same repository we should also count the uses of a patch name, to be extra safe). If I understand correctly, this does not entail a repo format change, and will decrease the probability of a patch id collision to be small enough that we can close this ticket. I'm marking it this as critical for darcs 2.1
msg5755 (view)	Author: nwf	Date: 2008-08-28.10:18:36
Is it impossible to use the hashed repository format for a patch info hash? I realize it couldn't be stored inside the file, but that seems OK, since you can grab it from the HashedIO layer itself. For non-hashed repositories, this might be a performance issue (though for non-hashed repositories, you could presumably append an optional field to the patch header containing the hash (and strip it back off on darcs convert) to mitigate this particular problem... or just convert to hashed). That is: I don't understand the "hide it in the date field" proposal's appeal, and the above seems backwards compatible... though I am still not terribly comfortable with this part of darcs's internals.
msg5756 (view)	Author: kowey	Date: 2008-08-28.10:37:40
> Is it impossible to use the hashed repository format for a patch info hash? I > realize it couldn't be stored inside the file, but that seems OK, since you can > grab it from the HashedIO layer itself. As I understand it, this needs to be something that is stored in the patch file itself and not recomputed on the fly. If I pull or push a patch from you, the contents and consequently its file hash may change through commutation. Does that make sense?
msg5758 (view)	Author: nwf	Date: 2008-08-28.11:25:47
On Thu, Aug 28, 2008 at 10:37:43AM -0000, Eric Kow wrote: > > Eric Kow <eric.kow@gmail.com> added the comment: > > > Is it impossible to use the hashed repository format for a patch info hash? I > > realize it couldn't be stored inside the file, but that seems OK, since you can > > grab it from the HashedIO layer itself. > > As I understand it, this needs to be something that is stored in the > patch file itself and not recomputed on the fly. If I pull or push a > patch from you, the contents and consequently its file hash may change > through commutation. If it really has to be stored in the file, why can't we just add another field to the PatchInfo structure at the top of each patch? Flag the change in _darcs/format so old versions don't puke unhelpfully... this seems preferable (though I obviously don't care so much about flag days) to changing the semantics of the date field. > Does that make sense? No; why would a patch object (serialized PatchInfoAnd, essentially) ever change during communication? If it could, it would seem that the global cache would be substantially less useful as several forms of the patch may all be in cache needlessly... Oh... I have a sneaking suspicion that the answer is that darcs rewrites named patches in the presence of conflictors... in this case, though, it would seem that the rewritten version _should_ have a different unique-ifying hash than the original? Thanks for your answers. --nwf;
msg5759 (view)	Author: kowey	Date: 2008-08-28.12:14:51
Hi, 2008/8/28 Nathaniel Filardo <bugs@darcs.net>: >> As I understand it, this needs to be something that is stored in the >> patch file itself and not recomputed on the fly. If I pull or push a >> patch from you, the contents and consequently its file hash may change >> through commutation. > > If it really has to be stored in the file, why can't we just add another > field to the PatchInfo structure at the top of each patch? Flag the change > in _darcs/format so old versions don't puke unhelpfully... this seems > preferable (though I obviously don't care so much about flag days) to > changing the semantics of the date field. Oh we could do this, but the appeal of packing this information into the date field is that it would be completely backwards compatible -- for purposes of getting the patch date, I think darcs only pays attention to the first 14 digits, eg 20080828124937. In fact, we could certainly change the PatchInfo code itself to add this field, and only do the actual on-file storage (and parsing) in the date field. It's not always possible to retain backwards compatibility, which is why it's nice to have _darcs/format as a means for older darcs to detect forward-compatibility issues; but I suppose it would be good if we can avoid resorting to it. >> Does that make sense? > > No; why would a patch object (serialized PatchInfoAnd, essentially) ever > change during communication? So here I'm talking about the patch itself and not the patchinfo. Well, it's really more basic than what you are suspecting. Say Arjan and Ganesh are starting from the same repository and editing a file "old". They make simultaneous changes which must be merged. Arjan darcs mv's "old" to "new" and Ganesh adds some hunks to "old". We'll call Ganesh's patch G. When Arjan pulls G so that it applies on top of his mv patch, G must be adjusted, i.e. replacing instances of "old" with "new". Do you agree with these core bits of patch theory? In practise, darcs stores Arjan's version of G (Ga) on disk. In principle it could have stored a canonical version of G instead, based on some minimal context, but that would mean that it would have to do extra work each time bringing G back up to his current context. Alternatively, we could store both the canonical version AND Arjan's current version (and hey, maybe this sort of redundancy would be useful in other ways). > If it could, it would seem that the global > cache would be substantially less useful as several forms of the patch may > all be in cache needlessly... Well, in practice you don't actually commute all that many patches, or the fact that you have commuted them doesn't really matter. Say we start from the repository with the three patches Prior X Y. If for some reason you have to commute the latter patches, Prior Y' X', yes it is true that you may get extra entries in the cache for Y' and X'. The good news is that if we both add the same patch on top of this (Prior X Y Z and Prior Y' X' Z), the new patch Z has the same exact representation (commuting X Y to Y' X' has no effect on anything outside of X and Y) and so it gets usefully cached. As an example, consider the darcs darcs repository. We can at least guarantee that all patches up to the 2.0.2 tag are cached. By itself the caching of that stuff (Prior) should be quite useful. Your repository may diverge from darcs.net if you made some local patches and then pulled some of our stuff, so you may lose some caching there. Furthermore, if your patches /do/ eventually make it into darcs.net (our repositories reconverge) any patches on top of the converged point will be identical and so caching becomes useful again. Also, in practice, commutation does not necessarily affect the patch contents. For example, patches that touch completely unrelated files commute without either patch having to change representation. > Oh... I have a sneaking suspicion that the answer is that darcs rewrites named > patches in the presence of conflictors... in this case, though, it would > seem that the rewritten version _should_ have a different unique-ifying hash > than the original? Well, the answer is that darcs almost /always/ rewrites patches when it has to commute something :-) Anyway, there is a chance that I am getting in over my head here, so I may have to defer to more knowledgeable darcs hackers from now on,
msg5764 (view)	Author: droundy	Date: 2008-08-28.13:34:26
On Thu, Aug 28, 2008 at 12:14:54PM -0000, Eric Kow wrote: > 2008/8/28 Nathaniel Filardo <bugs@darcs.net>: > >> As I understand it, this needs to be something that is stored in the > >> patch file itself and not recomputed on the fly. If I pull or push a > >> patch from you, the contents and consequently its file hash may change > >> through commutation. > > > > If it really has to be stored in the file, why can't we just add another > > field to the PatchInfo structure at the top of each patch? Flag the change > > in _darcs/format so old versions don't puke unhelpfully... this seems > > preferable (though I obviously don't care so much about flag days) to > > changing the semantics of the date field. > > Oh we could do this, but the appeal of packing this information into > the date field is that it would be completely backwards compatible -- > for purposes of getting the patch date, I think darcs only pays > attention to the first 14 digits, eg 20080828124937. In fact, we > could certainly change the PatchInfo code itself to add this field, > and only do the actual on-file storage (and parsing) in the date > field. > > It's not always possible to retain backwards compatibility, which is > why it's nice to have _darcs/format as a means for older darcs to > detect forward-compatibility issues; but I suppose it would be good if > we can avoid resorting to it. For what it's worth, I wouldn't choose to pack it into the date, I'd just stick it into the long comment (which would retain backwards compatibility), and then we could make newer darcs hide it. It doesn't matter what junk we stick in there, it could just as well be a large random number. But I don't really like the idea, as I think I may have mentioned--although perhaps that was on a duplicate of this bug report? The thing is that tailor can do this itself, and can make the junk something meaningful, and this issue cannot bite people who are using darcs in a reasonable non-automated manner, since you'd have to have a human record two patches within a second with the same patch name. Or two people recording two patches at the same time using the same patch name and pretending to be the same person. David
msg5767 (view)	Author: kowey	Date: 2008-08-28.14:42:45
> For what it's worth, I wouldn't choose to pack it into the date, I'd > just stick it into the long comment (which would retain backwards > compatibility), and then we could make newer darcs hide it. That could work too. I just figured that the date approach would involve the least hackery. > It > doesn't matter what junk we stick in there, it could just as well be a > large random number. Yep! I heartily agree. > But I don't really like the idea, as I think I > may have mentioned--although perhaps that was on a duplicate of this > bug report? Ah yes, you did say this in http://bugs.darcs.net/msg3899 > The thing is that tailor can do this itself, and can make the junk > something meaningful, and this issue cannot bite people who are using > darcs in a reasonable non-automated manner, since you'd have to have a > human record two patches within a second with the same patch name. Or > two people recording two patches at the same time using the same patch > name and pretending to be the same person. I think it's useful to practise defensive programming here. Who knows what other tools people might try to write (for example, Tortoise darcs)? Having the same patch name is actually very probable in some uses of darcs. Some people are fond of just calling their minor patches 'Wibble' -- hmm maybe we can call this the "Wibble" problem -- so you could imagine somebody doing something like: darcs record -am Wibble foo darcs record -am Wibble bar Well, that seems rather silly, but it isn't entirely far-fetched.
msg5768 (view)	Author: kowey	Date: 2008-08-28.14:53:30
>> It >> doesn't matter what junk we stick in there, it could just as well be a >> large random number. > > Yep! I heartily agree. Oh wait, I don't agree quite so heartily. I guess we'd still want something reasonably deterministic, but otherwise, it doesn't really matter.
msg5769 (view)	Author: droundy	Date: 2008-08-28.15:08:47
On Thu, Aug 28, 2008 at 02:42:48PM -0000, Eric Kow wrote: > > For what it's worth, I wouldn't choose to pack it into the date, I'd > > just stick it into the long comment (which would retain backwards > > compatibility), and then we could make newer darcs hide it. > > That could work too. I just figured that the date approach would > involve the least hackery. I think that the opposite is true. It's much nicer to put the data in the current free-form section of the patch info, rather than in a section that older darcs tries to parse. > > But I don't really like the idea, as I think I may have > > mentioned--although perhaps that was on a duplicate of this bug > > report? > > Ah yes, you did say this in http://bugs.darcs.net/msg3899 > > > The thing is that tailor can do this itself, and can make the junk > > something meaningful, and this issue cannot bite people who are > > using darcs in a reasonable non-automated manner, since you'd have > > to have a human record two patches within a second with the same > > patch name. Or two people recording two patches at the same time > > using the same patch name and pretending to be the same person. > > I think it's useful to practise defensive programming here. Who knows > what other tools people might try to write (for example, Tortoise > darcs)? > > Having the same patch name is actually very probable in some uses of > darcs. Some people are fond of just calling their minor patches > 'Wibble' -- hmm maybe we can call this the "Wibble" problem -- so you > could imagine somebody doing something like: > darcs record -am Wibble foo > darcs record -am Wibble bar > > Well, that seems rather silly, but it isn't entirely far-fetched. But darcs record has a one-second delay built into it (to keep the timestamps of the working directory from matching those in the pristine cache), so unless they're doing something very odd (or their computer's clock is broken somehow?), this won't cause any trouble. The problem really is (in practice) only with programs that use the --pipe interface to manually set the date. David
msg5770 (view)	Author: droundy	Date: 2008-08-28.15:09:45
Setting this as not ReleaseCritical, since I can't see how something can both be release-critical and almost wont-fix.
msg5771 (view)	Author: zooko	Date: 2008-08-28.15:31:57
> In practise, darcs stores Arjan's version of G (Ga) on disk. In > principle it could have stored a canonical version of G instead, based > on some minimal context, but that would mean that it would have to do > extra work each time bringing G back up to his current context. > Alternatively, we could store both the canonical version AND Arjan's > current version (and hey, maybe this sort of redundancy would be > useful in other ways). There are three different properties that patch ids might have. Let's examine each in turn. property 1: collision-resistance It is necessary for correctness (in my view) that the mapping from patch to patchid is injective -- that there is at most one patch which can produce a given patchid. A.k.a. patch ids are "collision- free" or "collision-resistant". property 2: convergence It is not necessary, but it might turn out to be useful, if the mapping is "convergent" in the sense that separate computations which start with the same patch result in the same patchid. This might turn out to be useful if we wanted convergence as a feature -- two people doing "darcs record" on the same textual changes both get the "same" darcs patch. It might also be useful in the future for efficiency, as Nathaniel Filardo observed, by reducing the number of patch ids that darcs has to keep track of. property 3: self-authenticating In the future I want, given a patchid that I got from a trusted source and a patch that I got from an untrusted source, to be able to verify that the patch matches the patch id, so that I know that the trusted source who gave me the patchid would accept this patch as an acceptable patch for that patchid. This 3rd property is potentially useful in issue992 -- short, secure, fast version identifiers. http://bugs.darcs.net/issue992 Here are a few ways to do it which have different combinations of these properties. Please see the end of this note, which argues that throwing in timestamps, "count of patches by this name" and other stuff helps with none of these properties. option 1: patchids are randomly generated 192-bit numbers This is definitely collision-resistant (up to the negligible chance of an accidental collision), and it has no convergence -- any time anyone computes a patchid from a patch, they get a new patchid. Observe that this is not a problem in current darcs because the patchid is included with the patch whenever the patch is transmitted. It has no self-authentication. option 2: patchids are collision-resistant hashes of the patch and the entire current context This is still collision-resistant (at least as much as the hash function we use), and it has convergence for the case that two people are computing patchids of the same patch in the same context. It is self-authenticating, but only if the person doing the checking has access to the entire context, and can figure out exactly which context was used. (See Nathaniel Filardo's patch to implement short, secure, fast version identifiers.) option 3: patchids are collision-resistant hashes of the patch and the minimal context This is still collision-resistant, and it has more convergence: now two people who are computing a patch id for the same patch, even in the presence of extraneous other patches which are not part of the minimal context. It is self-authenticating, and it might turn out to be easier to actually check in practice, since the checker knows how to find the minimal context of the patch in order to use that minimal context in the check. option 4: patchids are collision-resistant hashes of just the text of the patch, devoid of context This seems problematic to me, because then darcs might get confused about two textually identical patches which were created in disjoint contexts. Let us speak no more of it in this note. Now, observe that once you've chosen one of these three basic strategies, then throwing in the timestamp, the count of the patches by this name, and so forth can't help and can hurt: it does not increase collision-resistance (since all of these proposals except the weird 4th one already have collision-resistance), and it does not help with self-authentication, but it might hurt with self- authentication (if the added data that gets thrown in is not available at check-time or if it isn't clear which data elements need to be used in the check). Finally, it can't help and can hurt with convergence, since it is never the case that we want two patches which are otherwise identical to be treated as distinct patches due only to their timestamp or count. The same argument can be made about author name and patch description. Perhaps it is the case that we want two patches which are identical in all ways except for their patch comment to be treated as identical patches. Or perhaps we want them to be treated as different. I'm not sure. But I am sure that we should not throw data into the secure hash function unless we are sure that we want to cause divergence based on that data. By the way, I'm currently in favor of strategy #2, because #3 requires more computational work when recording a patch, and because strategy #1 doesn't seem to facilitate issue992 as well. However, I think any of the three strategies are okay. Regards, Zooko
msg5772 (view)	Author: kowey	Date: 2008-08-28.15:47:18
> But darcs record has a one-second delay built into it (to keep the > timestamps of the working directory from matching those in the > pristine cache) Hmm. I just did the Wibble test (no automation except for doing two records at once with a semi-colon) and got a Duplicate patch error. darcs init touch a b darcs add a b echo a > a echo b > b darcs record -am Wibble a; darcs record -am Wibble b darcs check Also, I didn't find any mentions of sleep or wait_a_moment in Record (on the other hand, I do see that Apply, Obliterate, Rollback and Unrecord use it before applying patches to the working directory). Is it the case that record should have a one-second delay and doesn't?
msg5773 (view)	Author: droundy	Date: 2008-08-28.15:55:58
On Thu, Aug 28, 2008 at 11:31 AM, zooko <zooko@zooko.com> wrote: > There are three different properties that patch ids might have. Let's > examine each in turn. > > property 1: collision-resistance > > It is necessary for correctness (in my view) that the mapping from patch to > patchid is injective -- that there is at most one patch which can produce a > given patchid. A.k.a. patch ids are "collision-free" or > "collision-resistant". The question becomes under what circumstances we need to be collision-resistant. In the presence of malicious users we cannot be collision-resistant, unless we also have the self-authenticating property. I would call users who intentionally manipulate the date in such a way as to create two identical patches with the same name something very close to malicious. > property 2: convergence > > It is not necessary, but it might turn out to be useful, if the mapping is > "convergent" in the sense that separate computations which start with the > same patch result in the same patchid. This might turn out to be useful if > we wanted convergence as a feature -- two people doing "darcs record" on the > same textual changes both get the "same" darcs patch. It might also be > useful in the future for efficiency, as Nathaniel Filardo observed, by > reducing the number of patch ids that darcs has to keep track of. I don't think this is a good property to have. > property 3: self-authenticating > > In the future I want, given a patchid that I got from a trusted source and a > patch that I got from an untrusted source, to be able to verify that the > patch matches the patch id, so that I know that the trusted source who gave > me the patchid would accept this patch as an acceptable patch for that > patchid. I can see that this could be useful, but do not see a practical path to achieve this. I don't want record to be O(N) in the length of the repository history, let alone O(N^2). > Here are a few ways to do it which have different combinations of these > properties. Please see the end of this note, which argues that throwing in > timestamps, "count of patches by this name" and other stuff helps with none > of these properties. > > > option 1: patchids are randomly generated 192-bit numbers > > This is definitely collision-resistant (up to the negligible chance of an > accidental collision), and it has no convergence -- any time anyone computes > a patchid from a patch, they get a new patchid. Observe that this is not a > problem in current darcs because the patchid is included with the patch > whenever the patch is transmitted. It has no self-authentication. Agreed. It's not a bad approach, if we really don't trust our users to be reasonable. > option 2: patchids are collision-resistant hashes of the patch and the > entire current context > > This is still collision-resistant (at least as much as the hash function we > use), and it has convergence for the case that two people are computing > patchids of the same patch in the same context. It is self-authenticating, > but only if the person doing the checking has access to the entire context, > and can figure out exactly which context was used. (See Nathaniel > Filardo's patch to implement short, secure, fast version identifiers.) Also not a bad approach, if we don't trust our users. A little more tailor-friendly. As you point out, it's not robustly self-authenicating, so I'd call that a red herring. > option 3: patchids are collision-resistant hashes of the patch and the > minimal context > > This is still collision-resistant, and it has more convergence: now two > people who are computing a patch id for the same patch, even in the presence > of extraneous other patches which are not part of the minimal context. It > is self-authenticating, and it might turn out to be easier to actually check > in practice, since the checker knows how to find the minimal context of the > patch in order to use that minimal context in the check. This is only convergent if the two people also have the same name and give the patch the same name, and also the same date. In other words, the convergence benefit here is a red herring. The self-authentication properties are real, but the cost is that this is O(N^2) in the size and duration of the repository. Note that in darcs-2 the minimal context is not a set of named patches, but rather a set of primitive patches. Also note that we'd have to introduce a canonical primitive-patch sorting function for this hash to be useful for self-authentication, which is another unsolved problem. > option 4: patchids are collision-resistant hashes of just the text of the > patch, devoid of context > > This seems problematic to me, because then darcs might get confused about > two textually identical patches which were created in disjoint contexts. > Let us speak no more of it in this note. Agreed. This is has no real benefits. > Now, observe that once you've chosen one of these three basic strategies, > then throwing in the timestamp, the count of the patches by this name, and > so forth can't help and can hurt: it does not increase collision-resistance > (since all of these proposals except the weird 4th one already have > collision-resistance), and it does not help with self-authentication, but it > might hurt with self-authentication (if the added data that gets thrown in > is not available at check-time or if it isn't clear which data elements need > to be used in the check). Finally, it can't help and can hurt with > convergence, since it is never the case that we want two patches which are > otherwise identical to be treated as distinct patches due only to their > timestamp or count. The timestamp is necessary to be included (not in a hash, but in the ID), because our users want to know when changes were made. > The same argument can be made about author name and patch description. > Perhaps it is the case that we want two patches which are identical in all > ways except for their patch comment to be treated as identical patches. Or > perhaps we want them to be treated as different. I'm not sure. But I am > sure that we should not throw data into the secure hash function unless we > are sure that we want to cause divergence based on that data. Again, these need to be included, because our users want to know who made the changes. They don't need to go into any sort of hash, but we need to track them, and if two people make the same change, we need to track that. So yes, we do need to leave the name of the patch as part of its identifier. > By the way, I'm currently in favor of strategy #2, because #3 requires more > computational work when recording a patch, and because strategy #1 doesn't > seem to facilitate issue992 as well. However, I think any of the three > strategies are okay. I just am not convinced that this is a real issue. As I mention above, we can't protect against malicious users using anything less than #3, which really is an O(N^2) solution that also requires the creation of a new primitive operation on patches (sorting). It might not be too hard to figure out the sorting problem, but any bugs or changes in that function would break repositories. I suppose there's something to be said for protecting against stupid users, but I don't at all like the idea of making record O(N) in order to do so. In a hashed repository, we #2 can be done in O(1) time, since we already store most of the hashes in the context, and I suppose on darcs-1 repositories we could just hash the inventory file. If we chose #1, I believe tailor users would be up in arms. David
msg5774 (view)	Author: droundy	Date: 2008-08-28.15:59:46
On Thu, Aug 28, 2008 at 11:47 AM, Eric Kow <bugs@darcs.net> wrote: >> But darcs record has a one-second delay built into it (to keep the >> timestamps of the working directory from matching those in the >> pristine cache) > > Hmm. I just did the Wibble test (no automation except for doing two > records at once with a semi-colon) and got a Duplicate patch error. > > darcs init > touch a b > darcs add a b > echo a > a > echo b > b > darcs record -am Wibble a; darcs record -am Wibble b > darcs check > > Also, I didn't find any mentions of sleep or wait_a_moment in Record > (on the other hand, I do see that Apply, Obliterate, Rollback and > Unrecord use it before applying patches to the working directory). > > Is it the case that record should have a one-second delay and doesn't? You're right, I was being stupid. Record doesn't need the delay because it doesn't touch the working directory. But of course, this kind of issue can be most readily fixed by implementing a very simple check as to whether there is a patch with an identical name among the most recently-recorded patches in the repository. In fact, this check ought to be there, because any junk data (i.e. hash or random number) isn't going to be human-friendly, and it's not nice to our users to allow them to create two distinct patches with indistinguishable (to them) names. At least the date should be different. Otherwise something like darcs changes will show two identical patches, which is downright unfriendly. David
msg5775 (view)	Author: zooko	Date: 2008-08-28.16:14:10
On Aug 28, 2008, at 9:56 AM, David Roundy wrote: > > The timestamp is necessary to be included (not in a hash, but in the > ID), because our users want to know when changes were made. I agree that the timestamp, author, patch description are important to have around for human use, and that they don't have to be included in the hash or otherwise included in the patch identifier. I think you and I have differing intuitions about the value of facilitating the possible future development of secure identifiers. I think that such identifiers, like those currently used in git etc., could be implemented in the future by someone, either in darcs or in an extension tool that uses darcs, or in a successor to darcs, and I hope that we don't make design decisions which hinder such a development. But maybe it will never happen. Making predictions is hard, especially about the future. See Nathaniel Filardo's contribution to issue992 for an example of a development along those lines which is already implemented. > If we chose #1, I believe tailor users would be up in arms. Why is that? Option 1 is that the identifier of a patch is nothing more than a randomly chosen 192-bit number. I am one of the world's heaviest tailor users -- I routinely convert the entire history of open source software projects into darcs -- multiple years, tens of thousands of patches -- just my own personal use. Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month
msg5777 (view)	Author: droundy	Date: 2008-08-28.17:46:25
On Thu, Aug 28, 2008 at 12:13 PM, zooko <zooko@zooko.com> wrote: > On Aug 28, 2008, at 9:56 AM, David Roundy wrote: >> The timestamp is necessary to be included (not in a hash, but in the >> ID), because our users want to know when changes were made. > > I agree that the timestamp, author, patch description are important to have > around for human use, and that they don't have to be included in the hash or > otherwise included in the patch identifier. You have some unusual idea that we're talking about completely rewriting darcs to introduce some new patch identifier concept. That doesn't exist in darcs right now, and I don't expect to add it any time soon. What you're describing cannot happen before darcs 3, since it's way too large a change in darcs' semantics. The patch identifier is the metadata describing the patch. > I think you and I have differing intuitions about the value of facilitating > the possible future development of secure identifiers. I think that such > identifiers, like those currently used in git etc., could be implemented in > the future by someone, either in darcs or in an extension tool that uses > darcs, or in a successor to darcs, and I hope that we don't make design > decisions which hinder such a development. But maybe it will never happen. > Making predictions is hard, especially about the future. See Nathaniel > Filardo's contribution to issue992 for an example of a development along > those lines which is already implemented. I think that adding complexity without adding value in the interest of possible future development is almost always wrong. And noone has yet suggested a plausible mechanism for creating such a scheme. (And yes, I did just take a look at issue992, thanks for pointing it out.) >> If we chose #1, I believe tailor users would be up in arms. > > Why is that? Option 1 is that the identifier of a patch is nothing more > than a randomly chosen 192-bit number. I am one of the world's heaviest > tailor users -- I routinely convert the entire history of open source > software projects into darcs -- multiple years, tens of thousands of patches > -- just my own personal use. Because I know there are tailor users who apply tailor twice and expect to get two identical repositories (apart from added patches). David
msg5779 (view)	Author: nwf	Date: 2008-08-28.17:50:38
On Thu, Aug 28, 2008 at 12:14:54PM -0000, Eric Kow wrote: > Say Arjan and Ganesh are starting from the same repository and editing > a file "old". They make simultaneous changes which must be merged. > Arjan darcs mv's "old" to "new" and Ganesh adds some hunks to "old". > We'll call Ganesh's patch G. When Arjan pulls G so that it applies on > top of his mv patch, G must be adjusted, i.e. replacing instances of > "old" with "new". Do you agree with these core bits of patch theory? I agree that the patch must be adjusted logically, but I disagree that we should be mangling the patch that came from upstream; see below. I am not very clear on how this works as patches "complete the loop" ... if a patch from Ganesh's repo, G, is stored in Arjan's as Ga, when Ganesh next pulls from Arjan, is G replaced by Ga, are G and Ga both now in Ganesh's repo, or is just Ga in Ganesh's repo? > In practise, darcs stores Arjan's version of G (Ga) on disk. In > principle it could have stored a canonical version of G instead, based > on some minimal context, but that would mean that it would have to do > extra work each time bringing G back up to his current context. > Alternatively, we could store both the canonical version AND Arjan's > current version (and hey, maybe this sort of redundancy would be > useful in other ways). Right; having both the patch as pulled from the remote repository and a local superseder seems like it would make life easier when trying to figure out "what happened" internally (I've had to go trolling _darcs/patches before to try to figure out what broke), would make comparisons between repositories easier (if only by hand; perhaps Darcs wouldn't have a need for such functionality), would allow us to use HashedIO hashes as patch identifiers across the board, and would seem to be a good start on issue891. Be well. --nwf;
msg5782 (view)	Author: zooko	Date: 2008-08-28.19:00:27
On Aug 28, 2008, at 11:46 AM, David Roundy wrote: > >> I agree that the timestamp, author, patch description are >> important to have >> around for human use, and that they don't have to be included in >> the hash or >> otherwise included in the patch identifier. > > You have some unusual idea that we're talking about completely > rewriting darcs to introduce some new patch identifier concept. I'm sorry if my terminology makes it sound like I have incorrect assumptions about darcs development. What I intended to say was that those bits of information don't have to be included in the patch identifier in order to achieve any of the three properties of patch identifiers that we were discussing. If I understand correctly, this is in agreement with what you said in your previous note. Let me make sure I understand your other points: you believe that the convergence property is necessary, at least for the use case of duplicate runs of tailor producing identical repositories, you continue to be skeptical that it will ever be possible to implement self-authenticating identifiers in a way compatible with patch theory, and you think that some of the strategies under discussion are unnecessarily complex. I guess you mean strategy #3 (hash including minimal context), which is more complex than either strategy #1 (random id) or strategy #2 (hash including current context). Is that right? Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month
msg5784 (view)	Author: kowey	Date: 2008-08-28.19:38:27
> I am not very clear on how this works as patches "complete the loop" ... if > a patch from Ganesh's repo, G, is stored in Arjan's as Ga, when Ganesh next > pulls from Arjan, is G replaced by Ga, are G and Ga both now in Ganesh's > repo, or is just Ga in Ganesh's repo? It remains G in Ganesh's repository and Ga in Arjan's one. Darcs knows that both patches are the same because they have the same patchinfo. This seems weird at first, but it helps to think that all patches are inherently mangled. There is nothing about the patch as it is created upstream that makes it the "correct" representation, unless you wish to consider the minimal context version. > Right; having both the patch as pulled from the remote repository and a > local superseder seems like it would make life easier when trying to figure > out "what happened" internally (I've had to go trolling _darcs/patches > before to try to figure out what broke), I understand that pain. If you are interested in this, it might be worthwhile for you to try and implement a minimal_context function. Even if we don't wind up using it for this issue, it could still be handy for other purposes, or at the very least, a good way to get a handle on how darcs works Thanks! ( especially for braving darcs' guts and having waded in so far! I'm still safely treading water at the UI level :-D )
msg5789 (view)	Author: droundy	Date: 2008-08-28.23:13:09
On Thu, Aug 28, 2008 at 2:59 PM, zooko <zooko@zooko.com> wrote: > On Aug 28, 2008, at 11:46 AM, David Roundy wrote: >>> I agree that the timestamp, author, patch description are important to >>> have >>> around for human use, and that they don't have to be included in the hash >>> or >>> otherwise included in the patch identifier. >> >> You have some unusual idea that we're talking about completely >> rewriting darcs to introduce some new patch identifier concept. > > I'm sorry if my terminology makes it sound like I have incorrect assumptions > about darcs development. What I intended to say was that those bits of > information don't have to be included in the patch identifier in order to > achieve any of the three properties of patch identifiers that we were > discussing. If I understand correctly, this is in agreement with what you > said in your previous note. I'm still not sure what you mean by patch identifier. I think you mean the extra junk that you'd like to have added to the patch identifiers? The thing that currently doesn't exist? > Let me make sure I understand your other points: you believe that the > convergence property is necessary, at least for the use case of duplicate > runs of tailor producing identical repositories, you continue to be > skeptical that it will ever be possible to implement self-authenticating > identifiers in a way compatible with patch theory, and you think that some > of the strategies under discussion are unnecessarily complex. I guess you > mean strategy #3 (hash including minimal context), which is more complex > than either strategy #1 (random id) or strategy #2 (hash including current > context). > > Is that right? Right. I don't particularly care about convergence, but I believe other people do care, and I don't see any reason to break current functionality in order to fix an loophole that has no real detrimental effect in practice. #2 is marginally better, but if it's done in any way that could provide hope of self-authentication, it would be too slow, and #3 is guaranteed to be too slow. So we're left with a #2 that only solves the problem of collision-resistance. (which is, of course, the title of this bug...) But is that a worthwhile tradeoff, including useless data in every patch identifier, just so we won't have collisions, when there is a friendlier way to eliminate the probability of collisions (i.e. to check for collisions). David
msg5790 (view)	Author: dagit	Date: 2008-08-29.00:42:02
After re-reading much of this thread it seems the decision is, "Won't fix because it's not darcs' problem, it's tailor's problem". If I understand correctly, the reasoning is darcs can/could make the patchinfo unique per branch at creation time. I think, this is to be done by checking if the name is already used and either complaining or using a new name. So instead of making patch ids collision free, we'll ensure that each branch is free of duplicate patchinfos. Unfortunately, this doesn't address the other problem that's emerged in this discussion, which is, what to do about repositories that have identical patchinfos but different patches? For example see issue1026. Because cases like issue1026 can cause a crash in darcs that is semantically equivalent to a segfault or buffer overflow, we really should fix this! I've just made a big claim. Here is how it works: 1) Darcs trusts patchinfos to be consistent across patchsets. 2) When you do a pull, darcs looks at the set of patches in both repositories and tries to figure out which patches they have in common and which patches exist only in each repository (get_common_and_uncommon). There is also a bias to work with patch contents from local copies of the patches. This is intended as an optimization, but it could be meaningful in this context. 3) To do get_common_and_uncommon, darcs grabs the the patchinfos for every patch in both sequences and takes the intersection and calls these the common patchinfos. 4) Darcs takes the common patchinfos and uses them to decide which patches in each patchset can be commuted out of the patchset. Darcs assumes if the patch is not a common patch but it's sitting between common patches then it was commuted there previously. And this assumption is true when there is an injection from patchinfos to patches (up to transformations generated by commute). 5) In the case of "bug in get_extra" you're getting lucky in a sense. That's the case where darcs is doing the equivalent of an unsafe C program trusting user input and dying with a segfault. Darcs catches the error and aborts. 6) The unlucky case. Now imagine if we had two different patches with the same patchinfos but they just happen to commute with all the right patches so that no bug in get_extra is triggered. Darcs has now corrupted your repository and you can expect it is now in some ill-defined state. This is akin to a buffer overflow exploit. Now, some people might argue that user's shouldn't be "stupid" and create different patches in different branches that share the same patchinfo. I think that's blaming the victim. As a respectable piece of software Darcs shouldn't let itself do unsafe things with user input. I'm not saying we have handle this situation by doing a successful get_common_and_uncommon inspite of duplicate patchinfos. I am saying that we need to detect #6 and at a minimum respond like we do in #5. It would be nice if someone could construct an example of #6 leading to corruption so we can include it in the buggy tests. Jason
msg5791 (view)	Author: zooko	Date: 2008-08-29.02:08:22
>> ... those bits of >> information don't have to be included in the patch identifier in >> order to >> achieve any of the three properties of patch identifiers that we were >> discussing. If I understand correctly, this is in agreement with >> what you >> said in your previous note. > > I'm still not sure what you mean by patch identifier. I think you > mean the extra junk that you'd like to have added to the patch > identifiers? The thing that currently doesn't exist? No, I mean the same thing that you mean by "patch identifier" (although the statement is also true of other possible identifiers of patches). > #2 is marginally better, but if it's done in any way that could > provide hope of self-authentication, it would be too slow, and #3 is > guaranteed to be too slow. > ... > So we're left with a #2 that only solves > the problem of collision-resistance. Could you tell me what you mean by this alternate form of #2 -- the one which could provide hope of self-authentication but is too slow -- as opposed to the form of #2 which provides only collision- resistance? The strategy #2 that I am thinking of is to take the secure hash of the patch contents and the current context, and include the result value in the patch id. This would provide collision-resistance and would be fast (I think). You think (if I understand correctly) that it would not provide hope of self-authentication in the future. I think that it might, although I haven't worked out the details enough to be sure that it would be practical. > But is that a worthwhile tradeoff, including useless > data in every patch identifier, just so we won't have collisions, when > there is a friendlier way to eliminate the probability of collisions > (i.e. to check for collisions). Can you tell me how this strategy -- let's call it strategy #5: "check for collisions" -- is friendlier than #2? Possibly because you are envisioning that the secure hash value would appear in the user interface in a way that would make that user interface less friendly? Any of these strategies would also mean that we don't need to have the "sleep for one second" feature in order to guarantee unique timestamps, although apparently it turns out that we don't use that feature in darcs record -- I'm a bit confused about that. Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month
msg6033 (view)	Author: droundy	Date: 2008-09-17.15:22:13
I'm pushing a fix for this, using the random number approach. We'll see if tailor users rise up in arms about this... if so, it's easy enough to change. Zooko, I'm sorry for being rude about this, and thank you for being stubborn. David
msg6036 (view)	Author: droundy	Date: 2008-09-17.18:49:38
The following patch updated the status of issue27 to be resolved: * resolve issue27: add junk to patch identifiers. Ignore-this: b91ab6f6e05e0fda25488fa51653b741
msg7167 (view)	Author: gwern	Date: 2009-01-23.17:51:57
> Having the same patch name is actually very probable in some uses of > darcs. Some people are fond of just calling their minor patches > 'Wibble' -- hmm maybe we can call this the "Wibble" problem -- so you > could imagine somebody doing something like: > darcs record -am Wibble foo > darcs record -am Wibble bar > > Well, that seems rather silly, but it isn't entirely far-fetched. Actually, it isn't silly at all. Suppose one is using Darcs for storing a wiki (like Gitit or Orchid do). One of the fundamental justifications of a wiki is allowing small minor edits. Obviously if one is correcting the spelling of a single word, it makes no sense to spend a full sentence summarizing one's change. One uses the shorthand- 'sp', or 'fmt' or 'cpedit' or 'rv' or 'rvv' or 'ln' or '+See also' etc. This jargon is very useful for speeding up review of edits, particularly subtle vandalism (a reversion without a summary like 'rvv' can easily look like an editing dispute). I daresay on the English Wikipedia alone there must be millions of edits which have the same summary as another edit. (With 279,793,198 edits in total on en, I suspect there are many very long/improbable edit summaries which are duplicates...) Now, this might seem to be a problem only for large wikis, but it's very easy to run into. When I was playing with the first release of Orchid, I made a test edit messing with some of the markup, which I summarized, of course, as 'fmt'. Later, I was adding some content and I saw I made a syntax error. My correction had the summary 'fmt'... (Orchid didn't allow this to be saved because it wasn't a unique summary and Orchid worked on unique summaries. Now that Orchid has moved to use filestore - which uses darcs's hashs as IDs and specifically avoids the summaries - this isn't a problem anymore. But it was.)
msg7172 (view)	Author: kowey	Date: 2009-01-24.08:58:56
On Fri, Jan 23, 2009 at 17:52:02 -0000, gwern wrote: > (Orchid didn't allow this to be saved because it wasn't a unique summary and > Orchid worked on unique summaries. Now that Orchid has moved to use filestore - > which uses darcs's hashs as IDs and specifically avoids the summaries - this > isn't a problem anymore. But it was.) So, darcs 2.2.0 generates some random data in the patch log now (using an Ignore-this) field... so by rights it should work even if Orchid is relying on the logs (note logs and not just summary though)

History
Date	User	Action	Args
2005-11-29 02:44:14	zooko	create
2005-11-29 02:48:25	zooko	set	status: unread -> unknown nosy: droundy, tommy, zooko messages: + msg100
2005-11-29 03:39:13	beschmi	set	nosy: + beschmi messages: + msg101
2005-11-29 11:47:12	zooko	set	nosy: droundy, tommy, beschmi, zooko messages: + msg102
2005-11-29 12:44:04	droundy	set	nosy: droundy, tommy, beschmi, zooko messages: + msg104
2005-11-29 14:29:55	zooko	set	nosy: droundy, tommy, beschmi, zooko messages: + msg108
2006-04-11 18:38:43	zooko	set	nosy: droundy, tommy, beschmi, zooko messages: + msg616
2008-01-19 02:43:14	markstos	set	nosy: + markstos, kowey messages: + msg2573
2008-01-19 13:32:25	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg2597
2008-01-19 15:59:16	markstos	set	topic: + Darcs2 nosy: droundy, tommy, beschmi, kowey, markstos, zooko
2008-01-28 14:33:19	zooko	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg2833
2008-01-28 15:17:17	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg2835
2008-01-28 15:56:08	zooko	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg2846
2008-02-13 22:40:56	markstos	set	status: unknown -> deferred nosy: droundy, tommy, beschmi, kowey, markstos, zooko
2008-03-06 13:56:02	zooko	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg3779
2008-03-06 22:45:31	markstos	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg3789
2008-03-13 13:28:50	zooko	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg3881
2008-03-17 22:05:24	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg3899
2008-03-17 22:22:00	zooko	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg3901
2008-03-18 13:27:49	markstos	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko messages: + msg3904
2008-03-31 15:38:51	droundy	link	issue772 superseder
2008-08-28 08:12:34	kowey	link	issue1026 superseder
2008-08-28 08:37:21	kowey	set	status: deferred -> unknown nosy: + Serware, dagit topic: + Target-2.0, - Darcs2 messages: + msg5749
2008-08-28 09:06:44	kowey	set	status: unknown -> needs-reproduction nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware
2008-08-28 10:18:38	nwf	set	nosy: + nwf messages: + msg5755
2008-08-28 10:37:43	kowey	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware, nwf messages: + msg5756
2008-08-28 11:25:50	nwf	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware, nwf messages: + msg5758
2008-08-28 12:14:54	kowey	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware, nwf messages: + msg5759
2008-08-28 13:34:30	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware, nwf messages: + msg5764
2008-08-28 14:42:48	kowey	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware, nwf messages: + msg5767
2008-08-28 14:53:32	kowey	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware, nwf messages: + msg5768
2008-08-28 15:08:49	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware, nwf messages: + msg5769
2008-08-28 15:09:48	droundy	set	topic: - Target-2.0 nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, Serware, nwf messages: + msg5770
2008-08-28 15:32:00	zooko	set	nosy: + serware messages: + msg5771
2008-08-28 15:47:20	kowey	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5772
2008-08-28 15:56:00	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5773
2008-08-28 15:59:48	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5774
2008-08-28 16:14:13	zooko	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5775
2008-08-28 17:46:28	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5777
2008-08-28 17:50:41	nwf	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5779
2008-08-28 19:00:30	zooko	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5782
2008-08-28 19:38:29	kowey	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5784
2008-08-28 23:13:12	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5789
2008-08-29 00:42:05	dagit	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5790
2008-08-29 02:08:30	zooko	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, serware, Serware, nwf messages: + msg5791
2008-09-17 12:06:46	dmitry.kurochkin	set	nosy: + dmitry.kurochkin
2008-09-17 15:22:15	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, dmitry.kurochkin, serware, Serware, nwf messages: + msg6033
2008-09-17 15:22:30	droundy	set	status: needs-reproduction -> has-patch nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, dmitry.kurochkin, serware, Serware, nwf
2008-09-17 15:22:57	droundy	set	nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, dmitry.kurochkin, serware, Serware, nwf assignedto: droundy
2008-09-17 18:49:40	droundy	set	status: has-patch -> resolved-in-unstable nosy: droundy, tommy, beschmi, kowey, markstos, zooko, dagit, dmitry.kurochkin, serware, Serware, nwf messages: + msg6036
2009-01-23 17:52:01	gwern	set	nosy: + gwern, simon, thorkilnaur messages: + msg7167
2009-01-23 22:41:41	droundy	set	nosy: - droundy
2009-01-24 08:58:58	kowey	set	nosy: tommy, beschmi, kowey, markstos, zooko, dagit, simon, thorkilnaur, gwern, dmitry.kurochkin, serware, Serware, nwf messages: + msg7172
2009-04-22 03:26:15	twb	set	status: resolved-in-unstable -> resolved nosy: tommy, beschmi, kowey, markstos, zooko, dagit, simon, thorkilnaur, gwern, dmitry.kurochkin, serware, Serware, nwf
2009-08-06 20:29:31	admin	set	nosy: - beschmi
2009-08-10 23:56:28	admin	set	nosy: - dagit
2009-08-25 17:39:45	admin	set	nosy: + darcs-devel, - simon
2009-08-27 14:17:51	admin	set	nosy: tommy, kowey, markstos, darcs-devel, zooko, thorkilnaur, gwern, dmitry.kurochkin, serware, Serware, nwf
2009-10-23 22:44:26	admin	set	nosy: - Serware
2009-10-23 23:27:51	admin	set	nosy: + Serware, - serware

Issue 27 patch ids are not collision-free