Created on 2008-03-31.14:44:42 by lele, last changed 2009-08-27.13:57:23 by admin.
msg4126 (view) |
Author: lele |
Date: 2008-03-31.14:44:39 |
|
Hi all,
I'm facing an unexpected trouble trying to merge two different
repositories into one. No matter which direction, or darcs1/darcs2, I
always trigger that "bug in get_extra commuting patch".
I tailorized two different subtrees of a Subversion repository into
two distinct darcs repositories.
Since the two are effectively wired to each other, I'd like to have a
single repository with the two subtrees.
So I basically did:
$ cd /tmp
$ darcs get .../tailorized/repo-A
$ cd repo-A
$ darcs pull .../tailorized/repo-B
and I get the error almost immediately at pull time, with the error
reporting a patch in repo-A. The same happens if I swap the order
(that is, trying to pull repo-A into repo-B): in this case, the error
message mention one patch of repo-B.
Then I rebuilt an up-to-date darcs2 binary, and tried the same (with
and without --hashed) with it, obtaining the very same result.
repo-A has 391 patches while repo-B only 113, and as said by
definition the two sets are completely non-overlapping:
$ ls -l repo-A
drwxrwxr-x 6 lele lele 4096 2008-03-31 15:55 _darcs
drwxrwxr-x 7 lele lele 4096 2008-03-31 15:28 gam-database-pg
$ du -sh repo-A
17M
$ ls -l repo-B
drwxrwxr-x 6 lele lele 4096 2008-03-31 15:55 _darcs
drwxrwxr-x 3 lele lele 4096 2008-03-31 15:29 tools
$ du -sh repo-B
1,4M
As the material is completely under GPL, I have no problem sharing it,
should that help in any way. Please, let me know if there's anything
else I could try.
Thank you in advance,
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
msg4131 (view) |
Author: droundy |
Date: 2008-03-31.14:53:21 |
|
On Mon, Mar 31, 2008 at 02:44:42PM -0000, Lele Gaifax wrote:
> Then I rebuilt an up-to-date darcs2 binary, and tried the same (with
> and without --hashed) with it, obtaining the very same result.
Could you try using the --darcs-2 format?
> $ du -sh repo-A
> 17M
It's distinctly possible (in fact, downright likely) that what you're
seeing is an out-of-memory error. It's a horrible error message for an
out-of-memory error, but given what you describe this bug shouldn't happen
(even with darcs-1). Anyhow, without seeing the repositories, or the
patches involved in the commutation, this is all I can guess.
> As the material is completely under GPL, I have no problem sharing it,
> should that help in any way. Please, let me know if there's anything
> else I could try.
That would be great, if you could give us a couple of URLs to get from.
--
David Roundy
Department of Physics
Oregon State University
_______________________________________________
darcs-devel mailing list
darcs-devel@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-devel
|
msg4134 (view) |
Author: droundy |
Date: 2008-03-31.15:20:54 |
|
On Mon, 31 Mar 2008 07:45:16 -0700
David Roundy <droundy@darcs.net> wrote:
> On Mon, Mar 31, 2008 at 02:44:42PM -0000, Lele Gaifax wrote:
> > Then I rebuilt an up-to-date darcs2 binary, and tried the same (with
> > and without --hashed) with it, obtaining the very same result.
>
> Could you try using the --darcs-2 format?
Uhm, not immediately: if I understand, I cannot migrate to that
format, but I should use "darcs2 init --darcs-2" in the tailorization
step... Am I right?
> It's distinctly possible (in fact, downright likely) that what you're
> seeing is an out-of-memory error.
This seems strange, because I get the error almost immediately,
without any apparent load on the machine...
>
> > As the material is completely under GPL, I have no problem sharing
> > it, should that help in any way.
>
> That would be great, if you could give us a couple of URLs to get
> from.
Sorry, here it is:
http://artiemestieri.tn.it/~lele/issue772.tar.bz2
It contains the two original darcs1 repositories without the pristine
trees.
thank you,
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
_______________________________________________
darcs-devel mailing list
darcs-devel@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-devel
|
msg4139 (view) |
Author: droundy |
Date: 2008-03-31.15:38:34 |
|
So the problem is that you've got two changes with identical names (and dates,
etc) that describe different changes:
Sat Sep 25 14:01:03 PDT 2004 lele
* Rudimentale indice degli script
If you fix tailor to generate unique ids for patches, that would fix this.
This is a duplicate of issue27. It's debatable whether this is a bug in darcs
or a bug in tailor. Zooko would argue, I'm sure, that darcs shouldn't give you
the power to shoot yourself in the foot. I tend to disagree. I consider it an
feature that you (as the author of tailor) can precisely specify the patch ID of
patches you're converting. Anyhow, an easy fix (what Zooko wants us to do in
darcs) is to add a bit of garbage into the long message of each patch. If you
prefix this garbage with something reasonable, we may even add a feature to hide
that garbage from our users.
David
|
msg4146 (view) |
Author: lele |
Date: 2008-03-31.16:52:41 |
|
On Mon, 31 Mar 2008 15:38:35 -0000
David Roundy <bugs@darcs.net> wrote:
>
>
> So the problem is that you've got two changes with identical names
> (and dates, etc) that describe different changes:
>
> Sat Sep 25 14:01:03 PDT 2004 lele
> * Rudimentale indice degli script
>
> If you fix tailor to generate unique ids for patches, that would fix
> this.
>
> This is a duplicate of issue27. It's debatable whether this is a bug
> in darcs or a bug in tailor. Zooko would argue, I'm sure, that darcs
> shouldn't give you the power to shoot yourself in the foot. I tend
> to disagree. I consider it an feature that you (as the author of
> tailor) can precisely specify the patch ID of patches you're
> converting. Anyhow, an easy fix (what Zooko wants us to do in darcs)
> is to add a bit of garbage into the long message of each patch. If
> you prefix this garbage with something reasonable, we may even add a
> feature to hide that garbage from our users.
Thank you David,
And I now understand better how issue27 born :)
So, once you know what the problem is, it's very easy to install a
workaround in tailor, just changing the "patch-name-format" option.
Is there any way for darcs to be more precise in its error message?
Could it diagnose that duplicate id is the reason behind?
thank you again,
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
msg4148 (view) |
Author: droundy |
Date: 2008-03-31.18:31:58 |
|
On Mon, Mar 31, 2008 at 04:52:43PM -0000, Lele Gaifax wrote:
>
>
> On Mon, 31 Mar 2008 15:38:35 -0000
> David Roundy <bugs@darcs.net> wrote:
>
> >
> >
> > So the problem is that you've got two changes with identical names
> > (and dates, etc) that describe different changes:
> >
> > Sat Sep 25 14:01:03 PDT 2004 lele
> > * Rudimentale indice degli script
> >
> > If you fix tailor to generate unique ids for patches, that would fix
> > this.
> >
> > This is a duplicate of issue27. It's debatable whether this is a bug
> > in darcs or a bug in tailor. Zooko would argue, I'm sure, that darcs
> > shouldn't give you the power to shoot yourself in the foot. I tend
> > to disagree. I consider it an feature that you (as the author of
> > tailor) can precisely specify the patch ID of patches you're
> > converting. Anyhow, an easy fix (what Zooko wants us to do in darcs)
> > is to add a bit of garbage into the long message of each patch. If
> > you prefix this garbage with something reasonable, we may even add a
> > feature to hide that garbage from our users.
>
> Thank you David,
You're welcome!
> And I now understand better how issue27 born :)
>
> So, once you know what the problem is, it's very easy to install a
> workaround in tailor, just changing the "patch-name-format" option.
Indeed. In fact, in tailor you needn't add random garbage, but could
instead add a little note indicating that the change was generated by
tailor running on a particular repository. This wouldn't fix all the
issue27 problems (e.g. if one svn repository has two different changes with
identical names and dates), but it would fix this particular problem, and
would also add human-friendly information.
> Is there any way for darcs to be more precise in its error message?
> Could it diagnose that duplicate id is the reason behind?
That would be hard. The trouble is that darcs assumes that two changes
with the same name are the same change. In this case, darcs then tries to
move those two changes into the same context, but is unable to do so,
because the only common context is the empty repository, and each of these
changes has some sort of a dependency. It'd be hard for darcs to figure
out that this is what happened (just as it was hard for me to figure out
what had happened, and I'm smarter than darcs is...). It might be able to
hazard a guess, but most often this particular bug message is actually
related to conflicts.
--
David Roundy
Department of Physics
Oregon State University
|
msg4152 (view) |
Author: zooko |
Date: 2008-03-31.19:39:03 |
|
On Mar 31, 2008, at 9:38 AM, David Roundy wrote:
>
> This is a duplicate of issue27. It's debatable whether this is a
> bug in darcs
> or a bug in tailor. Zooko would argue, I'm sure, that darcs
> shouldn't give you
> the power to shoot yourself in the foot.
My argument is that patch ids should be unique, so that it is
impossible for there to exist two different patches with the same
patch id. This is a property that monotone guaranteed, which git
adopted, and which mercurial and bzr now offer as well.
It seems like it could be a useful property to rely upon. It could
theoretically be used by a future version of darcs to
cryptographically verify the provenance of a repository, the way that
monotone and the others already do.
The "garbage" that David referred to in his note would be a secure
hash of the contents of the patch and the context of the patch, just
as is done in the other revision control tools.
Hopefully it wouldn't need to be encoded into the long patch
description itself, however, since that could collide with other uses
of the long patch description. Hopefully the patch hash information
could be stored with the patch description in a separate field, just
like monotone and the others do.
Regards,
Zooko
|
msg4153 (view) |
Author: droundy |
Date: 2008-03-31.20:29:22 |
|
On Mon, Mar 31, 2008 at 01:32:43PM -0600, zooko wrote:
> On Mar 31, 2008, at 9:38 AM, David Roundy wrote:
> >This is a duplicate of issue27. It's debatable whether this is a bug in
> >darcs or a bug in tailor. Zooko would argue, I'm sure, that darcs
> >shouldn't give you the power to shoot yourself in the foot.
>
> My argument is that patch ids should be unique, so that it is
> impossible for there to exist two different patches with the same
> patch id. This is a property that monotone guaranteed, which git
> adopted, and which mercurial and bzr now offer as well.
>
> It seems like it could be a useful property to rely upon. It could
> theoretically be used by a future version of darcs to cryptographically
> verify the provenance of a repository, the way that monotone and the
> others already do.
You can never rely upon this property in the presence of hostile attackers,
and in the absence of hostile attackers, the existing behavior is
adequate. I would say that if one developer creates two patches with the
same name at the same time, he's most likely hostile or he's using a
poorly-designed tool. If one developer creates patches using another
developer's name then he's definitely hostile (or perhaps confused as to
his identity).
> The "garbage" that David referred to in his note would be a secure hash
> of the contents of the patch and the context of the patch, just as is
> done in the other revision control tools.
No, it would be a secure hash of the contents of the patch and its context
*at the time that it's created*, which is something that cannot be verified
or used in any way (except perhaps if you're lucky, or don't use much of
darcs' functionality). So assuming it's a secure hash, then this is
garbage.
The key mistake you're making is that you seem to assume that we could
check this hash, but it's an uncheckable hash, because there's no reason to
believe we could ever again recreate that context. So this information is
no more useful than a truly random number. Its only advantage over a few
bytes from /dev/random would be (a) that it doesn't deplete your entropy
pool and (b) that tools like tailor would generate the same output when run
twice on the same repository.
> Hopefully it wouldn't need to be encoded into the long patch
> description itself, however, since that could collide with other uses
> of the long patch description. Hopefully the patch hash information
> could be stored with the patch description in a separate field, just
> like monotone and the others do.
Sorry, it *would* be encoded into the long patch description itself. As
I've explained to you before. I'm not going to break
backwards-compatibility.
--
David Roundy
Department of Physics
Oregon State University
|
msg4156 (view) |
Author: zooko |
Date: 2008-03-31.20:47:04 |
|
> The key mistake you're making is that you seem to assume that we could
> check this hash, but it's an uncheckable hash, because there's no
> reason to
> believe we could ever again recreate that context.
This is what I meant by "theoretically could be used by a future
version of darcs". It is theoretically possible that a future
version of darcs could get access to the context.
If in the future there were an extension to darcs to provide such
contexts, then darcs would gain the same provenance guarantee that
the other decentralized revision control tools offer without losing
its unique flexibility.
Perhaps such an extension is too difficult to implement, but perhaps
not.
> So this information is
> no more useful than a truly random number. Its only advantage over
> a few
> bytes from /dev/random would be (a) that it doesn't deplete your
> entropy
> pool
/dev/urandom suffices for that, and is no less secure than /dev/random.
> and (b) that tools like tailor would generate the same output when run
> twice on the same repository.
This would be a nice property for it to have.
> Sorry, it *would* be encoded into the long patch description
> itself. As
> I've explained to you before. I'm not going to break
> backwards-compatibility.
I see.
Regards,
Zooko
|
msg4157 (view) |
Author: droundy |
Date: 2008-03-31.20:55:39 |
|
On Mon, Mar 31, 2008 at 02:40:48PM -0600, zooko wrote:
> >The key mistake you're making is that you seem to assume that we could
> >check this hash, but it's an uncheckable hash, because there's no reason
> >to believe we could ever again recreate that context.
>
> This is what I meant by "theoretically could be used by a future
> version of darcs". It is theoretically possible that a future
> version of darcs could get access to the context.
No, if a patch in this context is obliterated (and the file describing that
patch is deleted, or this copy of the repository is deleted), then there is
absolutely no way any possible future version of darcs (okay, maybe I
should add that the hard drive was thrown into a volcano) could reconstruct
that context. Barring a search of all possible patch names that might have
ever been created.
> If in the future there were an extension to darcs to provide such
> contexts, then darcs would gain the same provenance guarantee that
> the other decentralized revision control tools offer without losing
> its unique flexibility.
As mentioned above, there is no possible way such an extension could be
written.
> Perhaps such an extension is too difficult to implement, but perhaps
> not.
No, it's not difficult. It's impossible.
If we modified a future version of darcs to store all patches ever
recorded, and transmit all such patches to every other repository it comes
in contact with, then this hash could be useful for verification purposes.
But until we make that change, there's just no reason to store it except as
repeatable pseudorandom garbage.
--
David Roundy
Department of Physics
Oregon State University
|
msg4161 (view) |
Author: lele |
Date: 2008-03-31.22:58:09 |
|
On Mon, 31 Mar 2008 18:32:00 -0000
David Roundy <bugs@darcs.net> wrote:
> > So, once you know what the problem is, it's very easy to install a
> > workaround in tailor, just changing the "patch-name-format" option.
>
> Indeed. In fact, in tailor you needn't add random garbage, but could
> instead add a little note indicating that the change was generated by
> tailor running on a particular repository. This wouldn't fix all the
> issue27 problems (e.g. if one svn repository has two different
> changes with identical names and dates), but it would fix this
> particular problem, and would also add human-friendly information.
Well, I think the patch-name-format option offers a good workaround to
that problem as well: by default it rewrites the upstream changelog
prepending something like "[upstream-svn-repo @ 1234]" (where 1234 is
the upstream svn revid) to its text, so effectively those different
changes with identical names and dates [and author, I may add] would
produce /different/ darcs hashes.
I experienced the problem myself exactly because, for the very first
time, I changed that option to avoid that prefix :-)
So the solution for both issues, at least from the tailor point of
view, is just a matter of differentiating in some way the
patch-name-format option.... that is, trusting tailor's default ;-)
I'll add a note about this in the README.
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
msg4164 (view) |
Author: droundy |
Date: 2008-04-01.13:11:23 |
|
On Mon, Mar 31, 2008 at 10:58:11PM -0000, Lele Gaifax wrote:
> So the solution for both issues, at least from the tailor point of
> view, is just a matter of differentiating in some way the
> patch-name-format option.... that is, trusting tailor's default ;-)
Ah, that explains why you haven't fixed this! :) (i.e. it's already been
fixed by default.)
> I'll add a note about this in the README.
Thanks!
--
David Roundy
Department of Physics
Oregon State University
|
|
Date |
User |
Action |
Args |
2008-03-31 14:44:42 | lele | create | |
2008-03-31 14:53:23 | droundy | set | status: unread -> unknown nosy:
+ darcs-devel, droundy messages:
+ msg4131 |
2008-03-31 14:53:50 | droundy | set | priority: bug nosy:
droundy, tommy, beschmi, kowey, darcs-devel, lele |
2008-03-31 14:53:59 | droundy | set | nosy:
- darcs-devel |
2008-03-31 15:20:56 | droundy | set | nosy:
+ darcs-devel messages:
+ msg4134 |
2008-03-31 15:38:35 | droundy | set | nosy:
droundy, tommy, beschmi, kowey, darcs-devel, lele messages:
+ msg4139 |
2008-03-31 15:38:51 | droundy | set | status: unknown -> duplicate nosy:
droundy, tommy, beschmi, kowey, darcs-devel, lele superseder:
+ patch ids are not collision-free |
2008-03-31 16:52:43 | lele | set | nosy:
droundy, tommy, beschmi, kowey, darcs-devel, lele messages:
+ msg4146 |
2008-03-31 18:32:00 | droundy | set | nosy:
droundy, tommy, beschmi, kowey, darcs-devel, lele messages:
+ msg4148 |
2008-03-31 19:39:05 | zooko | set | nosy:
+ zooko messages:
+ msg4152 |
2008-03-31 20:29:24 | droundy | set | nosy:
droundy, tommy, beschmi, kowey, darcs-devel, zooko, lele messages:
+ msg4153 |
2008-03-31 20:33:04 | droundy | set | nosy:
- droundy, darcs-devel |
2008-03-31 20:47:05 | zooko | set | nosy:
+ darcs-devel, droundy messages:
+ msg4156 |
2008-03-31 20:55:40 | droundy | set | nosy:
droundy, tommy, beschmi, kowey, darcs-devel, zooko, lele messages:
+ msg4157 |
2008-03-31 22:58:11 | lele | set | nosy:
droundy, tommy, beschmi, kowey, darcs-devel, zooko, lele messages:
+ msg4161 |
2008-04-01 13:11:24 | droundy | set | nosy:
droundy, tommy, beschmi, kowey, darcs-devel, zooko, lele messages:
+ msg4164 |
2008-04-01 13:16:42 | droundy | set | status: duplicate -> resolved nosy:
droundy, tommy, beschmi, kowey, darcs-devel, zooko, lele |
2009-08-06 17:57:45 | admin | set | nosy:
+ markstos, jast, Serware, dmitry.kurochkin, dagit, mornfall, simon, thorkilnaur, - droundy, lele |
2009-08-06 21:01:29 | admin | set | nosy:
- beschmi |
2009-08-10 22:19:03 | admin | set | nosy:
+ lele, - markstos, jast, dagit, Serware, mornfall |
2009-08-25 18:08:15 | admin | set | nosy:
- simon |
2009-08-27 13:57:23 | admin | set | nosy:
tommy, kowey, darcs-devel, zooko, lele, thorkilnaur, dmitry.kurochkin |
|