darcs

Issue 815 corrupt patch in darcs-2

Title corrupt patch in darcs-2
Priority bug Status duplicate
Milestone Resolved in
Superseder rmfile patch created without the actual content removal (tailor)
View: 693
Nosy List darcs-devel, dmitry.kurochkin, ertai, kowey, thorkilnaur, tommy, zooko
Assigned To
Topics

Created on 2008-04-25.16:56:58 by zooko, last changed 2009-10-24.00:05:53 by admin.

Files
File name Uploaded Type Edit Remove
0000000142-8a9a56b2d267a619e1931053128a1edc6db2da2fd9545ef33dfa789ce71c261e zooko, 2008-04-25.16:56:56 application/octet-stream
bash_history-from-issue815 zooko, 2008-04-26.01:51:16 application/octet-stream
ps.tar.bz2 zooko, 2008-04-26.01:50:28 application/octet-stream
trackbug2.tar.bz2 zooko, 2008-05-01.18:30:42 application/octet-stream
Messages
msg4335 (view) Author: zooko Date: 2008-04-25.16:56:56
Folks:

This is the first serious bug I've encountered in darcs-2.  I  
recorded a patch, and it offered me the "rm file" change without  
first offering me the "delete every line of the file change".  I  
thought this was mildly unusual, but I assumed that darcs knew what  
it was doing and recorded that patch and pushed it into the central,  
append-only repository for my project:

http://allmydata.org/trac/tahoe/changeset/2504

However, now certain operations on this repository yield darcs errors:

> allmydata/tahoe/trunk-bug$ darcs query contents --quiet --match  
> "hash 20071207003658- 
> e01fd-9c5c4455756f14fb24bf465869d43a9b78e7d1e0.gz" "src/allmydata/ 
> client.py"
>
> darcs failed:  Error applying hunk to file ./misc/hatch-eggs.py

Or equivalently:

http://allmydata.org/trac/tahoe/browser/src/allmydata/client.py?rev=1656

This is a major problem for me -- the central, canonical repository  
for the open source project and company seems to have some corruption  
now (although the consequences of this corruption are minor).

So I have a few questions:

1.  Why did darcs-2 offer to record removal of a file without  
recording removal of its contents?

2.  How can I repair my central repository with minimal disruption to  
the other programmers who rely on it?

Regards,

Zooko

attached is the offending patch file:
Attachments
msg4339 (view) Author: kowey Date: 2008-04-25.17:15:10
Escalating this to David-level (i.e. making him nosy)
msg4340 (view) Author: kowey Date: 2008-04-25.17:18:28
Zooko, could you make a cp -R copy of the repository and see what happens if you
unpull the offending patch in your local copy?  If that succeeds, maybe doing
the unpull is the way to go?

Users who have already pulled the patch will have do the same

Also, a darcs check on your local repo might help.

Hmm... maybe I should have asked for these before escalating.
msg4353 (view) Author: zooko Date: 2008-04-25.18:39:46
On Apr 25, 2008, at 11:18 AM, Eric Kow wrote:
> Zooko, could you make a cp -R copy of the repository and see what  
> happens if you
> unpull the offending patch in your local copy?

Yes, I will, and just so you know, you can try the same experiment  
yourself:

$ time darcs get http://allmydata.org/source/tahoe/trunk-hashedformat
# (took 40 seconds)

$ darcs check
The repository is consistent!

$ darcs oblit -p'hatch-eggs'
Wed Apr 23 14:41:56 MDT 2008  zooko@zooko.com
   * setup: remove the misc/hatch-eggs.py script, which we no longer  
use since [1937]
Shall I obliterate this patch? (1/1)  [ynWsfvpxdaqjk], or ? for help: y
Finished obliterating.
HACK wonwin-mcbrootles-computer:~/darcsbug/trunk-hashedformat$ darcs  
check
Looks like we have a difference...
Difference:  hunk ./misc/hatch-eggs.py 1
... I hereby ellide the entire contents of that file in order to send  
a shorter e-mail ...

Inconsistent repository!

I also tried the same thing with http://allmydata.org/source/tahoe/ 
trunk (which is in darcs-1-format a.k.a. old-fashioned-inventory  
format), and got the same result.  (Except the darcs get took 8  
minutes instead of 40 seconds.)

Then I tried "darcs repair" and then "darcs check".  The combination  
of obliterate followed by repair fixed it in both types of repository.

> If that succeeds, maybe doing
> the unpull is the way to go?
>
> Users who have already pulled the patch will have do the same

This would be either extremely inconvenient, or impossible, or would  
result in my development team abandoning darcs for good.  So, there's  
a chance it might work.  ;-)

Is there any way that a future version of darcs could be made  
'robust' enough to do something sensible with a repository that is  
corrupted in this sense?  For example, if "darcs repair" would detect  
that there was a patch that created this file, and added contents  
into it, and a later patch that removed the file, then darcs repair  
could decide that the later patch really had to have deleted the  
contents.

So, supposing that my development team would put up with obliterating  
the offending patch and then running darcs repair on all of our  
repositories.  This would also entail doing a resync of our trac  
instance [1] in order to make the obliterated patch disappear from  
the trac timeline (and cease offsetting the trac count-of-patches  
"revision number' by 1), and it might mean that the version numbers  
on some of our automatically built packages [2, 3] would be wrong.   
All of this just might be acceptable, if we understood what had gone  
wrong and knew that we could avoid this happening again in the future.

But at the moment, I don't understand why darcs-2 produced a patch  
like this.  I've reviewed the relevant part of my shell history  
(appended).  It looks like I ran "darcs amend" shortly after  
recording the relevant patch.  Perhaps I accidentally amended it to  
leave out the hunk while retaining the rm?  Or I amended it to add  
the rm without adding the hunk?  In my experience "darcs amend"  
correctly enforces the consistency constrain on this sort of thing,  
so I don't know how this happened.  The exact-version of darcs that  
I'm using is appended.

Regards,

Zooko

[1] http://allmydata.org
[2] http://allmydata.org/source/tahoe/tarballs/
[3] http://allmydata.org/debian/

------- included shell history
rm misc/hatch-eggs.py
darcs record --ask
time darcs pull ../prod/
time make test 2>&1 | tee zooko.make_test.log.txt
darcs amend
cd ../prod/
darcs oblit
time darcs pull ../trunk
darcs oblit
time make test 2>&1 | tee zooko.make_test.log.txt
time darcs push zooko@dev.allmydata.com:/home/darcs/tahoe/prod
darcs push ../trunk
cd ../trunk
time darcs push zooko@dev.allmydata.com:/home/darcs/tahoe/trunk
-------

Hm.  I think that there may have been an earlier patch that rm'ed  
misc/hatch-eggs.py, in the "../prod" repository, and the "oblit" in  
prod was to remove that old patch.  I'm not sure.

------- included darcs exact-version
darcs compiled on Apr  8 2008, at 18:37:34
# configured Tue Apr  8 18:35:12 PDT 2008
./configure --prefix=/usr/local/stow/darcs-2.0.0

Context:

[update README url links
gwern0@gmail.com**20080407201333]

[add a bit more debugging info to repository identification.
David Roundy <droundy@darcs.net>**20080408151912]

[resolve issue385: don't worry if we can't get local changes.
David Roundy <droundy@darcs.net>**20080408151823]

[add test for issue385.
David Roundy <droundy@darcs.net>**20080408151808]

[resolve issue786:  implement oops command.
David Roundy <droundy@darcs.net>**20080408145856]

[fix URL in network test.
David Roundy <droundy@darcs.net>**20080408145407]

[fix manual bug.
David Roundy <droundy@darcs.net>**20080407191736]

[add new show bug command (hidden) to see what darcs will report if  
we encounter a bug.
David Roundy <droundy@darcs.net>**20080407175410]

[automatically work out the version of the stable release.
David Roundy <droundy@darcs.net>**20080407171850]

[set prefs again (they got lost on convert).
David Roundy <droundy@darcs.net>**20080407171559]

[update darcs repository URL.
David Roundy <droundy@darcs.net>**20080407164601]

[fix up website for new release.
David Roundy <droundy@darcs.net>**20080407164010]

[simplify determine_release_state.pl.
David Roundy <droundy@darcs.net>**20080407153000]

[make determine_release_state.pl use changes --count.
David Roundy <droundy@darcs.net>**20080407152347]

[add --count output option to changes.
David Roundy <droundy@darcs.net>**20080407151825]

[TAG 2.0.0
David Roundy <droundy@darcs.net>**20080407150638]
-------
msg4355 (view) Author: zooko Date: 2008-04-25.19:07:38
I tried one more experiment:

I converted a repository (hashed-format) to darcs-2-format.  It was  
exactly the same -- 'darcs check' passed, but 'darcs query contents -- 
quiet --match "hash 20071207003658- 
e01fd-9c5c4455756f14fb24bf465869d43a9b78e7d1e0.gz" "src/allmydata/ 
client.py"' gives an error, obliterating the corrupted patch yielded  
and inconsistent repository, and darcs repair fixed it.

Regards,

Zooko
msg4357 (view) Author: zooko Date: 2008-04-26.01:50:28
I've been unable to reproduce this from a clean starting condition.  (I
basically spent all day today alternately trying to reproduce it and to devise a
fix for it.)  My best guess at this point (thanks to Simon Michael for the idea)
is that I had a corrupted patch that was generated by a prerelease of darcs-2
sitting around in my repo or in my global cache, and that patch caused the
modern corrupted patch (attached) to be produced by darcs-2, specifically by
getting darcs-2 to prompt me whether I wanted to record the rmfile change
without prompting me about the delete-the-lines change.  (This is similar to
what we suspect might have happened to me with issue687.)

This sounds a little far-fetched, but not impossible.  Does anyone know of bugs
in prereleases of darcs-2 that could have yielded corrupted patches?  I suppose
"darcs record" or "darcs amend" could have prompted me for the rmfile and not
the hunk if the pristine showed that the file was currently of zero length. 
Does anyone know of a bug in darcs before the 2.0 release that could have
corrupted pristine in such a way?

I scanned my global cache for patches having to do with hatch-eggs.py (attached,
as ps.tar.bz2), but saw nothing obviously culpable.  (It is strange that two of
them are identical except for curly brackets in the darcs patch syntax.)

Here is the complete shell history of the session which produced this patch on ,
attached as "bash_history-from-issue815".

If anyone has further suggestions for how to diagnose how this could have
happened, I would love to hear them.  I have made a .tar.bz2 of my entire (very
large) global cache in order to be able to inspect it later in search of
corrupted patches as soon as I know how to identify them.

Now, as to what to do about it I have two ideas:

First, it might be worth it for darcs apply to detect and reject bogus patches
like this.  This would have prevented the corrupt patch from getting into our
project's central, canonical, append-only repository, because when I pushed I
would have gotten "inapplicable patch!" because darcs had noticed that the patch
tried to rm a file while the file still had hunks.

Likewise, such a change would cause "darcs check" to raise the alarm on this
patch.  Currently, with darcs-2.0 release or with current trunk, "darcs check"
gives a repository the blessing of "consistent" though in fact it isn't
consisten because it contains a patch like this.

(Thanks to Ganesh Sittampalam for that idea.)

Second, in this particular case, or possibly in a general purpose "darcs
repair"-like tool, one could fix this patch by simply determining what the file
contents have to be and inserting a hunk to remove those lines.  One then has to
fix the hash, and then one has to learn more about how darcs works than one
already knows in order to make the new patch take the place of the old
(corrupted) one.  :-)  (No-one to blame but myself for that idea.)

Regards,

Zooko
Attachments
msg4358 (view) Author: droundy Date: 2008-04-26.11:47:21
It looks like this isn't an isolate occurrence in your repository, Zooko.  :(

When I instrument darcs to add checks for this sort of occurrence (which we used
to have, but got removed in the name of optimization, I believe), I get

$ time darcs check
Unapplicable patch:
Fri Nov  9 16:21:12 PST 2007  zooko@zooko.com
  * remove parts of pycrypto that we are no longer going to use: SHA256 and RSA

darcs failed:  Cannot remove non-empty file
./src/allmydata/Crypto/PublicKey/__init__.py

Which shows that this has happened more than once.  :(
msg4359 (view) Author: droundy Date: 2008-04-26.12:05:29
I think it would indeed be possible (and reasonable) to add a special case in
repair for this situation.  And then I'd like to make it so that check fails on
this sort of corruption (as I just did, in my local darcs repository).

I'm not sure when I'll have time to make this fix myself--my wrists are hurting
already (at 8:00 on a Saturday morning), so I need to avoid using the computer
much today.

On the plus side, it seems that this is the kind of corruption you've happily
lived with since November, so waiting a little while for a fix shouldn't hurt. 
I don't see any fundamental challenge in making repair handle this.  It's just
very, very tedious, because it requires a pretty fundamental change in the
behavior of repair.

I suppose we could generalize this by writing a

data OkayFixedOrWrong p = Okay | Wrong | Fixed p

checkAndTryToFixPrim :: Slurpy -> Prim -> OkayFixedOrWrong (FL Prim)

which fixes patches that can be fixed, and complains about patches that are
wrong.  (And, of course, this could be used to implement the same functionality
at a higher level in RepoPatch.)

This would allow us to reuse the same infrastructure for other situations where
we might be able to fix a corrupt repository.
msg4360 (view) Author: droundy Date: 2008-04-26.12:06:38
Oh, and I should add that I'm not aware of any bugs that could lead to this
corruption, so we have to assume the bug is still present.  So attempts to
reproduce this would be appreciated...
msg4361 (view) Author: zooko Date: 2008-04-26.12:34:44
> It looks like this isn't an isolate occurrence in your repository,  
> Zooko.  :(
>
> When I instrument darcs to add checks for this sort of occurrence  
> (which we used
> to have, but got removed in the name of optimization, I believe), I  
> get
>
> $ time darcs check
> Unapplicable patch:
> Fri Nov  9 16:21:12 PST 2007  zooko@zooko.com
>   * remove parts of pycrypto that we are no longer going to use:  
> SHA256 and RSA

November 9, 2007!  So that means that this happened with darcs-1.0.9  
or an interim version of darcs between 1.0.9 and darcs-2.0.

> Oh, and I should add that I'm not aware of any bugs that could lead  
> to this
> corruption, so we have to assume the bug is still present.  So  
> attempts to
> reproduce this would be appreciated...

I've been trying to figure out what could be causing this.  I tend to  
use obliterate and amend freely, and to navigate through large  
collections of changes with the interactive commands.

I remember in the past there was a bug in darcs in which if you made  
some interactive choices, and then navigated back up with "k" and  
made an incompatible choice, darcs would be confused.

However, yesterday I tried to reproduce this by hitting "n" to a hunk  
of removed lines, so darcs skipped the subsequent rmfile change, then  
I navigated back up to the rmfile change with "k" and hit "y", and  
darcs did the right thing which was to include the hunk of removed  
lines in the resulting patch.

Now, when the most recent corruped patch was created, I distinctly  
remember being surprised that darcs was offering me an rmfile without  
first a hunk of removed lines, so this is a clue.  This suggests that  
at that time the pristine version was corrupted in such a way that  
darcs thought that the file was already of zero length.  I also  
remembering seeing the hunk of removed lines earlier in the session,  
so at some point during that session the pristine version must have  
been corrupted.

In order to learn more about this, perhaps more people should use  
that self-check feature that David mentioned.  I myself started  
learning a little Haskell yesterday in order to implement such a  
feature.  :-)  If other people run that check on their repositories  
and find no such corrupted patches, then at least we know that it is  
something specific to my behavior, my versions of darcs, or my  
operating systems.  If other people do find such corrupted patches,  
then that is interesting, too.

Regards,

Zooko

P.S.  Please take it easy on your hands/wrists/arms!
msg4364 (view) Author: zooko Date: 2008-04-26.13:01:16
On Apr 26, 2008, at 6:05 AM, David Roundy wrote:
>
> On the plus side, it seems that this is the kind of corruption  
> you've happily
> lived with since November, so waiting a little while for a fix  
> shouldn't hurt.

Agreed.  And thank you for your attention to this matter.

Regards,

Zooko
msg4385 (view) Author: kowey Date: 2008-04-29.18:16:51
Zooko: I thought you might be interested in issue623
msg4386 (view) Author: kowey Date: 2008-04-29.18:18:23
Correction, issue693
msg4432 (view) Author: zooko Date: 2008-05-01.18:30:42
Now I find, in a different repository, the opposite problem -- there is a patch
which attempts to remove a file, and correctly removes the contents of the file
first, but there is no other patch which created that file or set those
contents.  Attached as "trackbug2.tar.bz2".
Attachments
msg4433 (view) Author: zooko Date: 2008-05-01.20:29:19
Oh dear.  Today, for reasons that I don't understand, any browsing of the
allmydata.org source code yields this error:

http://allmydata.org/trac/tahoe/browser

Running (cd /home/source/darcs/tahoe/trunk-hashedformat; TZ=UTC darcs query
contents --quiet --match "hash
20080104182742-92b7f-e8fd5c081c1f42ba0515e8d9f5ea23947e51c0c8.gz" "COPYING.GPL")
failed: 2, darcs failed: Error applying hunk to file ./misc/hatch-eggs.py :

This makes it more urgent for me to fix this.
msg4434 (view) Author: zooko Date: 2008-05-01.20:30:17
Looks like this is a duplicate of issue693.
History
Date User Action Args
2008-04-25 16:56:58zookocreate
2008-04-25 16:57:55zookolinkissue816 superseder
2008-04-25 17:15:12koweysetpriority: bug
nosy: + kowey, droundy
status: unread -> unknown
messages: + msg4339
2008-04-25 17:18:29koweysetnosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4340
2008-04-25 18:39:50zookosetnosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4353
2008-04-25 19:07:41zookosetnosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4355
2008-04-26 01:50:30zookosetfiles: + ps.tar.bz2
nosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4357
2008-04-26 01:51:17zookosetfiles: + bash_history-from-issue815
nosy: droundy, tommy, beschmi, kowey, zooko
2008-04-26 11:47:23droundysetnosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4358
2008-04-26 12:05:31droundysetnosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4359
2008-04-26 12:06:40droundysetnosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4360
2008-04-26 12:34:47zookosetnosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4361
2008-04-26 13:01:19zookosetnosy: droundy, tommy, beschmi, kowey, zooko
messages: + msg4364
2008-04-29 18:16:53koweysetnosy: + dagit
superseder: + get => out of memory (only on local filesystem!)
messages: + msg4385
2008-04-29 18:18:25koweysetnosy: droundy, tommy, beschmi, kowey, zooko, dagit
superseder: + rmfile patch created without the actual content removal (tailor), - get => out of memory (only on local filesystem!)
messages: + msg4386
2008-05-01 18:30:44zookosetfiles: + trackbug2.tar.bz2
nosy: droundy, tommy, beschmi, kowey, zooko, dagit
messages: + msg4432
2008-05-01 20:29:21zookosetnosy: droundy, tommy, beschmi, kowey, zooko, dagit
messages: + msg4433
2008-05-01 20:30:18zookosetstatus: unknown -> duplicate
nosy: droundy, tommy, beschmi, kowey, zooko, dagit
messages: + msg4434
2008-09-16 10:40:36ertaisetnosy: + ertai
2009-08-06 17:58:11adminsetnosy: + markstos, jast, Serware, dmitry.kurochkin, darcs-devel, mornfall, simon, thorkilnaur, - droundy, ertai
2009-08-06 21:03:13adminsetnosy: - beschmi
2009-08-10 22:20:00adminsetnosy: + ertai, - markstos, darcs-devel, jast, Serware, mornfall
2009-08-11 00:11:12adminsetnosy: - dagit
2009-08-25 18:08:27adminsetnosy: + darcs-devel, - simon
2009-08-27 13:58:05adminsetnosy: tommy, kowey, darcs-devel, zooko, thorkilnaur, ertai, dmitry.kurochkin
2009-10-23 22:40:50adminsetnosy: + nicolas.pouillard, - ertai
2009-10-24 00:05:53adminsetnosy: + ertai, - nicolas.pouillard