I have darcs 2.3.1 on server (linux) and client (osx) and this has been
working fine for months. Recently darcs get over http started hanging
for one of the three repositories. You can see this happening on this
buildbot:
http://tahoe-lafs.org/buildbot-zfec/builders/mac-amd64-osx-10.4.11-ootles
e.g.:
http://tahoe-lafs.org/buildbot-zfec/builders/mac-amd64-osx-10.4.11-ootles/builds/30
It doesn't happen for the same client and server on other repositories,
e.g.:
http://tahoe-lafs.org/buildbot/builders/zooko%20ootles%20Mac-amd64%2010.4/builds/409
Other builders building at the same time from the same repo on the same
server but from a different client had no problem, e.g.:
http://tahoe-lafs.org/buildbot-zfec/builders/buildbot.rubenkerkhof.com/builds/19
This happens with this newly created repo (ticket1026-hashedformat) but
not with the repo that it came from (trunk-hashedformat).
I deleted the ~/.darcs/cache on the server side--this didn't change the
behavior.
I upgraded the client from 2.3.1 to 2.4.4 and that didn't change the
behavior. I upgraded the server from 2.3.1 to 2.4.4 and that didn't
change the behavior.
I ran "darcs check" and it reported all clean.
I rebooted the server and that didn't change the behavior.
Next I decided to try backing up the repo and then obliterating recent
patches to that branch. My first such attempt yielded surprising errors:
source@org:~/darcs/zfec/ticket1026-hashedformat$ darcs oblit
Sun May 23 14:59:52 PDT 2010 zooko@zooko.com
* setup: add misc/run_trial.py
Shall I obliterate this patch? (1/15) [ynWvplxdaqjk], or ? for help: y
Sun May 23 14:52:46 PDT 2010 zooko@zooko.com
* setup: bundle the latest zetuptoolz as unpacked source egg and
change setup.py to load it directly instead of with ez_setup.py
Shall I obliterate this patch? (2/15) [ynWvplxdaqjk], or ? for help: d
Warning: ./zfec/misc/run_trial.py-0: takeFile ./zfec/misc/run_trial.py-0
in /home/source/darcs/zfec/ticket1026-hashedformat: openFd: permission
denied (Permission denied)
Warning: Cannot remove non-empty file ./zfec/misc/run_trial.py
Warning: Not deleting ./zfec/misc because it is not empty.
Finished obliterating.
Hm, this means that there are permissions problems on the directory, right?
source@org:~/darcs/zfec/ticket1026-hashedformat$ ls -al zfec/misc/
total 12
drwxr-sr-x 2 zooko source 4096 May 23 15:09 .
drwxrwsr-x 8 source source 4096 May 23 15:09 ..
-rw-r--r-- 1 zooko source 44 May 23 15:09 run_trial.py
Yes the "source" user has read but not write permissions on that directory.
However, running chmod -R g+u on the repo didn't fix the problem. So I
restored the repo from backup, did the chmod, and then tried the oblit
again:
source@org:~/darcs/zfec/ticket1026-hashedformat$ darcs oblit
Sun May 23 14:59:52 PDT 2010 zooko@zooko.com
* setup: add misc/run_trial.py
Shall I obliterate this patch? (1/15) [ynWvplxdaqjk], or ? for help: y
Sun May 23 14:52:46 PDT 2010 zooko@zooko.com
* setup: bundle the latest zetuptoolz as unpacked source egg and
change setup.py to load it directly instead of with ez_setup.py
Shall I obliterate this patch? (2/15) [ynWvplxdaqjk], or ? for help: d
Finished obliterating.
Next I tried the darcs get again, and it still hangs, but now it hangs
in a different place! Every run of the darcs get up until now has hung
after outputting like this (with --version --debug --timing --partial):
Sun May 23 23:16:45 MDT 2010: Adding new file to existing UrlRequest.
Sun May 23 23:16:45 MDT 2010: URL.waitUrl
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-8d5c515fc53ccd3deada935baa77b78f474dbc37129a8ead02d4b80226d71ec3
But now that I've obliterated this patch, now it ends like this:
URL.waitUrl
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-21808dcf8927e10480f7351b7ec5977531b23e031a79a5338325d0470837d38c
Sun May 23 23:21:28 MDT 2010: Adding new file to existing UrlRequest.
oh and when I try the same thing again then this time it ends like this:
Sun May 23 23:22:28 MDT 2010: Adding new file to existing UrlRequest.
Sun May 23 23:22:28 MDT 2010: URL.waitUrl
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-21808dcf8927e10480f7351b7ec5977531b23e031a79a5338325d0470837d38c
Oh I just noticed that I have "--partial" in there. That's not right.
What if I replace it with "--lazy"... Same thing:
Sun May 23 23:25:07 MDT 2010: Adding new file to existing UrlRequest.
Sun May 23 23:25:07 MDT 2010: URL.waitUrl
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-21808dcf8927e10480f7351b7ec5977531b23e031a79a5338325d0470837d38c
Okay, I'll try obliterating some more patches and see what happens...
source@org:~/darcs/zfec/ticket1026-hashedformat$ darcs oblit
Sun May 23 14:52:46 PDT 2010 zooko@zooko.com
* setup: bundle the latest zetuptoolz as unpacked source egg and
change setup.py to load it directly instead of with ez_setup.py
Shall I obliterate this patch? (1/14) [ynWvplxdaqjk], or ? for help: y
Sun May 23 14:13:42 PDT 2010 zooko@zooko.com
* setup: remove bundled copy of zetuptoolz -- we can just rely on the
ez_setup.py bootstrapping script installing the normal setuptools (or
whatever passes for the normal setuptools on your system)
Shall I obliterate this patch? (2/14) [ynWvplxdaqjk], or ? for help: d
Finished obliterating.
Now it hangs like this:
Sun May 23 23:25:48 MDT 2010: Adding new file to existing UrlRequest.
Sun May 23 23:25:48 MDT 2010: URL.waitUrl
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-372cc4d342e54de2702fff72b6e4e2b6a99c1e37015d57cd56d2e72669214c80
Then I deleted the ~/.darcs/cache on the client side--that didn't change
the behavior.
Then I obliterated another patch:
source@org:~/darcs/zfec/ticket1026-hashedformat$ darcs oblit
Sun May 23 14:13:42 PDT 2010 zooko@zooko.com
* setup: remove bundled copy of zetuptoolz -- we can just rely on the
ez_setup.py bootstrapping script installing the normal setuptools (or
whatever passes for the normal setuptools on your system)
Shall I obliterate this patch? (1/13) [ynWvplxdaqjk], or ? for help: y
Sat May 15 20:01:22 PDT 2010 zooko@zooko.com
* setup: new improved show-tool-versions.sh
Shall I obliterate this patch? (2/13) [ynWsfvplxdaqjk], or ? for help: d
Finished obliterating.
Now it hangs on:
Sun May 23 23:28:00 MDT 2010: Adding new file to existing UrlRequest.
Sun May 23 23:28:00 MDT 2010: URL.waitUrl
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
Now I have obliterated all the patches that distinguish the
ticket1026-hashedformat branch from the trunk-hashedformat branch! And
yet if I try the darcs get again on the ticket1026-hashedformat branch
then the get hangs, but if I do the same command on the
trunk-hashedformat branch then the get succeeds.
Okay, I stop here. I will attach an archive file containing the complete
current contents of the ticket1026-hashedformat and trunk-hashedformat
directories, named "both.7z".
Aha! But wait! I just ran "darcs optimize --uncompress" on the
ticket1026-hashedformat directory and now the darcs get from it
succeeds! Here I now attach the archive showing the contents of that
directory after optimize --uncompress, named
"ticket1026-hashedformat-uncompressed.7z".
Just to be sure, I went back and restored to the version of
ticket1026-hashedformat from the both.7z archive, and then the client
quickly completed its darcs get until I thought to remove the client's
cache after which the client went back to hanging whenever I did a darcs
get.
Now, I did some darcs obliterating in the ticket1026-hashedformat
repository earlier today, so if there is a bug in darcs obliterate in
darcs-2.3.1 then that might explain the repository corruption.
Oh wait. Dammit. I tried to reproduce it one last time by running "darcs
optimize --uncompress" again and this time it did *not* fix the problem,
even after removing the client cache. :-( So now I'm confused. Also I
just noticed this in the get output from the failed gets:
Sun May 23 23:42:48 MDT 2010: URL.waitNextUrl succeeded:
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/prefs/prefs
_darcs/pref
s/prefs
Sun May 23 23:42:48 MDT 2010: Done identifying repository .
Sun May 23 23:42:48 MDT 2010: Identified darcs-1 repo:
/Users/zfecbuildslave/tmp/build
Sun May 23 23:42:48 MDT 2010: Writing working directory contents...
Sun May 23 23:42:48 MDT 2010: Copying hashed pristine tree:
0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
Sun May 23 23:42:48 MDT 2010: Beginning copying pristine
Sun May 23 23:42:48 MDT 2010: Reading hash file
0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
from pristine.hashed/
Sun May 23 23:42:48 MDT 2010: I'm doing copyFileUsingCache on
pristine.hashed/0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba1
24945cc2
Sun May 23 23:42:48 MDT 2010: URL.copyUrlWithPriority
(http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/000000028
4-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
->
/home/source/.darcs/cache/pristine.hashed/0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2)
Sun May 23 23:42:48 MDT 2010: URL.urlThread
(http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-7bcd4756
519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
->
/home/source/.darcs/cache/pristine.hashed/0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2)
Sun May 23 23:42:48 MDT 2010: URL.request_url
(http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-7bcd47
56519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
->
/home/source/.darcs/cache/pristine.hashed/0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2)
Sun May 23 23:42:48 MDT 2010: URL.waitUrl
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-7bcd4756519
929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
Sun May 23 23:42:48 MDT 2010: Failed to start download URL
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/00000
00284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2:
fopen() failed
Sun May 23 23:42:48 MDT 2010: Failed to download URL
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-
7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2:
fopen() failed
Sun May 23 23:42:48 MDT 2010: In fetchFileUsingCachePrivate I'm going
manually
Sun May 23 23:42:48 MDT 2010: getting
0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
Sun May 23 23:42:48 MDT 2010: from
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-7bcd4756519929
c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
Sun May 23 23:42:48 MDT 2010: URL.copyUrlWithPriority
(http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/000000028
4-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
-> ./darcs17275)
Sun May 23 23:42:48 MDT 2010: URL.urlThread
(http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-7bcd4756
519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
-> ./darcs17275)
Sun May 23 23:42:48 MDT 2010: Adding new file to existing UrlRequest.
Sun May 23 23:42:48 MDT 2010: URL.waitUrl
http://tahoe-lafs.org/source/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-7bcd4756519
929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
See that fopen() failed? That does not appear in any log file from a
successful download and it appears in every log file from a hung
download. Well, why does that fail? That file is world-readable. Here I
am su'ing to be the www-data user and catting that file:
$ cat
/home/source/darcs/zfec/ticket1026-hashedformat/_darcs/pristine.hashed/0000000284-7bcd4756519929c587b779cf72c63114aeb9e0650c94223717271ba124945cc2
MMjD1####N;####+ڼ#########+o#.{z#ί######c7@
##X=###}O"T#######>EZ#####################
Okay now I'm really going to stop. Here is an archive that has these two
repositories which have the same set of patches from darcs's point of
view, yet pulling from the trunk-hashedformat one always succeeds and
pulling from the ticket1026-hashedformat one always gives this fopen()
failure and hangs. It is named "both-uncompressed.7z".
Attachments
|