darcs

Issue 1420 Possible bug in URL.waitNextUrl: at src/URL.hs:236 (2.2.0)

Title Possible bug in URL.waitNextUrl: at src/URL.hs:236 (2.2.0)
Priority bug Status duplicate
Milestone Resolved in
Superseder Another possible bug in URL.waitNextUrl: curl_multi_perform() - no running handles
View: 1368
Nosy List corinna.anderson, darcs-devel, dmitry.kurochkin, kowey, mornfall, thorkilnaur, zooko
Assigned To
Topics HTTP

Created on 2009-04-07.19:13:56 by zooko, last changed 2010-03-30.13:18:29 by kowey.

Files
File name Uploaded Type Edit Remove
infinite-pull.sh kowey, 2009-06-03.08:44:14 application/x-sh
many-pull.sh kowey, 2009-06-03.12:08:37 application/x-sh
many-pull.sh kowey, 2009-06-04.15:14:54 application/x-sh
Messages
msg7597 (view) Author: zooko Date: 2009-04-07.19:13:54
A routine checkout just failed:

http://allmydata.org/buildbot/builders/dapper/builds/2241/steps/darcs/ 
logs/stdio

"""
darcs: bug in darcs!
Possible bug in URL.waitNextUrl:  at src/URL.hs:236 compiled Jan 18  
2009 11:53:51
I'm unable to check http://darcs.net/maintenance to see if this  
version is supported.
If it is supported, please report this to bugs@darcs.net
If possible include the output of 'darcs --exact-version'.
"""

The darcs executable is the darcs-2.2.0 linux binary executable  
distributed by Petr Rockai.

Regards,

Zooko
msg7637 (view) Author: kowey Date: 2009-04-09.12:19:06
I wasn't able to open that URL:
http://allmydata.org/buildbot/builders/dapper/builds/2241/steps/darcs/logs/stdio

Also, Dmitry, what do we mean when we say it's a possible bug?

Is this just transient network failure?
msg7666 (view) Author: dmitry.kurochkin Date: 2009-04-10.12:38:55
I believe this is a duplicate of issue1368. Possible bug means we got something
we should never get - "inconsistent state" between haskell and C parts. I did a
review of URL module and found one place where it can happen - when we request a
url and curl can not add it by some reason.

--debug --debug-http would really help to identify the problem. I have asked for
logs in issue1368 but there is no response.

I failed to reproduce this myself. So without logs I can just stare at the code
and try to understand what goes wrong. Not very productive...

It can be just a network failure which is not reported properly.

The curl error handling when url is requested should be improved. And I plan to
fix it. But unfortunately darcs can not be easily build on my Debian system
anymore due to the missing haskeline dependency. It is sitting in the new queue
for 3 weeks already... I tried to build haskeline from sources but it failed
with some weird error. I should look at it again but I am tight on free time.

Regards,
  Dmitry
msg7668 (view) Author: zooko Date: 2009-04-10.15:18:36
kowey: you were unable to open that URL due to a large network outage in
Northern California yesterday.  You should be able to reach it now.

Dmitry: that buildbot has been doing that same "darcs get" many times a day for
months and this is the first such failure, so I guess it is a very rare race
condition that will be hard to reproduce.  Also many other of my buildbots have
been doing similar "darcs gets" during the same time, and none of them exhibited
that problem.
msg7670 (view) Author: dmitry.kurochkin Date: 2009-04-10.18:14:45
Eric, Zooko: what do you think about running buildbots with --debug and
--debug-http flags? This way we can get more info when such failures happen.

Regards,
  Dmitry
msg7671 (view) Author: zooko Date: 2009-04-10.18:29:29
Oh look, it happened again:

http://allmydata.org/buildbot/builders/clean/builds/1952/steps/darcs/logs/stdio

This buildslave is also using Petr Rockai's executable:

http://allmydata.org/buildbot/builders/feisty2.5/builds/2226/steps/show-tool-

I clicked the button to rebuild the same thing, and the second time it didn't
fail in the same way:

http://allmydata.org/buildbot/builders/clean/builds/1953

I think I will try building darcs myself on those boxes where I'm using Petr's
build.  Oh wait, no the reason I'm using Petr's build is that those boxes have
too old of a version of GHC.  Hm...  :-/
msg7674 (view) Author: kowey Date: 2009-04-10.21:49:02
On Fri, Apr 10, 2009 at 18:14:48 -0000, Dmitry Kurochkin wrote:
> Eric, Zooko: what do you think about running buildbots with --debug and
> --debug-http flags? This way we can get more info when such failures happen.

Well, this is on the tahoe buildbot, but I suppose it could be just as
useful to do it on the darcs buildbot?

I wonder if it'll make things unreadable, but then I guess we almost
never read these until we need to anyway.
msg7676 (view) Author: kowey Date: 2009-04-10.22:25:11
On Fri, Apr 10, 2009 at 18:29:32 -0000, Zooko wrote:
> Oh look, it happened again:
> 
> http://allmydata.org/buildbot/builders/clean/builds/1952/steps/darcs/logs/stdio

Could you modify the tahoe buildbot config so that it's calling darcs
with the --debug and the --debug-http flags should this happen again?
msg7682 (view) Author: dmitry.kurochkin Date: 2009-04-11.06:48:57
I did not realize that it is tahoe buildbot. I think we should turn on debug
logging for darcs buildbot in the first place. But it would be nice if Zooko
enables it for tahoe buildbot as well since the problem happens there.

Let's make it --debug only for now. --debug-http is curl debugging and would add
really much unneeded info.
msg7689 (view) Author: kowey Date: 2009-04-11.12:37:12
On Sat, Apr 11, 2009 at 06:49:00 -0000, Dmitry Kurochkin wrote:
> I did not realize that it is tahoe buildbot. I think we should turn on debug
> logging for darcs buildbot in the first place.

Is this something you could look into?
 darcs get http://code.haskell.org/darcs/buildbot
and the quickstart.  My hope is that the quickstart really is quick.

It shouldn't require too many skills.  The thing I got stuck on is that
I tried googling for the buildbot API, but I couldn't find the darcs
class, so I don't know if it's easy to pass arbitrary flags to it.

Thanks!
msg7692 (view) Author: dmitry.kurochkin Date: 2009-04-11.14:15:21
On Sat, Apr 11, 2009 at 4:37 PM, Eric Kow <bugs@darcs.net> wrote:
>
> Eric Kow <kowey@darcs.net> added the comment:
>
> On Sat, Apr 11, 2009 at 06:49:00 -0000, Dmitry Kurochkin wrote:
>> I did not realize that it is tahoe buildbot. I think we should turn on debug
>> logging for darcs buildbot in the first place.
>
> Is this something you could look into?
>  darcs get http://code.haskell.org/darcs/buildbot
> and the quickstart.  My hope is that the quickstart really is quick.
>
> It shouldn't require too many skills.  The thing I got stuck on is that
> I tried googling for the buildbot API, but I couldn't find the darcs
> class, so I don't know if it's easy to pass arbitrary flags to it.

As far as I see Darcs class (as well as other VCS adapters) does not
support passing arbitrary arguments. I think we should go another way
- configure darcs to add --debug with $HOME/.darcs/defaults for
buildbot user or add DARCS_DEBUG environment variable to enable debug.

Regards,
  Dmitry

>
> Thanks!
>
> __________________________________
> Darcs bug tracker <bugs@darcs.net>
> <http://bugs.darcs.net/issue1420>
> __________________________________
>
msg7868 (view) Author: kowey Date: 2009-06-03.08:44:14
I'm attaching a script which may help us to trigger the bug.
I have a user who gets these fairly frequently.  I'll see if she has the time to
try running this script for me.
Attachments
msg7870 (view) Author: dmitry.kurochkin Date: 2009-06-03.09:09:41
This still happens on the latest darcs?

Debug (not --debug-http) is the first think I would like to look at.

It would be really nice to get this fixed for 2.3.

Regards,
  Dmitry
msg7871 (view) Author: kowey Date: 2009-06-03.12:08:37
I'm adding Corinna Anderson to this ticket.
[I hope that Roundup does the right thing when I CC her]

Corinna reported this output:

Pulling from "http://code.haskell.org/GenI/OT-GenI"...
darcs: bug in darcs!
Possible bug in URL.waitNextUrl:  at src/URL.hs:236 compiled Jan 18
2009 11:53:51
I'm unable to check http://darcs.net/maintenance to see if this
version is supported.
If it is supported, please report this to bugs@darcs.net
If possible include the output of 'darcs --exact-version'.

Corinna: two questions.  First, when you run the pull a second time,
does it succeed?

Secondly, if so we would like your help in attempting to reproduce
this bug.  Could you please do the following:

1) mkdir /tmp/darcs-test

2) save the attached script in /tmp/darcs-test

3) cd /tmp/darcs-test
   chmod u+x many-pull.sh
   ./many-pull.sh

....

The script is set to perform a darcs pull a thousand times.  If it
encounters any errors -- which I hope it does because it means we'll
have found a means of reproducing the bug -- please send us the
/tmp/darcs-log file

Thanks!
Attachments
msg7885 (view) Author: kowey Date: 2009-06-04.15:09:27
We have a rough appointment on 15 June with Corinna.  If somebody could be
available on Jabber, it'd be great.

Zooko: in the meantime, maybe the many-pull.sh script (on the tracker website)
can trigger this bug for you?  If not, perhaps try with the Tahoe repository?
msg7886 (view) Author: kowey Date: 2009-06-04.15:14:54
On Thu, Jun 04, 2009 at 15:09:30 -0000, Eric Kow wrote:
> Zooko: in the meantime, maybe the many-pull.sh script (on the tracker website)
> can trigger this bug for you?  If not, perhaps try with the Tahoe repository?

Whoops!  It occurred to me that this test is meaningless without
--no-cache

Updated script attached.
Attachments
msg7892 (view) Author: zooko Date: 2009-06-04.18:02:00
I haven't had a chance to look at this yet, but coincidentally it  
just happened again on one of the tahoe buildbots:

http://allmydata.org/buildbot/builders/clean/builds/2001

Regards,

Zooko
msg7893 (view) Author: dmitry.kurochkin Date: 2009-06-04.18:21:44
Hi Zooko.

As I have said before, the issue is probably resolved by:

Fri Apr 10 18:51:26 MSD 2009  Dmitry Kurochkin <dmitry.kurochkin@gmail.com>
  * Properly handle errors from request_url.

And buildbot says darcs was compiled Jan 18 2009. So it does not have this patch.

Can you update darcs and see if it still happens?

Anyway, the bug message does not give us much info. To really debug the issue I
need --debug log (and if that is not enough --debug-http). Can we think of
something to run buildbot darcs checkouts (and other HTTP related commands, if
any) with --debug flag?

Regards,
  Dmitry
msg7894 (view) Author: zooko Date: 2009-06-04.19:06:42
That machine has ghc-6.6, which I think is too old to build darcs.

Maybe someone could provide me a darcs executable with darcs-2.0.2 plus this
patch and I could try it?

Also, what was that about turning on debugging on this system?  I would be happy
to, but can I do it by editing a config file instead of a command-line (because
buildbot is executing the command-line).  Maybe add "pull --debug" to
"~/.darcs/prefs/defaults" ?  Would that do it?
msg7895 (view) Author: kowey Date: 2009-06-04.19:11:29
On Thu, Jun 04, 2009 at 19:06:47 -0000, Zooko wrote:
> That machine has ghc-6.6, which I think is too old to build darcs.

For what it's worth, we're still planning on supporting GHC 6.6 for
just one last release (darcs 2.3).

Getting it to work is a bit involved, but doable:
  http://wiki.darcs.net/GHC6.6

The darcs.net server still uses GHC 6.6

> Also, what was that about turning on debugging on this system?  I would be happy
> to, but can I do it by editing a config file instead of a command-line (because
> buildbot is executing the command-line).  Maybe add "pull --debug" to
> "~/.darcs/prefs/defaults" ?  Would that do it?

pull debug
pull debug-http
pull no-cache
msg7896 (view) Author: dmitry.kurochkin Date: 2009-06-04.19:16:14
On Thu, Jun 4, 2009 at 11:10 PM, Eric Kow <kowey@darcs.net> wrote:
> On Thu, Jun 04, 2009 at 19:06:47 -0000, Zooko wrote:
>> That machine has ghc-6.6, which I think is too old to build darcs.
>
> For what it's worth, we're still planning on supporting GHC 6.6 for
> just one last release (darcs 2.3).
>
> Getting it to work is a bit involved, but doable:
>  http://wiki.darcs.net/GHC6.6
>
> The darcs.net server still uses GHC 6.6
>
>> Also, what was that about turning on debugging on this system?  I would be happy
>> to, but can I do it by editing a config file instead of a command-line (because
>> buildbot is executing the command-line).  Maybe add "pull --debug" to
>> "~/.darcs/prefs/defaults" ?  Would that do it?
>
> pull debug
> pull debug-http
> pull no-cache

Let's avoid debug-http for now. That would add very much info.

Regards,
  Dmitry

>
> --
> Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
> PGP Key ID: 08AC04F9
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkooHC0ACgkQBUrOwgisBPmruQCgrVyS4edroOWyqzpPJUlZCLKC
> u8gAoJvWZdrkOkc3c+2lm3Atp1vAwaSp
> =mvLv
> -----END PGP SIGNATURE-----
>
>
msg7897 (view) Author: zooko Date: 2009-06-04.19:25:38
Okay I added "pull debug" and now this builder has debug information:

http://allmydata.org/buildbot/builders/clean

Also this one, which is the same buildslave:

http://allmydata.org/buildbot/builders/feisty2.5

I didn't add no-cache, because I've been occasionally getting the  
error without adding no-cache, so unless you really want me to, I  
don't think I should change that.

Also I'm trying to build a new darcs executable with Dmitry's patch.

Regards,

Zooko
msg7898 (view) Author: kowey Date: 2009-06-04.19:41:35
On Thu, Jun 04, 2009 at 13:21:41 -0600, Zooko Wilcox-O'Hearn wrote:
> I didn't add no-cache, because I've been occasionally getting the error 
> without adding no-cache, so unless you really want me to, I don't think I 
> should change that.

No, I got confused.

No-cache makes sense if you're trying to trigger the bug by repeatedly
pulling from the same repo (as in the script I mentioned).
msg7939 (view) Author: zooko Date: 2009-07-08.14:10:19
This happened again on the same buildslave:

http://allmydata.org/buildbot/builders/feisty2.5/builds/2331/steps/darcs/logs/stdio


Oh, I forgot that Dmitry asked me to try a newer version of darcs.  Maybe after
work today I can look for a statically linked binary of a new darcs for Linux. 

Also, I don't know why that machine doesn't have --debug, which I thought I had
configured...  :-(

Thanks!
msg8701 (view) Author: kowey Date: 2009-09-05.06:20:07
I believe Corinna and Zooko are experiencing the same bug.  Dmitry thinks that
it is resolved in 2.3.0.

So I intend to set up another appointment with Corinna so that we can (i)
attempt to reproduce this with 2.2.0 and (ii) see if we can still reproduce it
with 2.3.0.

Alternatively, if somebody could put together a 2.3.0 binary (Petr?), perhaps
Zooko could install it and we could wait a few months to see if happens again.
msg10577 (view) Author: kowey Date: 2010-03-30.13:18:28
Corinna and I never managed to reproduce this with my infinite pull
script.  Anyway, I suppose we'll have to assume this is a duplicate of
issue1368...
History
Date User Action Args
2009-04-07 19:13:56zookocreate
2009-04-09 12:19:08koweysetpriority: bug
status: unread -> waiting-for
messages: + msg7637
nosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
2009-04-10 12:38:58dmitry.kurochkinsetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7666
2009-04-10 12:39:49dmitry.kurochkinsettopic: + HTTP
nosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
assignedto: dmitry.kurochkin
2009-04-10 15:18:38zookosetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7668
2009-04-10 18:14:48dmitry.kurochkinsetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7670
2009-04-10 18:29:32zookosetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7671
2009-04-10 21:49:05koweysetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7674
2009-04-10 22:25:13koweysetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7676
2009-04-11 06:49:00dmitry.kurochkinsetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7682
2009-04-11 12:37:14koweysetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7689
2009-04-11 14:15:24dmitry.kurochkinsetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7692
title: Possible bug in URL.waitNextUrl: at src/URL.hs:236 -> Possible bug in URL.waitNextUrl: at src/URL.hs:236
2009-06-03 08:44:19koweysetfiles: + infinite-pull.sh
nosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7868
2009-06-03 09:09:43dmitry.kurochkinsetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin
messages: + msg7870
2009-06-03 12:08:42koweysetfiles: + many-pull.sh
nosy: + corinna.anderson
messages: + msg7871
2009-06-04 15:09:30koweysetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7885
2009-06-04 15:14:56koweysetfiles: + many-pull.sh
nosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7886
2009-06-04 18:02:03zookosetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7892
2009-06-04 18:21:47dmitry.kurochkinsetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7893
2009-06-04 19:06:47zookosetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7894
2009-06-04 19:11:31koweysetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7895
2009-06-04 19:16:16dmitry.kurochkinsetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7896
2009-06-04 19:25:41zookosetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7897
2009-06-04 19:41:37koweysetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7898
2009-07-08 14:10:27zookosetnosy: kowey, zooko, simon, thorkilnaur, dmitry.kurochkin, corinna.anderson
messages: + msg7939
2009-08-25 17:43:04adminsetnosy: + darcs-devel, - simon
2009-08-27 14:20:47adminsetnosy: kowey, darcs-devel, zooko, thorkilnaur, dmitry.kurochkin, corinna.anderson
2009-09-05 06:20:10koweysetstatus: waiting-for -> needs-reproduction
nosy: + mornfall
assignedto: dmitry.kurochkin -> kowey
messages: + msg8701
title: Possible bug in URL.waitNextUrl: at src/URL.hs:236 -> Possible bug in URL.waitNextUrl: at src/URL.hs:236 (2.2.0)
2010-03-30 13:18:29koweysetstatus: needs-reproduction -> duplicate
messages: + msg10577
superseder: + Another possible bug in URL.waitNextUrl: curl_multi_perform() - no running handles
assignedto: kowey ->