darcs

Issue 1808 Another possible bug in URL.waitNextUrl: curl_multi_perform() - no running handles

Title Another possible bug in URL.waitNextUrl: curl_multi_perform() - no running handles
Priority bug Status resolved
Milestone Resolved in
Superseder Nosy List darcs-devel, dmitry.kurochkin, fplccl, kowey, sionescu, thorkilnaur, twb, zooko
Assigned To
Topics HTTP

Created on 2010-04-02.09:33:16 by sionescu, last changed 2010-06-03.22:41:24 by kowey.

Messages
msg10618 (view) Author: sionescu Date: 2010-04-02.09:33:15
tmp $ darcs get http://www.foldr.org/~michaelw/projects/redshank
darcs: bug at src/URL.hs:246 compiled Mar 31 2010 22:25:30
Another possible bug in URL.waitNextUrl:  curl_multi_perform() - no running handles
See http://wiki.darcs.net/index.html/BugTrackerHowto for help on bug reporting.
Identifying repository http://www.foldr.org/~michaelw/projects/redshank format

tmp $ darcs --exact-version
darcs compiled on Mar 31 2010, at 22:25:30

Context:

[TAG 2.4
Reinier Lamers <tux_rocker@reinier.de>**20100226180900
 Ignore-this: 36ce0456c214345f55a7bc5fc142e985
]

-- 
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib
msg10619 (view) Author: kowey Date: 2010-04-02.09:36:31
Hi Stelian,

This looks like another duplicate of issue1368.  We now have quite a
sizeable collection of these happening.  I'll send you more questions in
a moment.
msg10620 (view) Author: kowey Date: 2010-04-02.09:39:22
Hi Stelian,

You filed issue1808, which is a duplicate of this one.  You're
definitely not alone!  We've seen at least 4 people affected by this,
but so far nobody has been able to supply us with information for
debugging.  Can you help?

Did you build Darcs yourself?  If so, could you let us know

 - GHC version
 - libcurl version
 - if pipelining is enabled
 - if you can reproduce it also while using --debug-http

Thanks!
msg10629 (view) Author: sionescu Date: 2010-04-02.13:22:14
On Fri, 2010-04-02 at 09:39 +0000, Eric Kow wrote:
> Eric Kow <kowey@darcs.net> added the comment:
> 
> Hi Stelian,
> 
> You filed issue1808, which is a duplicate of this one.  You're
> definitely not alone!  We've seen at least 4 people affected by this,
> but so far nobody has been able to supply us with information for
> debugging.  Can you help?
> 
> Did you build Darcs yourself?  If so, could you let us know

Built using the Gentoo port. The build script is:
http://sources.gentoo.org/viewcvs.py/*checkout*/gentoo-x86/dev-vcs/darcs/darcs-2.4.ebuild

> 
>  - GHC version

tmp $ ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.10.4

>  - libcurl version

net-misc/curl-7.20.0-r2 (Gentoo)

>  - if pipelining is enabled

How do I find that out ?

>  - if you can reproduce it also while using --debug-http

tmp $ darcs get --debug-http http://www.foldr.org/~michaelw/projects/redshank
* About to connect() to www.foldr.org port 80 (#0)
*   Trying 2001:470:1f0b:15ba::1... * Failed to connect to 2001:470:1f0b:15ba::1: Network is unreachable
* Success
* couldn't connect to host
* Expire cleared
* Closing connection #0
* About to connect() to www.foldr.org port 80 (#0)
*   Trying 2001:470:1f0b:15ba::1... * Failed to connect to 2001:470:1f0b:15ba::1: Network is unreachable
* Success
* couldn't connect to host
* Expire cleared
* Closing connection #0
darcs: bug at src/URL.hs:246 compiled Mar 31 2010 22:25:30
Another possible bug in URL.waitNextUrl:  curl_multi_perform() - no running handles
See http://wiki.darcs.net/index.html/BugTrackerHowto for help on bug reporting.
withSignalsHandled: Interrupted!                                              


As a first guess I'd say that this is a bug in curl because I've seen it
in git too(oh, the irony): my network interface has a link-local
auto-configured IPv6 address but since my ISP uses IPv4 I have no IPv6
gateway to the internet, yet curl tries to connect to the IPv6 address
of the remote host without falling back to the IPv4 address
My current workaround is to add the remote hosts's IPv4 address to
my /etc/hosts

-- 
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib
msg10630 (view) Author: kowey Date: 2010-04-02.13:58:47
Hi, Zooko, Dmitry, Trent: news!

Stelian: thanks for the below.  We may *finally* be able to start making
progress on this with the details you've provided.

One more question: is this intermittent or systematic?  If it's systematic,
maybe I should not assume you have the same bug here, and re-open issue1808.

requests for Zooko
------------------
1. Please have a look at the --debug-http output and Stelian's guess below.
   Do they ring any bells?

2. Could you configure your build slave (marked as Ubuntu Gutsy at the time
   of this filing) to use darcs get --debug-http?

  darcs get --verbose --partial --repo-name build http://allmydata.org/ source/tahoe/server-hashedformat

comments for Dmitry, Trent
--------------------------
Hi guys, we're quickly entering over-Eric's-head territory (in all its
vastness).

Trent: I suppose this is still very different from issue1770 (particularly
       since you believe pipelining is not involved), but I'm hoping this gives
       some ideas of a common pattern regard curl

Dmitry: If you have a moment, any ideas?

On Fri, Apr 02, 2010 at 15:20:26 +0200, Stelian Ionescu wrote:
> Built using the Gentoo port. The build script is:
> http://sources.gentoo.org/viewcvs.py/*checkout*/gentoo-x86/dev-vcs/darcs/darcs-2.4.ebuild

Thanks!  This is a nice step forward because we've determined that you have
pipelining enabled (I think)

> net-misc/curl-7.20.0-r2 (Gentoo)

And that by rights your libcurl should be non-buggy (unless there's another
bug we don't know about)
 
> >  - if pipelining is enabled
> 
> How do I find that out ?

Some background: if you cabal install darcs with no flags, you get a version of
Darcs where pipelining has to be explicitly enabled on runtime; however if you
pass in -fcurl-pipelining, then the default changes (so that you have to
explicitly *disable* it on runtime).  In the future, when Cabal ticket
http://hackage.haskell.org/trac/hackage/ticket/342 is fixed, we will automatically
set -fcurl-pipelining depending on whether your libcurl is advanced enough to
avoid a pipelining bug with HTTP proxies (>= 7.19.1, I think).

Anyway, your Gentoo ebuild file seems to indicate pipelining is enabled by default
dd

src_configure() {
        # Use curl for net stuff to avoid strict version dep on HTTP and network

        cabal_src_configure \
                --flags=curl \
                --flags=-http \
                --flags=curl-pipelining \
                --flags=color \
                --flags=terminfo \
                --flags=mmap
}

> 
> tmp $ darcs get --debug-http http://www.foldr.org/~michaelw/projects/redshank
> * About to connect() to www.foldr.org port 80 (#0)
> *   Trying 2001:470:1f0b:15ba::1... * Failed to connect to 2001:470:1f0b:15ba::1: Network is unreachable
> * Success
> * couldn't connect to host
> * Expire cleared
> * Closing connection #0
> * About to connect() to www.foldr.org port 80 (#0)
> *   Trying 2001:470:1f0b:15ba::1... * Failed to connect to 2001:470:1f0b:15ba::1: Network is unreachable
> * Success
> * couldn't connect to host
> * Expire cleared
> * Closing connection #0
> darcs: bug at src/URL.hs:246 compiled Mar 31 2010 22:25:30
> Another possible bug in URL.waitNextUrl:  curl_multi_perform() - no running handles
> See http://wiki.darcs.net/index.html/BugTrackerHowto for help on bug reporting.
> withSignalsHandled: Interrupted!                                              
> 
> 
> As a first guess I'd say that this is a bug in curl because I've seen it
> in git too(oh, the irony): my network interface has a link-local
> auto-configured IPv6 address but since my ISP uses IPv4 I have no IPv6
> gateway to the internet, yet curl tries to connect to the IPv6 address
> of the remote host without falling back to the IPv4 address
> My current workaround is to add the remote hosts's IPv4 address to
> my /etc/hosts

This is interesting.  Maybe if it's a curl bug that's triggering this
instance of the problem, and if other instances are not related to this
problem, fixing the curl bug will also solve them by coincidence.

But let's dig around for some more information first.

Do you have any of the skills needed to boil this down into some sort
of example for the curl guys?

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9
msg10632 (view) Author: sionescu Date: 2010-04-02.15:01:25
On Fri, 2010-04-02 at 14:58 +0100, Eric Kow wrote:
> Hi, Zooko, Dmitry, Trent: news!
> 
> Stelian: thanks for the below.  We may *finally* be able to start making
> progress on this with the details you've provided.
> 
> One more question: is this intermittent or systematic?  If it's systematic,
> maybe I should not assume you have the same bug here, and re-open issue1808.

It's systematic, that's why I thought that the bug might consist in a
faulty address ordering/selection algorithm in curl
That said, though I haven't yet encountered this bug outside of
darcs/git(both using curl) it might well be that the bug is in
glibc(2.11)
Compiling curl with --disable-ipv6 "fixes" this, but that's unconclusive

> > As a first guess I'd say that this is a bug in curl because I've seen it
> > in git too(oh, the irony): my network interface has a link-local
> > auto-configured IPv6 address but since my ISP uses IPv4 I have no IPv6
> > gateway to the internet, yet curl tries to connect to the IPv6 address
> > of the remote host without falling back to the IPv4 address
> > My current workaround is to add the remote hosts's IPv4 address to
> > my /etc/hosts
> 
> This is interesting.  Maybe if it's a curl bug that's triggering this
> instance of the problem, and if other instances are not related to this
> problem, fixing the curl bug will also solve them by coincidence.
> 
> But let's dig around for some more information first.
> 
> Do you have any of the skills needed to boil this down into some sort
> of example for the curl guys?

Yes, and I'll try to debug it sometime this weekend, but right now I'm a
bit busy

-- 
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib
msg10633 (view) Author: kowey Date: 2010-04-02.15:34:55
Thanks for the help!

I've moved msg10620, msg10621 and msg10630 back to this ticket from
issue1368.  Sorry for the admin noise caused by my premature merging.

My current belief is that you've uncovered a new bug with similar
symptoms to all the other things that trigger this error message.  I'll
be a bit more careful with this HTTP stuff in the future.
msg11104 (view) Author: kowey Date: 2010-05-24.07:10:09
Hi Stelian,

How does the current darcs HEAD cope with this?  We have this patch now,
which, if nothing else, removes this particular error message

Thu Apr 15 23:47:39 BST 2010  Dmitry Kurochkin <dmitry.kurochkin@gmail.com>
  * Fix hscurl.c when URL is downloaded during the first call to
curl_multi_perform.
  Turns out that the first call to curl_multi_perform() can fetch the URL or
  result in error. I can easily reproduce this using HTTP server on
localhost.
  This means that situation when running_handles is zero is valid, so
remove the
  error and handle it correctly.
msg11227 (view) Author: fplccl Date: 2010-06-03.22:10:58
Same bug. With specified patches are applied.

user@host $ darcs --exact-version
darcs compiled on Jun  3 2010, at 21:02:56

Context:

[TAG 2.4.4
Eric Kow <kowey@darcs.net>**20100515090819
 Ignore-this: 7d1a0e6a17c2be314f6ab1607bbcac13
] 

/usr/portage/dev-vcs/darcs/darcs-2.4.4.ebuild:
src_configure() {
	...
        cabal_src_configure \
                --flags=curl \
                --flags=-http \
                --flags=curl-pipelining \
                --flags=color \
                --flags=terminfo \
                --flags=mmap \
                $threaded_flag \
                $(cabal_flag test)
}

net-misc/curl-7.20.0-r2

Reproducing.

user@host $ darcs get --lazy
"http://code.haskell.org/gentoo/gentoo-haskell/"

This is the gentoo-haskell darcs overlay.

Please report bugs in the IRC channel #gentoo-haskell on the freenode
network, or send mail to haskell@gentoo.org.

Please do *not* report bugs from this overlay at bugs.gentoo.org.
**********************
darcs: bug at src/URL.hs:246 compiled Jun  3 2010 21:02:56                  
Another possible bug in URL.waitNextUrl:  curl_multi_perform() - no
running handles
msg11232 (view) Author: kowey Date: 2010-06-03.22:41:23
Hi fplccl, there's a subtle difference between this bug and issue1368. 
I think here I'm erring on the side of conservatism by supposing there
may be some peculiarity to Stelian's config (see msg10629).

Anyway, I think I'll move your message somewhere more appropriate and
comment on it there.

Meanwhile, I'm re-setting this to resolved on the expectation that the
patch in Darcs HEAD (not 2.4.4, but the unstable Darcs) solves the
problem.  More later.  If anything else goes wrong, with that patch in
place. I'd like Stelian to report a new bug.
History
Date User Action Args
2010-04-02 09:33:16sionescucreate
2010-04-02 09:36:33koweysetstatus: unknown -> duplicate
priority: bug
title: Bug at src/URL.hs:246 with darcs 2.4 -> Another possible bug in URL.waitNextUrl: curl_multi_perform() - no running handles
nosy: + kowey
messages: + msg10619
topic: + HTTP
2010-04-02 09:36:46koweysetsuperseder: + Another possible bug in URL.waitNextUrl: curl_multi_perform() - no running handles
2010-04-02 15:30:12adminsetnosy: + zooko, thorkilnaur
messages: + msg10620, msg10629
2010-04-02 15:30:45adminsetnosy: + twb, naur
messages: + msg10630, msg10632
2010-04-02 15:34:56koweysetstatus: duplicate -> has-patch
messages: + msg10633
superseder: - Another possible bug in URL.waitNextUrl: curl_multi_perform() - no running handles
assignedto: sionescu
2010-05-24 07:10:10koweysetstatus: has-patch -> waiting-for
messages: + msg11104
2010-06-03 22:10:59fplcclsetassignedto: sionescu -> kowey
messages: + msg11227
nosy: + fplccl
2010-06-03 22:41:24koweysetstatus: waiting-for -> resolved
nosy: - naur
messages: + msg11232
assignedto: kowey ->