darcs

Issue 879 show authors => hangs (2.0.0 + 268)

Title show authors => hangs (2.0.0 + 268)
Priority bug Status duplicate
Milestone Resolved in
Superseder darcs hangs when invoked with no arguments and pager quits
View: 1278
Nosy List Serware, darcs-devel, dmitry.kurochkin, kowey, lele, mornfall, thorkilnaur, tommy
Assigned To lele
Topics Regression

Created on 2008-05-26.14:27:54 by lele, last changed 2009-10-23.23:29:19 by admin.

Messages
msg4863 (view) Author: lele Date: 2008-05-26.14:27:52
Doing a "darcs query authors" when darcs decides it needs to spawn a
pager and the reenters, you get it hung forever, eating all CPU
time. A "kill -9" is needed.

This is with 2.0.0 (2.0.0 (+ 268 patches)), with ghc 6.8.3: I'm quite
sure it worked ok till, say, +230.

ciao, lele.
-- 
nickname: Lele Gaifax    | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas    | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net |                 -- Fortunato Depero, 1929.
msg4867 (view) Author: lele Date: 2008-05-26.18:32:52
On Mon, 26 May 2008 14:27:55 -0000
Lele Gaifax <bugs@darcs.net> wrote:

> Doing a "darcs query authors" when darcs decides it needs to spawn a
> pager and the reenters, you get it hung forever, eating all CPU
> time. A "kill -9" is needed.
> 
> This is with 2.0.0 (2.0.0 (+ 268 patches)), with ghc 6.8.3: I'm quite
> sure it worked ok till, say, +230.

Ok, I can reliably track the culprit to the patch

Thu May 22 18:06:17 CEST 2008  David Roundy <droundy@darcs.net>
  * bump optimization level up to -O2.
  I'm not quite sure why we ever dropped it down... presumably someone was
  impatient with compile times.

A double "make clean; make", with and without that specific patch,
shows that using -O2 causes (at least) that strange problem. So maybe
it was not just impatience...

hth,
ciao, lele.
-- 
nickname: Lele Gaifax    | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas    | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net |                 -- Fortunato Depero, 1929.
msg4880 (view) Author: kowey Date: 2008-05-29.17:04:31
Lele, please submit a rollback patch for this, making sure it really does solve
the problem. Hopefully David can have a look (or just apply it...) and see what
the root cause is.
msg4888 (view) Author: lele Date: 2008-05-30.06:38:40
On Thu, 29 May 2008 17:04:35 -0000
Eric Kow <bugs@darcs.net> wrote:

> Lele, please submit a rollback patch for this, making sure it really
> does solve the problem. Hopefully David can have a look (or just
> apply it...) and see what the root cause is.

My problem is that "making sure it really solves the problem"
part. For sure *all* the times I recompiled darcs with that patch *on
this machine*, an ubuntu 8.04 on amd64, with ghc 6.8.2, it hangs.

But yesterday I pulled a few recent patches from the darcs.net and
recompiled: even if I skipped the mentioned patch, I got the same
behaviour... :-|

This was the first and only time I got the hang /without/ the -O2
patch, but nevertheless I checked twice before recompilation, the
patch wasn't there but the newly built darcs exhibited the hang.

A "make clean; make" cycle solved the problem.

I did some other investigation, that didn't enlight me. With an
"hanging darcs":

* "darcs query authors | less" does not hang
* "strace query authors" obviously shows a lot of "wait4()" calls but
  as soon as I hit 'q' and the pager exits, darcs regains control and
  exits without problems

So, while I'm sure it makes a difference, it isn't clear to me if
there is effectively some problem in darcs or not, possibly exposed by
different optimization levels...

Is anybody else experiencing the same behaviour?

ciao, lele.
-- 
nickname: Lele Gaifax    | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas    | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net |                 -- Fortunato Depero, 1929.
msg4889 (view) Author: kowey Date: 2008-05-30.10:16:46
Ok, so if you pull the optimisation patch, and make clean (maybe even make
maintainer-clean) and make again, does it still hang?
msg4891 (view) Author: lele Date: 2008-05-30.11:49:01
On Fri, 30 May 2008 10:16:48 -0000
Eric Kow <bugs@darcs.net> wrote:

> Ok, so if you pull the optimisation patch, and make clean (maybe even
> make maintainer-clean) and make again, does it still hang?

Oh boy, I'm really sorry for that, but this time (going thru
maintainer-clean) recompiling *with* that patch I got a non-hanging
darcs...

So, I'll keep this monitored, but seems a rather strange and
intermittent compilation bug that affects only me.

ciao, lele.
-- 
nickname: Lele Gaifax    | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas    | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net |                 -- Fortunato Depero, 1929.
msg4893 (view) Author: kowey Date: 2008-05-30.11:58:14
Ok, well, I'm going to mark this as resolved in the meantime.  Please re-open if
it strikes back.
msg4917 (view) Author: lele Date: 2008-06-02.12:47:00
On Fri, 30 May 2008 11:58:15 -0000
Eric Kow <bugs@darcs.net> wrote:

> 
> Eric Kow <eric.kow@gmail.com> added the comment:
> 
> Ok, well, I'm going to mark this as resolved in the meantime.  Please
> re-open if it strikes back.

I dunno if, and how, I should reopen this, but I have one more
observation as it just happened again.

I ran "darcs query authors" once, immediately found what I was looking
for and exited shortly after to the shell prompt, without problems.

A minute later I needed another info, and repeated the query: this time
it took a little longer to find it out... and the issue reappeared!

I repeated the two steps several times: if I exit immediately from the
pager, darcs regains control and correctly quits, while if I wait
"enough" before quitting the pager (less, in my case), darcs hungs
when it reenters.

How long is "enough" seems related to the number of patches: the
problem appears when my mental counter exceeds 8-10 on the tailor
repository (1600 patches), while on the darcs one (5782 patches) it
happens only after at least 25 mental-ticks...

This "have to wait in the pager" to see the problem may have fooled me
when I was trying to figure out if the -O2 patch was the culprit.

Hope this helps,
ciao, lele.
-- 
nickname: Lele Gaifax    | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas    | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net |                 -- Fortunato Depero, 1929.
msg4919 (view) Author: kowey Date: 2008-06-02.13:09:42
> How long is "enough" seems related to the number of patches: the
> problem appears when my mental counter exceeds 8-10 on the tailor
> repository (1600 patches), while on the darcs one (5782 patches) it
> happens only after at least 25 mental-ticks...

[oops, resending to list]

Maybe we can reproduce this more easily by setting PAGER to varying
values of sleep...
msg4932 (view) Author: lele Date: 2008-06-03.14:54:23
On Mon, 2 Jun 2008 14:18:34 +0100
"Eric Kow" <eric.kow@gmail.com> wrote:

> Maybe we can reproduce this more easily by setting PAGER to varying
> values of sleep...

I tried that, with this simple script:

    #!/bin/bash

    i=0
    while true
    do
      i=$[i+3]
      cat >/tmp/mypager <<EOF
    echo Waiting for $i seconds...
    sleep $i
    exit 0
    EOF
      chmod 755 /tmp/mypager
      DARCS_PAGER=/tmp/mypager darcs query a
      echo ok
    done

but for some reason, that does not seem to trigger the problem, at
least not in the first minute or so...

Instead I gave strace enough time: when darcs behave normally, I see
this trace:

    ... 
    wait4(24975, 0x7fff1c389554, 0, NULL)   = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    rt_sigreturn(0x1a)                      = 61
    wait4(24975, 0x7fff1c389554, 0, NULL)   = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    rt_sigreturn(0x1a)                      = 61
    wait4(24975, 0x7fff1c389554, 0, NULL)   = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    write(4, "\377", 1)                     = 1
    rt_sigreturn(0x2)                       = 61
    wait4(24975, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 24975
    --- SIGCHLD (Child exited) @ 0 (0) ---
    unlink("/tmp/darcs-pager24972")         = 0
    getcwd("/home/lele/wip/darcs-2.0", 4096) = 25
    chdir("/home/lele/wip/darcs-2.0")       = 0
    chdir("/home/lele/wip/darcs-2.0")       = 0
    getrusage(RUSAGE_SELF, {ru_utime={0, 192012}, ru_stime={0, 24001}, ...}) = 0
    write(4, "\376", 1)                     = 1
    clock_gettime(0xfffffffe /* CLOCK_??? */, {0, 214695426}) = 0
    rt_sigprocmask(SIG_BLOCK, [HUP INT ABRT PIPE TERM], [], 8) = 0
    rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
    futex(0xbe0eac, 0x85 /* FUTEX_??? */, 1) = 1
    futex(0xbe0ed8, 0x81 /* FUTEX_??? */, 1) = 1
    futex(0xbcad80, 0x81 /* FUTEX_??? */, 1) = 1
    futex(0xbe0eac, 0x85 /* FUTEX_??? */, 1) = 1
    futex(0xbe0ed8, 0x81 /* FUTEX_??? */, 1) = 1
    futex(0xbcad80, 0x81 /* FUTEX_??? */, 1) = 1
    sched_yield()                           = 0
    timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
    timer_delete(0)                         = 0
    fcntl(0, F_GETFL)                       = 0x8002 (flags O_RDWR|O_LARGEFILE)
    fcntl(1, F_GETFL)                       = 0x8002 (flags O_RDWR|O_LARGEFILE)
    fcntl(2, F_GETFL)                       = 0x8002 (flags O_RDWR|O_LARGEFILE)
    rt_sigprocmask(SIG_BLOCK, [TTOU], [], 8) = 0
    rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
    getrusage(RUSAGE_SELF, {ru_utime={0, 192012}, ru_stime={0, 24001}, ...}) = 0
    exit_group(0)                           = ?

But after "threshold" seconds I get this instead:

    ...
    wait4(26621, 0x7fff8c6c6884, 0, NULL)   = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    rt_sigreturn(0x1a)                      = 61
    wait4(26621, 0x7fff8c6c6884, 0, NULL)   = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    write(4, "\377", 1)                     = 1
    rt_sigreturn(0x2)                       = 61
    wait4(26621, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 26621
    --- SIGCHLD (Child exited) @ 0 (0) ---
    unlink("/tmp/darcs-pager26618")         = 0
    getcwd("/home/lele/wip/darcs-2.0", 4096) = 25
    chdir("/home/lele/wip/darcs-2.0")       = 0
    chdir("/home/lele/wip/darcs-2.0")       = 0
    getrusage(RUSAGE_SELF, {ru_utime={0, 208013}, ru_stime={0, 28001}, ...}) = 0
    write(4, "\376", 1)                     = 1
    clock_gettime(0xfffffffe /* CLOCK_??? */, {0, 231146281}) = 0
    rt_sigprocmask(SIG_BLOCK, [HUP INT ABRT PIPE TERM], [], 8) = 0
    rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
    futex(0xbe10dc, 0x85 /* FUTEX_??? */, 1) = 1
    sched_yield()                           = 0
    sched_yield()                           = 0
    sched_yield()                           = 0
    sched_yield()                           = 0

and that final "sched_yield()" goes on forever... That's the loop that
requires a "kill -9" to be stopped.

ciao, lele.
-- 
nickname: Lele Gaifax    | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas    | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net |                 -- Fortunato Depero, 1929.
msg6921 (view) Author: mornfall Date: 2008-12-28.12:06:41
Hi Lele, can you still reproduce this problem with 2.2.0pre1 or such? I believe 
there have been some pager-related changes in the meantime. I couldn't 
reproduce on Debian myself, waiting fairly long in the pager before quitting 
it. Darcs apparently never went into an infinite spin for me. (I have the exact 
same less version, it seems...)
msg7136 (view) Author: mornfall Date: 2009-01-18.14:27:27
Let's pull all those hanging reports together.
msg8697 (view) Author: kowey Date: 2009-09-04.23:07:54
I saw 'pager', 'hang' and '64' in this report, so I'm assuming this really is a
duplicate of issue1278 and closing it accordingly.  Please re-open and yell at
me if this was a silly thing to do.
History
Date User Action Args
2008-05-26 14:27:54lelecreate
2008-05-26 18:32:55lelesetstatus: unread -> unknown
nosy: + droundy
messages: + msg4867
2008-05-29 16:53:35koweylinkissue880 superseder
2008-05-29 17:04:35koweysetstatus: unknown -> has-patch
priority: urgent
title: darcs query authors does not terminate correctly -> show authors => hangs (2.0.0 + 268)
nosy: + Serware, kowey
messages: + msg4880
topic: + Regression, Target-2.0
assignedto: lele
2008-05-30 06:38:42lelesetnosy: droundy, tommy, beschmi, kowey, dagit, lele, Serware
messages: + msg4888
2008-05-30 10:16:48koweysetnosy: droundy, tommy, beschmi, kowey, dagit, lele, Serware
messages: + msg4889
2008-05-30 11:49:04lelesetnosy: droundy, tommy, beschmi, kowey, dagit, lele, Serware
messages: + msg4891
2008-05-30 11:58:15koweysetpriority: urgent -> bug
status: has-patch -> resolved
messages: + msg4893
nosy: droundy, tommy, beschmi, kowey, dagit, lele, Serware
2008-06-02 12:47:05lelesetstatus: resolved -> unknown
nosy: droundy, tommy, beschmi, kowey, dagit, lele, Serware
messages: + msg4917
2008-06-02 13:09:45koweysetnosy: droundy, tommy, beschmi, kowey, dagit, lele, Serware
messages: + msg4919
2008-06-03 14:54:25lelesetnosy: droundy, tommy, beschmi, kowey, dagit, lele, Serware
messages: + msg4932
2008-12-28 12:06:51mornfallsetstatus: unknown -> waiting-for
nosy: + dmitry.kurochkin, mornfall, simon, thorkilnaur
topic: - Target-2.0
messages: + msg6921
2009-01-18 14:27:31mornfallsetnosy: droundy, tommy, beschmi, kowey, dagit, lele, simon, thorkilnaur, dmitry.kurochkin, Serware, mornfall
superseder: + darcs hangs when invoked with no arguments and pager quits
messages: + msg7136
2009-08-06 17:58:52adminsetnosy: + markstos, jast, darcs-devel, zooko, - droundy, lele
2009-08-06 21:05:58adminsetnosy: - beschmi
2009-08-10 22:21:00adminsetnosy: + lele, - markstos, darcs-devel, zooko, jast
2009-08-11 00:15:18adminsetnosy: - dagit
2009-08-25 17:37:20adminsetnosy: + darcs-devel, - simon
2009-08-27 14:15:10adminsetnosy: tommy, kowey, darcs-devel, lele, thorkilnaur, dmitry.kurochkin, Serware, mornfall
2009-09-04 23:07:57koweysetstatus: waiting-for -> duplicate
nosy: tommy, kowey, darcs-devel, lele, thorkilnaur, dmitry.kurochkin, Serware, mornfall
messages: + msg8697
2009-10-23 22:43:21adminsetnosy: + serware, - Serware
2009-10-23 23:29:19adminsetnosy: + Serware, - serware