Created on 2008-05-26.14:27:54 by lele, last changed 2008-06-03.14:54:25 by lele.
| msg4932 (view) |
Author: lele |
Date: 2008-06-03.14:54:23 |
|
On Mon, 2 Jun 2008 14:18:34 +0100
"Eric Kow" <eric.kow@gmail.com> wrote:
> Maybe we can reproduce this more easily by setting PAGER to varying
> values of sleep...
I tried that, with this simple script:
#!/bin/bash
i=0
while true
do
i=$[i+3]
cat >/tmp/mypager <<EOF
echo Waiting for $i seconds...
sleep $i
exit 0
EOF
chmod 755 /tmp/mypager
DARCS_PAGER=/tmp/mypager darcs query a
echo ok
done
but for some reason, that does not seem to trigger the problem, at
least not in the first minute or so...
Instead I gave strace enough time: when darcs behave normally, I see
this trace:
...
wait4(24975, 0x7fff1c389554, 0, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 61
wait4(24975, 0x7fff1c389554, 0, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 61
wait4(24975, 0x7fff1c389554, 0, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
write(4, "\377", 1) = 1
rt_sigreturn(0x2) = 61
wait4(24975, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 24975
--- SIGCHLD (Child exited) @ 0 (0) ---
unlink("/tmp/darcs-pager24972") = 0
getcwd("/home/lele/wip/darcs-2.0", 4096) = 25
chdir("/home/lele/wip/darcs-2.0") = 0
chdir("/home/lele/wip/darcs-2.0") = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 192012}, ru_stime={0, 24001}, ...}) = 0
write(4, "\376", 1) = 1
clock_gettime(0xfffffffe /* CLOCK_??? */, {0, 214695426}) = 0
rt_sigprocmask(SIG_BLOCK, [HUP INT ABRT PIPE TERM], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0xbe0eac, 0x85 /* FUTEX_??? */, 1) = 1
futex(0xbe0ed8, 0x81 /* FUTEX_??? */, 1) = 1
futex(0xbcad80, 0x81 /* FUTEX_??? */, 1) = 1
futex(0xbe0eac, 0x85 /* FUTEX_??? */, 1) = 1
futex(0xbe0ed8, 0x81 /* FUTEX_??? */, 1) = 1
futex(0xbcad80, 0x81 /* FUTEX_??? */, 1) = 1
sched_yield() = 0
timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
timer_delete(0) = 0
fcntl(0, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
fcntl(1, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
fcntl(2, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
rt_sigprocmask(SIG_BLOCK, [TTOU], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 192012}, ru_stime={0, 24001}, ...}) = 0
exit_group(0) = ?
But after "threshold" seconds I get this instead:
...
wait4(26621, 0x7fff8c6c6884, 0, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = 61
wait4(26621, 0x7fff8c6c6884, 0, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
write(4, "\377", 1) = 1
rt_sigreturn(0x2) = 61
wait4(26621, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 26621
--- SIGCHLD (Child exited) @ 0 (0) ---
unlink("/tmp/darcs-pager26618") = 0
getcwd("/home/lele/wip/darcs-2.0", 4096) = 25
chdir("/home/lele/wip/darcs-2.0") = 0
chdir("/home/lele/wip/darcs-2.0") = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 208013}, ru_stime={0, 28001}, ...}) = 0
write(4, "\376", 1) = 1
clock_gettime(0xfffffffe /* CLOCK_??? */, {0, 231146281}) = 0
rt_sigprocmask(SIG_BLOCK, [HUP INT ABRT PIPE TERM], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0xbe10dc, 0x85 /* FUTEX_??? */, 1) = 1
sched_yield() = 0
sched_yield() = 0
sched_yield() = 0
sched_yield() = 0
and that final "sched_yield()" goes on forever... That's the loop that
requires a "kill -9" to be stopped.
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
| msg4919 (view) |
Author: kowey |
Date: 2008-06-02.13:09:42 |
|
> How long is "enough" seems related to the number of patches: the
> problem appears when my mental counter exceeds 8-10 on the tailor
> repository (1600 patches), while on the darcs one (5782 patches) it
> happens only after at least 25 mental-ticks...
[oops, resending to list]
Maybe we can reproduce this more easily by setting PAGER to varying
values of sleep...
|
| msg4917 (view) |
Author: lele |
Date: 2008-06-02.12:47:00 |
|
On Fri, 30 May 2008 11:58:15 -0000
Eric Kow <bugs@darcs.net> wrote:
>
> Eric Kow <eric.kow@gmail.com> added the comment:
>
> Ok, well, I'm going to mark this as resolved in the meantime. Please
> re-open if it strikes back.
I dunno if, and how, I should reopen this, but I have one more
observation as it just happened again.
I ran "darcs query authors" once, immediately found what I was looking
for and exited shortly after to the shell prompt, without problems.
A minute later I needed another info, and repeated the query: this time
it took a little longer to find it out... and the issue reappeared!
I repeated the two steps several times: if I exit immediately from the
pager, darcs regains control and correctly quits, while if I wait
"enough" before quitting the pager (less, in my case), darcs hungs
when it reenters.
How long is "enough" seems related to the number of patches: the
problem appears when my mental counter exceeds 8-10 on the tailor
repository (1600 patches), while on the darcs one (5782 patches) it
happens only after at least 25 mental-ticks...
This "have to wait in the pager" to see the problem may have fooled me
when I was trying to figure out if the -O2 patch was the culprit.
Hope this helps,
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
| msg4893 (view) |
Author: kowey |
Date: 2008-05-30.11:58:14 |
|
Ok, well, I'm going to mark this as resolved in the meantime. Please re-open if
it strikes back.
|
| msg4891 (view) |
Author: lele |
Date: 2008-05-30.11:49:01 |
|
On Fri, 30 May 2008 10:16:48 -0000
Eric Kow <bugs@darcs.net> wrote:
> Ok, so if you pull the optimisation patch, and make clean (maybe even
> make maintainer-clean) and make again, does it still hang?
Oh boy, I'm really sorry for that, but this time (going thru
maintainer-clean) recompiling *with* that patch I got a non-hanging
darcs...
So, I'll keep this monitored, but seems a rather strange and
intermittent compilation bug that affects only me.
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
| msg4889 (view) |
Author: kowey |
Date: 2008-05-30.10:16:46 |
|
Ok, so if you pull the optimisation patch, and make clean (maybe even make
maintainer-clean) and make again, does it still hang?
|
| msg4888 (view) |
Author: lele |
Date: 2008-05-30.06:38:40 |
|
On Thu, 29 May 2008 17:04:35 -0000
Eric Kow <bugs@darcs.net> wrote:
> Lele, please submit a rollback patch for this, making sure it really
> does solve the problem. Hopefully David can have a look (or just
> apply it...) and see what the root cause is.
My problem is that "making sure it really solves the problem"
part. For sure *all* the times I recompiled darcs with that patch *on
this machine*, an ubuntu 8.04 on amd64, with ghc 6.8.2, it hangs.
But yesterday I pulled a few recent patches from the darcs.net and
recompiled: even if I skipped the mentioned patch, I got the same
behaviour... :-|
This was the first and only time I got the hang /without/ the -O2
patch, but nevertheless I checked twice before recompilation, the
patch wasn't there but the newly built darcs exhibited the hang.
A "make clean; make" cycle solved the problem.
I did some other investigation, that didn't enlight me. With an
"hanging darcs":
* "darcs query authors | less" does not hang
* "strace query authors" obviously shows a lot of "wait4()" calls but
as soon as I hit 'q' and the pager exits, darcs regains control and
exits without problems
So, while I'm sure it makes a difference, it isn't clear to me if
there is effectively some problem in darcs or not, possibly exposed by
different optimization levels...
Is anybody else experiencing the same behaviour?
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
| msg4880 (view) |
Author: kowey |
Date: 2008-05-29.17:04:31 |
|
Lele, please submit a rollback patch for this, making sure it really does solve
the problem. Hopefully David can have a look (or just apply it...) and see what
the root cause is.
|
| msg4867 (view) |
Author: lele |
Date: 2008-05-26.18:32:52 |
|
On Mon, 26 May 2008 14:27:55 -0000
Lele Gaifax <bugs@darcs.net> wrote:
> Doing a "darcs query authors" when darcs decides it needs to spawn a
> pager and the reenters, you get it hung forever, eating all CPU
> time. A "kill -9" is needed.
>
> This is with 2.0.0 (2.0.0 (+ 268 patches)), with ghc 6.8.3: I'm quite
> sure it worked ok till, say, +230.
Ok, I can reliably track the culprit to the patch
Thu May 22 18:06:17 CEST 2008 David Roundy <droundy@darcs.net>
* bump optimization level up to -O2.
I'm not quite sure why we ever dropped it down... presumably someone was
impatient with compile times.
A double "make clean; make", with and without that specific patch,
shows that using -O2 causes (at least) that strange problem. So maybe
it was not just impatience...
hth,
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
| msg4863 (view) |
Author: lele |
Date: 2008-05-26.14:27:52 |
|
Doing a "darcs query authors" when darcs decides it needs to spawn a
pager and the reenters, you get it hung forever, eating all CPU
time. A "kill -9" is needed.
This is with 2.0.0 (2.0.0 (+ 268 patches)), with ghc 6.8.3: I'm quite
sure it worked ok till, say, +230.
ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@nautilus.homeip.net | -- Fortunato Depero, 1929.
|
Browse related patches:
unstable
|
stable
|
| Date |
User |
Action |
Args |
| 2008-06-03 14:54:25 | lele | set | nosy:
droundy, tommy, beschmi, kowey, dagit, lele, Serware messages:
+ msg4932 |
| 2008-06-02 13:09:45 | kowey | set | nosy:
droundy, tommy, beschmi, kowey, dagit, lele, Serware messages:
+ msg4919 |
| 2008-06-02 12:47:05 | lele | set | status: resolved -> chatting nosy:
droundy, tommy, beschmi, kowey, dagit, lele, Serware messages:
+ msg4917 |
| 2008-05-30 11:58:15 | kowey | set | priority: urgent -> bug status: in-progress -> resolved messages:
+ msg4893 nosy:
droundy, tommy, beschmi, kowey, dagit, lele, Serware |
| 2008-05-30 11:49:04 | lele | set | nosy:
droundy, tommy, beschmi, kowey, dagit, lele, Serware messages:
+ msg4891 |
| 2008-05-30 10:16:48 | kowey | set | nosy:
droundy, tommy, beschmi, kowey, dagit, lele, Serware messages:
+ msg4889 |
| 2008-05-30 06:38:42 | lele | set | nosy:
droundy, tommy, beschmi, kowey, dagit, lele, Serware messages:
+ msg4888 |
| 2008-05-29 17:04:35 | kowey | set | status: chatting -> in-progress topic:
+ Regression, ReleaseCritical_2.1 title: darcs query authors does not terminate correctly -> show authors => hangs (2.0.0 + 268) nosy:
+ Serware, kowey messages:
+ msg4880 priority: urgent assignedto: lele |
| 2008-05-29 16:53:35 | kowey | link | issue880 superseder |
| 2008-05-26 18:32:55 | lele | set | status: unread -> chatting nosy:
+ droundy messages:
+ msg4867 |
| 2008-05-26 14:27:54 | lele | create | |
|