Created on 2008-11-04.13:38:42 by simonmar, last changed 2010-06-15.21:51:36 by admin.
msg6590 (view) |
Author: simonmar |
Date: 2008-11-04.13:38:40 |
|
As requested by Eric, I'm submitting a new ticket on this. Note this slowdown is *not* caused by the GHC 6.8.2 getSymbolicLinkStatus bug.
2.0.2 (+ 326 patches)
> darcs +RTS --info
[("GHC RTS", "Yes")
,("GHC version", "6.8.3")
,("RTS way", "rts_thr")
,("Host platform", "x86_64-unknown-linux")
,("Build platform", "x86_64-unknown-linux")
,("Target platform", "x86_64-unknown-linux")
,("Compiler unregisterised", "NO")
,("Tables next to code", "YES")
]
With a fresh GHC testsuite repository (using darcs get --hashed http://darcs.haskell.org/testsuite):
~/darcs/testsuite > time darcs w -s
No changes!
16.48s real 9.64s user 5.18s system 89% darcs w -s
~/darcs/testsuite > time darcs w -s
No changes!
14.98s real 8.92s user 4.16s system 87% darcs w -s
ok, so that was on an NFS filesystem, so let's try locally:
/tmp > cp -r ~/darcs/testsuite .
/tmp > cd testsuite
/tmp/testsuite > time darcs w -s
No changes!
9.43s real 9.40s user 4.39s system 146% darcs w -s
/tmp/testsuite > time darcs w -s
No changes!
9.14s real 8.96s user 3.84s system 140% darcs w -s
And with an old darcs1 repository, again on NFS:
~/darcs/ghc-testing/testsuite > time darcs w -s
4.20s real 0.24s user 0.34s system 13% darcs w -s
~/darcs/ghc-testing/testsuite > time darcs w -s
2.38s real 0.22s user 0.39s system 25% darcs w -s
Fortunately the darcs-2 format is much better:
~/darcs/testsuite_0 > time darcs w -s
No changes!
3.79s real 1.17s user 0.79s system 51% darcs w -s
~/darcs/testsuite_0 > time darcs w -s
No changes!
2.65s real 1.13s user 0.70s system 69% darcs w -s
~/darcs/testsuite_0 > time darcs w -s
No changes!
1.67s real 1.11s user 0.74s system 110% darcs w -s
So I'm not sure how much you should care about this bug, since it only affects the transitional hashed format. Still, it will probably prevent me from using the hashed format for the GHC repos.
Cheers,
Simon
|
msg6614 (view) |
Author: mornfall |
Date: 2008-11-08.10:10:37 |
|
This is extremely weird, as it's unclear to me what difference there is between
hashed and darcs-2 in anything pertaining to whatsnew. Could you please call
"darcs revert" in both repos (saying "done" on first patch, ie. not reverting
anything) and re-running the checks? This "revert" should have the side-effect
of synchronising timestamps in working and pristine, so we can rule out any
timestamp-related issues.
Thank you.
|
msg6658 (view) |
Author: simonmar |
Date: 2008-11-12.14:15:11 |
|
> Petr Ročkai <me@mornfall.net> added the comment:
>
> This is extremely weird, as it's unclear to me what difference there is
> between
> hashed and darcs-2 in anything pertaining to whatsnew. Could you please
> call
> "darcs revert" in both repos (saying "done" on first patch, ie. not
> reverting
> anything) and re-running the checks? This "revert" should have the side-
> effect
> of synchronising timestamps in working and pristine, so we can rule out
> any
> timestamp-related issues.
Ok, I tried the 'darcs revert' trick, and it does indeed make a difference. However, it also made a difference on my darcs-1 repo, so we still have a performance discrepancy :-)
Here are the latest results. Two runs in each case, the first one to warm up the cache. All repos are NFS-mounted.
darcs-1:
$ time darcs w -s
M ./timeout/timeout.hs -32 +26
0.91s real 0.19s user 0.23s system 46% darcs w -s
$ time darcs w -s
M ./timeout/timeout.hs -32 +26
0.45s real 0.18s user 0.19s system 83% darcs w -s
(ok, I have a small local change there, hope it doesn't make a difference)
hashed:
$ time darcs w -s
No changes!
13.73s real 1.07s user 0.66s system 12% darcs w -s
$ time darcs w -s
No changes!
1.68s real 1.08s user 0.59s system 100% darcs w -s
(I think the first run fell afoul of load on our NFS server, ignore it)
darcs-2:
~/darcs/testsuite_0 > time darcs w -s
No changes!
3.23s real 0.97s user 0.49s system 45% darcs w -s
~/darcs/testsuite_0 > time darcs w -s
No changes!
1.71s real 1.10s user 0.55s system 96% darcs w -s
So there's roughly a factor of 4 difference between darcs-1 and hashed/darcs-2.
Again, this repo is available at http://darcs.haskell.org/testsuite, so you should be able to reproduce these results quite easily.
Cheers,
Simon
|
msg6882 (view) |
Author: mornfall |
Date: 2008-12-24.23:40:29 |
|
I believe most of this extra cost is coming from cache-related juggling in the
hashed codepaths. Last time I've checked, each filepath would get passed
through a regex and that alone could account for a nontrivial timeslice.
Moreover, the directory listings for hashed pristine live in compressed files
that need to be decompressed and parsed, which could account for another slice.
All in all, I'll look at profiles again and I'll hope to come up with some
improvements.
Thorkil, I'd like to make this a release goal for 2.3, but I don't yet know how
we want to mark our bugs. *wink*
|
msg6883 (view) |
Author: mornfall |
Date: 2008-12-24.23:44:07 |
|
Oh, I almost forgot, we should probably also make darcs whatsnew fix up
timestamps, like revert does, if that'd be possible without penalty on whatsnew
speed...
|
msg7253 (view) |
Author: kowey |
Date: 2009-01-31.21:30:14 |
|
Simon, did you "just" say NFS mounted?
I recently filed http://bugs.darcs.net/issue1293
Do you find darcs much faster when everything is on a local drive, (including
your global cache?)
Thanks!
|
msg7264 (view) |
Author: simonmar |
Date: 2009-02-03.12:04:59 |
|
Some of the slowdown I originally reported is probably due to having a global
cache in my NFS-mounted home directory (which I thought was disabled, but
apparently was not, but that's another bug report).
I can't symlink ~/.darcs/cache to a local drive, because the reason I have an
NFS-mounted home dir is so that I can use the same home dir on multiple
machines, so I would need a different symlink on each machine.
So now my global cache is empty, the timings look like this. First, NFS-mounted:
~/darcs/testsuite > darcs query repo
Type: darcs
Format: hashed
Root: /home/simonmar/darcs-all/work/testsuite
Pristine: HashedPristine
Cache: thisrepo:/home/simonmar/darcs-all/work/testsuite,
cache:/home/simonmar/.darcs/cache
boringfile Pref: .darcs-boring
Default Remote: /home/simonmar/ghc-HEAD/testsuite
Num Patches: 3047
~/darcs/testsuite > time darcs w -s
No changes!
8.57s real 0.94s user 0.54s system 17% darcs w -s
~/darcs/testsuite > time darcs w -s
No changes!
3.53s real 0.93s user 0.48s system 39% darcs w -s
~/darcs/testsuite > cd ../testsuite_0
~/darcs/testsuite_0 > darcs query repo
Type: darcs
Format: hashed, darcs-2
Root: /home/simonmar/darcs-all/work/testsuite_0
Pristine: HashedPristine
Cache: thisrepo:/home/simonmar/darcs-all/work/testsuite_0,
cache:/home/simonmar/.darcs/cache
Num Patches: 3047
~/darcs/testsuite_0 > time darcs w -s
No changes!
9.47s real 1.03s user 0.56s system 16% darcs w -s
~/darcs/testsuite_0 > time darcs w -s
No changes!
3.85s real 0.97s user 0.58s system 40% darcs w -s
When using a local drive, the time drops to about 1s, for both darcs-2 and
hashed repos.
|
msg7265 (view) |
Author: kowey |
Date: 2009-02-03.13:07:39 |
|
Thanks, I'll leave the timings for the others to puzzle over.
On Tue, Feb 03, 2009 at 12:05:11 -0000, Simon Marlow wrote:
> I can't symlink ~/.darcs/cache to a local drive, because the reason I have an
> NFS-mounted home dir is so that I can use the same home dir on multiple
> machines, so I would need a different symlink on each machine.
Just some more thoughts on a workaround: if the number of machines is
limited, one option might be to configure darcs to use the local hard
disk for /all/ the machines, i.e. have a sources entry for each one.
Lots of different caches, only one of which delivering hits any time.
I wonder if this all these cache lookup failures will cost, though.
Another variant on this idea might to have some sort of global naming
convention, so that, for example, every machine has a /local-hard-disk
|
msg7334 (view) |
Author: mornfall |
Date: 2009-02-17.14:46:58 |
|
I can report following, with a hashed format repository (everything with hot
cache on a local laptop drive):
15:41:51 | morn@eri:~/dev/darcs/ghc-testsuite -> time darcs wh
No changes!
darcs wh 0,91s user 0,09s system 93% cpu 1,068 total
15:41:57 | morn@eri:~/dev/darcs/ghc-testsuite -> time gorsvet wh
gorsvet wh 0,08s user 0,02s system 90% cpu 0,106 total
If hashed-storage changes are rolled into darcs for 2.3, you can expect about
ten-fold speedup for darcs whatsnew on hashed repositories (for the local case,
anyway). Gorsvet is an experimental implementation of subset of darcs
(currently whatsnew, diff and partially record) on top of darcslib and hashed-
storage.
I haven't had chance to benchmark NFS just yet, but I expect it to help there
as well, if for nothing else because it should cut the number of filesystem
calls to half.
|
msg7335 (view) |
Author: mornfall |
Date: 2009-02-17.14:50:15 |
|
Just to back up my filesystem call claim:
15:47:48 | morn@eri:~/dev/darcs/ghc-testsuite -> strace -c darcs wh
No changes!
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
86.92 0.003941 1 3378 read
8.56 0.000388 0 10464 lstat64
2.03 0.000092 0 1583 12 stat64
1.12 0.000051 0 1699 2 open
0.75 0.000034 0 1697 close
0.62 0.000028 0 642 642 link
[cut]
15:47:38 | morn@eri:~/dev/darcs/ghc-testsuite -> strace -c gorsvet wh
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
43.64 0.000048 0 5113 6 stat64
34.55 0.000038 0 544 chdir
21.82 0.000024 0 360 1 open
[cut]
|
msg7339 (view) |
Author: simonmar |
Date: 2009-02-18.09:13:37 |
|
> Petr Ročkai <me@mornfall.net> added the comment:
>
> Just to back up my filesystem call claim:
>
> 15:47:48 | morn@eri:~/dev/darcs/ghc-testsuite -> strace -c darcs wh
> No changes!
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 86.92 0.003941 1 3378 read
> 8.56 0.000388 0 10464 lstat64
> 2.03 0.000092 0 1583 12 stat64
> 1.12 0.000051 0 1699 2 open
> 0.75 0.000034 0 1697 close
> 0.62 0.000028 0 642 642 link
> [cut]
>
> 15:47:38 | morn@eri:~/dev/darcs/ghc-testsuite -> strace -c gorsvet wh
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 43.64 0.000048 0 5113 6 stat64
> 34.55 0.000038 0 544 chdir
> 21.82 0.000024 0 360 1 open
> [cut]
Nice going Petr, I look forward to seeing this in darcs proper!
Cheers,
Simon
|
msg7910 (view) |
Author: mornfall |
Date: 2009-06-11.23:01:28 |
|
Some progress update. The index code is on track for the 2.3 release planned
for July this year. Current timing in ghc-testsuite, with darcs goes as:
(; for i in `seq 1 20`; do; darcs wh; done; ) 0,66s user 0,39s system 99% cpu
1,055 total
(i.e. it almost finishes 20 times within a second... there's a non-negligible
jitter already at timespans this short)
I haven't done NFS testing yet myself, but rumour has it that the index helps
substantially on NFS as well.
|
msg8654 (view) |
Author: kowey |
Date: 2009-09-02.20:31:54 |
|
I assume we can call this one fixed.
|
|
Date |
User |
Action |
Args |
2008-11-04 13:38:42 | simonmar | create | |
2008-11-08 10:10:39 | mornfall | set | priority: bug nosy:
+ mornfall status: unread -> unknown messages:
+ msg6614 |
2008-11-12 14:15:14 | simonmar | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall messages:
+ msg6658 |
2008-11-24 10:01:09 | kowey | set | topic:
+ Performance nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall |
2008-12-24 23:40:33 | mornfall | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall messages:
+ msg6882 assignedto: mornfall |
2008-12-24 23:44:09 | mornfall | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall messages:
+ msg6883 |
2008-12-28 11:41:09 | mornfall | set | topic:
+ Target-2.3 nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall |
2009-01-31 21:30:16 | kowey | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall messages:
+ msg7253 |
2009-02-03 12:05:11 | simonmar | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall messages:
+ msg7264 |
2009-02-03 13:06:42 | kowey | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall title: darcs whatsnew significantly slower on a hashed repository -> global cache on network drive is slow |
2009-02-03 13:07:43 | kowey | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall messages:
+ msg7265 title: global cache on network drive is slow -> darcs whatsnew significantly slower on a hashed repository (cache on network drive) |
2009-02-17 14:47:02 | mornfall | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall messages:
+ msg7334 |
2009-02-17 14:50:19 | mornfall | set | nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall messages:
+ msg7335 |
2009-02-18 09:13:40 | simonmar | set | nosy:
+ noaddress messages:
+ msg7339 |
2009-06-11 23:01:33 | mornfall | set | status: unknown -> has-patch nosy:
kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall, noaddress messages:
+ msg7910 |
2009-08-10 23:50:37 | admin | set | nosy:
- dagit |
2009-08-25 17:33:40 | admin | set | nosy:
+ darcs-devel, - simon |
2009-08-27 14:20:43 | admin | set | nosy:
kowey, darcs-devel, simonmar, thorkilnaur, dmitry.kurochkin, mornfall, noaddress |
2009-09-02 20:31:55 | kowey | set | status: has-patch -> resolved nosy:
kowey, darcs-devel, simonmar, thorkilnaur, dmitry.kurochkin, mornfall, noaddress messages:
+ msg8654 |
2009-10-23 22:37:29 | admin | set | nosy:
+ marlowsd, - simonmar |
2009-10-23 23:35:56 | admin | set | nosy:
+ simonmar, - marlowsd |
2010-06-15 21:51:34 | admin | set | milestone: 2.3.x |
2010-06-15 21:51:36 | admin | set | topic:
- Target-2.3 |
|