darcs

Issue 1202 darcs whatsnew significantly slower on a hashed repository (cache on network drive)

Title darcs whatsnew significantly slower on a hashed repository (cache on network drive)
Priority bug Status resolved
Milestone 2.3.x Resolved in
Superseder Nosy List darcs-devel, dmitry.kurochkin, kowey, mornfall, noaddress, simonmar, thorkilnaur
Assigned To mornfall
Topics Performance

Created on 2008-11-04.13:38:42 by simonmar, last changed 2010-06-15.21:51:36 by admin.

Messages
msg6590 (view) Author: simonmar Date: 2008-11-04.13:38:40
As requested by Eric, I'm submitting a new ticket on this.  Note this slowdown is *not* caused by the GHC 6.8.2 getSymbolicLinkStatus bug.

2.0.2 (+ 326 patches)
> darcs +RTS --info
 [("GHC RTS", "Yes")
 ,("GHC version", "6.8.3")
 ,("RTS way", "rts_thr")
 ,("Host platform", "x86_64-unknown-linux")
 ,("Build platform", "x86_64-unknown-linux")
 ,("Target platform", "x86_64-unknown-linux")
 ,("Compiler unregisterised", "NO")
 ,("Tables next to code", "YES")
 ]

With a fresh GHC testsuite repository (using darcs get --hashed http://darcs.haskell.org/testsuite):

~/darcs/testsuite > time darcs w -s
No changes!
16.48s real   9.64s user   5.18s system   89% darcs w -s
~/darcs/testsuite > time darcs w -s
No changes!
14.98s real   8.92s user   4.16s system   87% darcs w -s

ok, so that was on an NFS filesystem, so let's try locally:

/tmp > cp -r ~/darcs/testsuite .
/tmp > cd testsuite
/tmp/testsuite > time darcs w -s
No changes!
9.43s real   9.40s user   4.39s system   146% darcs w -s
/tmp/testsuite > time darcs w -s
No changes!
9.14s real   8.96s user   3.84s system   140% darcs w -s

And with an old darcs1 repository, again on NFS:

~/darcs/ghc-testing/testsuite > time darcs w -s
4.20s real   0.24s user   0.34s system   13% darcs w -s
~/darcs/ghc-testing/testsuite > time darcs w -s
2.38s real   0.22s user   0.39s system   25% darcs w -s

Fortunately the darcs-2 format is much better:

~/darcs/testsuite_0 > time darcs w -s
No changes!
3.79s real   1.17s user   0.79s system   51% darcs w -s
~/darcs/testsuite_0 > time darcs w -s
No changes!
2.65s real   1.13s user   0.70s system   69% darcs w -s
~/darcs/testsuite_0 > time darcs w -s
No changes!
1.67s real   1.11s user   0.74s system   110% darcs w -s

So I'm not sure how much you should care about this bug, since it only affects the transitional hashed format.  Still, it will probably prevent me from using the hashed format for the GHC repos.

Cheers,
        Simon
msg6614 (view) Author: mornfall Date: 2008-11-08.10:10:37
This is extremely weird, as it's unclear to me what difference there is between
hashed and darcs-2 in anything pertaining to whatsnew. Could you please call
"darcs revert" in both repos (saying "done" on first patch, ie. not reverting
anything) and re-running the checks? This "revert" should have the side-effect
of synchronising timestamps in working and pristine, so we can rule out any
timestamp-related issues.

Thank you.
msg6658 (view) Author: simonmar Date: 2008-11-12.14:15:11
> Petr Ročkai <me@mornfall.net> added the comment:
>
> This is extremely weird, as it's unclear to me what difference there is
> between
> hashed and darcs-2 in anything pertaining to whatsnew. Could you please
> call
> "darcs revert" in both repos (saying "done" on first patch, ie. not
> reverting
> anything) and re-running the checks? This "revert" should have the side-
> effect
> of synchronising timestamps in working and pristine, so we can rule out
> any
> timestamp-related issues.

Ok, I tried the 'darcs revert' trick, and it does indeed make a difference.  However, it also made a difference on my darcs-1 repo, so we still have a performance discrepancy :-)

Here are the latest results.  Two runs in each case, the first one to warm up the cache.  All repos are NFS-mounted.

darcs-1:

$ time darcs w -s
M ./timeout/timeout.hs -32 +26
0.91s real   0.19s user   0.23s system   46% darcs w -s
$ time darcs w -s
M ./timeout/timeout.hs -32 +26
0.45s real   0.18s user   0.19s system   83% darcs w -s

(ok, I have a small local change there, hope it doesn't make a difference)

hashed:

$ time darcs w -s
No changes!
13.73s real   1.07s user   0.66s system   12% darcs w -s
$ time darcs w -s
No changes!
1.68s real   1.08s user   0.59s system   100% darcs w -s

(I think the first run fell afoul of load on our NFS server, ignore it)

darcs-2:

~/darcs/testsuite_0 > time darcs w -s
No changes!
3.23s real   0.97s user   0.49s system   45% darcs w -s
~/darcs/testsuite_0 > time darcs w -s
No changes!
1.71s real   1.10s user   0.55s system   96% darcs w -s

So there's roughly a factor of 4 difference between darcs-1 and hashed/darcs-2.

Again, this repo is available at http://darcs.haskell.org/testsuite, so you should be able to reproduce these results quite easily.

Cheers,
        Simon
msg6882 (view) Author: mornfall Date: 2008-12-24.23:40:29
I believe most of this extra cost is coming from cache-related juggling in the 
hashed codepaths. Last time I've checked, each filepath would get passed 
through a regex and that alone could account for a nontrivial timeslice. 
Moreover, the directory listings for hashed pristine live in compressed files 
that need to be decompressed and parsed, which could account for another slice. 
All in all, I'll look at profiles again and I'll hope to come up with some 
improvements.

Thorkil, I'd like to make this a release goal for 2.3, but I don't yet know how 
we want to mark our bugs. *wink*
msg6883 (view) Author: mornfall Date: 2008-12-24.23:44:07
Oh, I almost forgot, we should probably also make darcs whatsnew fix up 
timestamps, like revert does, if that'd be possible without penalty on whatsnew 
speed...
msg7253 (view) Author: kowey Date: 2009-01-31.21:30:14
Simon, did you "just" say NFS mounted?

I recently filed http://bugs.darcs.net/issue1293

Do you find darcs much faster when everything is on a local drive, (including
your global cache?)

Thanks!
msg7264 (view) Author: simonmar Date: 2009-02-03.12:04:59
Some of the slowdown I originally reported is probably due to having a global
cache in my NFS-mounted home directory (which I thought was disabled, but
apparently was not, but that's another bug report).

I can't symlink ~/.darcs/cache to a local drive, because the reason I have an
NFS-mounted home dir is so that I can use the same home dir on multiple
machines, so I would need a different symlink on each machine.

So now my global cache is empty, the timings look like this.  First, NFS-mounted:

~/darcs/testsuite > darcs query repo
          Type: darcs
        Format: hashed
          Root: /home/simonmar/darcs-all/work/testsuite
      Pristine: HashedPristine
         Cache: thisrepo:/home/simonmar/darcs-all/work/testsuite,
cache:/home/simonmar/.darcs/cache
boringfile Pref: .darcs-boring
Default Remote: /home/simonmar/ghc-HEAD/testsuite
   Num Patches: 3047
~/darcs/testsuite > time darcs w -s
No changes!              
8.57s real   0.94s user   0.54s system   17% darcs w -s
~/darcs/testsuite > time darcs w -s
No changes!              
3.53s real   0.93s user   0.48s system   39% darcs w -s
~/darcs/testsuite > cd ../testsuite_0 
~/darcs/testsuite_0 > darcs query repo
          Type: darcs
        Format: hashed, darcs-2
          Root: /home/simonmar/darcs-all/work/testsuite_0
      Pristine: HashedPristine
         Cache: thisrepo:/home/simonmar/darcs-all/work/testsuite_0,
cache:/home/simonmar/.darcs/cache
   Num Patches: 3047
~/darcs/testsuite_0 > time darcs w -s  
No changes!              
9.47s real   1.03s user   0.56s system   16% darcs w -s
~/darcs/testsuite_0 > time darcs w -s
No changes!              
3.85s real   0.97s user   0.58s system   40% darcs w -s

When using a local drive, the time drops to about 1s, for both darcs-2 and
hashed repos.
msg7265 (view) Author: kowey Date: 2009-02-03.13:07:39
Thanks, I'll leave the timings for the others to puzzle over.

On Tue, Feb 03, 2009 at 12:05:11 -0000, Simon Marlow wrote:
> I can't symlink ~/.darcs/cache to a local drive, because the reason I have an
> NFS-mounted home dir is so that I can use the same home dir on multiple
> machines, so I would need a different symlink on each machine.

Just some more thoughts on a workaround: if the number of machines is
limited, one option might be to configure darcs to use the local hard
disk for /all/ the machines, i.e. have a sources entry for each one.
Lots of different caches, only one of which delivering hits any time.
I wonder if this all these cache lookup failures will cost, though.

Another variant on this idea might to have some sort of global naming
convention, so that, for example, every machine has a /local-hard-disk
msg7334 (view) Author: mornfall Date: 2009-02-17.14:46:58
I can report following, with a hashed format repository (everything with hot 
cache on a local laptop drive):

15:41:51 | morn@eri:~/dev/darcs/ghc-testsuite -> time darcs wh
No changes!              
darcs wh  0,91s user 0,09s system 93% cpu 1,068 total
15:41:57 | morn@eri:~/dev/darcs/ghc-testsuite -> time gorsvet wh
gorsvet wh  0,08s user 0,02s system 90% cpu 0,106 total

If hashed-storage changes are rolled into darcs for 2.3, you can expect about 
ten-fold speedup for darcs whatsnew on hashed repositories (for the local case, 
anyway). Gorsvet is an experimental implementation of subset of darcs 
(currently whatsnew, diff and  partially record) on top of darcslib and hashed-
storage.

I haven't had chance to benchmark NFS just yet, but I expect it to  help there 
as well, if for nothing else because it should cut the number of filesystem 
calls to half.
msg7335 (view) Author: mornfall Date: 2009-02-17.14:50:15
Just to back up my filesystem call claim:

15:47:48 | morn@eri:~/dev/darcs/ghc-testsuite -> strace -c darcs wh
No changes!              
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 86.92    0.003941           1      3378           read
  8.56    0.000388           0     10464           lstat64
  2.03    0.000092           0      1583        12 stat64
  1.12    0.000051           0      1699         2 open
  0.75    0.000034           0      1697           close
  0.62    0.000028           0       642       642 link
[cut]

15:47:38 | morn@eri:~/dev/darcs/ghc-testsuite -> strace -c gorsvet wh
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 43.64    0.000048           0      5113         6 stat64
 34.55    0.000038           0       544           chdir
 21.82    0.000024           0       360         1 open
[cut]
msg7339 (view) Author: simonmar Date: 2009-02-18.09:13:37
> Petr Ročkai <me@mornfall.net> added the comment:
>
> Just to back up my filesystem call claim:
>
> 15:47:48 | morn@eri:~/dev/darcs/ghc-testsuite -> strace -c darcs wh
> No changes!
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  86.92    0.003941           1      3378           read
>   8.56    0.000388           0     10464           lstat64
>   2.03    0.000092           0      1583        12 stat64
>   1.12    0.000051           0      1699         2 open
>   0.75    0.000034           0      1697           close
>   0.62    0.000028           0       642       642 link
> [cut]
>
> 15:47:38 | morn@eri:~/dev/darcs/ghc-testsuite -> strace -c gorsvet wh
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  43.64    0.000048           0      5113         6 stat64
>  34.55    0.000038           0       544           chdir
>  21.82    0.000024           0       360         1 open
> [cut]

Nice going Petr, I look forward to seeing this in darcs proper!

Cheers,
        Simon
msg7910 (view) Author: mornfall Date: 2009-06-11.23:01:28
Some progress update. The index code is on track for the 2.3 release planned
for July this year. Current timing in ghc-testsuite, with darcs goes as:
(; for i in `seq 1 20`; do; darcs wh; done; )  0,66s user 0,39s system 99% cpu
1,055 total

(i.e. it almost finishes 20 times within a second... there's a non-negligible
jitter already at timespans this short)

I haven't done NFS testing yet myself, but rumour has it that the index helps
substantially on NFS as well.
msg8654 (view) Author: kowey Date: 2009-09-02.20:31:54
I assume we can call this one fixed.
History
Date User Action Args
2008-11-04 13:38:42simonmarcreate
2008-11-08 10:10:39mornfallsetpriority: bug
nosy: + mornfall
status: unread -> unknown
messages: + msg6614
2008-11-12 14:15:14simonmarsetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
messages: + msg6658
2008-11-24 10:01:09koweysettopic: + Performance
nosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
2008-12-24 23:40:33mornfallsetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
messages: + msg6882
assignedto: mornfall
2008-12-24 23:44:09mornfallsetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
messages: + msg6883
2008-12-28 11:41:09mornfallsettopic: + Target-2.3
nosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
2009-01-31 21:30:16koweysetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
messages: + msg7253
2009-02-03 12:05:11simonmarsetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
messages: + msg7264
2009-02-03 13:06:42koweysetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
title: darcs whatsnew significantly slower on a hashed repository -> global cache on network drive is slow
2009-02-03 13:07:43koweysetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
messages: + msg7265
title: global cache on network drive is slow -> darcs whatsnew significantly slower on a hashed repository (cache on network drive)
2009-02-17 14:47:02mornfallsetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
messages: + msg7334
2009-02-17 14:50:19mornfallsetnosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall
messages: + msg7335
2009-02-18 09:13:40simonmarsetnosy: + noaddress
messages: + msg7339
2009-06-11 23:01:33mornfallsetstatus: unknown -> has-patch
nosy: kowey, dagit, simonmar, simon, thorkilnaur, dmitry.kurochkin, mornfall, noaddress
messages: + msg7910
2009-08-10 23:50:37adminsetnosy: - dagit
2009-08-25 17:33:40adminsetnosy: + darcs-devel, - simon
2009-08-27 14:20:43adminsetnosy: kowey, darcs-devel, simonmar, thorkilnaur, dmitry.kurochkin, mornfall, noaddress
2009-09-02 20:31:55koweysetstatus: has-patch -> resolved
nosy: kowey, darcs-devel, simonmar, thorkilnaur, dmitry.kurochkin, mornfall, noaddress
messages: + msg8654
2009-10-23 22:37:29adminsetnosy: + marlowsd, - simonmar
2009-10-23 23:35:56adminsetnosy: + simonmar, - marlowsd
2010-06-15 21:51:34adminsetmilestone: 2.3.x
2010-06-15 21:51:36adminsettopic: - Target-2.3