Created on 2008-07-25.10:10:25 by kowey, last changed 2009-10-24.00:05:57 by admin.
msg5264 (view) |
Author: kowey |
Date: 2008-07-25.10:10:21 |
|
I haven't seen a ticket on this yet.
Simon M mentions that the hashed format is slow enough for darcs
whatsnew that he has switched back to the original format.
Presumably, if we fix whatever makes the hashed format significantly
slower, this will also translate to gains for the darcs-2 format. It
would be useful if somebody would check and see if how the hashed
format compares with darcs-2.
If only there was more work going on with the maybench ([darcs]
benchmarking) project...
|
msg5266 (view) |
Author: kowey |
Date: 2008-07-25.10:13:42 |
|
Adding relevant people and marking urgent.
Darcs-2 performance is a big barrier to adoption for people with biggish
repositories, which is bad because biggish repositories are especially
susceptible to the kinds of problems that darcs fixes. Plus, the hashed format
thing should be a clue, eh?
|
msg5269 (view) |
Author: ertai |
Date: 2008-07-25.13:25:41 |
|
I remember that some "whatsnew" optimizations was harder to make with the
hashed format.
> If only there was more work going on with the maybench ([darcs]
> benchmarking) project...
My bad, partly...
|
msg5283 (view) |
Author: droundy |
Date: 2008-07-30.17:19:48 |
|
I haven't looked into this, and don't have time to look into it, but it's
probably just a strictness bug.
David
|
msg5304 (view) |
Author: mornfall |
Date: 2008-08-07.09:54:43 |
|
I have some possible insight on this issue. On systems using GHC 6.8.2 (not all
of them, but many -- including all Debian x86 installs from packages),
getSymbolicLinkStatus is utterly broken, returning bogus mtimes, which completely
kills any mtime-based optimisation. This means we have been basically running
with --ignore-times here... The difference is very significant on a mostly clean
working tree and much more so for hashed repository, as the slurpHashed there
computes sha256 of every file in pristine (through fetchFileUsingCachePrivate),
unlike the old-fashioned one.
|
msg5319 (view) |
Author: markstos |
Date: 2008-08-07.14:17:12 |
|
Given mornfall's comment, can we get anyone to re-test with 6.8.3 or newer as a
first step?
I suppose a direct benchmark between 6.8.2 and a newer GHC would be especially
useful.
|
msg5340 (view) |
Author: mornfall |
Date: 2008-08-08.23:37:29 |
|
Ok, after fixing that mtime borkage, times on my repository (2200 files or so) go
like .5s with working mtime and 3.5 with non-working. However, on ghc repo, it
doesn't help, and now I found why: ghc-hashedrepo, as sitting on
darcs.haskell.org, is in some weird intermediate (not final) hashed format that
doesn't keep (original) filesizes in filenames. So in effect, it's running like
with --ignore-times still, because the HashedIO slurpy gives unknown_size and
smart_diff then decides it better should do a full diff, which it does. Which is
slow. Duh.
I am downlnoading ghc-HEAD-2008-07-24-ghc.tar.bz2 from darcs.haskell.org to
convert it to hashed myself tomorrow (it's going at 10K/s so no chance to do it
before passing out today).
|
msg5348 (view) |
Author: mornfall |
Date: 2008-08-09.11:27:30 |
|
13:10:46 | xroc@ann:~/dev/public/ghc-hashed -> time pdarcs-2.0.2-nolink wh -s --
No changes!
pdarcs-2.0.2-nolink wh -s --ignore-times 1,59s user 0,25s system 100% cpu 1,839
total
13:10:48 | xroc@ann:~/dev/public/ghc-hashed -> time pdarcs-2.0.2-nolink wh -s
No changes!
pdarcs-2.0.2-nolink wh -s 0,26s user 0,06s system 99% cpu 0,320 total
13:11:33 | xroc@ann:~/dev/public/ghc -> time pdarcs-2.0.2-nolink wh -s --ignore-
No changes!
pdarcs-2.0.2-nolink wh -s --ignore-times 0,38s user 0,14s system 99% cpu 0,525
total
13:11:34 | xroc@ann:~/dev/public/ghc -> time pdarcs-2.0.2-nolink wh -s
No changes!
pdarcs-2.0.2-nolink wh -s 0,04s user 0,04s system 97% cpu 0,086 total
Hmm, it still is a fair bit slower. From a profile, the old-fashioned version
takes .06 secs of cpu time, which isn't much more than git, really. It could
possibly be optimized further nevertheless.
Looking at the hashed profile, I see
slh Darcs.Repository.HashedIO 30.8 18.1
(this is the HashedIO slurpy)
and:
readroot Darcs.Repository.HashedIO
946 630 0.0 0.1 46.2 65.3
readdir Darcs.Repository.HashedIO
952 2417 0.0 1.7 38.5 64.5
unpackPS FastPackedString
1042 1208 15.4 42.4 15.4 42.4
and atop of that
slh Darcs.Repository.HashedIO
945 6603 30.8 18.1 84.6 90.8
So basically all the extra time (90 % of those 13 20ms ticks) is spent in
HashedIO Slurpy (well, not really that surprising, is it?). Now that unpackPS
figure could be addressed by using PackedString for the hashes (instead of
String), although it does seem to propagate quite far (it might even be
responsible for other bits of slowness, though -- the Hashed path does a fair bit
of string fiddling it seems).
|
msg5378 (view) |
Author: mornfall |
Date: 2008-08-11.15:22:10 |
|
I think we need to address some things here:
1) try to ensure we are not using broken ghc to build darcs -- and warn about it
if we do (alternatively, never use getSymbolicLinkStatus to get mtime)
2) warn users about obsolete repository formats (no size in hashed filename)
3) improve the actual performance of whatsnew on hashed, which is still .32
seconds compared to 0.09 or so (it is fairly bearable I think, but could use
improving)
|
msg5387 (view) |
Author: mornfall |
Date: 2008-08-11.23:39:51 |
|
For the bogus getSymbolicLinkStatus mtime problem (which by the way also makes
fileSize results bogus), see http://hackage.haskell.org/trac/ghc/ticket/2093
It has been fixed for 6.8.3 the ticket says, so we should probably just blacklist
6.8.2 for building darcs? Note that 6.8.2 is what Debian Lenny will release with,
so we are sort of screwed on that front. I have no idea about status of other
major distributions.
Igloo has suggested that it should be possible to work around that problem on
darcs side by foreign-importing lstat ourselves.
|
msg5395 (view) |
Author: kowey |
Date: 2008-08-12.09:31:05 |
|
Well, do we do have a Workaround.hs mechanism that we could use to implement our
own getSymbolicLinkStatus (Ian's solution). Is there a way to detect buggy
getSymbolicLinkStatus directly (as opposed to checking GHC version?)
In any case, we should probably enter it into darcs folklore that compiling with
GHC 6.8.3 is way preferable to 6.8.2, for performance reasons. We would need a
comparison, though... especially under Windows.
|
msg5396 (view) |
Author: mornfall |
Date: 2008-08-12.09:39:36 |
|
I actually don't think this applies to Windows at all. The comparison goes as
3.5s vs .5s on a medium-sized repository (a little bigger than ghc I think, at
least to number of files -- smaller in number of patches). For whatsnew, that is.
I think that's a good enough comparison -- 600% speedup?
|
msg5511 (view) |
Author: gwern |
Date: 2008-08-14.17:34:38 |
|
I recently converted a up-to-date GHC repo to darcs-2 format. Here's a timing
with the original:
gwern@craft:16593~/bin/ghc>time =darcs query repo
[ 1:30PM]
Type: darcs
Show: Repo "/home/gwern/bin/ghc" [FixFilePath "/home/gwern/bin/ghc"
"",Verbose] (RF [["darcs-1.0"]]) (DarcsRepository (PlainPristine
"_darcs/pristine") thisrepo:/home/gwern/bin/ghc
cache:/home/gwern/.darcs/cache
)
Format: darcs-1.0
Root: /home/gwern/bin/ghc
Pristine: PlainPristine "_darcs/pristine"
Cache: thisrepo:/home/gwern/bin/ghc, cache:/home/gwern/.darcs/cache
boringfile Pref: .darcs-boring
Default Remote: http://darcs.haskell.org/ghc
Num Patches: 19263
=darcs query repo 0.05s user 0.02s system 32% cpu 0.209 total
gwern@craft:16589~/bin/ghc>time =darcs whatsnew
[ 1:29PM]
diffing dir...
binary ./WindowsInstaller/Glasgow\32\Haskell\32\Compiler.ism
rmfile ./WindowsInstaller/Glasgow\32\Haskell\32\Compiler.ism
=darcs whatsnew 0.04s user 0.01s system 44% cpu 0.107 total
Okay, that's good. No one can complain about a 0.1s whatsnew.
On the converted repo:
gwern@craft:16597~/bin/ghc-darcs2>time =darcs query repo
[ 1:31PM]
Type: darcs
Show: Repo "/home/gwern/bin/ghc-darcs2" [FixFilePath
"/home/gwern/bin/ghc-darcs2" "",Verbose] (RF [["hashed"],["darcs-2"]])
(DarcsRepository HashedPristine thisrepo:/home/gwern/bin/ghc-darcs2
cache:/home/gwern/.darcs/cache
)
Format: hashed, darcs-2
Root: /home/gwern/bin/ghc-darcs2
Pristine: HashedPristine
Cache: thisrepo:/home/gwern/bin/ghc-darcs2, cache:/home/gwern/.darcs/cache
Num Patches: 19263
=darcs query repo 0.21s user 0.02s system 72% cpu 0.310 total
gwern@craft:16600~/bin/ghc-darcs2>time =darcs whatsnew
[ 1:31PM]
diffing dir...
binary ./WindowsInstaller/Glasgow\32\Haskell\32\Compiler.ism
rmfile ./WindowsInstaller/Glasgow\32\Haskell\32\Compiler.ism
=darcs whatsnew 0.10s user 0.03s system 55% cpu 0.236 total
I would be more concerned about this doubling of time, if the base weren't so
small. :)
The important thing is that I'm on a 64-bit Ubuntu Heron, so I avoid the
mentioned file bug. I don't think Windows has anything to do with it; so I agree
with Mornfall here.
Given this major performance regression - and also the bug with mixing -profile
and -threaded - telling people to avoid 6.8.2 in favor of 6.8.3 is starting to
sound really good to me (as opposed to doing funky FFI stuff and also
complicating the autoconf stuff even further).
|
msg5879 (view) |
Author: markstos |
Date: 2008-09-01.21:48:01 |
|
I'm not sure how helpful this is, but I filled a GHC bug with Ubuntu regarding
6.8.2:
https://bugs.launchpad.net/ubuntu/+source/ghc6/+bug/263773
|
msg5883 (view) |
Author: markstos |
Date: 2008-09-02.18:48:12 |
|
Ubuntu is taking the GHC bug report seriously, but they are on the fence about
whether they can get it done in time for Intreprid, due to manpower. Is there
anyone from the darcs camp that could help? And could a Haskeller pass this on
to the Haskell community, since it affects all GHC uses, and not just darcs?
You can follow the discussion to date here:
https://bugs.launchpad.net/ubuntu/+source/ghc6/+bug/263773
It doesn't sound like hard work, just a lot of it.
|
msg6168 (view) |
Author: mornfall |
Date: 2008-09-29.11:39:58 |
|
A version of ghc 6.8.2 with the relevant fix backported has been uploaded to
Debian unstable. Thanks Kari Pahula. We'll see what can be done about pushing
that fix through to Lenny...
|
msg6170 (view) |
Author: markstos |
Date: 2008-09-29.13:46:04 |
|
mornfall,
Would you be able to test that binary generated for Ubuntu is fixed as well?
http://launchpadlibrarian.net/18008357/darcs_2.0.2-2ubuntu2_i386.deb
Their bug report is here:
https://bugs.launchpad.net/ubuntu/+source/ghc6/+bug/263773
|
msg6171 (view) |
Author: mornfall |
Date: 2008-09-29.14:28:13 |
|
Yes, I can confirm the Ubuntu deb being fixed. I am uploading fixed darcs to
Debian unstable in a little while as well.
|
msg6173 (view) |
Author: markstos |
Date: 2008-09-29.15:06:50 |
|
So this is ticket resolvable now, or is there another part to
"darcs whatsnew significantly slower on hashed format" that is separate from the
ghc 6.8.2 bug?
Thanks to everyone who helped address this!
|
msg6178 (view) |
Author: markstos |
Date: 2008-09-30.13:46:12 |
|
> Why did they just take that one fix from 6.8.3? There were a bunch of other
> really important fixes in 6.8.3, and since you still have to recompile all the
> libraries anyway it would seem sensible to just move to 6.8.3.
Simon,
They explained this in the Ubuntu bug thread:
1. There is not time for a full re-build of everything that is effected before
the Intrepid release-- it was considered.
2. It was determined that this one fix could be done without breaking the ABI,
and be done in time.
https://bugs.launchpad.net/ubuntu/+source/ghc6/+bug/263773
|
msg6588 (view) |
Author: kowey |
Date: 2008-11-04.12:47:04 |
|
Simon Marlow : you reported this offhand, that hashed repositories were
significantly slower than old-fashioned ones. The fix has been to compile darcs
with GHC 6.8.3. I'm going to mark this as resolved. If you notice this
behaviour still (perhaps under Windows? Petr says the bug/fix does not affect
Windows in any way), please open a new ticket
|
msg6592 (view) |
Author: markstos |
Date: 2008-11-04.15:50:08 |
|
I agree with the resolved status, but want to note that I think both Debian and
Ubuntu patched their versions of GHC 6.8.2 to quit having the related bug as well.
|
msg7319 (view) |
Author: kowey |
Date: 2009-02-15.18:46:31 |
|
Just noting from
http://packages.debian.org/changelogs/pool/main/g/ghc6/ghc6_6.8.2dfsg1-1/changelog
that the patched version of GHC 6.8.2 seems to have made it in indeed. In
particular, this is closed: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=500407
|
msg7398 (view) |
Author: zooko |
Date: 2009-03-04.15:49:32 |
|
I guess this has been resolved by Debian and Ubuntu patching their ghc 6.8.2.
|
|
Date |
User |
Action |
Args |
2008-07-25 10:10:25 | kowey | create | |
2008-07-25 10:11:00 | kowey | link | issue972 superseder |
2008-07-25 10:13:48 | kowey | set | priority: urgent nosy:
+ droundy, simonmar topic:
+ Performance status: unread -> unknown messages:
+ msg5266 |
2008-07-25 13:25:44 | ertai | set | nosy:
+ ertai messages:
+ msg5269 |
2008-07-30 17:19:51 | droundy | set | nosy:
droundy, tommy, beschmi, kowey, dagit, simonmar, ertai messages:
+ msg5283 |
2008-08-07 09:54:47 | mornfall | set | nosy:
+ mornfall messages:
+ msg5304 |
2008-08-07 13:34:59 | markstos | set | nosy:
+ markstos |
2008-08-07 14:17:15 | markstos | set | nosy:
droundy, tommy, beschmi, kowey, markstos, dagit, simonmar, ertai, mornfall messages:
+ msg5319 |
2008-08-08 23:37:33 | mornfall | set | nosy:
droundy, tommy, beschmi, kowey, markstos, dagit, simonmar, ertai, mornfall messages:
+ msg5340 |
2008-08-09 11:27:34 | mornfall | set | nosy:
droundy, tommy, beschmi, kowey, markstos, dagit, simonmar, ertai, mornfall messages:
+ msg5348 |
2008-08-11 15:22:13 | mornfall | set | nosy:
droundy, tommy, beschmi, kowey, markstos, dagit, simonmar, ertai, mornfall messages:
+ msg5378 assignedto: mornfall |
2008-08-11 23:39:54 | mornfall | set | nosy:
droundy, tommy, beschmi, kowey, markstos, dagit, simonmar, ertai, mornfall messages:
+ msg5387 |
2008-08-12 09:31:08 | kowey | set | nosy:
droundy, tommy, beschmi, kowey, markstos, dagit, simonmar, ertai, mornfall messages:
+ msg5395 |
2008-08-12 09:39:39 | mornfall | set | nosy:
droundy, tommy, beschmi, kowey, markstos, dagit, simonmar, ertai, mornfall messages:
+ msg5396 |
2008-08-12 16:36:45 | kowey | link | issue986 superseder |
2008-08-14 17:11:29 | gwern | set | nosy:
+ darcs-devel, gwern |
2008-08-14 17:34:40 | gwern | set | nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, gwern, ertai, mornfall messages:
+ msg5511 |
2008-09-01 21:48:03 | markstos | set | nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, gwern, ertai, mornfall messages:
+ msg5879 |
2008-09-02 18:48:14 | markstos | set | nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, gwern, ertai, mornfall messages:
+ msg5883 |
2008-09-29 11:40:00 | mornfall | set | nosy:
+ simon, thorkilnaur messages:
+ msg6168 |
2008-09-29 13:46:06 | markstos | set | nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, simon, thorkilnaur, gwern, ertai, mornfall messages:
+ msg6170 title: darcs whatsnew significantly slower on hashed format -> darcs whatsnew significantly slower on hashed format (ghc 6.8.2 bug affecting Debian and Ubuntu) |
2008-09-29 14:28:14 | mornfall | set | nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, simon, thorkilnaur, gwern, ertai, mornfall messages:
+ msg6171 |
2008-09-29 15:06:51 | markstos | set | nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, simon, thorkilnaur, gwern, ertai, mornfall messages:
+ msg6173 |
2008-09-30 13:46:15 | markstos | set | nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, simon, thorkilnaur, gwern, ertai, mornfall messages:
+ msg6178 |
2008-11-04 12:47:06 | kowey | set | nosy:
+ dmitry.kurochkin messages:
+ msg6588 |
2008-11-04 12:50:22 | kowey | set | status: unknown -> resolved nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, simon, thorkilnaur, gwern, ertai, dmitry.kurochkin, mornfall |
2008-11-04 15:50:10 | markstos | set | status: resolved -> unknown nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, simon, thorkilnaur, gwern, ertai, dmitry.kurochkin, mornfall messages:
+ msg6592 |
2008-11-04 16:52:39 | kowey | set | status: unknown -> resolved nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, simon, thorkilnaur, gwern, ertai, dmitry.kurochkin, mornfall |
2009-02-15 18:46:34 | kowey | set | status: resolved -> unknown nosy:
droundy, tommy, beschmi, kowey, markstos, darcs-devel, dagit, simonmar, simon, thorkilnaur, gwern, ertai, dmitry.kurochkin, mornfall messages:
+ msg7319 |
2009-02-15 20:21:28 | droundy | set | nosy:
- darcs-devel |
2009-03-04 15:49:35 | zooko | set | status: unknown -> resolved nosy:
+ zooko messages:
+ msg7398 |
2009-08-06 17:59:23 | admin | set | nosy:
+ jast, Serware, darcs-devel, - droundy, simonmar, gwern, ertai |
2009-08-06 21:10:02 | admin | set | nosy:
- beschmi |
2009-08-10 22:21:55 | admin | set | nosy:
+ gwern, simonmar, ertai, - darcs-devel, jast, Serware |
2009-08-11 00:19:31 | admin | set | nosy:
- dagit |
2009-08-25 17:29:47 | admin | set | nosy:
+ darcs-devel, - simon |
2009-08-27 14:15:54 | admin | set | nosy:
tommy, kowey, markstos, darcs-devel, zooko, simonmar, thorkilnaur, gwern, ertai, dmitry.kurochkin, mornfall |
2009-10-23 22:38:29 | admin | set | nosy:
+ marlowsd, - simonmar |
2009-10-23 22:40:54 | admin | set | nosy:
+ nicolas.pouillard, - ertai |
2009-10-23 23:36:58 | admin | set | nosy:
+ simonmar, - marlowsd |
2009-10-24 00:05:57 | admin | set | nosy:
+ ertai, - nicolas.pouillard |
|