Issue 814: darcs whatsnew -s does too much work

Title	darcs whatsnew -s does too much work
Priority	bug	Status	given-up
Milestone		Resolved in
Superseder		Nosy List	darcs-devel, dmitry.kurochkin, gwern, kowey, thorkilnaur, tommy
Assigned To		Topics	Performance, ThePendingPatch

Created on 2008-04-25.04:55:21 by dagit, last changed 2017-07-30.23:22:13 by gh.

Files
File name	Uploaded	Type	Edit	Remove
make-whatsnew-lazier.dpatch	dagit, 2008-04-25.18:14:18	text/x-darcs-patch

Messages
msg4328 (view)	Author: dagit	Date: 2008-04-25.04:55:20
The following commands work on OS X, and cause darcs to read bigtempfile before telling you that it is to be added. For other unices, use --bytes= instead of -c, the --darcs-2 to init doesn't matter, but I wanted to demonstrate that darcs-2 repos do not fix this issue. time head -c 17179869184 /dev/zero > bigtempfile darcs init --darcs-2 darcs add bigtempfile darcs whats -s You'll notice this takes much longer than it should. I'm pretty sure this is because adding the file puts it pending and the summary needs to know if this is a binary or non-binary patch. Determining that a file is not binary requires reading the whole file in smart_diff and diff_files because has_bin calls is_funky to look at every char in the file checking for characters that are typical in binary files.
msg4338 (view)	Author: gwern	Date: 2008-04-25.16:59:22
That seems to work here: gwern@localhost:2249~/foo>time (head -c 10179869184 /dev/zero > bigtempfile) && time (darcs init --darcs-2) && time (darcs add bigtempfile) && time (darcs whats -s) (; head -c 10179869184 /dev/zero > bigtempfile; ) 1.52s user 37.22s system 15% cpu 4:07.02 total (; darcs init --darcs-2; ) 0.00s user 0.01s system 0% cpu 2.590 total (; darcs add bigtempfile; ) 0.00s user 0.00s system 0% cpu 1.997 total A ./bigtempfile (; darcs whats -s; ) 2.03s user 4.43s system 7% cpu 1:27.42 total (I reduced the file size because of ulimit considerations here.) --- I find myself wondering - if the slowdown is due to the double memchr loop over the entire file concerned (in the worst case, it's a huge text file and memchr doesn't bail out early, so the entire file gets scanned front to back, front to back), then what can we do to optimize it? The obvious solution is to just scan some fixed bytes (or percentage), ie: return !!(memchr(s, 0, 1024) \|\| memchr(s, 26, 1024)); or maybe return !!(memchr(s, 0, (len / 10) \|\| memchr(s, 26, (len / 10))); This would probably cause us to miss some binary files and interpret them as text files, but: A) Is that really so bad? (I'm serious here, I don't know how bad it is to mistakenly interpret a binary file as being text) B) How many binary files don't have the magical characters anywhere near the beginning? Is there any smarter definition than just 'neither \0 or ^Z appear anywhere in the file'? The timing for (len/10) is: gwern@localhost:2232~/foo>time darcs whatsnew -s [12:55PM] A ./bigtempfile darcs whatsnew -s 1.79s user 3.67s system 8% cpu 1:01.55 total Unfortunately, this change seems to fail the merging_newlines, an add test, either-dependency.sh test, and it seems whatsnew.pl (but I'm not sure they're supposed to pass or if failures are bad). Looking at the memchr man page, I see that there is a reverse memchr, 'memrchr'. I don't know much about low-level stuff, but it strikes me as possible that if we changed the definition to return !!(memchr(s, 0, len) \|\| memrchr(s, 26, len)); we'd get better cache locality - we'd scan forward through the long ByteString, and then we'd reverse course and scan backwards, which allows the cached end of the string to be reused, instead of having to restart at the beginning and re-reading the entire darn thing! Using reverse memchr on the dummy repo I gave timings for above gives the result: gwern@localhost:2220~/foo>time darcs whatsnew -s [12:46PM] A ./bigtempfile darcs whatsnew -s 1.63s user 3.55s system 10% cpu 50.062 total Which is a noticeable improvement, but I fear it caused some Perl tests to fail :( Combining memrchr with the (len/10) change brings it down to 47s, which doesn't surprise me - looking at top shows that darcs is not using much memory or CPU, so that leaves IO-bound stuff. A final suggestion is more difficult, since I'm not familiar with C: instead of looping over the string and matching against a single character, surely there must be some way to loop over the string once, matching against either \0 or ^Z; this would seem to me to be more efficient than looping twice (since we're interested in performance on strings that obviously blow all the caches).
msg4351 (view)	Author: dagit	Date: 2008-04-25.18:14:18
I disabled the test in is_funky and found that it was still taking ages to compute 'whatsnew -s'. After a fair bit of debugging trying to figure out where the culprit is, I found that it may be a problem with reading the pending or what we do with the pending after we read it. The attached patch passes all the tests on my system, but may not be the correct way to solve the problem. Also, I have tried to send this patch twice once to darcs-devel and once to darcs-users, but both times it was silently ignored. I'm not sure why. Attachments make-whatsnew-lazier.dpatch
msg4356 (view)	Author: gwern	Date: 2008-04-25.19:12:07
I've come up with a hideous new optimization: gwern@localhost:2234~/darcs.net_2>new [ 2:59PM] hunk ./src/fpstring.c 60 - return !!(memchr(s, 0, len) \|\| memchr(s, 26, len)); +#ifdef _WIN32 + return !!(memchr(s, 26, (len/10))); +#else + return !!(memchr(s, 0, (len/10))); +#endif which gives a timing of gwern@localhost:2231~/foo>time darcs whatsnew -s [ 2:55PM] A ./bigtempfile darcs whatsnew -s 1.64s user 3.92s system 10% cpu 55.468 total Of course, on the one hand, this is quite a nice speed boost for a nigh-trivial change; on the other hand this change would seem to imply that a Linux darcs could fail to understand a Windows binary and conversely. And on the gripping hand, I don't see any new test suite failures for this change in particular, so the 'specification', as it were, allows for this change. No doubt there are further better ways to do it; I was discussing this on #c, and I heard: 14:45 <@twkm> gwern: sounds horrible. you could dump the memchr's and use your own loop, though that might miss optimizations available. or go assembly -- to gain access to that optimal instruction sequence -- with the loop as a fall-back, for systems without assembly yet.
msg4649 (view)	Author: kowey	Date: 2008-05-14.08:35:22
Folks, what's the relationship between this bug and issue711? Duplicate?
msg4703 (view)	Author: gwern	Date: 2008-05-14.16:54:43
kowey, I don't think it's a dupe. The bug there is about 'whatsnew -sl' performing badly in directories with lots of uncontrolled files, right? This bug is about 'whatsnew -s' (no -l) performing badly in directories with controlled, but big, files. "`darcs2 whatsnew -s' only takes a small fraction of a second. As does `darcs2 whatsnew' without any options."
msg4704 (view)	Author: kowey	Date: 2008-05-14.16:56:42
Thanks for clarifying that!
msg7054 (view)	Author: thorkilnaur	Date: 2009-01-12.10:45:01
It appears that further work is needed on this issue: There is a patch attached that has not been applied, as far as I can see. So I need a volunteer to look into this, verifying whether the patch (or a modified version of it: Things have changed in the meantime) solves the reported performance problem. Alternatively, assuming the problem can be reproduced, to investigate further how to solve it. Thanks and regards Thorkil
msg7057 (view)	Author: kowey	Date: 2009-01-12.10:59:56
Marking target-2.3 (which is perhaps an implicit way of saying "we should work on this for the next darcs hacking sprint")
msg8629 (view)	Author: dagit	Date: 2009-09-02.06:07:03
This is very much related to the bad performance of "initial imports". What is really happening here is that when we read the pending we try to eagerly construct the pending patches. This is why the attached patch works. It adds an explicit short circuit so that the pending patches are not constructed. While this may very well solve the problem in this case, we should look at a more fundamental solution. See this thread for discussion: http://lists.osuosl.org/pipermail/darcs-users/2009-August/021076.html In summary, we could apply the patch here (as a bandaid, assuming the summary is still correct), but even if we do that we need to redesign the reading of pending because it has wider impact than whatsnew. Also, to answer Eric's question: Yes, I suspect issue711 is related. I haven't spent time debugging that issue, but if it is also constructing pending patches then the problem is related.
msg8631 (view)	Author: kowey	Date: 2009-09-02.07:18:46
Jason: It may be a good idea to send the patch to darcs-users so that it gets noticed
msg8635 (view)	Author: kowey	Date: 2009-09-02.14:21:09
Jason: please disregard my last suggestion to send the patch to the mailing list (I wasn't paying enough attention). I tend to think we should just accept the cost of a "no bandaids/shortcuts/hacks" attitude. Not perfectionism but trying to get things right more or less right the first time. We should fix performance by making things cleaner and better conceived, not by twiddling knobs or by adding complexity. Perhaps you could create a new ticket explaining the fundamental problem with our use of the pending patch problem and your proposed solution? This ticket can then point to that. Thanks!

History
Date	User	Action	Args
2008-04-25 04:55:21	dagit	create
2008-04-25 16:59:24	gwern	set	status: unread -> unknown nosy: + gwern messages: + msg4338
2008-04-25 18:14:20	dagit	set	files: + make-whatsnew-lazier.dpatch nosy: tommy, beschmi, dagit, gwern messages: + msg4351
2008-04-25 19:12:08	gwern	set	nosy: tommy, beschmi, dagit, gwern messages: + msg4356
2008-05-14 08:35:24	kowey	set	topic: + Performance nosy: + kowey messages: + msg4649
2008-05-14 16:54:45	gwern	set	nosy: tommy, beschmi, kowey, dagit, gwern messages: + msg4703
2008-05-14 16:56:44	kowey	set	nosy: tommy, beschmi, kowey, dagit, gwern messages: + msg4704
2009-01-12 10:45:04	thorkilnaur	set	status: unknown -> needs-reproduction nosy: + dmitry.kurochkin, simon, thorkilnaur messages: + msg7054
2009-01-12 11:00:08	kowey	set	topic: + Target-2.3 nosy: tommy, beschmi, kowey, dagit, simon, thorkilnaur, gwern, dmitry.kurochkin messages: + msg7057
2009-08-06 16:04:38	kowey	set	topic: + Target-2.4, - Target-2.3 nosy: tommy, beschmi, kowey, dagit, simon, thorkilnaur, gwern, dmitry.kurochkin
2009-08-06 21:03:11	admin	set	nosy: - beschmi
2009-08-11 00:11:07	admin	set	nosy: - dagit
2009-08-25 17:38:51	admin	set	nosy: + darcs-devel, - simon
2009-08-27 14:21:21	admin	set	nosy: tommy, kowey, darcs-devel, thorkilnaur, gwern, dmitry.kurochkin
2009-09-02 06:07:07	dagit	set	nosy: + dagit messages: + msg8629
2009-09-02 07:18:48	kowey	set	nosy: tommy, kowey, darcs-devel, dagit, thorkilnaur, gwern, dmitry.kurochkin messages: + msg8631
2009-09-02 14:21:15	kowey	set	topic: + ThePendingPatch, - Target-2.4 nosy: tommy, kowey, darcs-devel, dagit, thorkilnaur, gwern, dmitry.kurochkin messages: + msg8635
2009-09-29 06:48:29	dagit	set	nosy: - dagit
2017-07-30 23:22:13	gh	set	status: needs-reproduction -> given-up

Issue 814 darcs whatsnew -s does too much work

Attachments