Created on 2008-04-25.04:55:21 by dagit, last changed 2017-07-30.23:22:13 by gh.
msg4328 (view) |
Author: dagit |
Date: 2008-04-25.04:55:20 |
|
The following commands work on OS X, and cause darcs to read bigtempfile before
telling you that it is to be added.
For other unices, use --bytes= instead of -c, the --darcs-2 to init doesn't
matter, but I wanted to demonstrate that darcs-2 repos do not fix this issue.
time head -c 17179869184 /dev/zero > bigtempfile
darcs init --darcs-2
darcs add bigtempfile
darcs whats -s
You'll notice this takes much longer than it should. I'm pretty sure this is
because adding the file puts it pending and the summary needs to know if this is
a binary or non-binary patch. Determining that a file is not binary requires
reading the whole file in smart_diff and diff_files because has_bin calls
is_funky to look at every char in the file checking for characters that are
typical in binary files.
|
msg4338 (view) |
Author: gwern |
Date: 2008-04-25.16:59:22 |
|
That seems to work here:
gwern@localhost:2249~/foo>time (head -c 10179869184 /dev/zero > bigtempfile) &&
time (darcs init --darcs-2) && time (darcs add bigtempfile) && time (darcs whats -s)
(; head -c 10179869184 /dev/zero > bigtempfile; ) 1.52s user 37.22s system 15%
cpu 4:07.02 total
(; darcs init --darcs-2; ) 0.00s user 0.01s system 0% cpu 2.590 total
(; darcs add bigtempfile; ) 0.00s user 0.00s system 0% cpu 1.997 total
A ./bigtempfile
(; darcs whats -s; ) 2.03s user 4.43s system 7% cpu 1:27.42 total
(I reduced the file size because of ulimit considerations here.)
---
I find myself wondering - if the slowdown is due to the double memchr loop over
the entire file concerned (in the worst case, it's a huge text file and memchr
doesn't bail out early, so the entire file gets scanned front to back, front to
back), then what can we do to optimize it? The obvious solution is to just scan
some fixed bytes (or percentage), ie:
return !!(memchr(s, 0, 1024) || memchr(s, 26, 1024));
or maybe
return !!(memchr(s, 0, (len / 10) || memchr(s, 26, (len / 10)));
This would probably cause us to miss some binary files and interpret them as
text files, but:
A) Is that really so bad? (I'm serious here, I don't know how bad it is to
mistakenly interpret a binary file as being text)
B) How many binary files don't have the magical characters anywhere near the
beginning? Is there any smarter definition than just 'neither \0 or ^Z appear
anywhere in the file'?
The timing for (len/10) is:
gwern@localhost:2232~/foo>time darcs whatsnew -s
[12:55PM]
A ./bigtempfile
darcs whatsnew -s 1.79s user 3.67s system 8% cpu 1:01.55 total
Unfortunately, this change seems to fail the merging_newlines, an add test,
either-dependency.sh test, and it seems whatsnew.pl (but I'm not sure they're
supposed to pass or if failures are bad).
Looking at the memchr man page, I see that there is a reverse memchr, 'memrchr'.
I don't know much about low-level stuff, but it strikes me as possible that if
we changed the definition to
return !!(memchr(s, 0, len) || memrchr(s, 26, len));
we'd get better cache locality - we'd scan forward through the long ByteString,
and then we'd reverse course and scan backwards, which allows the cached end of
the string to be reused, instead of having to restart at the beginning and
re-reading the entire darn thing! Using reverse memchr on the dummy repo I gave
timings for above gives the result:
gwern@localhost:2220~/foo>time darcs whatsnew -s
[12:46PM]
A ./bigtempfile
darcs whatsnew -s 1.63s user 3.55s system 10% cpu 50.062 total
Which is a noticeable improvement, but I fear it caused some Perl tests to fail
:( Combining memrchr with the (len/10) change brings it down to 47s, which
doesn't surprise me - looking at top shows that darcs is not using much memory
or CPU, so that leaves IO-bound stuff.
A final suggestion is more difficult, since I'm not familiar with C: instead of
looping over the string and matching against a single character, surely there
must be some way to loop over the string once, matching against either \0 or ^Z;
this would seem to me to be more efficient than looping twice (since we're
interested in performance on strings that obviously blow all the caches).
|
msg4351 (view) |
Author: dagit |
Date: 2008-04-25.18:14:18 |
|
I disabled the test in is_funky and found that it was still taking ages to
compute 'whatsnew -s'. After a fair bit of debugging trying to figure out where
the culprit is, I found that it may be a problem with reading the pending or
what we do with the pending after we read it.
The attached patch passes all the tests on my system, but may not be the correct
way to solve the problem. Also, I have tried to send this patch twice once to
darcs-devel and once to darcs-users, but both times it was silently ignored.
I'm not sure why.
Attachments
|
msg4356 (view) |
Author: gwern |
Date: 2008-04-25.19:12:07 |
|
I've come up with a hideous new optimization:
gwern@localhost:2234~/darcs.net_2>new
[ 2:59PM]
hunk ./src/fpstring.c 60
- return !!(memchr(s, 0, len) || memchr(s, 26, len));
+#ifdef _WIN32
+ return !!(memchr(s, 26, (len/10)));
+#else
+ return !!(memchr(s, 0, (len/10)));
+#endif
which gives a timing of
gwern@localhost:2231~/foo>time darcs whatsnew -s
[ 2:55PM]
A ./bigtempfile
darcs whatsnew -s 1.64s user 3.92s system 10% cpu 55.468 total
Of course, on the one hand, this is quite a nice speed boost for a nigh-trivial
change; on the other hand this change would seem to imply that a Linux darcs
could fail to understand a Windows binary and conversely. And on the gripping
hand, I don't see any new test suite failures for this change in particular, so
the 'specification', as it were, allows for this change.
No doubt there are further better ways to do it; I was discussing this on #c,
and I heard:
14:45 <@twkm> gwern: sounds horrible. you could dump the memchr's and use your
own loop, though that might miss optimizations available. or
go assembly -- to gain access to that optimal instruction sequence
-- with the loop as a fall-back, for systems without
assembly yet.
|
msg4649 (view) |
Author: kowey |
Date: 2008-05-14.08:35:22 |
|
Folks, what's the relationship between this bug and issue711? Duplicate?
|
msg4703 (view) |
Author: gwern |
Date: 2008-05-14.16:54:43 |
|
kowey, I don't think it's a dupe. The bug there is about 'whatsnew -sl'
performing badly in directories with lots of uncontrolled files, right? This bug
is about 'whatsnew -s' (no -l) performing badly in directories with controlled,
but big, files.
"`darcs2 whatsnew -s' only takes a small fraction of a second.
As does `darcs2 whatsnew' without any options."
|
msg4704 (view) |
Author: kowey |
Date: 2008-05-14.16:56:42 |
|
Thanks for clarifying that!
|
msg7054 (view) |
Author: thorkilnaur |
Date: 2009-01-12.10:45:01 |
|
It appears that further work is needed on this issue: There is a patch attached
that has not been applied, as far as I can see. So I need a volunteer to look into
this, verifying whether the patch (or a modified version of it: Things have changed
in the meantime) solves the reported performance problem. Alternatively, assuming
the problem can be reproduced, to investigate further how to solve it.
Thanks and regards
Thorkil
|
msg7057 (view) |
Author: kowey |
Date: 2009-01-12.10:59:56 |
|
Marking target-2.3 (which is perhaps an implicit way of saying "we should work
on this for the next darcs hacking sprint")
|
msg8629 (view) |
Author: dagit |
Date: 2009-09-02.06:07:03 |
|
This is very much related to the bad performance of "initial imports".
What is really happening here is that when we read the pending we try to eagerly
construct the pending patches. This is why the attached patch works. It adds
an explicit short circuit so that the pending patches are not constructed.
While this may very well solve the problem in this case, we should look at a
more fundamental solution.
See this thread for discussion:
http://lists.osuosl.org/pipermail/darcs-users/2009-August/021076.html
In summary, we could apply the patch here (as a bandaid, assuming the summary is
still correct), but even if we do that we need to redesign the reading of
pending because it has wider impact than whatsnew.
Also, to answer Eric's question: Yes, I suspect issue711 is related. I haven't
spent time debugging that issue, but if it is also constructing pending patches
then the problem is related.
|
msg8631 (view) |
Author: kowey |
Date: 2009-09-02.07:18:46 |
|
Jason: It may be a good idea to send the patch to darcs-users so that it gets
noticed
|
msg8635 (view) |
Author: kowey |
Date: 2009-09-02.14:21:09 |
|
Jason: please disregard my last suggestion to send the patch to the mailing list
(I wasn't paying enough attention).
I tend to think we should just accept the cost of a "no
bandaids/shortcuts/hacks" attitude. Not perfectionism but trying to get things
right more or less right the first time. We should fix performance by making
things cleaner and better conceived, not by twiddling knobs or by adding complexity.
Perhaps you could create a new ticket explaining the fundamental problem with
our use of the pending patch problem and your proposed solution? This ticket
can then point to that.
Thanks!
|
|
Date |
User |
Action |
Args |
2008-04-25 04:55:21 | dagit | create | |
2008-04-25 16:59:24 | gwern | set | status: unread -> unknown nosy:
+ gwern messages:
+ msg4338 |
2008-04-25 18:14:20 | dagit | set | files:
+ make-whatsnew-lazier.dpatch nosy:
tommy, beschmi, dagit, gwern messages:
+ msg4351 |
2008-04-25 19:12:08 | gwern | set | nosy:
tommy, beschmi, dagit, gwern messages:
+ msg4356 |
2008-05-14 08:35:24 | kowey | set | topic:
+ Performance nosy:
+ kowey messages:
+ msg4649 |
2008-05-14 16:54:45 | gwern | set | nosy:
tommy, beschmi, kowey, dagit, gwern messages:
+ msg4703 |
2008-05-14 16:56:44 | kowey | set | nosy:
tommy, beschmi, kowey, dagit, gwern messages:
+ msg4704 |
2009-01-12 10:45:04 | thorkilnaur | set | status: unknown -> needs-reproduction nosy:
+ dmitry.kurochkin, simon, thorkilnaur messages:
+ msg7054 |
2009-01-12 11:00:08 | kowey | set | topic:
+ Target-2.3 nosy:
tommy, beschmi, kowey, dagit, simon, thorkilnaur, gwern, dmitry.kurochkin messages:
+ msg7057 |
2009-08-06 16:04:38 | kowey | set | topic:
+ Target-2.4, - Target-2.3 nosy:
tommy, beschmi, kowey, dagit, simon, thorkilnaur, gwern, dmitry.kurochkin |
2009-08-06 21:03:11 | admin | set | nosy:
- beschmi |
2009-08-11 00:11:07 | admin | set | nosy:
- dagit |
2009-08-25 17:38:51 | admin | set | nosy:
+ darcs-devel, - simon |
2009-08-27 14:21:21 | admin | set | nosy:
tommy, kowey, darcs-devel, thorkilnaur, gwern, dmitry.kurochkin |
2009-09-02 06:07:07 | dagit | set | nosy:
+ dagit messages:
+ msg8629 |
2009-09-02 07:18:48 | kowey | set | nosy:
tommy, kowey, darcs-devel, dagit, thorkilnaur, gwern, dmitry.kurochkin messages:
+ msg8631 |
2009-09-02 14:21:15 | kowey | set | topic:
+ ThePendingPatch, - Target-2.4 nosy:
tommy, kowey, darcs-devel, dagit, thorkilnaur, gwern, dmitry.kurochkin messages:
+ msg8635 |
2009-09-29 06:48:29 | dagit | set | nosy:
- dagit |
2017-07-30 23:22:13 | gh | set | status: needs-reproduction -> given-up |
|