Created on 2008-04-25.04:55:21 by dagit, last changed 2008-05-14.16:56:44 by kowey.
| msg4704 (view) |
Author: kowey |
Date: 2008-05-14.16:56:42 |
|
Thanks for clarifying that!
|
| msg4703 (view) |
Author: gwern |
Date: 2008-05-14.16:54:43 |
|
kowey, I don't think it's a dupe. The bug there is about 'whatsnew -sl'
performing badly in directories with lots of uncontrolled files, right? This bug
is about 'whatsnew -s' (no -l) performing badly in directories with controlled,
but big, files.
"`darcs2 whatsnew -s' only takes a small fraction of a second.
As does `darcs2 whatsnew' without any options."
|
| msg4649 (view) |
Author: kowey |
Date: 2008-05-14.08:35:22 |
|
Folks, what's the relationship between this bug and issue711? Duplicate?
|
| msg4356 (view) |
Author: gwern |
Date: 2008-04-25.19:12:07 |
|
I've come up with a hideous new optimization:
gwern@localhost:2234~/darcs.net_2>new
[ 2:59PM]
hunk ./src/fpstring.c 60
- return !!(memchr(s, 0, len) || memchr(s, 26, len));
+#ifdef _WIN32
+ return !!(memchr(s, 26, (len/10)));
+#else
+ return !!(memchr(s, 0, (len/10)));
+#endif
which gives a timing of
gwern@localhost:2231~/foo>time darcs whatsnew -s
[ 2:55PM]
A ./bigtempfile
darcs whatsnew -s 1.64s user 3.92s system 10% cpu 55.468 total
Of course, on the one hand, this is quite a nice speed boost for a nigh-trivial
change; on the other hand this change would seem to imply that a Linux darcs
could fail to understand a Windows binary and conversely. And on the gripping
hand, I don't see any new test suite failures for this change in particular, so
the 'specification', as it were, allows for this change.
No doubt there are further better ways to do it; I was discussing this on #c,
and I heard:
14:45 <@twkm> gwern: sounds horrible. you could dump the memchr's and use your
own loop, though that might miss optimizations available. or
go assembly -- to gain access to that optimal instruction sequence
-- with the loop as a fall-back, for systems without
assembly yet.
|
| msg4351 (view) |
Author: dagit |
Date: 2008-04-25.18:14:18 |
|
I disabled the test in is_funky and found that it was still taking ages to
compute 'whatsnew -s'. After a fair bit of debugging trying to figure out where
the culprit is, I found that it may be a problem with reading the pending or
what we do with the pending after we read it.
The attached patch passes all the tests on my system, but may not be the correct
way to solve the problem. Also, I have tried to send this patch twice once to
darcs-devel and once to darcs-users, but both times it was silently ignored.
I'm not sure why.
|
| msg4338 (view) |
Author: gwern |
Date: 2008-04-25.16:59:22 |
|
That seems to work here:
gwern@localhost:2249~/foo>time (head -c 10179869184 /dev/zero > bigtempfile) &&
time (darcs init --darcs-2) && time (darcs add bigtempfile) && time (darcs whats -s)
(; head -c 10179869184 /dev/zero > bigtempfile; ) 1.52s user 37.22s system 15%
cpu 4:07.02 total
(; darcs init --darcs-2; ) 0.00s user 0.01s system 0% cpu 2.590 total
(; darcs add bigtempfile; ) 0.00s user 0.00s system 0% cpu 1.997 total
A ./bigtempfile
(; darcs whats -s; ) 2.03s user 4.43s system 7% cpu 1:27.42 total
(I reduced the file size because of ulimit considerations here.)
---
I find myself wondering - if the slowdown is due to the double memchr loop over
the entire file concerned (in the worst case, it's a huge text file and memchr
doesn't bail out early, so the entire file gets scanned front to back, front to
back), then what can we do to optimize it? The obvious solution is to just scan
some fixed bytes (or percentage), ie:
return !!(memchr(s, 0, 1024) || memchr(s, 26, 1024));
or maybe
return !!(memchr(s, 0, (len / 10) || memchr(s, 26, (len / 10)));
This would probably cause us to miss some binary files and interpret them as
text files, but:
A) Is that really so bad? (I'm serious here, I don't know how bad it is to
mistakenly interpret a binary file as being text)
B) How many binary files don't have the magical characters anywhere near the
beginning? Is there any smarter definition than just 'neither \0 or ^Z appear
anywhere in the file'?
The timing for (len/10) is:
gwern@localhost:2232~/foo>time darcs whatsnew -s
[12:55PM]
A ./bigtempfile
darcs whatsnew -s 1.79s user 3.67s system 8% cpu 1:01.55 total
Unfortunately, this change seems to fail the merging_newlines, an add test,
either-dependency.sh test, and it seems whatsnew.pl (but I'm not sure they're
supposed to pass or if failures are bad).
Looking at the memchr man page, I see that there is a reverse memchr, 'memrchr'.
I don't know much about low-level stuff, but it strikes me as possible that if
we changed the definition to
return !!(memchr(s, 0, len) || memrchr(s, 26, len));
we'd get better cache locality - we'd scan forward through the long ByteString,
and then we'd reverse course and scan backwards, which allows the cached end of
the string to be reused, instead of having to restart at the beginning and
re-reading the entire darn thing! Using reverse memchr on the dummy repo I gave
timings for above gives the result:
gwern@localhost:2220~/foo>time darcs whatsnew -s
[12:46PM]
A ./bigtempfile
darcs whatsnew -s 1.63s user 3.55s system 10% cpu 50.062 total
Which is a noticeable improvement, but I fear it caused some Perl tests to fail
:( Combining memrchr with the (len/10) change brings it down to 47s, which
doesn't surprise me - looking at top shows that darcs is not using much memory
or CPU, so that leaves IO-bound stuff.
A final suggestion is more difficult, since I'm not familiar with C: instead of
looping over the string and matching against a single character, surely there
must be some way to loop over the string once, matching against either \0 or ^Z;
this would seem to me to be more efficient than looping twice (since we're
interested in performance on strings that obviously blow all the caches).
|
| msg4328 (view) |
Author: dagit |
Date: 2008-04-25.04:55:20 |
|
The following commands work on OS X, and cause darcs to read bigtempfile before
telling you that it is to be added.
For other unices, use --bytes= instead of -c, the --darcs-2 to init doesn't
matter, but I wanted to demonstrate that darcs-2 repos do not fix this issue.
time head -c 17179869184 /dev/zero > bigtempfile
darcs init --darcs-2
darcs add bigtempfile
darcs whats -s
You'll notice this takes much longer than it should. I'm pretty sure this is
because adding the file puts it pending and the summary needs to know if this is
a binary or non-binary patch. Determining that a file is not binary requires
reading the whole file in smart_diff and diff_files because has_bin calls
is_funky to look at every char in the file checking for characters that are
typical in binary files.
|
Browse related patches:
unstable
|
stable
|
| Date |
User |
Action |
Args |
| 2008-05-14 16:56:44 | kowey | set | nosy:
tommy, beschmi, kowey, dagit, gwern messages:
+ msg4704 |
| 2008-05-14 16:54:45 | gwern | set | nosy:
tommy, beschmi, kowey, dagit, gwern messages:
+ msg4703 |
| 2008-05-14 08:35:24 | kowey | set | topic:
+ Performance nosy:
+ kowey messages:
+ msg4649 |
| 2008-04-25 19:12:08 | gwern | set | nosy:
tommy, beschmi, dagit, gwern messages:
+ msg4356 |
| 2008-04-25 18:14:20 | dagit | set | files:
+ make-whatsnew-lazier.dpatch nosy:
tommy, beschmi, dagit, gwern messages:
+ msg4351 |
| 2008-04-25 16:59:24 | gwern | set | status: unread -> chatting nosy:
+ gwern messages:
+ msg4338 |
| 2008-04-25 04:55:21 | dagit | create | |
|