Created on 2006-01-04.17:14:56 by zooko, last changed 2008-10-07.14:54:37 by droundy.
| (mbox) |
| msg300 (view) |
Author: zooko |
Date: 2006-01-04.17:14:55 |
|
I created a 0.5 GiB file and darcs added it, darcs consumes 1.5 GiB of RAM
during the darcs add and again during the darcs record. It also took 35
minutes of maximum CPU on my high-powered workstation before I killed it.
It would be nice if darcs required enough RAM to store "only" one copy of the
patch. It would be nicer if darcs required less RAM -- using instead a fixed
maximum buffer of RAM and lazily processing the file as needed.
Hopefully the fact that laziness is one of the oldest, core features of the
design of Haskell means that it is relatively easy for programmers to implement
algorithms that do not eagerly consume RAM ?
The extreme CPU usage is perplexing. Are we trying to match the entire
contents of the binary file against a regex or something?
Regards,
Zooko
DARC yumyum:/mnt/sdb1/zooko/tmp$ time head --bytes=`python -c 'print 2**29'` /dev/zero > 0.5_GiB_file_a
real 0m1.790s
user 0m0.116s
sys 0m0.820s
DARC yumyum:/mnt/sdb1/zooko/tmp$
DARC yumyum:/mnt/sdb1/zooko/tmp$ l
drwxr-xr-x 5 zooko zooko 200 Jan 4 12:29 ./..
drwxrwxr-x 6 zooko zooko 184 Jan 4 12:36 ./_darcs
drwxrwxr-x 3 zooko zooko 104 Jan 4 12:36 ./.
-rw-rw-r-- 1 zooko zooko 536870912 Jan 4 12:36 ./0.5_GiB_file_a
DARC yumyum:/mnt/sdb1/zooko/tmp$ darcs add 0.5_GiB_file_a
DARC yumyum:/mnt/sdb1/zooko/tmp$ time darcs record
addfile ./0.5_GiB_file_a
Shall I record this patch? (1/?) [ynWsfqadjkc], or ? for help: y
binary ./0.5_GiB_file_a
Shall I record this patch? (2/?) [ynWsfqadjkc], or ? for help: y
What is the patch name? a
Do you want to add a long comment? [yn] n
Couldn't handle interrupt since darcs was in a sensitive job.
Couldn't handle interrupt since darcs was in a sensitive job.
Finished recording patch 'a'
real 35m49.529s
user 34m33.986s
sys 0m19.432s
DARC yumyum:/mnt/sdb1/zooko/tmp$
|
| msg301 (view) |
Author: zooko |
Date: 2006-01-04.17:18:19 |
|
Hm. I just noticed the "Finished record patch" part. That seems like a
separate bug. I hit C-c after 35 minutes.
--Z
> DARC yumyum:/mnt/sdb1/zooko/tmp$ darcs add 0.5_GiB_file_a
> DARC yumyum:/mnt/sdb1/zooko/tmp$ time darcs record
> addfile ./0.5_GiB_file_a
> Shall I record this patch? (1/?) [ynWsfqadjkc], or ? for help: y
> binary ./0.5_GiB_file_a
> Shall I record this patch? (2/?) [ynWsfqadjkc], or ? for help: y
> What is the patch name? a
> Do you want to add a long comment? [yn] n
>
> Couldn't handle interrupt since darcs was in a sensitive job.
> Couldn't handle interrupt since darcs was in a sensitive job.
> Finished recording patch 'a'
>
> real 35m49.529s
> user 34m33.986s
> sys 0m19.432s
> DARC yumyum:/mnt/sdb1/zooko/tmp$
|
| msg2951 (view) |
Author: markstos |
Date: 2008-01-31.04:03:24 |
|
I confirmed this issue with Darcs2 and the --darcs-2 format tonight, although
the memory usage reported was "only" twice the patch size. After 'record' ran
about about 2 minutes on a 1 Ghz laptop, darcs bailed out with this error:
darcs: out of memory (requested 1074790400 bytes)
(That's about 1 Gig of Ram being requested.)
If the reason the files are being loaded into memory is to check for changes, it
seems like some special case improvements are possible:
- If it's an "add", of course the whole file is new. Maybe we can avoid loading
the as much in this case?
- If it's a binary file, there's no need to look inside it, just to notice that
it changed, right?
|
| msg2981 (view) |
Author: droundy |
Date: 2008-01-31.16:26:55 |
|
We have previously had special-case code to enable record to run with less
memory, but I'm not convinced this is a good idea. By design, some of darcs
operations require that we hold parsed patches in memory, which will always
require more memory than the actual patch. And I don't care for the idea of
allowing users to record a patch that they cannot unrecord.
The real solution here is to revamp our handling of hunk patches so that we
don't store them in memory as a list of lines, but instead as a solid chunk of
memory with a stored number of lines.
David
|
| msg3503 (view) |
Author: markstos |
Date: 2008-02-16.19:03:06 |
|
David has been working on the fix to to replace the in-memory storage of hunks
to be blob-based rather than line-based.
However, the work caused some regressions, so it is being paused now while we
work on a stable Darcs 2 release. I'm marking the Darcs 2 release as a superseder.
|
| msg4675 (view) |
Author: kowey |
Date: 2008-05-14.13:03:10 |
|
Ok, darcs-2 is released, so I'm reviving this performance bug.
|
Browse related patches:
unstable
|
stable
|
| Date |
User |
Action |
Args |
| 2006-01-04 17:14:56 | zooko | create | |
| 2006-01-04 17:18:20 | zooko | set | status: unread -> chatting nosy:
droundy, tommy, zooko messages:
+ msg301 |
| 2006-01-13 14:43:07 | droundy | set | nosy:
droundy, tommy, zooko |
| 2006-01-13 14:43:15 | droundy | set | priority: feature -> bug nosy:
droundy, tommy, zooko |
| 2007-07-16 09:27:11 | kowey | set | topic:
+ Performance nosy:
+ kowey, beschmi |
| 2008-01-31 04:03:28 | markstos | set | topic:
+ Confirmed, Darcs2, IncludesExampleOrTest nosy:
+ markstos messages:
+ msg2951 title: memory usage is 3X patch size, and darcs record took at least 35 minutes -> record: memory usage is 2X patch size |
| 2008-01-31 16:27:00 | droundy | set | nosy:
droundy, tommy, beschmi, kowey, markstos, zooko messages:
+ msg2981 |
| 2008-02-16 18:59:22 | markstos | link | issue172 superseder |
| 2008-02-16 19:03:10 | markstos | set | status: chatting -> deferred nosy:
droundy, tommy, beschmi, kowey, markstos, zooko superseder:
+ Release Darcs 2.0 messages:
+ msg3503 |
| 2008-05-14 13:03:16 | kowey | set | status: deferred -> chatting nosy:
+ Serware, dagit superseder:
- Release Darcs 2.0 messages:
+ msg4675 |
| 2008-09-06 12:21:07 | gwern | set | nosy:
+ gwern |
| 2008-10-07 14:54:23 | droundy | set | priority: bug -> feature nosy:
+ dmitry.kurochkin, simon, thorkilnaur |
| 2008-10-07 14:54:37 | droundy | set | nosy:
droundy, tommy, beschmi, kowey, markstos, zooko, dagit, simon, thorkilnaur, gwern, dmitry.kurochkin, Serware |
|