darcs

Issue 2607 darcs record is too slow with medium/large numbers of files

Title darcs record is too slow with medium/large numbers of files
Priority bug Status resolved
Milestone 2.18.1 Resolved in 2.18.1
Superseder Nosy List simon
Assigned To
Topics

Created on 2018-11-01.18:55:31 by simon, last changed 2023-03-26.06:49:28 by bfrk.

Messages
msg20434 (view) Author: simon Date: 2018-11-01.18:55:29
darcs record seems to run slowly and scale badly when recording large
numbers of files. Here are some times for recording increasing numbers
of added small files (one word each), on a 2013 macbook air, with
darcs-2.14.1:

[11:41:56] recording 100 files: real 0m0.231s
[11:41:56] recording 200 files: real 0m0.443s
[11:41:57] recording 300 files: real 0m0.618s
[11:41:58] recording 400 files: real 0m1.416s
[11:42:00] recording 500 files: real 0m1.932s
[11:42:03] recording 600 files: real 0m3.429s
[11:42:07] recording 700 files: real 0m6.340s
[11:42:15] recording 800 files: real 0m8.051s
[11:42:24] recording 900 files: real 0m10.717s
[11:42:36] recording 1000 files: real 0m15.171s
[11:42:53] recording 2000 files: real 1m58.243s
[11:44:54] recording 3000 files: real 6m17.081s
[11:51:18] recording 4000 files: 
make[1]: *** [test4000] Error 130


darcs --exact-version output begins with:
```
darcs compiled on Sep  4 2018, at 08:40:45

Weak Hash: 7cad985e35ce5c2c37f0cc33e3ca042d567bca55
```

Reproduced on mac mojave and ubuntu.
msg20435 (view) Author: simon Date: 2018-11-02.00:14:56
Perhaps related: http://darcs.net/FAQ/Performance#darcs-record-takes-forever
msg20437 (view) Author: bfrk Date: 2018-11-03.17:09:59
Interesting. Performance should not degrade like that, it should be
linear in the amount of data that is recorded. I have observed similar
degradation when importing large trees, down to out-of-memory crashes
(with appropriate ulimits on memory imposed so as not to bring down my
whole machine). But it never occurred to me that this could be related
to the number of files.

You seem to have a systematic test procedure. Can you send/attach that,
as a script, so I can more easily reproduce?
msg20439 (view) Author: simon Date: 2018-11-03.17:33:12
Thanks for looking at it Ben. Yes, I made one as I was having trouble
reproducing consistent results, and wanted to be sure before reporting.
Here's the scripts:


# Makefile.linux

TIME=/usr/bin/time -f%E
DARCS=darcs

test:
	for N in \
	 100 200 300 400 500 600 700 800 900 \
	 1000 2000 3000 4000 5000 6000 7000 8000 9000 \
	 ; do printf "[`date +%H:%M:%S`] recording $$N files: "; printf "`make
-s test$$N`" ; done
	 # 10 20 30 40 50 60 70 80 90 \
	 # 100 200 300 400 500 600 700 800 900 \
	 # 1000 2000 3000 4000 5000 6000 7000 8000 9000 \

# time darcs operations in a darcs repo with this many files
test%:
	@make -s t$*
	-@cd t$*; \
		rm -rf _darcs; \
		$(DARCS) init -q; \
		$(DARCS) add -q *; \
		$(TIME) $(DARCS) record -am 'record' -q >/dev/null
#		( time ( $(DARCS) record -am 'record' ) ) 2>&1 | grep real

# make a test directory with N files. Must be the last make target
beginning with t.
t%: dict
	mkdir -p $@; cd $@; for w in $$(head -$* ../dict); do echo $$w > $$w; done

# make a dictionary of simple unique ascii words
dict:
	strings /usr/share/dict/words | grep -v \' | sort | uniq > $@

clean:
	rm -rf t[0-9]*/_darcs


# Makefile.mac
DARCS=darcs

test:
	for N in \
	 100 200 300 400 500 600 700 800 900 \
	 1000 2000 3000 4000 5000 6000 7000 8000 9000 \
	 ; do printf "[`date +%H:%M:%S`] recording $$N files: "; echo `make -s
test$$N` ; done
	 # 10 20 30 40 50 60 70 80 90 \
	 # 100 200 300 400 500 600 700 800 900 \
	 # 1000 2000 3000 4000 5000 6000 7000 8000 9000 \

# time darcs operations in a darcs repo with this many files
test%:
	@make -s t$*
	cd t$*; \
		rm -rf _darcs; \
		$(DARCS) init >/dev/null; \
		$(DARCS) add -q *; \
		( time ( $(DARCS) record -am 'record' ) ) 2>&1 | grep real

# make a test directory with N files. Must be the last make target
beginning with t.
t%:
	mkdir -p $@; cd $@; for w in $$(head -$* /usr/share/dict/words); do
echo $$w > $$w; done

clean:
	rm -rf t[0-9]*/_darcs


As you can see, it wasn't that easy to get clean output on two platforms. :)
msg20440 (view) Author: simon Date: 2018-11-03.17:38:26
Non-line-wrapped:
https://gist.github.com/simonmichael/60838809ecf4fa2f209b5b978b5aae81
msg20441 (view) Author: bfrk Date: 2018-11-03.20:39:00
I found that this happens only if we add the files to pending via 'darcs
add *' and then record. When I modify the script to remove the 'darcs
add' and instead "darcs record -la" I get:

recording 100 files
0:00.11
recording 200 files
0:00.18
recording 300 files
0:00.16
recording 400 files
0:00.23
recording 500 files
0:00.28
recording 600 files
0:00.29
recording 700 files
0:00.37
recording 800 files
0:00.40
recording 900 files
0:00.47
recording 1000 files
0:00.52

which looks more like what I would expect.

So indeed this seems to be a problem with pending. And now that I think
about it, this is no great surprise. I recently tried to fix a few
situations where the pending patch can become corrupt. In the course of
doing so I analysed and modified the procedure that updates the pending
patch, so i know what I'm talking about.

If you use the latest Darcs from screened you can see the progress
messages that indicate what takes it so long: every single prim we
record (i.e. x-100 pairs of addfile, then hunk) that gets recorded must
be removed from the pending patch. Not a big problem for the addfile
patches alone, since they are in the same sort order in the pending
patch as in what we have recorded so removing them is fast.

The problematic ones are the hunks. We must first try to remove them
from pending (note that hunks can be in pending e.g. if a pending
replace patch depends on them) and of that fails try to remove them from
the difference between pending and working. This is necessary in order
to correctly update the pending patch in situations where patches from
pending and working are coalesced.

A possible fix could be to use a different sort order for prim patches.
We currently sort first by affected file name, then by patch type, which
is the reason the addfiles and hunks are interleaved (you can see that
with darcs log -v in one of the test repos). Can we place all the
addfiles before all the hunks? That should be possible since an addfile
can never depend on a hunk. Then, by the time we are done with the
addfiles, pending is empty and then we just have to check that the hunks
are all in the working without having to commute anything past pending. Hmm.

A short-term work-around is to use 'darcs record -l' which circumvents
the pending patch.
msg20442 (view) Author: simon Date: 2018-11-05.19:02:42
Thanks! I confirm that record -l performs much better than add+record:

[10:33:53] recording 100 files: real 0m0.197s
[10:33:53] recording 200 files: real 0m0.230s
[10:33:54] recording 300 files: real 0m0.263s
[10:33:54] recording 400 files: real 0m0.457s
[10:33:55] recording 500 files: real 0m0.405s
[10:33:55] recording 600 files: real 0m0.490s
[10:33:56] recording 700 files: real 0m0.595s
[10:33:57] recording 800 files: real 0m0.649s
[10:33:58] recording 900 files: real 0m0.994s
[10:33:59] recording 1000 files: real 0m0.798s
[10:34:00] recording 2000 files: real 0m1.651s
[10:34:03] recording 3000 files: real 0m2.667s
[10:34:06] recording 4000 files: real 0m5.331s
[10:34:14] recording 5000 files: real 0m4.613s
[10:34:20] recording 6000 files: real 0m6.184s
[10:34:28] recording 7000 files: real 0m7.276s
[10:34:37] recording 8000 files: real 0m8.127s
[10:34:48] recording 9000 files: real 0m9.609s

This is a good workaround for adding many files, if you can ensure there
are no undesirable files lying around. 

I'm wondering what other operations are affected by this large-pending
slowdown. I guess recording changes to many existing files is not,
judging by this:

[10:59:46] recording 1000 files: real 0m1.020s
[10:59:49] recording 2000 files: real 0m2.000s
[10:59:54] recording 3000 files: real 0m3.272s
[11:00:02] recording 4000 files: real 0m4.871s
[11:00:15] recording 5000 files: real 0m5.793s
[11:00:30] recording 6000 files: real 0m7.312s
[11:00:48] recording 7000 files: real 0m8.669s
[11:01:07] recording 8000 files: real 0m10.428s
[11:01:31] recording 9000 files: real 0m13.912s
msg20443 (view) Author: simon Date: 2018-11-05.19:44:35
whatsnew seems linear, also. Here's new output for all cases, I have
also updated the gist script (mac version only):

[11:42:45] recording 1000 added files: real	0m16.895s
[11:43:04] recording 2000 added files: real	2m6.411s
[11:45:14] recording 3000 added files:   C-c C-c

[11:34:06] recording 1000 added files with -l: real	0m1.047s
[11:34:07] recording 2000 added files with -l: real	0m1.851s
[11:34:09] recording 3000 added files with -l: real	0m3.903s
[11:34:13] recording 4000 added files with -l: real	0m4.342s
[11:34:18] recording 5000 added files with -l: real	0m4.977s
[11:34:25] recording 6000 added files with -l: real	0m7.481s
[11:34:35] recording 7000 added files with -l: real	0m8.055s
[11:34:45] recording 8000 added files with -l: real	0m8.326s
[11:34:55] recording 9000 added files with -l: real	0m10.361s

[11:35:08] recording 1000 edited files: real	0m1.076s
[11:35:11] recording 2000 edited files: real	0m2.002s
[11:35:17] recording 3000 edited files: real	0m2.851s
[11:35:24] recording 4000 edited files: real	0m4.631s
[11:35:35] recording 5000 edited files: real	0m6.444s
[11:35:50] recording 6000 edited files: real	0m8.446s
[11:36:15] recording 7000 edited files: real	0m9.121s
[11:36:36] recording 8000 edited files: real	0m11.184s
[11:37:03] recording 9000 edited files: real	0m10.906s

[11:37:30] whatsnew with 1000 unadded files: real	0m0.035s
[11:37:31] whatsnew with 2000 unadded files: real	0m0.036s
[11:37:32] whatsnew with 3000 unadded files: real	0m0.038s
[11:37:33] whatsnew with 4000 unadded files: real	0m0.036s
[11:37:35] whatsnew with 5000 unadded files: real	0m0.032s
[11:37:37] whatsnew with 6000 unadded files: real	0m0.033s
[11:37:40] whatsnew with 7000 unadded files: real	0m0.034s
[11:37:43] whatsnew with 8000 unadded files: real	0m0.034s
[11:37:47] whatsnew with 9000 unadded files: real	0m0.034s

[11:37:51] whatsnew with 1000 pending added files: real	0m0.408s
[11:37:52] whatsnew with 2000 pending added files: real	0m1.275s
[11:37:57] whatsnew with 3000 pending added files: real	0m1.944s
[11:38:04] whatsnew with 4000 pending added files: real	0m2.710s
[11:38:17] whatsnew with 5000 pending added files: real	0m3.518s
[11:38:37] whatsnew with 6000 pending added files: real	0m3.635s
[11:39:04] whatsnew with 7000 pending added files: real	0m4.721s
[11:39:36] whatsnew with 8000 pending added files: real	0m5.702s
[11:40:17] whatsnew with 9000 pending added files: real	0m5.682s
msg20444 (view) Author: bfrk Date: 2018-11-06.08:48:43
> This is a good workaround for adding many files, if you can ensure there
> are no undesirable files lying around. 

If that is an issue, you can supply filepath arguments in addition to
-l, just like with add.

> I'm wondering what other operations are affected by this large-pending
> slowdown. I guess recording changes to many existing files is not,
> judging by this:
> 
> [10:59:46] recording 1000 files: real 0m1.020s
> [10:59:49] recording 2000 files: real 0m2.000s
> [[...]

Right. Bceause in that case the pending patch is empty. Hunks are
(normally) not in the pending patch, we commute them out the back,
because they can be re-calculated from the difference between working
tree and pristine tree. This (and your other tests) confirm my analysis:
the problem only appears if

(1) we have many changes in pending, and then
(2) record or amend many changes that are not in pending.

Now to a possible fix.

My first idea was to change the standard sort order for prims, but this
would affect the UI negatively so we shouldn't do that. My current plan
is to do with the newly recorded changes something similar as what we do
with pending when we "sift" it to get rid of hunks that can be commute
out the back. Roughly,

  for_pend :> not_for_pend = partitionFL belongsInPending newly_recorded

where

  belongsInPending = False for hunks and binaries, True otherwise

Then first try to remove for_pend from pending. With the rest
(not_for_pend) we would first try to remove them from working and from
pending only if that fails. This should avoid the quadratic trap in most
of the common cases including your test case.

Cheers
Ben
msg23183 (view) Author: bfrk Date: 2023-03-26.06:49:28
This has been fixed with

patch 6e1427815800d31bc880c730b6bac5edfeba300b
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sat Jul 23 22:23:03 CEST 2022
  * drastically simplify updatePending, remove decoalescing

The new algorithm is not only simpler but also scales better.
History
Date User Action Args
2018-11-01 18:55:31simoncreate
2018-11-02 00:14:57simonsetmessages: + msg20435
2018-11-03 17:10:01bfrksetmessages: + msg20437
2018-11-03 17:33:14simonsetmessages: + msg20439
2018-11-03 17:38:28simonsetmessages: + msg20440
2018-11-03 20:39:03bfrksetmessages: + msg20441
2018-11-05 19:02:48simonsetmessages: + msg20442
2018-11-05 19:44:37simonsetmessages: + msg20443
2018-11-06 08:48:46bfrksetmessages: + msg20444
2023-03-26 06:49:28bfrksetstatus: unknown -> resolved
priority: bug
milestone: 2.18.1
messages: + msg23183
resolvedin: 2.18.1