darcs

Patch 1234 patch index creation memory improvements

Title patch index creation memory improvements
Superseder Nosy List bsrkaditya, ganesh
Related Issues patch index creation consumes too much memory
View: 2405
Status accepted Assigned To
Milestone

Created on 2014-11-24.18:49:37 by ganesh, last changed 2015-02-12.17:49:45 by gh.

Files
File name Status Uploaded Type Edit Remove
check-on_disk-version-of-patch-index-at-right-place.dpatch gh, 2014-12-12.01:52:48 application/x-darcs-patch
move-the-handling-of-individual-patchmods-into-applypatchmods.dpatch ganesh, 2014-11-24.18:49:37 application/x-darcs-patch
need-to-un_reverse-the-pids-for-the-patch_index_test.dpatch ganesh, 2014-12-10.23:06:15 application/x-darcs-patch
patch-index_-adapt-removepidsuffix-to-the-new-order-of-pids.dpatch ganesh, 2015-02-09.07:03:32 application/x-darcs-patch
patch-preview.txt ganesh, 2014-11-24.18:49:37 text/x-darcs-patch
patch-preview.txt ganesh, 2014-12-10.23:06:15 text/x-darcs-patch
patch-preview.txt gh, 2014-12-12.01:52:48 text/x-darcs-patch
patch-preview.txt ganesh, 2015-02-09.07:03:32 text/x-darcs-patch
unnamed ganesh, 2014-11-24.18:49:37
unnamed ganesh, 2014-12-10.23:06:15
unnamed gh, 2014-12-12.01:52:48
unnamed ganesh, 2015-02-09.07:03:32
See mailing list archives for discussion on individual patches.
Messages
msg17851 (view) Author: ganesh Date: 2014-11-24.18:49:37
Another attempt at improving memory usage for patch index creation.

I hope I'm getting there now:
 - All the tests pass
 - The memory profile is flat when creating the index on the screened repo
 - darcs annotate darcs.cabal output looks plausible

However, I haven't benchmarked this in general yet so it's possible
it would regress performance in some way.

4 patches for repository darcs-unstable@darcs.net:screened:

Wed Nov 19 17:45:06 GMT 2014  Ganesh Sittampalam <ganesh@earth.li>
  * Move the handling of individual PatchMods into applyPatchMods
  
  This also pushes the 'nubSeq' call to only operate on the [PatchMod].
  This should make no difference as the PatchIds would be unique anyway,
  and it makes the behaviour of applyPatchMods more compositional with
  respect to individual patches.
  

Mon Nov 24 17:59:58 GMT 2014  Ganesh Sittampalam <ganesh@earth.li>
  * Keep the PatchId list in reversed order in memory
  
  This is preparatory for a further patch that builds it up
  incrementally.
  

Mon Nov 24 18:07:39 GMT 2014  Ganesh Sittampalam <ganesh@earth.li>
  * maintain the list of PatchIds inside the PatchIndex structure
  
  This should make it easier to build the two together incrementally
  

Mon Nov 24 18:08:20 GMT 2014  Ganesh Sittampalam <ganesh@earth.li>
  * use the strict state monad for working with PatchIndex
  
  This allows applyPatchMods to consume a list of patches incrementally,
  otherwise the entire list of patches is kept around until it finishes.
  
  It also seems like a reasonable change for any other code that works
  on a PatchIndex, as there's no particular value in it being lazy.
Attachments
msg17852 (view) Author: gh Date: 2014-11-25.03:32:35
Output of annotate is correct on the few files I tried in the darcs.net
repo (darcs.cabal and NEWS), and there is no notable performance
regression on these files.

Memory consumption on PI creation is indeed flat (hurrah!) and time
performance is slightly better. Also each patch make minimal changes so
that's reassuring. I'm screening the bundle for now (not accepting it
yet in case there is a general performance regression).
msg17871 (view) Author: gh Date: 2014-12-03.19:58:18
The patch "Keep the PatchId list in reversed order in memory" makes
"darcs show patch-index-test" fail in the darcs.net repo.
msg17872 (view) Author: gh Date: 2014-12-03.20:42:00
Also we can safely change the on-disk order of patch-ids by bumping the
PI version number.
msg17873 (view) Author: ganesh Date: 2014-12-03.22:59:16
Agreed re fixing the on-disk format - I think it makes sense to do that 
after fixing the patch-index-test failure. I'll get to that once I 
reach a good stopping point with the encoding work, which I don't want 
to page out at present.
msg17893 (view) Author: ganesh Date: 2014-12-10.23:06:15
This should resolve the patch-index-test problem.

1 patch for repository darcs-unstable@darcs.net:screened:

Wed Dec 10 23:03:43 GMT 2014  Ganesh Sittampalam <ganesh@earth.li>
  * need to un-reverse the pids for the patch-index-test
Attachments
msg17895 (view) Author: gh Date: 2014-12-11.22:30:12
I'm having another issue when amending a patch:

```
darcs init x
cd x
touch x
darcs rec -lam x
echo content > x
darcs rec -am amend_me
echo content > x
echo ya | darcs amend

darcs failed:  bug at src/Darcs/Repository/PatchIndex.hs:264 compiled
Dec 11 2014 19:15:38
couldn't find ./x in patch index
```
msg17896 (view) Author: gh Date: 2014-12-11.22:31:15
Sorry the second echo should be different from the first one:

```
darcs init x
cd x
touch x
darcs rec -lam x
echo content > x
darcs rec -am amend_me
echo different_content > x
echo ya | darcs amend

darcs failed:  bug at src/Darcs/Repository/PatchIndex.hs:264 compiled
Dec 11 2014 19:15:38
couldn't find ./x in patch index
```
msg17898 (view) Author: gh Date: 2014-12-12.00:25:24
(Wrongly sent the bundle to a new patch)

A fix for the above bug:

removePidSuffix relies on patch ids order (see local functions before
and beforeM),
so each time we run updatePatchIndexOnDisk we wrongly remove a LOT of stuff.

Patch is here (already screened):

http://bugs.darcs.net/file5259/patch-index_-adapt-removepidsuffix-to-the-new-order-of-pids.dpatch
msg17899 (view) Author: gh Date: 2014-12-12.01:52:48
Already a patch index version change! This way we avoid some
computing when loading and storing it on disk.

2 patches for repository http://darcs.net:

patch e4ac703007a23ea345848839a67bddb5bbdf3095
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Thu Dec 11 19:38:32 ART 2014
  * check on-disk version of patch index at right place

patch e4ad20d9b68738b2203944c3500a713cdef709e9
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Thu Dec 11 21:26:47 ART 2014
  * patch index: change order of on-disk pids
  this way we don't have to reverse the pids list all the time
Attachments
msg17900 (view) Author: ganesh Date: 2014-12-12.06:38:57
Did you get the last problem you reported before trying to reverse 
the pids on disk? The pids2idx structure is built from the on-disk 
version, so would have been in the correct order before the on-disk 
version was reversed.

Anyway, I think the overall effect of the two patches is correct and 
an improvement.
msg17901 (view) Author: gh Date: 2014-12-12.17:32:16
Yes, it happened before my last two patches (it's easy to reproduce).

And no, pids2idx did not depend on the order of pids returned by the
function loadPatchIndex (ie, only the pids were reversed).
msg17903 (view) Author: ganesh Date: 2014-12-12.20:10:39
I checked again and I'm fairly sure your repro is fine without both 
of these patches, or with both of them, but fails if either one is 
on its own:

patch index: change order of on-disk pids
patch index: adapt removePidSuffix to the new order of pids

Anyway, we're both happy with the end result, so the intermediate 
details don't matter too much.
msg18051 (view) Author: ganesh Date: 2015-02-09.07:03:32
Reattaching Guillaume's fix to this patch so it's all in one place
1 patch for repository /home/ganesh/darcs/screened-temp2:

Fri Dec 12 00:01:03 GMT 2014  Guillaume Hoffmann <guillaumh@gmail.com>
  * patch index: adapt removePidSuffix to the new order of pids
Attachments
msg18091 (view) Author: gh Date: 2015-02-12.17:49:45
Accepting along with other patch index patches (goes to branch 2.10).
History
Date User Action Args
2014-11-24 18:49:37ganeshcreate
2014-11-24 18:51:18ganeshsetstatus: needs-screening -> in-discussion
issues: + patch index creation consumes too much memory
title: Move the handling of individual PatchMod... (and 3 more) -> patch index creation memory improvements
2014-11-25 03:32:36ghsetstatus: in-discussion -> needs-review
messages: + msg17852
2014-11-25 03:36:23ghsetnosy: + bsrkaditya
2014-12-03 19:58:19ghsetmessages: + msg17871
2014-12-03 20:42:00ghsetmessages: + msg17872
2014-12-03 22:59:17ganeshsetmessages: + msg17873
2014-12-10 23:06:16ganeshsetfiles: + patch-preview.txt, need-to-un_reverse-the-pids-for-the-patch_index_test.dpatch, unnamed
messages: + msg17893
2014-12-11 22:30:13ghsetmessages: + msg17895
2014-12-11 22:31:15ghsetmessages: + msg17896
2014-12-12 00:25:24ghsetmessages: + msg17898
2014-12-12 01:52:48ghsetfiles: + patch-preview.txt, check-on_disk-version-of-patch-index-at-right-place.dpatch, unnamed
messages: + msg17899
2014-12-12 06:39:00ganeshsetmessages: + msg17900
2014-12-12 17:32:16ghsetmessages: + msg17901
2014-12-12 20:10:39ganeshsetmessages: + msg17903
2015-02-09 07:03:32ganeshsetfiles: + patch-preview.txt, patch-index_-adapt-removepidsuffix-to-the-new-order-of-pids.dpatch, unnamed
messages: + msg18051
2015-02-12 17:49:45ghsetstatus: needs-review -> accepted
messages: + msg18091