Patch 2227: patch index: remove another kludge for b... (and 7 more) - Darcs bug tracker

Title	patch index: remove another kludge for b... (and 7 more)
Superseder		Nosy List	bfrk
Related Issues
Status	accepted	Assigned To
Milestone

Created on 2021-11-20.11:43:52 by bfrk, last changed 2023-06-24.16:16:10 by ganesh.

Files
File name	Status	Uploaded	Type	Edit	Remove
patch-index_-remove-another-kludge-for-broken-move-patches.dpatch		bfrk, 2021-11-20.11:43:38	application/x-darcs-patch
patch-preview.txt		bfrk, 2021-11-20.11:43:34	text/x-darcs-patch

See mailing list archives for discussion on individual patches.

Messages
msg22932 (view)	Author: bfrk	Date: 2021-11-20.11:43:38
8 patches for repository http://darcs.net/screened: patch 955d1155f0cd5063fd12f48217f37dfcb3f0a33e Author: Ben Franksen <ben.franksen@online.de> Date: Thu Jun 10 21:00:38 CEST 2021 * patch index: remove another kludge for broken move patches patch 3b5f95878bfbc46cbf3c439c7227dba1db165155 Author: Ben Franksen <ben.franksen@online.de> Date: Thu Jun 10 14:09:49 CEST 2021 * patch index: debug messages when loading patch 2f2cfa1bea65fa775f25dd9408e0faa22f1311fb Author: Ben Franksen <ben.franksen@online.de> Date: Thu Jun 10 14:06:19 CEST 2021 * patch index: import from Darcs.Patch.Index.Types explicitly patch b1ed0fbbca4a874be2905ae1c5b1a82a29b719aa Author: Ben Franksen <ben.franksen@online.de> Date: Thu Jun 10 14:05:01 CEST 2021 * patch index: reformat a doc comment patch 05ad2db3ec6da188b5390573154fae5c3aea6025 Author: Ben Franksen <ben.franksen@online.de> Date: Thu Oct 4 18:07:31 CEST 2018 * patch index: massive performance improvement By definition, a directory is regarded as touched whenever a file or directory beneath it is. This information was laboriously re-constructed at query time (in maybeFilterPatches), which could be very slow when a directory had many children. It is much more efficient to add this information immediately when we build or update the patch index. This requires no change to the data structures, just additional inserts. Nonetheless, this is a format change, so we have to increase the version number to 4. patch f61562ebbf31c75f4c7ccfe23751b7b98286a290 Author: Ben Franksen <ben.franksen@online.de> Date: Thu Jun 10 14:15:41 CEST 2021 * patch index: optimize Set Word32 -> IntSet This trades a bit of memory (at least on 64 bit archs) for the ability to use an optimized data structure. The version number gets increased to 5. patch ca8f233393233b0580a20eda7eafa8e379687f1a Author: Ben Franksen <ben.franksen@online.de> Date: Thu Jun 10 14:09:09 CEST 2021 * patch index: cleanup doc comments patch 77d58fea123dc04bf96aa646206381d2d5d280a7 Author: Ben Franksen <ben.franksen@online.de> Date: Mon Oct 4 10:09:39 CEST 2021 * patch index: disentangle Darcs.Patch.Index.* modules, rename PatchMod to FileMod Attachments patch-index_-remove-another-kludge-for-broken-move-patches.dpatch patch-preview.txt
msg23242 (view)	Author: ganesh	Date: 2023-04-01.09:01:53
reviewing incrementally > * patch index: remove another kludge for broken move patches Can you remind me why we don't need this workaround any more? > * patch index: debug messages when loading > * patch index: debug messages when loading > * patch index: reformat a doc comment All OK
msg23252 (view)	Author: bfrk	Date: 2023-04-01.16:30:47
>> * patch index: remove another kludge for broken move patches > > Can you remind me why we don't need this workaround any more? I believe it was this patch patch f28ec42284caa0477537bb5741762abf0ab31b02 Author: Ben Franksen <ben.franksen@online.de> Date: Tue Mar 9 17:40:02 CET 2021 * fail in Darcs.Util.Tree.Monad.rename if source does not exist Previously, when we applied a move with a non-existing source, this was not detected i.e. did not fail, but rather (effectively) ignored it. Such changes could then be recorded and indeed we had a very old patch in our own history that contained such a bad move. This prompted me to extend the basic repair machinery for Prim.V1 to include a new case which simply removes such a change from the patch.
msg23253 (view)	Author: ganesh	Date: 2023-04-01.18:19:35
> This prompted me to extend the basic repair machinery for Prim.V1 to > include a new case which simply removes such a change from the > patch. So just so I've got it clear in my head (and in the hope I remember next time...) - we used to silently accept these broken patches, now we fail and force the user to repair them first before doing anything else?
msg23254 (view)	Author: bfrk	Date: 2023-04-01.18:21:48
It is funny how such a small and innocent looking change led to a series of patches, some of them much less trivial. I can't remember exactly how I noticed the broken move in our repos, perhaps I just ran a check or repair in one of my clones and noticed that it now "crashes". After I extended Prim.V1.applyAndTryToFixFL and doing lots of tests and self-review to make sure this is really okay, I went ahead and repair'ed screened on darcs.net, and I think the other repos, too (reviewed, releases). Quite some time later I noticed that my clones had become slower on operations like push and pull or with commands that support --not-in-remote. Furthermore, a fresh clone with the new patches pulled of top did not show this behavior. This vexxed me enough to investigate. Comparing the two repos I found that their _darcs/inventories differed a lot, in particular the fresh clone had much fewer inventories. So I looked into the upstream repos and saw that screened has just a single inventory! This made me remember the repair operation and indeed a quick look at the code in Darcs.Repository.Repair revealed that repair throws away all Tagged sections (it worked on the result of a patchSet2FL). So I fixed that. But then I had already repaired the repos on darcs.net, so how to restore them to their old glory? This led me to add -- deep/--shallow options for optimize reorder, --deep having the effect of creating inventories for all clean tags in the repo and also trying to make all tags clean as long as that doesn't make an earlier tag unclean. Again, after lots of testing and self-review, I ran this on on the repos on darcs.net (this time I archived the old versions under /opt/darcs/old). What is still missing is a variant of `darcs optimize reorder` that adapts an existing repo to the inventory structure (and thus patch order) of its upstream. Like a fresh clone with patches added/removed as required to mirror the old clone, as I did manually when I first encountered the mysterious slow- down. This should make use of packs if present in upstream. We could perhaps try to detect situations which are suboptimal (during push and pull and such) and issue a warning with a hint how to improve the situation. This is related to my idea (still not realised) of a "safe mode" which does not trust patch identities and instead requires patch content equality (via content hashes, including inventory hashes), re-ordering patches locally if necessary. This would reliably detect violations of the ident-commute law, i.e. our assumption that re-ordering via commute is the only valid way that patches can keep their identity (and vice versa). Realising this idea is difficult because it means everything in Darcs.Patch.Depends must be refactored, if not completely re-written.
msg23255 (view)	Author: bfrk	Date: 2023-04-01.18:26:51
To answer your question (which I hadn't seen before sending the long story): yes, precisely.
msg23258 (view)	Author: bfrk	Date: 2023-04-01.20:30:37
Well, not quite. We do NOT force users to run repair. If the broken move is contained in an old enough patch, then most operations won't even notice any problem. Remember that we fail when applying the patch (which is our only way to check if a patch actually makes sense in a given context) and old patches are usually not applied. Pulling all patches into an empty repo will fail, for instance. Commuting such a patch may or may not have triggered an error even before this change.
msg23377 (view)	Author: ganesh	Date: 2023-06-24.14:54:57
> * patch index: massive performance improvement Looks good. I've read this a few times and while I don't think I understand the nuances well enough to be completely certain about it, the concept makes sense. Also it's not a disaster if there is a bug as it won't corrupt repo history or anything.
msg23378 (view)	Author: ganesh	Date: 2023-06-24.14:57:51
> * patch index: optimize Set Word32 -> IntSet > * patch index: cleanup doc comments > * patch index: disentangle Darcs.Patch.Index.* modules, rename PatchMod to FileMod All good.

History
Date	User	Action	Args
2021-11-20 11:43:52	bfrk	create
2021-11-21 07:13:06	bfrk	set	status: needs-screening -> needs-review
2023-04-01 09:01:53	ganesh	set	messages: + msg23242
2023-04-01 16:30:48	bfrk	set	messages: + msg23252
2023-04-01 18:19:36	ganesh	set	messages: + msg23253
2023-04-01 18:21:52	bfrk	set	messages: + msg23254
2023-04-01 18:26:54	bfrk	set	messages: + msg23255
2023-04-01 20:30:38	bfrk	set	messages: + msg23258
2023-04-02 16:33:25	bfrk	set	status: needs-review -> review-in-progress
2023-06-24 14:54:58	ganesh	set	messages: + msg23377
2023-06-24 14:57:51	ganesh	set	status: review-in-progress -> accepted-pending-tests messages: + msg23378
2023-06-24 16:16:10	ganesh	set	status: accepted-pending-tests -> accepted