darcs

Patch 1755 disentangle D.R.State.filteredWorking

Title disentangle D.R.State.filteredWorking
Superseder Nosy List bf, ganesh
Related Issues
Status accepted-pending-tests Assigned To ganesh
Milestone

Created on 2018-10-18.22:13:35 by bf, last changed 2018-12-10.13:20:26 by bf.

Files
File name Status Uploaded Type Edit Remove
disentangle-d_r_state_filteredworking.dpatch bf, 2018-10-18.22:13:35 application/x-darcs-patch
fix-docs_-following-up-on-_disentangle-d_r_state_filteredworking_.dpatch bf, 2018-12-03.19:10:28 application/x-darcs-patch
patch-preview.txt bf, 2018-10-18.22:13:35 text/x-darcs-patch
patch-preview.txt bf, 2018-12-03.19:10:28 text/x-darcs-patch
unnamed bf, 2018-10-18.22:13:35 text/plain
unnamed bf, 2018-12-03.19:10:28 text/plain
See mailing list archives for discussion on individual patches.
Messages
msg20426 (view) Author: bf Date: 2018-10-18.22:13:35
1 patch for repository http://darcs.net/screened:

patch cc12c0b275fc233beb579c6502fa47e65c292a98
Author: Ben Franksen <ben.franksen@online.de>
Date:   Tue Oct  2 23:26:43 CEST 2018
  * disentangle D.R.State.filteredWorking
  
  Swapping the order of the cases and then trying to pull out applyTreeFilter
  relevant made it apparent that
  (a) using the index in the IgnoreIndex case can be replaced by passing the
      pending_tree instead, and
  (b) applyTreeFilter relevant was missing in the UseIndex+ScanKnown case.
Attachments
msg20499 (view) Author: ganesh Date: 2018-11-18.09:22:04
> -- The @pending_tree@ is understood to have @relevant@ already applied and is
> -- used (only) if @useidx == 'IgnoreIndex'@ and @scan == 'ScanKnown'@ to act as
> -- a guide for filtering the working tree.

This comment is now wrong, as it's now also used in the IgnoreIndex/ScanAll case.

>  (a) using the index in the IgnoreIndex case can be replaced by passing the
>      pending_tree instead, and

This change is in the ScanAll, IgnoreIndex case specifically. Doesn't using
pending_tree mean that newly added files will be missed?

>  (b) applyTreeFilter relevant was missing in the UseIndex+ScanKnown case.

This seems quite fundamental in a common case, didn't this lead to obvious
misbehaviour with darcs ignoring an explicit list of paths?

I don't find this code/API very intuitive so I could be completely misunderstanding.
msg20509 (view) Author: bf Date: 2018-11-18.14:24:17
Ganesh, thanks for the comments.

As I found out by myself in the meantime, my claim that "(a) using the 
index in the IgnoreIndex case can be replaced by passing the pending_tree 
instead" is not fully correct and the resulting code changes may perhaps 
break things. More on that below. I think the second statement (b) is 
correct and the only possible explanation I have for why it doesn't lead 
to misbehavior is that we duplicate some of the work of filtering out 
irrelevant parts in other places e.g. command implementations.

Pushing this change to screened was premature and I regret having done 
so. (The same goes for the "resolve issue1959" patch which prompted this 
refactor.) Even though they pass the test suite that doesn't tell us much 
because when the test suite is run we set "ALL --ignore-times" which 
means we don't use the index... except that this option is not honored by 
all commands and not all the functions exported by Darcs.Repository.State 
give us the possibility to avoid access to the index.

In other words: this is a big and ugly mess and cleaning it up is slow 
going.

The main problem for me was to reverse engineer a clear specification of 
what "I.readIndex >>= updateIndex" is supposed to give back as a result 
i.e. an equivalent IO action that does not involve the index. I currently 
believe that this may capture it:

If pending_tree is the recorded state with the pending patch applied, 
then we have

  I.readIndex >>= updateIndex
  ===
  do
    guide <- expand pending_tree
    expand =<< restrict guide <$> readPlainTree (repoLocation repo)

We don't use it like that, though. Instead we have another TreeFilter in 
between in case we want to restrict the operation to a list of paths 
given by the user. So in practice it's rather

  I.readIndex >>= updateIndex . applyTreeFilter treeFilter
  ===
  do
    guide <- expand pending_tree
    expand =<< applyTreeFilter treeFilter . restrict guide <$>
      readPlainTree (repoLocation repo)

I found this out by modifying the code so that it always does /both/ 
versions, then asserts equality of the two resulting trees (using their 
set of paths to determine equality). This allowed me to use the test 
suite as a comprehensive source of inputs and still be independent of the 
--ignore-times option.

My problem now is that I am reluctant to throw this code away as it is 
very useful and also documents the behavior nicely. OTOH, we don't want 
to keep it in the main code path because it duplicates work and doesn't 
allow us to fix issue1959. Any idea? Perhaps I should disable the 
unwanted code with a global constant.

I am not sure what to do with this patch as it is. I am considering a 
rollback.
msg20510 (view) Author: ganesh Date: 2018-11-18.14:37:15
A rollback is fine, it's one of the risks we accepted by having
the screened/reviewed setup.
msg20567 (view) Author: bf Date: 2018-12-03.19:10:28
Instead of rolling this patch back, here are two fixes. After thinking about
this for quite some time I came to the conclusion that the refactor is
basically sound modulo the fix below.

2 patches for repository http://darcs.net/screened:

patch 6491f483e4051706669820a15e31bd6e0e7bbc5e
Author: Ben Franksen <ben.franksen@online.de>
Date:   Mon Dec  3 19:30:56 CET 2018
  * fix docs, following up on "disentangle D.R.State.filteredWorking"

patch 3d69bbfd68a8ac98ea1d1fa54141dd6eca5920f7
Author: Ben Franksen <ben.franksen@online.de>
Date:   Mon Dec  3 19:50:12 CET 2018
  * expand pending_tree in filteredWorking (IgnoreIndex+ScanAll case)
  
  We use the pending_tree for filtering which means it should be fully
  expanded, see the docs for Darcs.Util.Tree.restrict and my comment
  (http://bugs.darcs.net/msg20509). It would be more efficient and elegant to
  do the expanding on demand, but since this is an IO action it would mean
  tree filtering is no longer a pure function.
Attachments
msg20594 (view) Author: ganesh Date: 2018-12-07.07:12:50
I don't understand why using 'expand' on pending_tree means that
newly added files will now be found, but if you've tested it
thoroughly I'll go with that.

I also missed your previous question about how to keep code that
essentially forms a useful behaviour specification. I'd agree
with putting it in controlled by a global constant - maybe expose
the equivalence check as a test?
msg20595 (view) Author: ganesh Date: 2018-12-07.07:37:46
Oh, now I see that you have included the test code in patch1738 :-)
msg20602 (view) Author: bf Date: 2018-12-10.13:20:26
> I don't understand why using 'expand' on pending_tree means that
> newly added files will now be found, but if you've tested it
> thoroughly I'll go with that.

Nothing under a stub item will be reported as found by the pure
functions that operate on a tree, including filters.

> I also missed your previous question about how to keep code that
> essentially forms a useful behaviour specification. I'd agree
> with putting it in controlled by a global constant - maybe expose
> the equivalence check as a test?

I would have done if it were easily possible. The stated equivalence is
an invariant that must be maintained by any code that updates the index
and since these are all IO actions...
History
Date User Action Args
2018-10-18 22:13:35bfcreate
2018-10-28 16:40:57bfsetstatus: needs-screening -> needs-review
2018-11-18 09:22:05ganeshsetstatus: needs-review -> review-in-progress
assignedto: ganesh
messages: + msg20499
nosy: + ganesh
2018-11-18 14:24:18bfsetmessages: + msg20509
2018-11-18 14:37:15ganeshsetmessages: + msg20510
2018-12-03 19:10:28bfsetfiles: + patch-preview.txt, fix-docs_-following-up-on-_disentangle-d_r_state_filteredworking_.dpatch, unnamed
messages: + msg20567
2018-12-07 07:12:51ganeshsetstatus: review-in-progress -> accepted-pending-tests
messages: + msg20594
2018-12-07 07:37:46ganeshsetmessages: + msg20595
2018-12-10 13:20:26bfsetmessages: + msg20602