darcs

Patch 2147 eliminate Darcs.Repository.HashedIO

Title eliminate Darcs.Repository.HashedIO
Superseder Nosy List bfrk
Related Issues
Status accepted Assigned To
Milestone

Created on 2021-01-10.15:41:57 by bfrk, last changed 2021-04-03.19:31:25 by ganesh.

Files
File name Status Uploaded Type Edit Remove
fix-build-failure-due-to-missing-import-of-darcsaddmissinghashes.dpatch bfrk, 2021-03-29.09:02:01 application/x-darcs-patch
fix-regression-of-issue2365-and-its-test.dpatch bfrk, 2021-03-28.22:07:29 application/x-darcs-patch
patch-preview.txt bfrk, 2021-01-10.15:41:56 text/x-darcs-patch
patch-preview.txt bfrk, 2021-03-28.22:07:29 text/x-darcs-patch
patch-preview.txt bfrk, 2021-03-29.09:02:01 text/x-darcs-patch
simplify-createpristinedirectorytree-by-reading-and-writing-trees.dpatch bfrk, 2021-01-10.15:41:56 application/x-darcs-patch
unnamed bfrk, 2021-01-10.15:41:56 text/plain
unnamed bfrk, 2021-03-28.22:07:29 text/plain
unnamed bfrk, 2021-03-29.09:02:01 text/plain
See mailing list archives for discussion on individual patches.
Messages
msg22619 (view) Author: bfrk Date: 2021-01-10.15:41:56
As the subject says, this bundle gets us rid of the largely obsolete
Darcs.Repository.HashedIO module. Its functionality mostly duplicates what
we have in Darcs.Util.Tree.Hashed/Monad, so this saves lots of code to
maintain.

4 patches for repository http://darcs.net/screened:

patch df6118d74508b6b387916dcc7101507f0b07f376
Author: Ben Franksen <ben.franksen@online.de>
Date:   Tue Nov 10 15:40:42 CET 2020
  * simplify createPristineDirectoryTree by reading and writing trees
  
  This gets rid of yet another function exported from D.R.HashedIO, namely
  copyHashed.

patch 41fdaeb64e3000ac322072e7b90ebfdaaa2667ac
Author: Ben Franksen <ben.franksen@online.de>
Date:   Tue Nov 10 15:34:31 CET 2020
  * dist command: simplify getting paths and contents of pristine for --zip
  
  This gets us rid of yet another function exported from D.R.HashedIO, namely
  pathsAndContents. The re-implementation of that functionality using the
  existing tree functions is entirely straight-forward.

patch 3956dfcd4d7e7836757043fc472de98fa5b114ef
Author: Ben Franksen <ben.franksen@online.de>
Date:   Tue Nov 10 13:44:54 CET 2020
  * tone down excessive debug output from D.R.HashedIO.readHashFile

patch 97cd8cfd19b20d85608c025c9f957232178abe2e
Author: Ben Franksen <ben.franksen@online.de>
Date:   Tue Nov 10 16:16:58 CET 2020
  * move cleanPristineDir to D.R.Pristine
  
  This finally allows us to completely scrap D.R.HashedIO. Yay!
Attachments
msg22682 (view) Author: ganesh Date: 2021-03-27.22:41:20
Partial review:

>   * simplify createPristineDirectoryTree by reading and writing trees

I think the new code no longer checks 'isMaliciousSubPath'. I'm not 100%
certain if it's a problem but I think it might be the place that clone
is relying on to catch bad paths copied from remote repos.

The new version also now does nothing for the NoWorkingDir case. That
sort of makes sense but the old version deliberately evaluated the file
anyway and had a comment to that effect. I think that's because it's
ultimately calling 'fetchFileUsingCache' which can actually write into a
local cache as it works. But I haven't dug into the details enough to be
sure about this.
msg22687 (view) Author: bfrk Date: 2021-03-28.10:02:07
>>   * simplify createPristineDirectoryTree by reading and writing trees
> 
> I think the new code no longer checks 'isMaliciousSubPath'. I'm not 100%
> certain if it's a problem but I think it might be the place that clone
> is relying on to catch bad paths copied from remote repos.

I should have added a slightly longer comment to explain why all this is
no longer necessary.

The point is that nowadays all the validation is done when building
AnchoredPath, more specifically when create a Name. Thus, at the point
where we operate on a Tree, we can be sure that the paths in that tree
are non-malicious.

I think it was my latest index refactors that eliminated the last place
where we created a Name in an unsafe manner i.e. w/o validation. The
Name type is now abstract and there are no longer any unsafe functions
exported to build one.

The error handling when we encounter a bad name could still be improved.
For instance, decodeWhiteName throws a CorruptPatch exception (from pure
code). Apart from the naming (the thing at fault could also be a bad
pristine tree) it would be better to return an Either here and leave it
to the caller to fail with a more useful error message, including the
name of the corrupt file. I think I will send one or two patches to
clean that up, but not as a follow-up as it is pretty much unrelated.

> The new version also now does nothing for the NoWorkingDir case. That
> sort of makes sense but the old version deliberately evaluated the file
> anyway and had a comment to that effect. I think that's because it's
> ultimately calling 'fetchFileUsingCache' which can actually write into a
> local cache as it works. But I haven't dug into the details enough to be
> sure about this.

This is actually a bug, but not related to validation. The patch that
introduced that comment line was

patch 973cd2af4f30567c9f5b2e4f2a963358387daec3
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Tue Apr 22 20:24:36 CEST 2014
  * resolve issue2365: correctly copy pristine in no-working-dir clones
  This patch makes Darcs.Repository.HashedIO.copyHashed take
  a flag that tells whether a clean working copy should be created
  while pristine is copied.

We do have a regression test for issue2365 and it succeeds, but this is
only because the whatsnew command in the test now /fetches/ the pristine
files from the repo, since it uses the cache now, and the source repo is
part of the cache (in the general sense i.e. set of hashed file
locations, which is independent of whether the /global/ cache is used or
not).

The test is easily fixed by renaming the source directory before the
whatsnew (you need to run the tests with -c=n to make it fail). And the
bug is fixed by reading and evaluating the tree, for the latter we can
use darcsAddMissingHashes.

Thanks for catching this! Will fix.
msg22688 (view) Author: ganesh Date: 2021-03-28.13:37:34
Thanks for the explanations about the first patch.

>   * dist command: simplify getting paths and contents of pristine for --zip

Looks like a nice simplification.

>   * tone down excessive debug output from D.R.HashedIO.readHashFile

OK

>   * move cleanPristineDir to D.R.Pristine
>   
>   This finally allows us to completely scrap D.R.HashedIO. Yay!

OK
msg22692 (view) Author: bfrk Date: 2021-03-28.22:07:29
Following up on review.

1 patch for repository http://darcs.net/screened:

patch d5d7468a80c32c2206e0d62f399108e806659d53
Author: Ben Franksen <ben.franksen@online.de>
Date:   Mon Mar 29 00:16:50 CEST 2021
  * fix regression of issue2365 and its test
  
  We have to read and evaluate pristine even if we have no working tree, in
  order for pristine files to be copied to our repo. This wasn't caught by the
  test for that issue since we now use the cache for pristine files and the
  immediate source repo of a clone is always used as a cache location,
  regardless of whether the global cache is used or not.
Attachments
msg22693 (view) Author: bfrk Date: 2021-03-29.08:05:45
> patch d5d7468a80c32c2206e0d62f399108e806659d53
> Author: Ben Franksen <ben.franksen@online.de>
> Date:   Mon Mar 29 00:16:50 CEST 2021
>   * fix regression of issue2365 and its test

Sorry, I botched this one: the import of darcsAddMissinghashes is
missing. Will send another patch to fix.
msg22694 (view) Author: bfrk Date: 2021-03-29.09:02:01
1 patch for repository http://darcs.net/screened:

patch ab541229c74def29718839a91f29f61a5de2a971
Author: Ben Franksen <ben.franksen@online.de>
Date:   Mon Mar 29 10:21:41 CEST 2021
  * fix build failure due to missing import of darcsAddMissingHashes
Attachments
msg22695 (view) Author: ganesh Date: 2021-04-03.15:52:16
Both followups fine
History
Date User Action Args
2021-01-10 15:42:00bfrkcreate
2021-01-10 15:42:57bfrksetstatus: needs-screening -> needs-review
2021-03-27 22:41:24ganeshsetmessages: + msg22682
2021-03-28 10:02:09bfrksetmessages: + msg22687
2021-03-28 13:37:35ganeshsetmessages: + msg22688
2021-03-28 13:37:51ganeshsetstatus: needs-review -> accepted-pending-tests
2021-03-28 14:03:45ganeshsetstatus: accepted-pending-tests -> followup-in-progress
2021-03-28 22:07:29bfrksetfiles: + patch-preview.txt, fix-regression-of-issue2365-and-its-test.dpatch, unnamed
messages: + msg22692
2021-03-29 08:05:47bfrksetmessages: + msg22693
2021-03-29 09:02:01bfrksetfiles: + patch-preview.txt, fix-build-failure-due-to-missing-import-of-darcsaddmissinghashes.dpatch, unnamed
messages: + msg22694
2021-04-03 15:40:20ganeshsetstatus: followup-in-progress -> needs-review
2021-04-03 15:52:17ganeshsetstatus: needs-review -> accepted-pending-tests
messages: + msg22695
2021-04-03 19:31:25ganeshsetstatus: accepted-pending-tests -> accepted