darcs

Patch 1745 use AnchoredPath instead of FileName for... (and 8 more)

Title use AnchoredPath instead of FileName for... (and 8 more)
Superseder Nosy List bfrk, ganesh
Related Issues
Status accepted Assigned To ganesh
Milestone

Created on 2018-10-14.00:20:11 by bfrk, last changed 2018-11-25.12:20:17 by ganesh.

Files
File name Status Uploaded Type Edit Remove
document-and-add-property-for-explodepath_s_.dpatch bfrk, 2018-11-18.16:30:04 application/x-darcs-patch
fix-in-patch-index_-avoid-generating-repeated-move-patch-mods.dpatch bfrk, 2018-10-18.20:42:42 application/x-darcs-patch
increase-format-version-of-patch-index-to-3.dpatch bfrk, 2018-11-18.22:05:02 application/x-darcs-patch
patch-preview.txt bfrk, 2018-10-14.00:20:10 text/x-darcs-patch
patch-preview.txt bfrk, 2018-10-18.20:42:42 text/x-darcs-patch
patch-preview.txt bfrk, 2018-11-18.16:30:04 text/x-darcs-patch
patch-preview.txt bfrk, 2018-11-18.22:05:02 text/x-darcs-patch
patch-preview.txt ganesh, 2018-11-18.22:45:56 text/x-darcs-patch
patch-preview.txt ganesh, 2018-11-24.23:05:48 text/x-darcs-patch
unnamed bfrk, 2018-10-14.00:20:10 text/plain
unnamed bfrk, 2018-10-18.20:42:42 text/plain
unnamed bfrk, 2018-11-18.16:30:04 text/plain
unnamed bfrk, 2018-11-18.22:05:02 text/plain
unnamed ganesh, 2018-11-18.22:45:56 text/plain
unnamed ganesh, 2018-11-24.23:05:48 text/plain
use-anchoredpath-instead-of-filename-for-internal-path-representation.dpatch bfrk, 2018-10-14.00:20:10 application/x-darcs-patch
use-the-posix-___-in-anchorpath.dpatch ganesh, 2018-11-24.23:05:48 application/x-darcs-patch
use-the-posix-___-in-displaypath.dpatch dead ganesh, 2018-11-18.22:45:56 application/x-darcs-patch
See mailing list archives for discussion on individual patches.
Messages
msg20403 (view) Author: bfrk Date: 2018-10-14.00:20:10
Note: it makes sense to review the first patch separately. Then pull/apply
the rest and run the tests. They all succeed for me. Some of the changes to
test scripts are not strictly needed for that but are included because they
make sense (IMO) and were motivated by earlier versions of the refactor that
made a few more changes to behavior. This latest incarnation, rebased more
often than is good for my sanity, tries very hard to preserve the existing
behavior.

9 patches for repository http://darcs.net/screened:

patch e03902aeb50727cb199584282ffd46f061b48793
Author: Ben Franksen <ben.franksen@online.de>
Date:   Thu Oct 11 23:18:34 CEST 2018
  * use AnchoredPath instead of FileName for internal path representation
  
  This is a large patch that touches many files. I have tried hard to make
  reviewing it as easy as possible by avoiding changes that aren't necessary
  to achieve the goal. For instance, in this patch, FileName is still used but
  is a synonym for AnchoredPath. I have resisted the temptation to include
  cleanup changes, except where it was necessary for me to make sense of
  existing code.
  
  AnchoredPath is now used throughout, from the UI down to the patches
  themselves, for all paths that potentially reference user content in a
  repository. This means paths under _darcs are /not/ included, and neither
  are paths /to/ a repository or a cache or other files not under darcs
  control.
  
  Arguments from the command line are sanitized and converted to AnchoredPath
  early on. This is now concentrated in two routines: pathsFromArgs, and
  pathSetFromArgs, see their documentation for details. We convert to FilePath
  only for IO operations or for display to the user.
  
  Parsing of prim V1 patches is a bit stricter now: it fails if paths are not
  explicitly relative i.e. start with "./". The standard constructor for the
  Name type (makeName) now checks that the invariants aren't violated, that
  is, a Name is never empty, ".", or "..". Unfortunately, the index code still
  needs to use unsafeMakeName because it violates these invariants, at least
  temporarily.

patch cd6e1a2035f86da573f6d9eef21a2d6b2c0e715a
Author: Ben Franksen <ben.franksen@online.de>
Date:   Thu Oct 11 23:25:36 CEST 2018
  * rename FileName to AnchoredPath and remove the type synonym

patch 49ce550e44938c3ce8cc05f2158500df038c259b
Author: Ben Franksen <ben.franksen@online.de>
Date:   Fri Oct 12 14:55:18 CEST 2018
  * make some test scripts more robust against change in wording of messages
  
  It shouldn't matter whether darcs says "We have conflicts..." or "There are
  conflicts..." etc.

patch 012ad33b2290e06df77dce3db94c1fd9aa559e9b
Author: Ben Franksen <ben.franksen@online.de>
Date:   Fri Oct 12 22:04:08 CEST 2018
  * make test script independent from the order in which prim patches are offered
  
  The specific sorting order for prim patches in a record or amend is
  something we never specified (and I guess we don't want to do that), so our
  tests should not rely on it.

patch e8d4e47572b175ea769524d5e09b86464eba7a4e
Author: Ben Franksen <ben.franksen@online.de>
Date:   Thu Oct 11 02:04:40 CEST 2018
  * fix: fail if pending patch cannot be parsed
  
  The previous behavior was to silently ignore pending. This is bad because we
  want to know if the pending patch is corrupt. Note that parsing is now
  stricter and requires the leading "./". This invalidates some of the tests
  for darcs repair since we cannot record a pending patch with an invalid
  file path.

patch 9253d72801cb849f1474184b4baa0ba1173d3624
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Oct 14 00:57:23 CEST 2018
  * improve wording of warning message in maybeFixSubPaths

patch 92e94c8810ea48f5b848b537577ba41abf799166
Author: Ben Franksen <ben.franksen@online.de>
Date:   Tue Oct  9 13:01:01 CEST 2018
  * bugfix in whatsnew: must use chooseTouching, not choosePreTouching
  
  This bug was previously hidden because the old effectOnFilePaths did not
  always do the right thing. I trust the much simpler version that works with
  AnchoredPaths more that the old version that fiddles with the String
  representation. chooseTouching also makes more sense when you think about
  it: if we invert the patch that adds the new files then its effect on the
  paths is to remove them; thus choosePreTouching removes the interesting
  paths from its input before it calls chooseTouching, which gives us wrong
  results (now).

patch f867f1f5d3ffe8c466d1c9b219381ba649818bd3
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sat Oct 13 21:23:49 CEST 2018
  * bugfix in remove command: don't allow removal of root
  
  This will otherwise create an invalid pending patch causing subsequent darcs
  invocations to crash.

patch 441b08715437be817ef7cf0a63907ed6d5fcff7f
Author: Ben Franksen <ben.franksen@online.de>
Date:   Fri Oct 12 22:08:29 CEST 2018
  * adapt a test script: darcs no longer fails to detect this move 
  
  This was fixed by the FileName->AnchoredPath cleanup. I have no idea how or
  why in particular, except that we save a lot of tricky Char/String fiddling
  some of which may have been buggy.
Attachments
msg20418 (view) Author: bfrk Date: 2018-10-18.20:42:42
a minor follow-up for the AnchoredPath refactor

1 patch for repository http://darcs.net/screened:

patch 8fc258192b76374fd570e310f45496ff81a6e6d4
Author: Ben Franksen <ben.franksen@online.de>
Date:   Mon Oct 15 09:05:30 CEST 2018
  * fix in patch index: avoid generating repeated move patch mods
  
  This regression was due to a mistake in the AnchoredPatch refactor.
Attachments
msg20484 (view) Author: ganesh Date: 2018-11-17.13:19:25
>  * use AnchoredPath instead of FileName for internal path representation

> -- | For displaying paths to the user. This adds the "./" for consistency
> -- with how repo paths are displayed by 'showPatch' and friends. 
> -- At some point point in the future we may decide to change this.
> displayPath :: AnchoredPath -> FilePath
> displayPath = anchorPath "."

I guess that displayPath is never to be used for the on-disk patch format?
It might be worth clarifying that explicitly in the comment if so (I can
do it if you confirm that's true).

> explodePaths :: Tree IO -> [AnchoredPath] -> [AnchoredPath]
> explodePaths tree paths = concatMap (explodePath tree) paths

> explodePath :: Tree m -> AnchoredPath -> [AnchoredPath]
> explodePath tree path =
 >  path : maybe [] (map (catPaths path . fst) . list) (findTree tree path)

Could you give these some haddocks? I think from the usage that it's
just about traversing a directory recursively, but it's not obvious
from the definition.

Overall the patch looks fine. Thanks for keeping it relatively small. I still 
didn't manage to check every detail, but the overall description of the
change in the patch comment sounds good to me, and I've confirmed that the
actual changes seem to match that description.
msg20485 (view) Author: ganesh Date: 2018-11-17.13:35:55
>   * rename FileName to AnchoredPath and remove the type synonym

Fine

>  * make some test scripts more robust against change in wording of 
messages
>  It shouldn't matter whether darcs says "We have conflicts..." or 
"There are
>  conflicts..." etc.

Fine (of course it would matter if darcs said "We have no 
conflicts", but I'm ok
to take that risk :-)

> * make test script independent from the order in which prim 
patches are offered

Fine - I expect this is a widespread problem in our scripts, but we 
can fix them
ad hoc as it comes up.

>  * fix: fail if pending patch cannot be parsed

Fine. And getting rid of a "catchall" is always good, some of them 
cause real
confusion.

Should we give a hint to the user that they can delete it if they 
run into this
error?

>  * improve wording of warning message in maybeFixSubPaths

Fine

>  * bugfix in whatsnew: must use chooseTouching, not 
choosePreTouching

> This bug was previously hidden because the old effectOnFilePaths 
did not
> always do the right thing. I trust the much simpler version that 
works with
> AnchoredPaths more that the old version that fiddles with the 
String
> representation. chooseTouching also makes more sense when you 
think about
> it: if we invert the patch that adds the new files then its effect 
on the
> paths is to remove them; thus choosePreTouching removes the 
interesting
> paths from its input before it calls chooseTouching, which gives 
us wrong
> results (now).

Isn't either choice sometimes wrong? If it works for patches that 
add something
then it won't work for patches that remove it, and vice versa. Also 
for moves
either the old or new names will work but not both.

I haven't played around with how darcs actually behaves before and 
after
this set of patches so I might be misunderstanding.

>   * bugfix in remove command: don't allow removal of root

Fine

>  * adapt a test script: darcs no longer fails to detect this move

Fine
msg20486 (view) Author: ganesh Date: 2018-11-17.14:56:23
Is it expected that these changes affect the format of the patch
index? I'm seeing errors from binary in any commands that use the
patch index when I mix-and-match darcs binaries from before and after 
these changes. I haven't dug into the details yet.
msg20507 (view) Author: ganesh Date: 2018-11-18.13:29:30
As a result of this code, I'm seeing paths like .\foo/bar being printed
by darcs on Windows whereas previously they were ./foo/bar. This breaks
a few tests (issue2136, patch-index-spans, rename_shouldnt_affect_prefixes)
as well as being quite ugly.

Changing it to use the </> from System.FilePath.Posix fixes this, shall
we just go with that?

The problem with incompatible patch indexes still exists even with that 
fix though.

> import System.FilePath( (</>), splitDirectories, normalise, dropTrailingPathSeparator )

> -- | For displaying paths to the user. This adds the "./" for consistency
> -- with how repo paths are displayed by 'showPatch' and friends.
> -- At some point point in the future we may decide to change this.
> displayPath :: AnchoredPath -> FilePath
> displayPath = anchorPath "."

> -- | Take a "root" directory and an anchored path and produce a full
> -- 'FilePath'. Moreover, you can use @anchorPath \"\"@ to get a relative
> -- 'FilePath'.
> anchorPath :: FilePath -> AnchoredPath -> FilePath
> anchorPath dir p = dir FilePath.</> decodeLocale (flatten p)
msg20513 (view) Author: bfrk Date: 2018-11-18.16:30:04
Following up on review.

1 patch for repository http://darcs.net/screened:

patch 74a4b7d369595e69668c18761836920b3f254c28
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Nov 18 16:55:44 CET 2018
  * document and add property for explodePath(s)
Attachments
msg20514 (view) Author: bfrk Date: 2018-11-18.21:23:07
>>  It shouldn't matter whether darcs says "We have conflicts..." or
> "There are
>>  conflicts..." etc.
>
> Fine (of course it would matter if darcs said "We have no
> conflicts", but I'm ok to take that risk :-)

Right. Testing for correct messages is difficult to get right. We have
to strike a balance between being too lax and being too specific. Perhps
a better test would spell out both alternatives? Or perhaps we should
unify the message to be always the same?

>> * make test script independent from the order in which prim
> patches are offered
>
> Fine - I expect this is a widespread problem in our scripts, but we
> can fix them ad hoc as it comes up.

Yes. (I usually gather non-essential tests script fixes and send them in
a separate bundle, but this time the fix was needed to avoid a false
positive.)

>>  * fix: fail if pending patch cannot be parsed
>
> Fine. And getting rid of a "catchall" is always good, some of them
> cause real confusion.
>
> Should we give a hint to the user that they can delete it if they
> run into this error?

I would prefer to extend darcs repair with an option or subcommand for
fixing the pending patch. I don't think we should ever encourage users
to fiddle around with stuff inside _darcs (except prefs, but even that
is questionable, see issue2602).

>>  * bugfix in whatsnew: must use chooseTouching, not
> choosePreTouching
>
>> This bug was previously hidden because the old effectOnFilePaths
> did not
>> always do the right thing. I trust the much simpler version that
> works with
>> AnchoredPaths more that the old version that fiddles with the
> String
>> representation. chooseTouching also makes more sense when you
> think about
>> it: if we invert the patch that adds the new files then its effect
> on the
>> paths is to remove them; thus choosePreTouching removes the
> interesting
>> paths from its input before it calls chooseTouching, which gives
> us wrong
>> results (now).
>
> Isn't either choice sometimes wrong? If it works for patches that
> add something
> then it won't work for patches that remove it, and vice versa. Also
> for moves
> either the old or new names will work but not both.

You are absolutely right, in principle. But paths are only added or
removed or renamed in this (simple) way if we call effectOnPaths, and
this is done only by choosePreTouching, not by chooseTouching. This begs
the question how chooseTouching manages to give the correct result in
the face of adds, removes, and renames. This is the point at which I
usually give up because this part of the code is not easy to understand
as there are /many/ functions involved, none of which precisely document
their behavior and all work on list of paths or tuples of lists of
tuples of paths...

BTW, I just had the idea that neither a call to chooseTouching nor to
choosePreTouching should not be needed here, since unrecordedChanges
already gets the file paths as arguments and is supposed to use them to
filter the result. I just played with that and found that it doesn't do
so. Indeed Darcs.Repository.State only filters trees, but never patches.
This is an inconsistency in the API and not documented anywhere I think.
Perhaps we should change that and do the filtering of patches there,
too. This could lead to less code duplication in the command
implementations.

There is also some overlap here with the discussion we had about the
"refactor filteredWorking" patch. To recap, you observed that if my
patch is correct, then it must have been broken before in the
UseIndex/ScanKnown case, which happens to be the common case when no
options are given. I guess this has been "fixed" long ago in the command
implementations, similar to how the chooseTouching here filters the
patches that unrecordedChanges fails to. Or perhaps the tree filtering
that is done by D.R.State was only meant as an optimization, at least
originally.

> I haven't played around with how darcs actually behaves before and
> after this set of patches so I might be misunderstanding.

I have just tested manually that whatsnew works properly for newly
added, removed, and renamed files, no matter whether I specify the old
or the new names and with and without --ignore-times. With zsh even the
completion works as expected: removing a (tracked) file still completes
the (no longer existing) old file path.
msg20515 (view) Author: bfrk Date: 2018-11-18.21:30:11
> Is it expected that these changes affect the format of the patch
> index? I'm seeing errors from binary in any commands that use the
> patch index when I mix-and-match darcs binaries from before and after 
> these changes. I haven't dug into the details yet.

An interesting observation. I have seen similar errors but I thought
they were caused by my (not yet published) changes to the patch index.

To answer the question: no, these changes are not meant to affect the
binary format of the patch index. If I had planned for that i would have
increased the version number, causing the patch index to be re-created
instead of giving an error. This is a bug and I will try to fix it.
msg20516 (view) Author: bfrk Date: 2018-11-18.22:01:16
>> Is it expected that these changes affect the format of the patch
>> index? I'm seeing errors from binary in any commands that use the
>> patch index when I mix-and-match darcs binaries from before and after 
>> these changes. I haven't dug into the details yet.
> 
> An interesting observation. I have seen similar errors but I thought
> they were caused by my (not yet published) changes to the patch index.
> 
> To answer the question: no, these changes are not meant to affect the
> binary format of the patch index. If I had planned for that i would have
> increased the version number, causing the patch index to be re-created
> instead of giving an error. This is a bug and I will try to fix it.

I am an idiot. /Of course/ this change affects the binary format of the
patch index. Not only has AnchoredPath a different Binary instance, it
also has a different Ord instance so all the maps with paths as keys
change format etc. So while this wasn't intentional, it is clear that
the format is affected and thus we must increase the verson number. This
will cause the patch index to be re-created instead of crashing Darcs
with an unhandled exception. Will send and push a patch to that effect.
msg20517 (view) Author: bfrk Date: 2018-11-18.22:05:02
Following up on patch review. Many thanks for catching this one.

1 patch for repository http://darcs.net/screened:

patch 307ad115a790fb29c5688d228263c1e496dc4225
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Nov 18 23:04:05 CET 2018
  * increase format version of patch index to 3
  
  This change reflects that the patch index now uses AnchoredPath instead of
  the old FileName for all paths, and that has different Binary and Ord
  instances.
Attachments
msg20518 (view) Author: ganesh Date: 2018-11-18.22:34:58
Maybe we should add a test for the "broken patch index" situation.
I only caught this by accident while trying to debug the other
problem with funny paths on Windows.

I guess we make a known good patch index on a released version of
darcs and then check we can still read it. Do you happen to know
if the index is compatible between different architectures?
msg20519 (view) Author: bfrk Date: 2018-11-18.22:43:48
> As a result of this code, I'm seeing paths like .\foo/bar being printed
> by darcs on Windows whereas previously they were ./foo/bar. This breaks
> a few tests (issue2136, patch-index-spans, rename_shouldnt_affect_prefixes)
> as well as being quite ugly.
> 
> Changing it to use the </> from System.FilePath.Posix fixes this, shall
> we just go with that?

Oh f***. This is a serious bug that doesn't only affect display of paths
but actually the on disk format of patches on Windows, since anchorPath
is now used for that, too. So this is a definite YES.

I am unsure how / why this changed from Posix to Native FilePath,
perhaps just a stupid mistake I made when cleaning up imports.

Note that e.g. realPath is meant to be used as input for IO actions that
take path arguments. This is a point that always confuses me: Is there a
general guide line for when we have to use the "native" path separator
(i.e. '\' on Windows) as opposed to the Posix one? I am pretty sure the
on-disk format of patches always uses the Posix separator, and similarly
the index, but that's about the end of my knowledge.
msg20520 (view) Author: ganesh Date: 2018-11-18.22:45:56
1 patch for repository darcs-unstable@darcs.net:screened:

patch ca7d6b60e8f7ac1166c336bc4d5c97916f609b14
Author: Ganesh Sittampalam <ganesh@earth.li>
Date:   Sun Nov 18 15:19:02 GMT 2018
  * use the Posix </> in displayPath
  
  Otherwise displayPath produces things like ".\foo/bar"
  on Windows.
Attachments
msg20521 (view) Author: ganesh Date: 2018-11-18.22:48:15
> Is there a general guide line for when we have to use the "native" 
> path separator (i.e. '\' on Windows) as opposed to the Posix one?
> I am pretty sure the on-disk format of patches always uses the 
> Posix separator, and similarly the index, but that's about the end
> of my knowledge.

I'm never very sure either. The Posix separator generally works
on Windows, so probably that's a safe bet if you're not sure.
I suspect that the Darcs code has just grown up organically, the
main times it really needs to deal with Windows separators are
when dealing with user input.
msg20522 (view) Author: bfrk Date: 2018-11-18.22:58:08
> Maybe we should add a test for the "broken patch index" situation.
> I only caught this by accident while trying to debug the other
> problem with funny paths on Windows.
> 
> I guess we make a known good patch index on a released version of
> darcs and then check we can still read it. Do you happen to know
> if the index is compatible between different architectures?

This is from the docs of the binary package:

"""
Values encoded using the Binary class are always encoded in network
order (big endian) form, and encoded data should be portable across
machine endianness, word size, or compiler version. For example, data
encoded using the Binary class could be written on any machine, and read
back on any another.
"""

That sounds pretty good and AFAIK the patch index uses only standard
Binary instances or instances that derive from one of the standard
instances.

Can we read a patch index when there may be patches missing? If not,
you'd have to store a full repo along with the patch index for the test.
msg20523 (view) Author: bfrk Date: 2018-11-18.23:11:26
>> Is there a general guide line for when we have to use the "native" 
>> path separator (i.e. '\' on Windows) as opposed to the Posix one?
>> I am pretty sure the on-disk format of patches always uses the 
>> Posix separator, and similarly the index, but that's about the end
>> of my knowledge.
> 
> I'm never very sure either. The Posix separator generally works
> on Windows, so probably that's a safe bet if you're not sure.
> I suspect that the Darcs code has just grown up organically, the
> main times it really needs to deal with Windows separators are
> when dealing with user input.

Yes, that makes sense. What if we get a path via a system call? I'd
guess that will be in native format, too. I wanted to cite listing a
directory as example but that doesn't contain path separators (unless
done resursively but we do that ourselves). What about making a path
absolute? Oops, we also do that ourselves. Hmm, I can't think of any
more native sys calls which return a full path name as part of the
result but of course that doesn't mean there aren't any.
msg20553 (view) Author: ganesh Date: 2018-11-24.23:05:48
The same patch as before, with a slightly more accurate
description in case it helps with later forensics.

1 patch for repository darcs-unstable@darcs.net:screened:

patch b53b697e82fdc1db10cfb17b42cdde6d44f5238d
Author: Ganesh Sittampalam <ganesh@earth.li>
Date:   Sat Nov 24 23:05:51 GMT 2018
  * use the Posix </> in anchorPath
  
  Otherwise we get things like ".\foo/bar" on Windows.
Attachments
History
Date User Action Args
2018-10-14 00:20:11bfrkcreate
2018-10-18 20:39:32bfrksetstatus: needs-screening -> needs-review
2018-10-18 20:42:42bfrksetfiles: + patch-preview.txt, fix-in-patch-index_-avoid-generating-repeated-move-patch-mods.dpatch, unnamed
messages: + msg20418
2018-11-17 13:19:26ganeshsetstatus: needs-review -> review-in-progress
messages: + msg20484
2018-11-17 13:35:56ganeshsetmessages: + msg20485
2018-11-17 14:56:23ganeshsetmessages: + msg20486
2018-11-18 13:29:30ganeshsetmessages: + msg20507
2018-11-18 16:30:04bfrksetfiles: + patch-preview.txt, document-and-add-property-for-explodepath_s_.dpatch, unnamed
messages: + msg20513
2018-11-18 21:23:08bfrksetmessages: + msg20514
2018-11-18 21:30:11bfrksetmessages: + msg20515
2018-11-18 22:01:16bfrksetmessages: + msg20516
2018-11-18 22:05:02bfrksetfiles: + patch-preview.txt, increase-format-version-of-patch-index-to-3.dpatch, unnamed
messages: + msg20517
2018-11-18 22:34:59ganeshsetassignedto: ganesh
messages: + msg20518
nosy: + ganesh
2018-11-18 22:43:49bfrksetmessages: + msg20519
2018-11-18 22:45:57ganeshsetfiles: + patch-preview.txt, use-the-posix-___-in-displaypath.dpatch, unnamed
messages: + msg20520
2018-11-18 22:48:16ganeshsetmessages: + msg20521
2018-11-18 22:58:10bfrksetmessages: + msg20522
2018-11-18 23:11:27bfrksetmessages: + msg20523
2018-11-24 23:05:48ganeshsetfiles: + patch-preview.txt, use-the-posix-___-in-anchorpath.dpatch, unnamed
messages: + msg20553
2018-11-25 12:20:17ganeshsetstatus: review-in-progress -> accepted