darcs

Issue 2673 clone: inconsistencies in destination dir (put vs. get)

Title clone: inconsistencies in destination dir (put vs. get)
Priority Status unknown
Milestone Resolved in
Superseder Nosy List gpiero
Assigned To
Topics

Created on 2021-01-23.08:54:34 by gpiero, last changed 2022-04-13.21:14:07 by bf.

Messages
msg22624 (view) Author: gpiero Date: 2021-01-23.08:54:30
$ darcs --version 
2.16.3 (+ 180 patches)

$ darcs ini R
Finished initializing repository.
$ darcs clone R local-local
Copying patches, to get lazy repository hit ctrl-C...
Finished cloning.
$ darcs clone localhost.:$(pwd)/R remote-local
Copying patches, to get lazy repository hit ctrl-C...
Finished cloning.
$ ls -1d */_darcs
R/_darcs
local-local/_darcs
remote-local/_darcs

So far, so good...

$ darcs clone R localhost.:$(pwd)/local-remote
Creating local clone...
Transferring clone using scp...
Cloning and transferring successful.
$ darcs clone localhost.:$(pwd)/R localhost.:$(pwd)/remote-remote
Creating local clone...
Transferring clone using scp...
Cloning and transferring successful.
$ ls -1d */*/_darcs
local-remote/R/_darcs
remote-remote/R/_darcs

When creating the repo on a remote host, darcs appends the basename of 
the repodir to the provided path. This does not happen if the 
destination dir is on the local host. Different combinations of trailing 
slashes do not change the results.
msg22625 (view) Author: bf Date: 2021-01-23.18:21:07
Sorry, dumb question first:

> $ darcs clone localhost.:$(pwd)/R remote-local

What does the single "."after the "localhost" mean?
msg22626 (view) Author: gpiero Date: 2021-01-23.21:24:14
* [Sat, Jan 23, 2021 at 06:21:10PM +0000] Ben Franksen:
>> $ darcs clone localhost.:$(pwd)/R remote-local
>
>What does the single "."after the "localhost" mean?

 From a DNS resolver's point of view it denotes the root domain, so the 
resolver knows it is a fully qualified domain name (e.g.: the real fqdn 
for 'www.darcs.net' is 'www.darcs.net.', the trailing dot is just 
usually omitted and silently granted for).
 From SSH's point of view, as I use a few Canonical* options in my 
configuration, in conjunction with CanonicalizeMaxDots=0 it means: do 
not canonicalize the provided hostname.
So you can just consider 'localhost.' equivalent to 'localhost' for the 
general case.
msg22628 (view) Author: bf Date: 2021-01-24.10:52:55
>  From a DNS resolver's point of view it denotes the root domain, so the 
> resolver knows it is a fully qualified domain name

Thanks, I didn't know that.

> So you can just consider 'localhost.' equivalent to 'localhost' for the 
> general case.

Okay, got it.
msg22630 (view) Author: bf Date: 2021-01-24.11:17:19
I think I have caused this misbehavior when I tried to make clone-to-
ssh ("put") work with DARCS_SCP=rsync. rsync is often quite a lot 
faster than scp but has slightly different semantics when recursively 
copying directories. The relevant code is in 
src/Darcs/UI/Commands/Clone.hs lines 210-214.
msg22958 (view) Author: bf Date: 2022-04-10.11:19:53
I think this has been fixed with:

patch 102a6eac17ebf2c516714046c312a31dbfcc2c6b
Author: Ben Franksen <ben.franksen@online.de>
Date:   Wed Feb 12 16:37:18 CET 2020
  * fix clone to ssh: need to check remote target dir does not exist
msg22981 (view) Author: gpiero Date: 2022-04-13.07:43:39
* [Sun, Apr 10, 2022 at 11:19:53AM +0000] Ben Franksen:
>I think this has been fixed with:
>
>patch 102a6eac17ebf2c516714046c312a31dbfcc2c6b

Not sure, but I think that patch could have introduced it. The solution 
should be in e71755510f765efe468b916c020c585dd34c7c6f from patch2232 
(not tested).
msg22984 (view) Author: bf Date: 2022-04-13.08:46:52
You are mostly right. My original mistake was to rely on the special 
behavior of `rsync -r source/ dest` i.e. when the source ends with a 
slash, wrongly assuming that scp behaved in the same way. This is what 
caused all the trouble to begin with! Instead of fixing this root 
cause, I added a fix (102a6eac17) to guard against overwriting an 
existing directory, which again worked only because of this special 
behavior of rsync, but resulted in the duplicate directory when using 
scp (the bug reported here). The correct fix in e71755510, ensures that 
the destination directory does not exist on the remote host, so the 
difference in behavior no longer matters.

I will follow up with a patch that adds documentation (comments) to the 
code and removes the addition of the trailing slash when invoking the 
remote copy operation; the latter in order to avoid any difference in 
behavior creeping back in. I think I will also add explicit setting of 
'export DARCS_SCP=scp' to tests/network/ssh.sh to avoid future 
regressions.
msg22985 (view) Author: bf Date: 2022-04-13.11:12:03
Contrary to my last comment, I found the following strange behavior of rsync:

> rm -r empty repo
> mkdir empty repo; touch repo/file; rsync -r repo empty/repo; find empty
empty
empty/repo
empty/repo/repo
empty/repo/repo/file

> rm -r empty repo
> mkdir empty repo; touch repo/file; scp -r repo/ empty/repo; find empty
empty
empty/repo
empty/repo/file

Which means the trailing slash is essential for compatibility between scp and 
rsync, even if the destination directory does not yet exist.
msg22986 (view) Author: bf Date: 2022-04-13.21:14:07
And finally

> rm -r empty repo; mkdir empty repo; touch repo/file; rsync -r 
repo/ empty/repo; find empty 

empty
empty/repo
empty/repo/file

so same behavior as with scp.
History
Date User Action Args
2021-01-23 08:54:34gpierocreate
2021-01-23 18:21:10bfsetmessages: + msg22625
2021-01-23 21:24:18gpierosetmessages: + msg22626
2021-01-24 10:52:58bfsetmessages: + msg22628
2021-01-24 11:17:22bfsetmessages: + msg22630
2022-04-10 11:19:53bfsetstatus: unknown -> resolved
messages: + msg22958
2022-04-13 07:43:39gpierosetstatus: resolved -> unknown
messages: + msg22981
2022-04-13 08:46:53bfsetmessages: + msg22984
2022-04-13 11:12:03bfsetmessages: + msg22985
2022-04-13 21:14:07bfsetmessages: + msg22986