darcs

Issue 2275 Darcs follows symbolic links instead of properly ignoring them

Title Darcs follows symbolic links instead of properly ignoring them
Priority Status unknown
Milestone Resolved in
Superseder Nosy List gpiero
Assigned To
Topics

Created on 2012-12-11.08:11:39 by gpiero, last changed 2017-04-01.20:55:25 by bf.

Files
File name Uploaded Type Edit Remove
issuexxxx.sh gpiero, 2012-12-11.08:11:37 application/x-sh
Messages
msg16391 (view) Author: gpiero Date: 2012-12-11.08:11:37
It seems that sometimes darcs follows symbolic links instead of ignoring 
them. It surely happens when substituting a recorded file with a 
symbolic link.

A description of the problem is appended below and a minimal test is 
attached.

Currently using v2.9.6, but I've found the same problem in 2.8 and, 
IIRC, also in 2.5. Could not test older versions.

Thanks,
Gian Piero.

----------- darcs version

$ darcs --exact-version
darcs compiled on Dec  4 2012, at 22:31:54
...
$ darcs --version 
2.9.6 (+ 26 patches)


----------- steps to reproduce

       # -- initialize the repository
$ darcs ini ; echo ImmutableFile > file ; darcs rec -lam init
Finished recording patch 'init'

       # -- add a test file and record the patch
$ echo TemporaryFile > maybeFile ; darcs rec -lam 'Add maybeFile'
Finished recording patch 'Add maybeFile'

       # -- remove the just added file and check that darcs recognizes the 
       # removal
$ rm maybeFile
$ darcs wh -s
R ./maybeFile
$ darcs wh -ls
R ./maybeFile

       # -- create a symbolic link with the same name of the just removed 
       # file pointing to an existent file (does not need to be in the 
       # repodir)
$ ln -s file maybeFile

       # -- now you get different opinions about what changed depending on 
       # the use of the '--look-for-adds' option. In both case 'maybeFile' 
       # is wrongly reported as changed (the current content of 'maybeFile' 
       # is assumed to be the one of the file the link points to), anyway 
       # using the '-l' option you also get the right information that 
       # 'maybeFile' has been removed.
$ darcs wh -s
M ./maybeFile -1 +1
$ darcs wh -ls
M ./maybeFile -1 +1
R ./maybeFile

       # -- trying to record the changes without the use of '-l' leads to 
       # the wrong patch being recorded
$ darcs rec -am 'Maybe remove maybeFile'
Finished recording patch 'Maybe remove maybeFile'
$ darcs cha --last 1 -s
Mon Dec 10 23:15:44 CET 2012  "Gian Piero Carrubba" <gpiero@rm-rf.it>
     * Maybe remove maybeFile

       M ./maybeFile -1 +1

       # -- unrecord the wrong patch
$ darcs unrec --last 1 -a
Finished unrecording.

       # -- passing '-l' to the record command leads to the right patch 
       # being  recorded.
$ darcs rec -lam 'Remove maybeFile'
Finished recording patch 'Remove maybeFile'
$ darcs cha --last 1 -s
Mon Dec 10 23:24:22 CET 2012  "Gian Piero Carrubba" <gpiero@rm-rf.it>
     * Remove maybeFile

       R ./maybeFile

       # -- now if you unrecord the last patch the `-l' option does not 
       # make a difference anymore
$ darcs unrec --last 1 -a
Finished unrecording.
$ darcs wh -s
R ./maybeFile
$ darcs wh -ls
R ./maybeFile

       # -- recreate the issue
$ darcs rec -lam 'Remove maybeFile'
Finished recording patch 'Remove maybeFile'
$ echo AnotherTemporaryFile > maybeFileTwo
$ darcs rec -lam 'Add maybeFileTwo'
Finished recording patch 'Add maybeFileTwo'
$ rm maybeFileTwo ; ln -s file maybeFileTwo

       # -- giving a hint to darcs about the removal of the file is another 
       # way for solving
$ darcs wh -s
M ./maybeFileTwo -1 +1
$ darcs wh -ls
M ./maybeFileTwo -1 +1
R ./maybeFileTwo
$ darcs remove maybeFileTwo
$ darcs wh -s
R ./maybeFileTwo
$ darcs wh -ls
R ./maybeFileTwo
$ darcs rec -am 'Remove maybeFileTwo'
Finished recording patch 'Remove maybeFileTwo'
$ darcs cha --last 1 -s
Tue Dec 11 00:37:10 CET 2012  "Gian Piero Carrubba" <gpiero@rm-rf.it>
     * Remove maybeFileTwo

       R ./maybeFileTwo

       # -- create again the issue, this time giving the file the same 
       # content of the file referenced by the link so that 'wh' will not 
       # report a modified file
$ echo ImmutableFile > maybeFileThree
$ darcs rec -lam 'Add maybeFileThree'
Finished recording patch 'Add maybeFileThree'
$ rm maybeFileThree ; ln -s file maybeFileThree

       # -- in this case 'wh' without '-l' reports no changes
$ darcs wh -s
No changes!
$ darcs wh -ls
R ./maybeFileThree
$ darcs rec -lam 'Remove maybeFileThree'
Finished recording patch 'Remove maybeFileThree'

       # -- create again the problem, pointing the symbolic link to a 
       # not-existent file
$ echo JustAnotherTemporaryFile > maybeFileFour
$ darcs rec -lam 'Add maybeFileFour'
Finished recording patch 'Add maybeFileFour'
$ rm maybeFileFour ; ln -s not-existent maybeFileFour

       # -- this time, darcs croaks
$ darcs wh -s
darcs: ...: openBinaryFile: does not exist (No such file or directory)
$ darcs wh -ls
darcs: ...: openBinaryFile: does not exist (No such file or directory)

       # -- cannot use the same workarounds that fixed the other cases
$ darcs remove maybeFileFour
darcs: ...: openBinaryFile: does not exist (No such file or directory)
$ darcs rec -lam 'Remove maybeFileFour'
darcs: ...: does not exist (No such file or directory)

       # -- remove the link seems the only way to go
$ rm maybeFileFour
$ darcs wh -s
R ./maybeFileFour
$ darcs wh -ls
R ./maybeFileFour
Attachments
msg16392 (view) Author: markstos Date: 2012-12-11.13:58:26
Darcs Manual doesn't claim to support this use of symlinks one way or the 
other, but at a minimum, I think it would be good to make the behavior 
consistent with and without --look-for-adds.
msg16393 (view) Author: gpiero Date: 2012-12-11.18:41:22
Please let me stress that consistence with and without --look-for-adds 
is not enough and the current behaviour without --look-for-adds is 
self-inconsistent.

EDIT: I ended up with a long message beyond my initial intentions.  
Anyway the important part is the 'user experience' one (last two 
paragraphs).

-----

$ darcs ini --repo R
$ echo Sample Text > R/f
$ echo Ops, wrong file > R/g
$ darcs rec -lam init --repo R
Finished recording patch 'init'
$ darcs get R S
Copying patches, to get lazy repository hit ctrl-C...
Finished getting.
$ rm R/g ; ln -s f R/g
$ darcs rec -am WRONG --repo R
Finished recording patch 'WRONG'
$ darcs diff -u -p WRONG --repo R
Tue Dec 11 17:48:46 CET 2012  "Gian Piero Carrubba" <gpiero@rm-rf.it>
   * WRONG
diff -rN -u old-R/g new-R/g
--- old-R/g     2012-12-11 17:48:54.120648808 +0100
+++ new-R/g     2012-12-11 17:48:54.120648808 +0100
@@ -1 +1 @@
-Ops, wrong file
+Sample Text
$ darcs push -a --repo R S
Finished applying...
Push successful.

-----

Now we have a file and a symlink to it in R and two files with the same 
content in S. Assuming that
- we want to track the content and not the container
- the content of a symlink is defined as the content of the dereferenced 
   file
we can still consider the behaviour until now as correct.

Anyway now the things become really messy.

-----

$ echo A new line >> R/f
$ darcs rec -am FAIL --repo R
Finished recording patch 'FAIL'
$ darcs diff -u -p FAIL --repo R
Tue Dec 11 17:57:51 CET 2012  "Gian Piero Carrubba" <gpiero@rm-rf.it>
   * FAIL
diff -rN -u old-R/f new-R/f
--- old-R/f     2012-12-11 17:58:02.354956537 +0100
+++ new-R/f     2012-12-11 17:58:02.358956575 +0100
@@ -1 +1,2 @@
  Sample Text
+A new line
$ darcs push -a --repo R S
Finished applying...
Push successful.

-----

Now we still have a file and a symlink to it in R, but two files with 
different contents in S. Wait a moment, we still have an unrecorded 
change in R so maybe reverting it we can really have two repos 
containing the same patches and the same content.

-----

$ darcs wh -ls --repo R
R ./g
$ darcs rev -a --repo R
There are no changes to revert!
Finished reverting.

-----

Mmmh, maybe forcing ? I assume that reverting the removal of a file 
means re-adding it...

-----

$ darcs add --repo R R/g
The following file is already in the repository;
note that to ensure portability we don't allow
files that differ only in case. Use --case-ok to override this:
g

darcs failed:  No files were added

-----

No way.

At this point all sort of things can happen to your poor repo :)
Now, to be fair, just let me add a case in which darcs auto-magically 
fix the anomaly. 

-----

$ echo Another new line >> S/g
$ darcs rec -am PATCH --repo S
Finished recording patch 'PATCH'
$ darcs push -a --repo S R
Finished applying...
Push successful.

-----

Now magically the symlink R/g has been substituted by a file with the 
same name. The content still isn't the same as S/g, but we again have an 
unrecorded change in R (a different one, this time) and simply reverting 
it leads to the situation in which we have two repos containing the same 
patches and the same content.
It's great (sort of: not sure I like darcs removed my symlink w/o 
letting me know), but what if the normal flow is mono-directional from R 
to S ?

As a darcs user I know darcs doesn't support symlinks and I can cope 
with it. I assume it simply ignores them, as this is the behaviour I've 
observed almost all the times. This is a minor annoyance that can be 
fixed with a hook script when needed. Anyway I'm happy remembering the 
following rules about symlinks:
1. symlinks are ignored.

Instead we have that:
1. symlinks are ignored.
2. rule 1) is not true when they have the same name of a just deleted 
    file if the removal patch hasn't been recorded. In this case the 
    symlink is considered a sort of virtual file whose content is the 
    content of the file the link points to.
3. rule 2) is not true if the file has been deleted via 'darcs remove'.
4. when 2) is in effect, '--look-for-adds' also acts as a sort of 
    '--look-for-removals'.
5. when 2) is in effect, it applies only to some commands. I.e., pulling 
    a patch could lead to the removal of the link (w/o tracking of the 
    removal) if a file with the same name is affected (so effectively 
    acting as if the symlink is ignored by darcs).
6. ? (forgotten something?)

I don't think this scenario well into the overall usage simplicity of 
darcs.

Ciao,
Gian Piero.
msg19372 (view) Author: bf Date: 2017-03-15.18:13:33
I fully agree that the current behaviour is a mess, as you amply
illustrated. I am not so sure ignoring symlinks is the right idea,
however. I would say Darcs should guard against them: as soon as you try
to add one or replace a file or directory that Darcs knows about with a
symlink, or in just about any situation where Darcs accesses a file or
directory (that is or is about to become version controlled), Darcs
should fail with an error message, unless you pass an explicit switch
(--resolve-symbolic-links).
msg19395 (view) Author: gpiero Date: 2017-03-25.10:01:14
* [Wed, Mar 15, 2017 at 06:13:35PM +0000] Ben Franksen:
>I fully agree that the current behaviour is a mess, as you amply
>illustrated. I am not so sure ignoring symlinks is the right idea,
>however. I would say Darcs should guard against them: as soon as you try
>to add one or replace a file or directory that Darcs knows about with a
>symlink, or in just about any situation where Darcs accesses a file or
>directory (that is or is about to become version controlled), Darcs
>should fail with an error message, unless you pass an explicit switch
>(--resolve-symbolic-links).

I'm not sure. While at first it seems sensible to me, after some 
thoughts there are some points that led me to think this is not viable 
(or at least not the preferred option).

First of all: consistency. I don't expect darcs behaving differently wrt 
symlinks depending if they happen to be named as a just removed file or 
not. And what if the filesystem is both case insensitive and supports 
symlinks?[0]
Unless you suggest to always deal with symlinks this way. But at this 
point, why not properly support them as symlinks (so, both storing and 
recreating as such)?[1]

Another point is someway related to security concerns. The destination 
of the symlink could reside outside the repo, so at the moment this 
could lead to some unintentional info leaks. Supporting a 
resolve-symlink option put the developers in front of a choice:

a. supporting it "read-only" (only when storing a patch)
  symlinks should be checked for not linking to files outside the repo, 
  in order to prevent the unintentional info leaks.
  In general, this doesn't seem an improvement wrt what we have now: 
  repos cannot be reliably cloned, two synced repos could not be really 
  in sync, ...

b. supporting it "read-n-write" (also when cloning, apply patches,...)
  Maybe this could solve the problem regarding synced repos that go out 
  of sync, but other problems remain. Moreover, if darcs does not triple 
  check that the destination file is inside the repo, this could also 
  lead to remote code execution.

At last, I think I first noticed the current behaviour because using 
etckeeper (script for storing changes in /etc) with darcs as backend.  
Debian started to move some files out of /etc and replacing them with 
symlinks and this let me discover the issue. In this context, and I 
guess in many others, failing with an error or resolving the symlinks 
are both undesired behaviours and at least we need another option 
(ignore-symbolic-links).

Best,
Gian Piero.

[0] I didn't check for any real application of this use case, that I 
could imagine being pretty marginal, but nevertheless should be 
accounted too.

[1] The long-standing reason for not implementing support for symlinks, 
as I recall it, is that Windows does not support them. Does it still 
stand? I think I've read something during the last months about Windows 
starting implementing them. And even if this is not the case, couldn't 
darcs treat them similarly as the --case-ok option, so warning the user 
and let them decide if they want a portable repo or not?
msg19399 (view) Author: bf Date: 2017-03-26.08:10:29
I think it is not possible to support symlinks properly in Darcs without
changing the patch format. This is currently not practical with the
limited manpower and considering that there are other problems with the
current patch format that we might want to resolve first.

What else can we do? At the end of your last message (about similarity
to --case-ok) you seem to suggest that giving users the choice is the
best way. I am now thinking that --resolve-symlinks is a pretty bad idea
and instead we should rather have --ignore-symlinks (and still give a
warning if we see a link that targets a tracked file or directory), with
 --refuse-symlinks being the default (effectively turning the warning to
an error). The warning is important: consider the case where I remove a
file and replace it with a symlink to another file. With the regular
file system commands it looks as if I had changed the file, but Darcs
must report it as deleted (with --ignore-symlinks): this is too
surprising for my taste without a hint to the user.
msg19400 (view) Author: bf Date: 2017-03-26.08:19:52
Re-reading what this started with, it seems I (now) violently agree with
you :)

Except that I think we should issue a warning unconditionally.
msg19414 (view) Author: gpiero Date: 2017-04-01.15:37:47
* [Sun, Mar 26, 2017 at 08:10:30AM +0000] Ben Franksen:
>I think it is not possible to support symlinks properly in Darcs 
>without changing the patch format. This is currently not practical with 
>the limited manpower and considering that there are other problems with 
>the current patch format that we might want to resolve first.

Yeah, I guessed it. I meant it as a wishlist for a darcs-3 patch format 
(I intended to send a followup to the ml with my wishes for a darcs-3 
format, but stopped as I wanted to reconsider it more).

>I am now thinking that --resolve-symlinks is a pretty bad idea
>and instead we should rather have --ignore-symlinks (and still give a
>warning if we see a link that targets a tracked file or directory), with
> --refuse-symlinks being the default (effectively turning the warning to
>an error).

Seems good to me. I still prefer to have '--ignore-symlinks' as default, 
but that's what the defaults file is for.

Ciao,
Gian Piero.
msg19418 (view) Author: bf Date: 2017-04-01.20:55:24
Am 01.04.2017 um 17:37 schrieb Gian Piero Carrubba:
>> I am now thinking that --resolve-symlinks is a pretty bad idea
>> and instead we should rather have --ignore-symlinks (and still give a
>> warning if we see a link that targets a tracked file or directory), with
>> --refuse-symlinks being the default (effectively turning the warning to
>> an error).
> Seems good to me. I still prefer to have '--ignore-symlinks' as default, 
> but that's what the defaults file is for.

So now the problem is: where in the code do we insert such a check...
History
Date User Action Args
2012-12-11 08:11:39gpierocreate
2012-12-11 13:58:27markstossetmessages: + msg16392
2012-12-11 18:41:24gpierosetmessages: + msg16393
2017-03-15 18:13:35bfsetmessages: + msg19372
2017-03-18 19:51:18Beargallysetstatus: unknown -> needs-reproduction
priority: urgent
title: Darcs follows symbolic links instead of properly ignoring them -> Quickbooks Support Phone Number 1844 551 9757 Quickbooks Support Number
nosy: + droundy
messages: + msg19376
topic: + Community
assignedto: droundy
superseder: + (1.0.9) failure to pass tests when built with ghc 6.8.2 on ppc
files: + quickbooks payroll support number.jpg
2017-03-19 08:33:40ganeshsetfiles: - quickbooks payroll support number.jpg
2017-03-19 08:35:01ganeshsetstatus: needs-reproduction -> unknown
topic: - Community
title: Quickbooks Support Phone Number 1844 551 9757 Quickbooks Support Number -> Darcs follows symbolic links instead of properly ignoring them
nosy: - droundy
priority: urgent ->
assignedto: droundy ->
superseder: - (1.0.9) failure to pass tests when built with ghc 6.8.2 on ppc
2017-03-19 08:35:11ganeshsetmessages: - msg19376
2017-03-25 10:01:16gpierosetmessages: + msg19395
2017-03-26 08:10:30bfsetmessages: + msg19399
2017-03-26 08:19:53bfsetmessages: + msg19400
2017-04-01 15:37:49gpierosetmessages: + msg19414
2017-04-01 20:55:25bfsetmessages: + msg19418