darcs

Patch 362 Failing test case for issue1932

Title Failing test case for issue1932
Superseder Accept issue1932: darcs add -qr . should not fail on f...
View: 367
Nosy List dastapov
Related Issues isFile skips real files with colons, breaks "add" in darcs 2.4.98
View: 1932
Status obsoleted Assigned To
Milestone

Created on 2010-08-24.10:32:36 by dastapov, last changed 2010-08-26.15:45:20 by kowey.

Files
File name Status Uploaded Type Edit Remove
failing-issue1932-colon-breaks-add.sh.patch dastapov, 2010-08-24.10:32:36 application/octet-stream
unnamed dastapov, 2010-08-25.21:25:52 text/html
unnamed dastapov, 2010-08-26.03:55:16 text/html
See mailing list archives for discussion on individual patches.
Messages
msg12283 (view) Author: dastapov Date: 2010-08-24.10:32:36
Simplies possible test case
Attachments
msg12293 (view) Author: kowey Date: 2010-08-25.14:51:18
Hi Dmitry, many thanks for improving our test suite! It makes bugs 
easier to solve.  

About the test: is "0401:19d2" really a minimal example of a filename 
with a colon on it?  Maybe something simpler take makes fewer 
suggestions about what may be the actual problem would be good.  Adding 
to that, it might also be nice to have *two* examples, one showing the 
expected behaviour on colons (c:/foo should be recognised as a windows 
path) and the minimal difference which makes Darcs go wrong (cx:/foo 
probably should not).

Some admin stuff: may I ask for a little bit of extra help on your part? 
Could you re-submit this patch as a Darcs patch, following the 
conventions in http://wiki.darcs.net/Development/RegressionTests ? 
Little things like this help us (especially when multiplied by hopefully 
lots of users).

[NB: if Dmitry does not get a chance to do this by 2010-09-01, we should 
just make a patch on his behalf, probably using one of our names, citing 
Dmitry in the patch log, which is ultimately less confusing IMHO]

Also, it's perfectly fine (and admirable) to license your test as public 
domain, but I *think* [A] the "Public domain" and "(C)" are 
contradictory so you should probably choose one or the other and [B] 
what would really save effort on our part is to just use the MIT license 
from EXAMPLE.sh (because I think we'll just be tacking it on anyway).  
If you're at all curious about this sort of thing, see 
http://lists.osuosl.org/pipermail/darcs-users/2010-August/024770.html 
for my education on these matters.
msg12298 (view) Author: dastapov Date: 2010-08-25.16:31:28
On Wed, Aug 25, 2010 at 5:51 PM, Eric Kow <bugs@darcs.net> wrote:

>
> Eric Kow <kowey@darcs.net> added the comment:
>
> Hi Dmitry, many thanks for improving our test suite! It makes bugs
> easier to solve.
>
> About the test: is "0401:19d2" really a minimal example of a filename
> with a colon on it?  Maybe something simpler take makes fewer
> suggestions about what may be the actual problem would be good.  Adding
> to that, it might also be nice to have *two* examples, one showing the
> expected behaviour on colons (c:/foo should be recognised as a windows
> path) and the minimal difference which makes Darcs go wrong (cx:/foo
> probably should not).
>

Yeah, that would be nice.

Let's chat a bit first, though. The real issue at hand is with handling
"tricky" characters in file names. On Unix, I can have ':' and '?' or '<' in
my filename, while on Windows they are forbidden. It seems to me that no
attempt is made to handle this characters in some special way anywhere in
the darcs, so I guess that the official position is that users are allowed
to shoot themselves in the foot however they like.

I am fine with that and will assume this much for the rest of the text. So,
the correct behavior for darcs would be to add files like "aaa:bbb" and even
"c:\src" to the repository, as long as those files really exist and are
really files.

Now, colon has a special role in Darcs, since it is used in names of remote
scp-accessible repositories. It seems that a colon in repository name should
automatically mean that it is either a HTTP repo or an scp repo. Unless we
make a drastical change and start requiring "scp://" in front of the scp
repo names, all colons in repo dir names should be disallowed.

Then, functions in Darcs.URL clearly serve to distinguish between possible
forms of repo names.
* isUrl should be true for names like http://....
* isSsh should be true for all names that are not URLs and containin ':'
* isFile should be true whenever both isSSH and isURL are false.

I browsed through the source and isFile, isSsh, and isUrl are indeed used to
differentiate between possible forms of repository names everywhere, except
for the "isRelative" in URL.hs

My proposal is:
1)Replace isRelative from URL.hs by isRelative from System.FilePath, which
should take into account differences between platforms and handle colon
under unix in sensible way.

2)Submit the proposal to mark ssh-accesible repositories by "ssh://" and
replace a horrible ad-hoc implementations of isSsh and isFile with much
better ad-hoc implementation:

isSsh f = "ssh://" `isPrefixOf` f
isUrl f = not (isSsh f) && "://" `isInfixOf` f
isFile f = not (isSsh f || isUrl f)

Do you agree?

I'll be submitting the patch for (1) shortly (along with proper test-case).



> Some admin stuff: may I ask for a little bit of extra help on your part?
> Could you re-submit this patch as a Darcs patch, following the
> conventions in http://wiki.darcs.net/Development/RegressionTests ?
>
Will do. Likewise for licensing.

-- 
Dmitry Astapov
Attachments
msg12311 (view) Author: dastapov Date: 2010-08-25.21:25:52
On Wed, Aug 25, 2010 at 5:51 PM, Eric Kow <bugs@darcs.net> wrote:

>
> Eric Kow <kowey@darcs.net> added the comment:
>
> Hi Dmitry, many thanks for improving our test suite! It makes bugs
> easier to solve.
>
> About the test: is "0401:19d2" really a minimal example of a filename
> with a colon on it?  Maybe something simpler take makes fewer
> suggestions about what may be the actual problem would be good.  Adding
> to that, it might also be nice to have *two* examples, one showing the
> expected behaviour on colons (c:/foo should be recognised as a windows
> path) and the minimal difference which makes Darcs go wrong (cx:/foo
> probably should not).
>

Yeah, that would be nice.

Let's chat a bit first, though. The real issue at hand is with handling
"tricky" characters in file names. On Unix, I can have ':' and '?' or '<' in
my filename, while on Windows they are forbidden. It seems to me that no
attempt is made to handle this characters in some special way anywhere in
the darcs, so I guess that the official position is that users are allowed
to shoot themselves in the foot however they like.

I am fine with that and will assume this much for the rest of the text. So,
the correct behavior for darcs would be to add files like "aaa:bbb" and even
"c:\src" to the repository, as long as those files really exist and are
really files.

Now, colon has a special role in Darcs, since it is used in names of remote
scp-accessible repositories. It seems that a colon in repository name should
automatically mean that it is either a HTTP repo or an scp repo. Unless we
make a drastical change and start requiring "scp://" in front of the scp
repo names, all colons in repo dir names should be disallowed.

Then, functions in Darcs.URL clearly serve to distinguish between possible
forms of repo names.
* isUrl should be true for names like http://....
* isSsh should be true for all names that are not URLs and containin ':'
* isFile should be true whenever both isSSH and isURL are false.

I browsed through the source and isFile, isSsh, and isUrl are indeed used to
differentiate between possible forms of repository names everywhere, except
for the "isRelative" in URL.hs

My proposal is:
1)Replace isRelative from URL.hs by isRelative from System.FilePath, which
should take into account differences between platforms and handle colon
under unix in sensible way.

2)Submit the proposal to mark ssh-accesible repositories by "ssh://" and
replace a horrible ad-hoc implementations of isSsh and isFile with much
better ad-hoc implementation:

isSsh f = "ssh://" `isPrefixOf` f
isUrl f = not (isSsh f) && "://" `isInfixOf` f
isFile f = not (isSsh f || isUrl f)

Do you agree?

I'll be submitting the patch for (1) shortly (along with proper test-case).



> Some admin stuff: may I ask for a little bit of extra help on your part?
> Could you re-submit this patch as a Darcs patch, following the
> conventions in http://wiki.darcs.net/Development/RegressionTests ?
>
Will do. Likewise for licensing.

-- 
Dmitry Astapov
Attachments
msg12312 (view) Author: dastapov Date: 2010-08-26.03:55:16
On Wed, Aug 25, 2010 at 5:51 PM, Eric Kow <bugs@darcs.net> wrote:

>
> Eric Kow <kowey@darcs.net> added the comment:
>
> Hi Dmitry, many thanks for improving our test suite! It makes bugs
> easier to solve.
>
> About the test: is "0401:19d2" really a minimal example of a filename
> with a colon on it?  Maybe something simpler take makes fewer
> suggestions about what may be the actual problem would be good.  Adding
> to that, it might also be nice to have *two* examples, one showing the
> expected behaviour on colons (c:/foo should be recognised as a windows
> path) and the minimal difference which makes Darcs go wrong (cx:/foo
> probably should not).
>

Yeah, that would be nice.

Let's chat a bit first, though. The real issue at hand is with handling
"tricky" characters in file names. On Unix, I can have ':' and '?' or '<' in
my filename, while on Windows they are forbidden. It seems to me that no
attempt is made to handle this characters in some special way anywhere in
the darcs, so I guess that the official position is that users are allowed
to shoot themselves in the foot however they like.

I am fine with that and will assume this much for the rest of the text. So,
the correct behavior for darcs would be to add files like "aaa:bbb" and even
"c:\src" to the repository, as long as those files really exist and are
really files.

Now, colon has a special role in Darcs, since it is used in names of remote
scp-accessible repositories. It seems that a colon in repository name should
automatically mean that it is either a HTTP repo or an scp repo. Unless we
make a drastical change and start requiring "scp://" in front of the scp
repo names, all colons in repo dir names should be disallowed.

Then, functions in Darcs.URL clearly serve to distinguish between possible
forms of repo names.
* isUrl should be true for names like http://....
* isSsh should be true for all names that are not URLs and containin ':'
* isFile should be true whenever both isSSH and isURL are false.

I browsed through the source and isFile, isSsh, and isUrl are indeed used to
differentiate between possible forms of repository names everywhere, except
for the "isRelative" in URL.hs

My proposal is:
1)Replace isRelative from URL.hs by isRelative from System.FilePath, which
should take into account differences between platforms and handle colon
under unix in sensible way.

2)Submit the proposal to mark ssh-accesible repositories by "ssh://" and
replace a horrible ad-hoc implementations of isSsh and isFile with much
better ad-hoc implementation:

isSsh f = "ssh://" `isPrefixOf` f
isUrl f = not (isSsh f) && "://" `isInfixOf` f
isFile f = not (isSsh f || isUrl f)

Do you agree?

I'll be submitting the patch for (1) shortly (along with proper test-case).



> Some admin stuff: may I ask for a little bit of extra help on your part?
> Could you re-submit this patch as a Darcs patch, following the
> conventions in http://wiki.darcs.net/Development/RegressionTests ?
>
Will do. Likewise for licensing.

-- 
Dmitry Astapov
Attachments
msg12315 (view) Author: kowey Date: 2010-08-26.15:45:20
On Wed, Aug 25, 2010 at 19:34:05 +0300, Dmitry Astapov wrote:
> Let's chat a bit first, though. The real issue at hand is with handling
> "tricky" characters in file names. On Unix, I can have ':' and '?' or '<' in
> my filename, while on Windows they are forbidden. It seems to me that no
> attempt is made to handle this characters in some special way anywhere in
> the darcs, so I guess that the official position is that users are allowed
> to shoot themselves in the foot however they like.

Note that Darcs makes some effort to detect case sensitivity issues and
reserved filenames (eg. AUX or COM[1-9]), saying "hey, if you add this
file, Darcs is going to have a lot of trouble on Windows" (not an actual
UI message).

So we do seem to discourage shooting self in foot, but making provisions
for people to opt-in with flags like --case-ok and --reserved-ok. Is it
just the case that --reserved-ok isn't working hard enough to pick up
the ':', '?' and '<' characters?

See issue53 for more details

> I am fine with that and will assume this much for the rest of the text. So,
> the correct behavior for darcs would be to add files like "aaa:bbb" and even
> "c:\src" to the repository, as long as those files really exist and are
> really files.

So following the current Darcs UI, even if we did discourage the user
for adding these, it is conceivable for such filenames to exist

> Now, colon has a special role in Darcs, since it is used in names of remote
> scp-accessible repositories. It seems that a colon in repository name should
> automatically mean that it is either a HTTP repo or an scp repo.

Oh, of course!  Very sorry for forgetting about this; it didn't occur to
me as I was mechanically doing my BTS triage.

> Unless we make a drastical change and start requiring "scp://" in
> front of the scp repo names, all colons in repo dir names should be
> disallowed.

*Requiring* ssh repos to be URIs could be tricky and perhaps involve
a painful deprecation process (which isn't to say we shouldn't do it
just that it will take some dedication)

On the other hand, supporting such paths seems to be quite clearly
a good thing (see issue1575, and also issue1576 for a bonus feature)

> Then, functions in Darcs.URL clearly serve to distinguish between possible
> forms of repo names.
> * isUrl should be true for names like http://....
> * isSsh should be true for all names that are not URLs and containin ':'
> * isFile should be true whenever both isSSH and isURL are false.
> 
> I browsed through the source and isFile, isSsh, and isUrl are indeed used to
> differentiate between possible forms of repository names everywhere, except
> for the "isRelative" in URL.hs

By the way, isRelative seems to now have a redundant colon-checking case
(redundant with isFile).  Not sure if there's some sort of wise
robustness thing behind it (like code-change-proofing) or if it was just
an oversight.

> 1)Replace isRelative from URL.hs by isRelative from System.FilePath, which
> should take into account differences between platforms and handle colon
> under unix in sensible way.

Would isFile && System.FilePath.isRelative have the same benefits?

Note that in *general* Darcs code, we have a policy of using exclusively
System.FilePath.Posix out of conservatism (filepaths are a really
important part of darcs as they are part of many patch types, eg.
hunk patches, so we want to make sure that we're always handling them
*exactly* the same way)

> 2)Submit the proposal to mark ssh-accesible repositories by "ssh://" and
> replace a horrible ad-hoc implementations of isSsh and isFile with much
> better ad-hoc implementation:
> 
> isSsh f = "ssh://" `isPrefixOf` f
> isUrl f = not (isSsh f) && "://" `isInfixOf` f
> isFile f = not (isSsh f || isUrl f)

Part of this is issue1575.  Well, again here we're running into the
conservatism issue.  This time it's the
dont-break-what-users-are-used-to-unless-you-have-a-good-reason
conservatism (and not the
whoah-be-careful-not-to-accidentally-make-something-wrong
conservatism from above).

So I want to be very very very cautious about it and make sure that we
are thinking long and hard about what we're about to do.

Thanks for checking carefully!  The path handling in Darcs is in a
somewhat sorry state, and I would be thrilled to see it cleaned up.
You may want to use Search functionality on the tracker to look for
tickets with topic FilePath and status "not resolved".  Also,
patch343 could be of interest.

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
For a faster response, try +44 (0)1273 64 2905 or
xmpp:kowey@jabber.fr (Jabber or Google Talk only)
History
Date User Action Args
2010-08-24 10:32:36dastapovcreate
2010-08-25 14:51:18koweysetstatus: needs-review -> followup-requested
messages: + msg12293
2010-08-25 16:31:28dastapovsetfiles: + unnamed
messages: + msg12298
2010-08-25 17:34:53dastapovsetfiles: - unnamed
2010-08-25 17:37:48dastapovsetstatus: followup-requested -> obsoleted
superseder: + Accept issue1932: darcs add -qr . should not fail on f...
2010-08-25 21:25:53dastapovsetfiles: + unnamed
messages: + msg12311
2010-08-26 03:55:16dastapovsetfiles: + unnamed
messages: + msg12312
2010-08-26 15:45:20koweysetmessages: + msg12315