darcs

Issue 706 encode spaces in filenames for darcs-2 format.

Title encode spaces in filenames for darcs-2 format.
Priority bug Status resolved
Milestone 2.0.x Resolved in
Superseder Nosy List Serware, darcs-devel, dmitry.kurochkin, kowey, simonmar, thorkilnaur, tommy
Assigned To
Topics Darcs2

Created on 2008-02-26.10:05:43 by simonmarhaskell, last changed 2010-06-15.21:20:19 by admin.

Messages
msg3663 (view) Author: simonmar Date: 2008-02-26.10:05:40
I mentioned there might be a problem with filenames containing spaces.  I 
just tried a few things and managed to reproduce some strange behaviour.

'darcs' is darcs 1.0.9
'darcs2' is darcs-unstable pulled yesterday

$ mkdir foo
$ cd foo
$ darcs init
$ touch 'A B'
$ darcs add 'A B'
$ darcs rec -a
What is the patch name? fo
Do you want to add a long comment? [yn]
$ ls
A B  _darcs/
$ darcs check
Applying patch 1 of 1... done.
The repository is consistent!
$ ~/darcs/darcs-unstable/darcs check
The repository is consistent!
$ cd ..
$ ~/darcs/darcs-unstable/darcs convert foo foo2
Finished converting.
$ cd foo2
$ ls
A B  _darcs/
$ ~/darcs/darcs-unstable/darcs check
Looks like we have a difference...
Difference:  rmfile ./A

Inconsistent repository!
zsh: 20274 exit 1     ~/darcs/darcs-unstable/darcs check

This behaviour is *not* fixed by recompiling with --disable-bytestring, 
incedentally.

Also, darcs2 fails to check the GHC darcs2 repository, giving this error:

rmfile ./WindowsInstaller/Glasgow\92\32\92\Haskell\92\32\92\Compiler.ism

Inconsistent repository!

Cheers,
	Simon
msg3665 (view) Author: kowey Date: 2008-02-26.15:34:33
So... the background behind this, if I understand correctly is that
darcs uses an internal representation for filenames (FileName).

When going from PackedStrings to FileNames, it performs an encoding of
special characters (whitespace and backslashes) by wrapping their
character code in backslahes (for example, a space gets converted to
the string \32\).  I'm guessing that this is for parsing of darcs
patch files.

Anyway, when actually applying the patches, it does not seem to decode
these special characters; for example, it does do the conversion of
\32\ back into a space.  Particularly, Darcs.IO uses fn2fp to convert
from FileName to FilePath.

So far this sorta seems like a reasonable explanation about what's
going on.  But why are we only noticing this now?  It seems like the
kind of thing that would have come up a very long time ago... which
makes me wonder what I'm missing.

David?
msg3666 (view) Author: droundy Date: 2008-02-26.16:20:12
On Tue, Feb 26, 2008 at 03:34:34PM -0000, Eric Kow wrote:
> So... the background behind this, if I understand correctly is that
> darcs uses an internal representation for filenames (FileName).
> 
> When going from PackedStrings to FileNames, it performs an encoding of
> special characters (whitespace and backslashes) by wrapping their
> character code in backslahes (for example, a space gets converted to
> the string \32\).  I'm guessing that this is for parsing of darcs
> patch files.
> 
> Anyway, when actually applying the patches, it does not seem to decode
> these special characters; for example, it does do the conversion of
> \32\ back into a space.  Particularly, Darcs.IO uses fn2fp to convert
> from FileName to FilePath.
> 
> So far this sorta seems like a reasonable explanation about what's
> going on.  But why are we only noticing this now?  It seems like the
> kind of thing that would have come up a very long time ago... which
> makes me wonder what I'm missing.
> 
> David?

The change is that with the darcs-2 repository format, I changed the
on-disk format for filenames to be simple binary encoding.  Apparently,
this broke some of the parsing behavior.  :(  We should add a test that
demonstrates this.  It shouldn't be too hard to fix.

The reason for this change, by the way, was two-fold.  One was that it's
cheaper to not perform a conversion, and the reason for the conversion was
largely because I thought that FilePath was an actual unicode string (and
thus would require conversion to utf8 for safe storage).  And as long as we
were performing a conversion, we decided we might as well try to be safe
with regard to weird characters at the same time.  Alas, I seem to have
forgotten that safety in my enthusiasm for removing the double layer of
utf8 encoding.
-- 
David Roundy
Department of Physics
Oregon State University
msg3745 (view) Author: droundy Date: 2008-03-04.16:34:22
The following patch updated the status of issue706 to be resolved in the unstable branch:

* resolved issue706: encode spaces in filenames for darcs-2 format. 

You can view the patch details online here: 
http://darcs.net/cgi-bin/darcs.cgi/unstable/?c=annotate&p=20080304161615-72aca-92292d68d16f7d521b6d9ff8b3624723e07fa746.gz
History
Date User Action Args
2008-02-26 10:05:43simonmarhaskellcreate
2008-02-26 15:34:34koweysetstatus: unread -> unknown
nosy: droundy, tommy, beschmi, kowey, simonmarhaskell
messages: + msg3665
2008-02-26 16:20:14droundysetnosy: droundy, tommy, beschmi, kowey, simonmarhaskell
messages: + msg3666
2008-02-26 22:46:43droundysetpriority: bug
nosy: droundy, tommy, beschmi, kowey, simonmarhaskell
topic: + Darcs2, Target-2.0
2008-02-26 22:47:14droundylinkissue707 superseder
2008-03-04 16:34:23droundysetstatus: unknown -> resolved-in-unstable
nosy: droundy, tommy, beschmi, kowey, simonmarhaskell
messages: + msg3745
title: Filenames with spaces issue -> encode spaces in filenames for darcs-2 format.
2008-03-27 17:52:45droundylinkissue765 superseder
2008-09-04 21:32:43adminsetstatus: resolved-in-unstable -> resolved
nosy: + dagit
2009-08-06 17:55:01adminsetnosy: + markstos, jast, Serware, dmitry.kurochkin, darcs-devel, zooko, mornfall, simon, thorkilnaur, - droundy, simonmarhaskell
2009-08-06 20:58:52adminsetnosy: - beschmi
2009-08-10 22:16:07adminsetnosy: + simonmarhaskell, - markstos, darcs-devel, zooko, jast, Serware, mornfall
2009-08-11 00:07:38adminsetnosy: - dagit
2009-08-25 18:06:16adminsetnosy: + darcs-devel, - simon
2009-08-27 14:09:11adminsetnosy: tommy, kowey, darcs-devel, simonmarhaskell, thorkilnaur, dmitry.kurochkin
2009-10-24 00:39:39adminsetnosy: + simonmar, - simonmarhaskell
2010-06-15 21:20:18adminsetmilestone: 2.0.x
2010-06-15 21:20:19adminsettopic: - Target-2.0
nosy: + Serware