Created on 2009-04-16.11:04:52 by kowey, last changed 2020-07-29.17:04:47 by bfrk.
msg7709 (view) |
Author: kowey |
Date: 2009-04-16.11:04:48 |
|
% darcs changes 首頁.page -v
Changes to 首頁.page:
Wed Apr 15 19:35:06 BST 2009 Eric Kow <E.Y.Kow@brighton.ac.uk>
* Convert \e9\a6\96\e9\a0\81.page from MoinMoin syntax to RST.
hunk ./\e9\a6\96\e9\160\\81.page 1
-#language zh-tw
-#redirect FrontPage
Sun Dec 31 14:13:58 GMT 2006 Eric Kow <q15kxe002@sneakemail.com>
* \e9\a6\96\e9\a0\81: redirect to FrontPage
addfile ./\e9\a6\96\e9\160\\81.page
hunk ./\e9\a6\96\e9\160\\81.page 1
+#language zh-tw
+#redirect FrontPage
And also
% DARCS_DONT_ESCAPE_ANYTHING=1 darcs changes 首頁.page -v
Changes to 首頁.page:
Wed Apr 15 19:35:06 BST 2009 Eric Kow <E.Y.Kow@brighton.ac.uk>
* Convert 首頁.page from MoinMoin syntax to RST.
hunk ./首?\160\?.page 1
-#language zh-tw
-#redirect FrontPage
Sun Dec 31 14:13:58 GMT 2006 Eric Kow <q15kxe002@sneakemail.com>
* 首頁: redirect to FrontPage
addfile ./首?\160\?.page
hunk ./首?\160\?.page 1
+#language zh-tw
+#redirect FrontPage
What's going on with those filenames? What's with the \160 and the double-slash?
Nevermind the actual encoding of the filenames for now. Once we've got the
filename as a sequence of bytes, do we preserve that sequence?
Hopefully we'll find out that this is just a call for more testing and that
everything is OK.
I claim that the next action required is to study the darcs source code (maybe
Darcs.Patch.FileName) and provide an explanation for \160 and the double slash
that follows (see encode_white).
After that, if all is well, we should probably extend our test suite or make
sure that our test suite covers this stuff.
Marking this as a bug in the absence of full knowledge.
|
msg7711 (view) |
Author: twb |
Date: 2009-04-16.11:11:37 |
|
I will write some tests for this issue RSN.
|
msg7712 (view) |
Author: kowey |
Date: 2009-04-16.12:14:17 |
|
This made me curious so I spent some time on
http://irclog.perlgeek.de/darcs/2009-04-16#i_1068098
trying to track down the issue
> Wed Apr 15 19:35:06 BST 2009 Eric Kow <E.Y.Kow@brighton.ac.uk>
> * Convert \e9\a6\96\e9\a0\81.page from MoinMoin syntax to RST.
These escape sequences are generated by Darcs.ColorPrinter.
Notice \a0 in particular...
> hunk ./\e9\a6\96\e9\160\\81.page 1
We use a function encode_white in Darcs.Patch.FileName, which encodes
all whitespace characters like this: \xxx\
Filenames are treated as Unicode Chars but they are read and written in
the usual Haskell way (no encoding, makes it as if it were in latin-1).
So it's as if we were treating filenames as bytestrings except for the
occasions where we wanted to look for slashes, dots and spaces. In this
particular case, the encode_white function uses Data.Char.isSpace to
work out that the character corresponding to \a0 is a space character,
so it encodes it as \160\.
> * 首頁: redirect to FrontPage
> addfile ./首?\160\?.page
> hunk ./首?\160\?.page 1
> +#language zh-tw
> +#redirect FrontPage
So doesn't this display correctly?
Here we are telling Darcs.ColorPrinter not to escape characters, so this
is why it leaves the bytes corresponding to "首" as they are. But
before passing the filename to colorprinter, our patch display code
does this whitespace-escaping thing.
This means that we end up printing the byte
e9 (first question mark)
then the string
\160\ (escaped whitespace due to Darcs.Patch.FileName)
then the byte
81 (second question mark)
So phew!
The weird gobbledygook on your screen does not constitute a bug.
I'm not sure what to do about this, though. I guess one interpretation
is that we can say that when the user passes in DONT_ESCAPE_ANYTHING,
this includes darcs's own lower-level escapings and not just the higher
level ColorPrinter ones. But that sounds like potentially messy code.
Should we somehow extend the printer code to say something like "I want
to escape this thing X as Y"? Should we just treat this as a corner
case and document it on the wiki somewhere?
> After that, if all is well, we should probably extend our test suite or make
> sure that our test suite covers this stuff.
|
msg8194 (view) |
Author: kowey |
Date: 2009-08-17.06:15:31 |
|
Trent has kindly written up a test for this.
To be clear: there is no hanky-panky going on here. It's just a problem of us
using the same code to render filenames for patch bundles as we do for putting
them on the screen.
The darcs-specific convention of representing whitespace as \xxx\ is
inappropriate for printing on screen.
This just needs somebody to study the darcs code and work out a solution of
doing two kinds of printing without us having too much code duplication.
Perhaps an alternative show class?
|
msg8201 (view) |
Author: twb |
Date: 2009-08-17.08:15:45 |
|
On Mon, Aug 17, 2009 at 06:15:35AM +0000, Eric Kow wrote:
> This just needs somebody to study the darcs code and work out a
> solution of doing two kinds of printing without us having too much
> code duplication. Perhaps an alternative show class?
There was discussion on-list about replacing our in-house Printer
module with the HughesPJ one. I dunno if that's relevant.
|
msg8202 (view) |
Author: kowey |
Date: 2009-08-17.08:42:59 |
|
On Mon, Aug 17, 2009 at 08:15:47 +0000, Trent W. Buck wrote:
> On Mon, Aug 17, 2009 at 06:15:35AM +0000, Eric Kow wrote:
> > This just needs somebody to study the darcs code and work out a
> > solution of doing two kinds of printing without us having too much
> > code duplication. Perhaps an alternative show class?
>
> There was discussion on-list about replacing our in-house Printer
> module with the HughesPJ one. I dunno if that's relevant.
I don't think so.
The printer controls all these fancy layout functions like blueText
and ($$). Here the issue is really Darcs-specific, as in the function
that is at 'fault' is Darcs.Patch.FileName.encode_filename
|
msg10513 (view) |
Author: kowey |
Date: 2010-03-25.11:51:11 |
|
Reinier, I'm making you nosy on this bug because of your experience with
issue1739 (which is currently failing on buildbots BTW). Maybe you'd be
interested.
|
msg10523 (view) |
Author: kowey |
Date: 2010-03-26.00:48:39 |
|
On Thu, Mar 25, 2010 at 21:02:37 +0100, Reinier Lamers wrote:
> This /may/ also have something to do with the non-roundtrip-safe filepath
> munging that I presume causes http://bugs.darcs.net/issue1763 . I may look
> into this one after I have fixed 1763 and 1599, and also released some 3-ish
> darcs versions or so.
http://bugs.darcs.net/msg7712 has a more optimistic analysis (but maybe
it's a wrong one!)
--
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9
|
msg21050 (view) |
Author: bfrk |
Date: 2019-08-08.12:43:07 |
|
Just noting that, according to Ganesh, this test still fails on Windows, while it does not on Linux.
|
msg21060 (view) |
Author: bfrk |
Date: 2019-08-09.09:59:21 |
|
> Just noting that, according to Ganesh, this test still fails on
> Windows, while it does not on Linux.
Or rather, if I understood correctly, the test script needs to be
re-written to be more portable, so at the moment it is unclear whether
there is an actual problem with this on Windows.
|
msg21062 (view) |
Author: ganesh |
Date: 2019-08-09.10:51:53 |
|
I think it was actually failing, but I can't remember for sure.
Every so often I get up the energy to really dig into an
encoding problem on Windows, but now is not one of those times.
|
msg22283 (view) |
Author: bfrk |
Date: 2020-07-29.17:04:45 |
|
Ganesh, I think this should work on Windows, in principle. It may be
simply a problem with the test script. Please attach the full output
of the test.
|
|
Date |
User |
Action |
Args |
2009-04-16 11:04:52 | kowey | create | |
2009-04-16 11:11:39 | twb | set | status: unread -> unknown assignedto: twb messages:
+ msg7711 nosy:
+ twb |
2009-04-16 12:14:20 | kowey | set | nosy:
kowey, simon, twb, thorkilnaur, dmitry.kurochkin messages:
+ msg7712 |
2009-04-23 00:56:25 | twb | set | assignedto: twb -> nosy:
kowey, simon, twb, thorkilnaur, dmitry.kurochkin |
2009-08-17 06:15:35 | kowey | set | status: unknown -> needs-reproduction nosy:
kowey, simon, twb, thorkilnaur, dmitry.kurochkin topic:
+ UI messages:
+ msg8194 title: does the darcs encoding/decoding of filenames round-trip? -> filenames printed to screen with junk in place of whitespace |
2009-08-17 08:15:47 | twb | set | nosy:
kowey, simon, twb, thorkilnaur, dmitry.kurochkin messages:
+ msg8201 |
2009-08-17 08:43:01 | kowey | set | nosy:
kowey, simon, twb, thorkilnaur, dmitry.kurochkin messages:
+ msg8202 |
2009-08-25 17:43:43 | admin | set | nosy:
+ darcs-devel, - simon |
2009-08-27 14:25:18 | admin | set | nosy:
kowey, darcs-devel, twb, thorkilnaur, dmitry.kurochkin |
2010-03-25 11:51:13 | kowey | set | nosy:
+ tux_rocker messages:
+ msg10513 |
2010-03-26 00:48:40 | kowey | set | messages:
+ msg10523 title: filenames printed to screen with junk in place of whitespace -> does the darcs encoding/decoding of filenames round-trip? |
2019-08-08 12:43:09 | bfrk | set | messages:
+ msg21050 |
2019-08-09 09:59:23 | bfrk | set | messages:
+ msg21060 |
2019-08-09 10:51:55 | ganesh | set | messages:
+ msg21062 |
2020-07-29 17:04:47 | bfrk | set | messages:
+ msg22283 |
|