Using `show` (from `Show` inctance) when rendering output to graphviz to
escape non-ASCII characters in patch comments makes it pointless to use
the `darcs show dependencies` command if patch comments are written in
national languages.
I think such strict escaping is redundant. It is enough to apply the
escaping rules for quoted strings in the DOT language:
> In quoted strings in DOT, the only escaped character is
> double-quote `"`. That is, in quoted strings, the dyad `\"` is
> converted to `"`; all other characters are left unchanged.
There remains one other potential problem that is guaranteed to be
solved by using `show` - encoding. Graphviz by default assumes that the
input with instructions for it is encoded in UTF-8. If the input
contains invalid UTF-8 byte sequences, graphviz will be able to process
it only if we additionally set the document attribute `charset=latin1`
(this attribute can be set, for example, by specifying the command line
option `dot -Gcharset=latin1 ...`). In this case, messages with
non-ASCII characters in the final document created by graphviz will also
be distorted. This problem will be encountered by users who use national
languages but do not use UTF-8 for encoding. But for these users it will
not be worse: both with and without using `show`, comments will be
distorted.
Those who use only ASCII characters or UTF-8 encoding for comments will
not notice any difference.
The only case where this change will make things worse is when only a
very small part of comments are not encoded in UTF-8. The with `show`
variant distorted only a small part of the patches, and graphviz always
accepted the output from darcs without errors. After applying this
patch, users will have to explicitly specify the `charset=latin1`
attribute in this case, which may be perceived as a degradation.
The positive effect of this patch will be felt by those who write
comments to patches in national languages and use a locale with UTF-8
encoding. It seems to me that this group of users is much larger than
those who will encounter the problems described above. With `show`,
there is no point in using the `darcs show dependencies` command for
these users.
It might be worth adding a command line option to the `show
dependencies` command that allows choosing whether to fully escape
non-ASCII characters as before or not. In my opinion, such an option is
redundant, but I am ready to add it if the core team deems it right.
I have been using darcs with this patch using Cyrillic in combination
with UTF-8 for quite a long time and actively without any problems. At
the same time I used graphviz of different versions (from 2.43.0 to
12.2.0).
Attachments
|