Issue 2018: should Printer drop ByteString support? (or alternatively drop String) - Darcs bug tracker

Title	should Printer drop ByteString support? (or alternatively drop String)
Priority	wishlist	Status	needs-reproduction
Milestone		Resolved in
Superseder		Nosy List	kowey
Assigned To		Topics	Devel

Created on 2010-12-16.16:37:39 by kowey, last changed 2010-12-16.16:42:00 by kowey.

Messages
msg13354 (view)	Author: kowey	Date: 2010-12-16.16:37:37
On IRC, Wolfgang Jeltsch observed: The problem with the Printable type is that it mixes values of type String (sequences of characters) and values of type ByteString (sequences of bytes), which are completely different things. :-( Mixing bytes and characters freely works only as long as you fix an encoding. Well, maybe you always use UTF-8 internally. But even then, this is not visible in the code, since a ByteString doesn’t carry encoding info. http://irclog.perlgeek.de/darcs/2010-12-16#i_3094602 Doing some research, I found that the justification for this support in Sun Jun 13 01:02:34 BST 2004 jch@pps.jussieu.fr * Avoid unpacking PackedStrings in the printer. Darcs reads file data into PackedStrings, but unpacks them when printing out a patch. The fix is to make the printer able to grok streams of arbitrary tokens, not just Haskell strings (streams of Char). See the type class Printer.Printable and the instance Printer.PChar. See also the type synonim PrintPatch.PrinterType, which is what gets actually used. The net effect is that darcs whatsnew is more than twice as fast, and darcs pull of large patches uses 10 (!) times less memory. On the other hand, darcs pull of many small patches uses up a few percent more CPU time, which I don't understand. Wolfgang believes that since we are now caring about user locales, this approach may no longer be valid. It seems like it could be useful to research this question from a code quality perspective
msg13355 (view)	Author: kowey	Date: 2010-12-16.16:41:59
Wolfgang adds that part of the problem is that bytestrings are written to stdout directly (whereas Strings are written with hPutStr, which may be locale-sensitive from GHC 6.12 on), so this may lead to two different behaviours depending on the internal type

History
Date	User	Action	Args
2010-12-16 16:37:39	kowey	create
2010-12-16 16:42:00	kowey	set	messages: + msg13355