darcs

Issue 2160 Darcs whatsnew incorrectly reports line number for change that adds blank line to EOF

Title Darcs whatsnew incorrectly reports line number for change that adds blank line to EOF
Priority Status unknown
Milestone 2.12.0 Resolved in
Superseder Nosy List owst
Assigned To
Topics

Created on 2012-03-19.01:53:54 by owst, last changed 2016-04-04.14:49:43 by gh.

Messages
msg15354 (view) Author: owst Date: 2012-03-19.01:53:52
Some ideas to help bug reporting (1-4 may also help for feature requests):

1. What darcs version are you using? (Try: darcs --exact-version)

2.9.1 (+ 24 patches)

2. What operating system are you running?

Linux x86_64

3. What were you doing? (eg. what darcs commands)

$ darcs init --repo R
$ cd R
$ echo 'initline' > file
$ darcs rec -alm 'Add file'
$ echo '' >> file
$ darcs wh

4. What behaviour were you expecting?

The newline to be at line # 2:

hunk ./file 2
+

5. What went wrong?

The line number is reported as #3!  
hunk ./file 3
+

Ooh err...
msg15362 (view) Author: owst Date: 2012-03-19.21:23:06
I think the main cause of this bug (and its manifestations - e.g. trying
to annotate the repo after recording the empty line patch will give an
'unknown' line) is the interaction between several internal methods of
splitting a string into lines.

src/ByteStringUtils.hs has:

linesPS :: B.ByteString -> [B.ByteString]
linesPS ps
     | B.null ps = [B.empty]
     | otherwise = BC.split '\n' ps

and src/Annotate.hs has

breakLines :: BC.ByteString -> [BC.ByteString]
breakLines s = case BC.split '\n' s of
    [] -> []
    split | BC.null (last split) -> init split
          | otherwise -> split

src/Diff.hs has:

[...] (line_diff p (linesB a) (linesB b))
  where
    line_diff p a b = canonize (hunk p 1 a b)
    linesB = map strict . BLC.split '\n'

So in this case we get `line_diff "file" (linesB "initline\n") (linesB
"initline\n\n")` which falls through to `canonize (hunk "file" 1
["initline", ""] ["initline", "", ""])`

canonize calls getChanges from Lcs to get the diff by looking for the
longest common subsequence and drops the common prefix of ["initline",
""] giving a diff of a new line at line 3 containing "".
msg15366 (view) Author: ganesh Date: 2012-03-20.07:19:46
Could we change Lcs so that if both sides have a "" at the end, it removes 
them first?

I think the problem with annotate may need a different fix (perhaps change 
breakLines so it doesn't drop the last ""?)
msg15367 (view) Author: owst Date: 2012-03-20.09:49:05
> Could we change Lcs so that if both sides have a "" at the end, it
removes 
> them first?

We could, and it actually looks like the diff code tries to drop equal
leading/trailing elements before diffing (as you'd expect). I'm not sure
that
is the right fix though, since we're providing diff with two lists that
contain elements that aren't technically lines (the last elements, in
the case
of EOF \n's).

> I think the problem with annotate may need a different fix (perhaps
change 
> breakLines so it doesn't drop the last ""?)

My feeling is that annotate has the right implementation in terms of
splitting
into lines, so I'd rather change the other implementations first. But
I've not
thought through the consequences of that...
History
Date User Action Args
2012-03-19 01:53:54owstcreate
2012-03-19 21:23:07owstsetmessages: + msg15362
2012-03-20 07:19:47ganeshsetmessages: + msg15366
2012-03-20 09:49:06owstsetmessages: + msg15367
2016-04-04 14:49:43ghsetmilestone: 2.12.0