darcs

Issue 2160 Darcs whatsnew incorrectly reports line number for change that adds blank line to EOF

Title Darcs whatsnew incorrectly reports line number for change that adds blank line to EOF
Priority Status has-patch
Milestone 2.12.0 Resolved in
Superseder Nosy List owst
Assigned To
Topics

Created on 2012-03-19.01:53:54 by owst, last changed 2017-08-13.13:49:42 by bf.

Messages
msg15354 (view) Author: owst Date: 2012-03-19.01:53:52
Some ideas to help bug reporting (1-4 may also help for feature requests):

1. What darcs version are you using? (Try: darcs --exact-version)

2.9.1 (+ 24 patches)

2. What operating system are you running?

Linux x86_64

3. What were you doing? (eg. what darcs commands)

$ darcs init --repo R
$ cd R
$ echo 'initline' > file
$ darcs rec -alm 'Add file'
$ echo '' >> file
$ darcs wh

4. What behaviour were you expecting?

The newline to be at line # 2:

hunk ./file 2
+

5. What went wrong?

The line number is reported as #3!  
hunk ./file 3
+

Ooh err...
msg15362 (view) Author: owst Date: 2012-03-19.21:23:06
I think the main cause of this bug (and its manifestations - e.g. trying
to annotate the repo after recording the empty line patch will give an
'unknown' line) is the interaction between several internal methods of
splitting a string into lines.

src/ByteStringUtils.hs has:

linesPS :: B.ByteString -> [B.ByteString]
linesPS ps
     | B.null ps = [B.empty]
     | otherwise = BC.split '\n' ps

and src/Annotate.hs has

breakLines :: BC.ByteString -> [BC.ByteString]
breakLines s = case BC.split '\n' s of
    [] -> []
    split | BC.null (last split) -> init split
          | otherwise -> split

src/Diff.hs has:

[...] (line_diff p (linesB a) (linesB b))
  where
    line_diff p a b = canonize (hunk p 1 a b)
    linesB = map strict . BLC.split '\n'

So in this case we get `line_diff "file" (linesB "initline\n") (linesB
"initline\n\n")` which falls through to `canonize (hunk "file" 1
["initline", ""] ["initline", "", ""])`

canonize calls getChanges from Lcs to get the diff by looking for the
longest common subsequence and drops the common prefix of ["initline",
""] giving a diff of a new line at line 3 containing "".
msg15366 (view) Author: ganesh Date: 2012-03-20.07:19:46
Could we change Lcs so that if both sides have a "" at the end, it removes 
them first?

I think the problem with annotate may need a different fix (perhaps change 
breakLines so it doesn't drop the last ""?)
msg15367 (view) Author: owst Date: 2012-03-20.09:49:05
> Could we change Lcs so that if both sides have a "" at the end, it
removes 
> them first?

We could, and it actually looks like the diff code tries to drop equal
leading/trailing elements before diffing (as you'd expect). I'm not sure
that
is the right fix though, since we're providing diff with two lists that
contain elements that aren't technically lines (the last elements, in
the case
of EOF \n's).

> I think the problem with annotate may need a different fix (perhaps
change 
> breakLines so it doesn't drop the last ""?)

My feeling is that annotate has the right implementation in terms of
splitting
into lines, so I'd rather change the other implementations first. But
I've not
thought through the consequences of that...
msg19577 (view) Author: bf Date: 2017-08-13.12:54:07
This is an interesting bug. It is easy to fix: just add an extra case
when both files end with a newline, then remove both newlines before
splitting into lines. What is interesting is that this make an ancient
test fail, namely tests/merging_newlines.sh. As it turns out, this was
once a failing test, but David changed it (back in 2008) to one that
doesn't fail.The test tries to merge (push) appending a non-empty line
with appending an empty line.

Indeed, if one accepts that adding a newline to a file that contains a
single line of text is a change on *line 3*, then there is no conflict
with another change that adds some text to line 2.

However, reporting the addition of an empty line to a one-line file at
line 3 is highly un-intuitive, even though one could argue that it is
not incorrect in a strict sense, since one may have a different view of
how things like 'line' and 'appending a line' are defined. In any case,
darcs annotate assumes the intuitive view and would have to be changed
to conform to the current (un-intuitive) behavior of darcs.

The upshot is that IMO this is indeed a bug and
tests/merging_newlines.sh should fail (i.e. report a conflict).

I am going to send the fix for this issue. As stipulated by owst, this
indeed also fixes issue2154. Of course, already recorded patches are not
affected, so we will still see the 'unknown' annotations for old repos
e.g. src/Darcs/Util/Email.hs).
History
Date User Action Args
2012-03-19 01:53:54owstcreate
2012-03-19 21:23:07owstsetmessages: + msg15362
2012-03-20 07:19:47ganeshsetmessages: + msg15366
2012-03-20 09:49:06owstsetmessages: + msg15367
2016-04-04 14:49:43ghsetmilestone: 2.12.0
2017-08-13 12:54:08bfsetmessages: + msg19577
2017-08-13 13:49:42bfsetstatus: unknown -> has-patch