darcs

Patch 1829 use attoparsec to replace our own parser monad

Title use attoparsec to replace our own parser monad
Superseder Nosy List bf
Related Issues
Status accepted Assigned To
Milestone

Created on 2019-06-15.09:51:06 by bf, last changed 2019-07-28.09:31:05 by bf.

Files
File name Status Uploaded Type Edit Remove
patch-preview.txt bf, 2019-06-15.09:51:06 text/x-darcs-patch
restore-alphabetical-order-of-exposed_modules-in-darcs_cabal.dpatch bf, 2019-06-15.09:51:06 application/x-darcs-patch
unnamed bf, 2019-06-15.09:51:06 text/plain
See mailing list archives for discussion on individual patches.
Messages
msg20822 (view) Author: bf Date: 2019-06-15.09:51:06
The second to last patch "rename module Darcs.Patch.Parse to
Darcs.Util.Parser" wasn't contained in the original large bundle because it
conflicted with screened. I rebased and added it because I think it belongs
here.

6 patches for repository /home/ben/src/darcs/without-v3:

patch f7fc2f06d8e41aeeafb76ddf2457fbca1b3c04e2
Author: Ben Franksen <ben.franksen@online.de>
Date:   Fri Feb  8 17:46:49 CET 2019
  * restore alphabetical order of exposed-modules in darcs.cabal

patch 0cae5acc67078b6cc2b6b0cbe5137a19522ec553
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Feb 17 12:12:56 CET 2019
  * replace our own patch parser with attoparsec
  
  We already depend on attoparsec for convert import and its functionality and
  implementation is quite similar to our self-written parser monad. I have
  checked that this does not impact performance negatively.

patch 27448d5c3ee98412c4331329bbcf874c8e87812f
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Feb 17 12:22:58 CET 2019
  * take advantage of attoparsec error message for patches and inventories
  
  This changes the return type of readPatch and parseInventory from Maybe to
  Either String. Attoparsec's error messages are weak but that's still better
  than none at all.

patch bbe57ddc7d2380c0d2b2f568f32b1e1c7ab6d659
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Feb 17 17:20:19 CET 2019
  * replace parseStrictly with parse everywhere
  
  This is not a simple rename but changes the output type from Maybe to Either
  String throughout.

patch 70fbba0a41bc042648dc8c9ed23b806b870e7078
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Feb 17 21:25:39 CET 2019
  * rename module Darcs.Patch.Parse to Darcs.Util.Parser
  
  There is nothing specific to patches in this module, in fact we use it for
  inventories as well.

patch 3dc3501824692790cf4bcfa487416c76b77073c8
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Feb 17 21:32:00 CET 2019
  * restore alphabetical order of modules in darcs.cabal
Attachments
msg20823 (view) Author: bf Date: 2019-06-15.09:53:44
This bundle isn't, semantically, part of the addition of v3 patch 
theory, it just happens to depend on it.
msg20970 (view) Author: ganesh Date: 2019-07-27.11:45:15
> patch f7fc2f06d8e41aeeafb76ddf2457fbca1b3c04e2
> Author: Ben Franksen <ben.franksen@online.de>
> Date:   Fri Feb  8 17:46:49 CET 2019
>   * restore alphabetical order of exposed-modules in darcs.cabal
OK
> patch 0cae5acc67078b6cc2b6b0cbe5137a19522ec553
> Author: Ben Franksen <ben.franksen@online.de>
> Date:   Sun Feb 17 12:12:56 CET 2019
>   * replace our own patch parser with attoparsec
>
>   We already depend on attoparsec for convert import and its 
functionality and
>   implementation is quite similar to our self-written parser 
monad. I have
>   checked that this does not impact performance negatively.

> + import Data.Attoparsec.ByteString.Char8

I was quite nervous about this until I compared the old code which 
also works on ByteString.Char8.

I suspect this change would have been slightly easier to review if 
ReadMonads.hs had been renamed to Parse.hs, but of course a new file 
for a new implementation also makes sense.

Code change looks fine, anyway. Standardising our code is great. I 
hope our test suite is good enough to rule out any edge case 
differences!

> patch 27448d5c3ee98412c4331329bbcf874c8e87812f
> Author: Ben Franksen <ben.franksen@online.de>
> Date:   Sun Feb 17 12:22:58 CET 2019
>   * take advantage of attoparsec error message for patches and 
inventories

Great. I happened to notice some commented out parsers in 
Darcs.Repository.Inventory that we should probably just delete.

> patch bbe57ddc7d2380c0d2b2f568f32b1e1c7ab6d659
> Author: Ben Franksen <ben.franksen@online.de>
> Date:   Sun Feb 17 17:20:19 CET 2019
>   * replace parseStrictly with parse everywhere
>   
>   This is not a simple rename but changes the output type from 
Maybe to Either
>   String throughout.

I'm a bit unclear on exactly what the parseStrictly/parse 
disctinction was achieving, but assuming we didn't need it this 
looks like a good simplification.

> patch 70fbba0a41bc042648dc8c9ed23b806b870e7078
> Author: Ben Franksen <ben.franksen@online.de>
> Date:   Sun Feb 17 21:25:39 CET 2019
>   * rename module Darcs.Patch.Parse to Darcs.Util.Parser
Fine

> patch 3dc3501824692790cf4bcfa487416c76b77073c8
> Author: Ben Franksen <ben.franksen@online.de>
> Date:   Sun Feb 17 21:32:00 CET 2019
>   * restore alphabetical order of modules in darcs.cabal

Fine
msg20980 (view) Author: bf Date: 2019-07-28.09:31:05
>>   * replace our own patch parser with attoparsec
> 
>> + import Data.Attoparsec.ByteString.Char8
> 
> I was quite nervous about this until I compared the old code which 
> also works on ByteString.Char8.

Yup. Our on-disk format is pretty much in terms of bytes, rather than
text/characters.

> I hope our test suite is good enough to rule out any edge case 
> differences!

I did look at all the parsing primitives we export and compared them
with their attoparsec counterpart to make sure they are semantically
equivalent to what we had before. I am quite confident that there aren't
any edge-cases differences.

>>   * take advantage of attoparsec error message for patches and 
> inventories
> 
> Great. I happened to notice some commented out parsers in 
> Darcs.Repository.Inventory that we should probably just delete.

I had left them in there, back when I added this module, as some sort of
TODO comment. The low-level peeking and poking of the pristine hash we
do instead may be needed as an optimization, but is unsatisfying.
History
Date User Action Args
2019-06-15 09:51:06bfcreate
2019-06-15 09:53:44bfsetmessages: + msg20823
title: v3: use attoparsec to replace our own parser monad -> use attoparsec to replace our own parser monad
2019-06-15 09:58:20bfsetstatus: needs-screening -> needs-review
2019-07-27 11:45:16ganeshsetstatus: needs-review -> accepted-pending-tests
messages: + msg20970
2019-07-27 17:59:44ganeshsetstatus: accepted-pending-tests -> accepted
2019-07-28 09:31:05bfsetmessages: + msg20980