darcs

Patch 1809 some optimizations in ByteString handling (and 1 more)

Title some optimizations in ByteString handling (and 1 more)
Superseder Nosy List bf
Related Issues
Status accepted Assigned To
Milestone

Created on 2019-06-11.21:39:02 by bf, last changed 2019-07-28.19:37:56 by ganesh.

Files
File name Status Uploaded Type Edit Remove
patch-preview.txt bf, 2019-06-11.21:39:02 text/x-darcs-patch
patch-preview.txt bf, 2019-06-12.14:46:57 text/x-darcs-patch
patch-preview.txt bf, 2019-06-14.11:32:16 text/x-darcs-patch
remove-unused-functions-readintps-and-breakspace.dpatch bf, 2019-06-14.11:32:16 application/x-darcs-patch
remove-unused-isspace-and-don_t-export-isspaceword8.dpatch bf, 2019-06-12.14:46:57 application/x-darcs-patch
some-optimizations-in-bytestring-handling.dpatch bf, 2019-06-11.21:39:02 application/x-darcs-patch
unnamed bf, 2019-06-11.21:39:02 text/plain
unnamed bf, 2019-06-12.14:46:57 text/plain
unnamed bf, 2019-06-14.11:32:16 text/plain
See mailing list archives for discussion on individual patches.
Messages
msg20732 (view) Author: bf Date: 2019-06-11.21:39:02
Two (unrelated) optimization patches.

2 patches for repository http://darcs.net/screened:

patch a8a24d311aad83c709ed3f0bc81f5b782f869c1a
Author: Ben Franksen <ben.franksen@online.de>
Date:   Thu Feb 14 20:33:07 CET 2019
  * some optimizations in ByteString handling

patch bac445f93153c0f5a2c2950ea054e772f4066e32
Author: Ben Franksen <ben.franksen@online.de>
Date:   Mon Feb 18 19:21:17 CET 2019
  * optimize fastRemoveFL/RL
  
  The point is to avoid structural equality (Eq2) tests completely.
Attachments
msg20741 (view) Author: ganesh Date: 2019-06-12.06:26:01
>   * some optimizations in ByteString handling

> -import Data.Char                ( ord, isSpace, toLower, toUpper 
)
> +import Data.Char                ( ord, toLower, toUpper )

> -readIntPS = BC.readInt . BC.dropWhile isSpace
> +readIntPS = BC.readInt . dropSpace

> +isSpace :: Char -> Bool
> +isSpace ' '  = True
> +isSpace '\t' = True
> +isSpace '\n' = True
> +isSpace '\r' = True
> +isSpace _    = False

This is a behaviour change from the old code: Data.Char.isSpace
would also match the rather obscure \v ("vertical tabulation"), \f 
("form feed"), and 0xa0 ("non-breaking space").

http://hackage.haskell.org/package/base-
4.12.0.0/docs/src/GHC.Unicode.html#isSpace

Regardless, I think it only flows to the parsing of the line numbers
in a hunk patch (via the 'int' parser in Darcs.Patch.ReadMonads),
so I doubt it matters.

[I thought I understood ASCII, but I'd never had cause to think 
about any of those three characters before!]
msg20742 (view) Author: ganesh Date: 2019-06-12.06:32:09
> optimize fastRemoveFL/RL

OK. We don't have any framework for assertions, but the removed
structural equality test might make a good one if we did.

Neither of these two patches are in screened so I'm not pushing
to reviewed in case you were holding off for some reason. But
go ahead and mark as accepted whenever you want. (I'm also happy
to review things and push to screened+reviewed when done if you
indicate that in the submission).
to push.
msg20754 (view) Author: bf Date: 2019-06-12.14:46:57
This should clear up the question about changing the meaning of isSpace...

1 patch for repository /home/ben/src/darcs/sent:

patch aa4c15933cc10708e8715957d2a303543b2985d3
Author: Ben Franksen <ben.franksen@online.de>
Date:   Wed Jun 12 16:43:49 CEST 2019
  * remove unused isSpace and don't export isSpaceWord8
Attachments
msg20755 (view) Author: bf Date: 2019-06-12.14:49:24
To clarify: the isSpace wasn't used anywhere so we can just scrap it. 
And isSpaceWord8 has always been as it is now, semantically. Screening 
the two bundles now.
msg20781 (view) Author: ganesh Date: 2019-06-14.10:32:21
I think readIntPS was used when I reviewed the original patches,
but isn't now because of the attoparsec switch?
msg20784 (view) Author: bf Date: 2019-06-14.11:32:16
Following up on my last comment.

1 patch for repository http://darcs.net/screened:

patch 78644584287e60f6a0a3e15faca8570663ad60a1
Author: Ben Franksen <ben.franksen@online.de>
Date:   Fri Jun 14 13:30:05 CEST 2019
  * remove unused functions readIntPS and breakSpace
Attachments
msg20864 (view) Author: bf Date: 2019-06-20.17:01:07
Regarding the bytestring optimization: I just found a version of this
patch that had the following long comment:

"""
Profiling a complicated V3 merge (the last pull of the loop in
tests/conflict-fight-failure.sh with numconflicts=80) showed that
breakSpace is near the top of the list and is responsible for about 10%
of the time and 5% of the memory allocation. This small change
practically eliminates this overhead and consistently reduces the
runtime of this particular darcs invocation from 0.44 seconds to 0.37.
"""

(This is about the change in isSpaceWord8.)
msg20867 (view) Author: ganesh Date: 2019-06-22.09:07:32
Why do we even need to call breakSpace while doing a merge? Or was
that part of the time just coming from reading the patches once
at the beginning, in which case I guess it must be hurting us in
lots of other scenarios too?
msg20868 (view) Author: bf Date: 2019-06-22.17:34:49
> Why do we even need to call breakSpace while doing a merge? Or was
> that part of the time just coming from reading the patches once
> at the beginning, in which case I guess it must be hurting us in
> lots of other scenarios too?

I'm pretty sure that was just reading the patches. You can bet that I
was quite astonished to see that in the profiling report. And one more
reason to delegate that stuff to attoparsec (breakSpace is gone by now
and dropSpace only used for email and bundles).
History
Date User Action Args
2019-06-11 21:39:02bfcreate
2019-06-12 06:26:01ganeshsetmessages: + msg20741
2019-06-12 06:32:09ganeshsetmessages: + msg20742
2019-06-12 14:46:57bfsetfiles: + patch-preview.txt, remove-unused-isspace-and-don_t-export-isspaceword8.dpatch, unnamed
messages: + msg20754
2019-06-12 14:49:24bfsetstatus: needs-screening -> needs-review
messages: + msg20755
2019-06-14 10:32:21ganeshsetstatus: needs-review -> accepted-pending-tests
messages: + msg20781
2019-06-14 11:32:16bfsetfiles: + patch-preview.txt, remove-unused-functions-readintps-and-breakspace.dpatch, unnamed
messages: + msg20784
2019-06-14 11:33:21ganeshsetstatus: accepted-pending-tests -> accepted
2019-06-14 12:08:57ganeshsetstatus: accepted -> accepted-pending-tests
2019-06-20 17:01:07bfsetmessages: + msg20864
2019-06-22 09:07:33ganeshsetmessages: + msg20867
2019-06-22 17:34:49bfsetmessages: + msg20868
2019-07-28 19:37:56ganeshsetstatus: accepted-pending-tests -> accepted