Patch 1809: some optimizations in ByteString handling (and 1 more)

Title	some optimizations in ByteString handling (and 1 more)
Superseder		Nosy List	bfrk
Related Issues
Status	accepted	Assigned To
Milestone

Created on 2019-06-11.21:39:02 by bfrk, last changed 2019-07-28.19:37:56 by ganesh.

Files
File name	Status	Uploaded	Type	Edit	Remove
patch-preview.txt		bfrk, 2019-06-11.21:39:02	text/x-darcs-patch
patch-preview.txt		bfrk, 2019-06-12.14:46:57	text/x-darcs-patch
patch-preview.txt		bfrk, 2019-06-14.11:32:16	text/x-darcs-patch
remove-unused-functions-readintps-and-breakspace.dpatch		bfrk, 2019-06-14.11:32:16	application/x-darcs-patch
remove-unused-isspace-and-don_t-export-isspaceword8.dpatch		bfrk, 2019-06-12.14:46:57	application/x-darcs-patch
some-optimizations-in-bytestring-handling.dpatch		bfrk, 2019-06-11.21:39:02	application/x-darcs-patch
unnamed		bfrk, 2019-06-11.21:39:02	text/plain
unnamed		bfrk, 2019-06-12.14:46:57	text/plain
unnamed		bfrk, 2019-06-14.11:32:16	text/plain

See mailing list archives for discussion on individual patches.

Messages
msg20732 (view)	Author: bfrk	Date: 2019-06-11.21:39:02
Two (unrelated) optimization patches. 2 patches for repository http://darcs.net/screened: patch a8a24d311aad83c709ed3f0bc81f5b782f869c1a Author: Ben Franksen <ben.franksen@online.de> Date: Thu Feb 14 20:33:07 CET 2019 * some optimizations in ByteString handling patch bac445f93153c0f5a2c2950ea054e772f4066e32 Author: Ben Franksen <ben.franksen@online.de> Date: Mon Feb 18 19:21:17 CET 2019 * optimize fastRemoveFL/RL The point is to avoid structural equality (Eq2) tests completely. Attachments patch-preview.txt some-optimizations-in-bytestring-handling.dpatch unnamed
msg20741 (view)	Author: ganesh	Date: 2019-06-12.06:26:01
> * some optimizations in ByteString handling > -import Data.Char ( ord, isSpace, toLower, toUpper ) > +import Data.Char ( ord, toLower, toUpper ) > -readIntPS = BC.readInt . BC.dropWhile isSpace > +readIntPS = BC.readInt . dropSpace > +isSpace :: Char -> Bool > +isSpace ' ' = True > +isSpace '\t' = True > +isSpace '\n' = True > +isSpace '\r' = True > +isSpace _ = False This is a behaviour change from the old code: Data.Char.isSpace would also match the rather obscure \v ("vertical tabulation"), \f ("form feed"), and 0xa0 ("non-breaking space"). http://hackage.haskell.org/package/base- 4.12.0.0/docs/src/GHC.Unicode.html#isSpace Regardless, I think it only flows to the parsing of the line numbers in a hunk patch (via the 'int' parser in Darcs.Patch.ReadMonads), so I doubt it matters. [I thought I understood ASCII, but I'd never had cause to think about any of those three characters before!]
msg20742 (view)	Author: ganesh	Date: 2019-06-12.06:32:09
> optimize fastRemoveFL/RL OK. We don't have any framework for assertions, but the removed structural equality test might make a good one if we did. Neither of these two patches are in screened so I'm not pushing to reviewed in case you were holding off for some reason. But go ahead and mark as accepted whenever you want. (I'm also happy to review things and push to screened+reviewed when done if you indicate that in the submission). to push.
msg20754 (view)	Author: bfrk	Date: 2019-06-12.14:46:57
This should clear up the question about changing the meaning of isSpace... 1 patch for repository /home/ben/src/darcs/sent: patch aa4c15933cc10708e8715957d2a303543b2985d3 Author: Ben Franksen <ben.franksen@online.de> Date: Wed Jun 12 16:43:49 CEST 2019 * remove unused isSpace and don't export isSpaceWord8 Attachments patch-preview.txt remove-unused-isspace-and-don_t-export-isspaceword8.dpatch unnamed
msg20755 (view)	Author: bfrk	Date: 2019-06-12.14:49:24
To clarify: the isSpace wasn't used anywhere so we can just scrap it. And isSpaceWord8 has always been as it is now, semantically. Screening the two bundles now.
msg20781 (view)	Author: ganesh	Date: 2019-06-14.10:32:21
I think readIntPS was used when I reviewed the original patches, but isn't now because of the attoparsec switch?
msg20784 (view)	Author: bfrk	Date: 2019-06-14.11:32:16
Following up on my last comment. 1 patch for repository http://darcs.net/screened: patch 78644584287e60f6a0a3e15faca8570663ad60a1 Author: Ben Franksen <ben.franksen@online.de> Date: Fri Jun 14 13:30:05 CEST 2019 * remove unused functions readIntPS and breakSpace Attachments patch-preview.txt remove-unused-functions-readintps-and-breakspace.dpatch unnamed
msg20864 (view)	Author: bfrk	Date: 2019-06-20.17:01:07
Regarding the bytestring optimization: I just found a version of this patch that had the following long comment: """ Profiling a complicated V3 merge (the last pull of the loop in tests/conflict-fight-failure.sh with numconflicts=80) showed that breakSpace is near the top of the list and is responsible for about 10% of the time and 5% of the memory allocation. This small change practically eliminates this overhead and consistently reduces the runtime of this particular darcs invocation from 0.44 seconds to 0.37. """ (This is about the change in isSpaceWord8.)
msg20867 (view)	Author: ganesh	Date: 2019-06-22.09:07:32
Why do we even need to call breakSpace while doing a merge? Or was that part of the time just coming from reading the patches once at the beginning, in which case I guess it must be hurting us in lots of other scenarios too?
msg20868 (view)	Author: bfrk	Date: 2019-06-22.17:34:49
> Why do we even need to call breakSpace while doing a merge? Or was > that part of the time just coming from reading the patches once > at the beginning, in which case I guess it must be hurting us in > lots of other scenarios too? I'm pretty sure that was just reading the patches. You can bet that I was quite astonished to see that in the profiling report. And one more reason to delegate that stuff to attoparsec (breakSpace is gone by now and dropSpace only used for email and bundles).

History
Date	User	Action	Args
2019-06-11 21:39:02	bfrk	create
2019-06-12 06:26:01	ganesh	set	messages: + msg20741
2019-06-12 06:32:09	ganesh	set	messages: + msg20742
2019-06-12 14:46:57	bfrk	set	files: + patch-preview.txt, remove-unused-isspace-and-don_t-export-isspaceword8.dpatch, unnamed messages: + msg20754
2019-06-12 14:49:24	bfrk	set	status: needs-screening -> needs-review messages: + msg20755
2019-06-14 10:32:21	ganesh	set	status: needs-review -> accepted-pending-tests messages: + msg20781
2019-06-14 11:32:16	bfrk	set	files: + patch-preview.txt, remove-unused-functions-readintps-and-breakspace.dpatch, unnamed messages: + msg20784
2019-06-14 11:33:21	ganesh	set	status: accepted-pending-tests -> accepted
2019-06-14 12:08:57	ganesh	set	status: accepted -> accepted-pending-tests
2019-06-20 17:01:07	bfrk	set	messages: + msg20864
2019-06-22 09:07:33	ganesh	set	messages: + msg20867
2019-06-22 17:34:49	bfrk	set	messages: + msg20868
2019-07-28 19:37:56	ganesh	set	status: accepted-pending-tests -> accepted

Patch 1809 some optimizations in ByteString handling (and 1 more)

Attachments

Attachments

Attachments