darcs

Patch 1214 make Darcs.Util.Path.normPath cheaper on well formed p...

Title make Darcs.Util.Path.normPath cheaper on well formed p...
Superseder Nosy List bfrk, gh
Related Issues
Status accepted Assigned To bfrk
Milestone

Created on 2014-11-10.21:36:07 by gh, last changed 2015-02-12.17:54:29 by gh.

Files
File name Status Uploaded Type Edit Remove
comment-for-normpath-with-examples.dpatch gh, 2015-02-05.18:51:49 application/x-darcs-patch
make-darcs_util_path_normpath-cheaper-on-well-formed-paths.dpatch gh, 2014-11-10.21:36:07 application/x-darcs-patch
make-darcs_util_path_normpath-cheaper-on-well-formed-paths.dpatch gh, 2014-11-11.04:18:07 application/x-darcs-patch
make-darcs_util_path_normpath-cheaper-on-well-formed-paths.dpatch gh, 2014-11-12.20:18:51 application/x-darcs-patch
make-darcs_util_path_normpath-cheaper-on-well-formed-paths.dpatch gh, 2014-11-18.15:06:11 application/x-darcs-patch
patch-preview.txt gh, 2014-11-10.21:36:07 text/x-darcs-patch
patch-preview.txt gh, 2014-11-11.04:18:07 text/x-darcs-patch
patch-preview.txt gh, 2014-11-12.20:18:51 text/x-darcs-patch
patch-preview.txt gh, 2014-11-18.15:06:11 text/x-darcs-patch
patch-preview.txt gh, 2014-11-19.20:19:09 text/x-darcs-patch
patch-preview.txt gh, 2015-02-05.18:51:49 text/x-darcs-patch
remove-unused-filename-related-function.dpatch gh, 2014-11-19.20:19:09 application/x-darcs-patch
unnamed gh, 2014-11-10.21:36:07
unnamed gh, 2014-11-11.04:18:07
unnamed gh, 2014-11-12.20:18:51
unnamed gh, 2014-11-18.15:06:11
unnamed gh, 2014-11-19.20:19:09
unnamed gh, 2015-02-05.18:51:49
See mailing list archives for discussion on individual patches.
Messages
msg17768 (view) Author: gh Date: 2014-11-10.21:36:07
Don't screen it yet.

Darcs is doing a lot of path normalization internally, and this appears
as a lot of memory consumption at patch index creation time, which is
an operation that requires to read and parse many patches at once.

This patch makes normPath avoid splitting and reconstructing paths
by first checking whether the path needs normalization (apart
from dropping initial ./'s and /'s, which can be done quickly).

I should measure how many strings are effectively normalized at
patch index creation before pushing this. In the meantime this makes
us go from:

        total time  =        5.87 secs   (5874 ticks @ 1000 us, 1 processor)
        total alloc = 4,112,933,336 bytes  (excludes profiling overheads)

to:

        total time  =        5.73 secs   (5734 ticks @ 1000 us, 1 processor)
        total alloc = 3,831,387,200 bytes  (excludes profiling overheads)


when creating PI on darcs.net.

1 patch for repository http://darcs.net:

patch 01d507421104a74a131148c12ac6868c4173ba8b
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Mon Nov 10 18:30:03 ART 2014
  * make Darcs.Util.Path.normPath cheaper on well formed paths
Attachments
msg17769 (view) Author: gh Date: 2014-11-11.04:18:07
Still toying with some space-consuming functions.
There is a small gain in space, but I have to measure it
on another machine.
Don't screen yet.

2 patches for repository http://darcs.net:

patch 01d507421104a74a131148c12ac6868c4173ba8b
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Mon Nov 10 18:30:03 ART 2014
  * make Darcs.Util.Path.normPath cheaper on well formed paths

patch bd0b7fbd929d64a20b2f41c7ba7e47888c4aa07c
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Tue Nov 11 00:31:09 ART 2014
  * make Darcs.Util.Path.decodeWhite cheaper on paths that do not need it
Attachments
msg17773 (view) Author: gh Date: 2014-11-12.20:18:51
Fixing a bad copy-paste in the decodeWhite patch.

3 patches for repository http://darcs.net:

patch dfa9d4656b25ea5f5fc5f38d6109aaab2472a0aa
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Wed Nov 12 16:29:26 ART 2014
  * make Darcs.Util.Path.normPath cheaper on well formed paths

patch 155bb3a94c77ba6fa2a73ed6d46c12d5bc44fc6a
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Wed Nov 12 16:29:49 ART 2014
  * remove unused FileName related function

patch c664d746b09e0b8322ea2741d6f034f6da8e2190
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Wed Nov 12 16:29:34 ART 2014
  * make Darcs.Util.Path.decodeWhite cheaper on paths that do not need it
Attachments
msg17842 (view) Author: gh Date: 2014-11-18.15:06:11
An new implementation of decodeWhite that always scans
the input string one.


3 patches for repository http://darcs.net:

patch 01d507421104a74a131148c12ac6868c4173ba8b
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Mon Nov 10 18:30:03 ART 2014
  * make Darcs.Util.Path.normPath cheaper on well formed paths

patch 9dc33faeb25563b1b03bcad2914ab7d456d72540
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Tue Nov 18 10:19:36 ART 2014
  * make Darcs.Util.Path.decodeWhite cheaper on paths that do not need it

patch 155bb3a94c77ba6fa2a73ed6d46c12d5bc44fc6a
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Wed Nov 12 16:29:49 ART 2014
  * remove unused FileName related function
Attachments
msg17848 (view) Author: gh Date: 2014-11-19.20:19:09
This time both decodeWhite and normPath read only their input
string once.

3 patches for repository http://darcs.net:

patch 3d76de664747d339d95169118861e7182c47f17c
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Wed Nov 19 14:09:14 ART 2014
  * remove unused FileName related function

patch 976df34e83abb11d44eba59de9fe04b45f9521e1
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Wed Nov 19 14:40:03 ART 2014
  * make Darcs.Util.Path.decodeWhite cheaper on paths that do not need it

patch 9c8cbc36906b3b62c56d2ab585924d09d3174daa
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Wed Nov 19 17:18:03 ART 2014
  * make Darcs.Util.Path.normPath cheaper on well formed paths
Attachments
msg18012 (view) Author: bfrk Date: 2015-02-05.12:57:34
These functions could use quickcheck properties (or perhaps even
proofs). I find low-level string/list operations like these quite hard
to verify by reading the code.

Could you at least write down the expected properties in comments?
msg18033 (view) Author: gh Date: 2015-02-05.18:51:49
Here are extra comments. I think that writing examples
is sufficient to explain what normPath does.

1 patch for repository http://darcs.net:

patch a8712e5ac4a9c3d112f9e03e10a9412f8563c99a
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Thu Feb  5 15:50:07 ART 2015
  * comment for normPath with examples
Attachments
msg18036 (view) Author: bfrk Date: 2015-02-05.19:52:28
Thanks. This really makes the code easier to understand. And yes, for
norm/normPath examples are enough.

My remark about properties was aimed more at encode/decodeWhite: the
naming suggests that they are inverses, so it would be nice to have that
stated explicitly (if it's true, else state some weaker property).
msg18092 (view) Author: gh Date: 2015-02-12.17:53:50
I find that the existing comments of both encodeWhite and decodeWhite
are clear enough:

http://hub.darcs.net/darcs/darcs-screened/browse/src/Darcs/Util/Path.hs#151
History
Date User Action Args
2014-11-10 21:36:07ghcreate
2014-11-10 21:51:32ganeshsetstatus: needs-screening -> in-discussion
2014-11-11 04:18:07ghsetfiles: + patch-preview.txt, make-darcs_util_path_normpath-cheaper-on-well-formed-paths.dpatch, unnamed
messages: + msg17769
2014-11-12 20:18:51ghsetfiles: + patch-preview.txt, make-darcs_util_path_normpath-cheaper-on-well-formed-paths.dpatch, unnamed
messages: + msg17773
2014-11-18 15:06:11ghsetfiles: + patch-preview.txt, make-darcs_util_path_normpath-cheaper-on-well-formed-paths.dpatch, unnamed
messages: + msg17842
2014-11-19 20:19:10ghsetfiles: + patch-preview.txt, remove-unused-filename-related-function.dpatch, unnamed
messages: + msg17848
2014-11-19 20:20:57ghsetstatus: in-discussion -> needs-review
2015-02-05 12:57:34bfrksetmessages: + msg18012
2015-02-05 18:51:49ghsetfiles: + patch-preview.txt, comment-for-normpath-with-examples.dpatch, unnamed
messages: + msg18033
2015-02-05 19:52:28bfrksetmessages: + msg18036
2015-02-05 20:18:57bfrksetstatus: needs-review -> followup-requested
assignedto: bfrk
nosy: + bfrk
2015-02-12 17:53:51ghsetmessages: + msg18092
2015-02-12 17:54:29ghsetstatus: followup-requested -> accepted