darcs

Patch 1409 merge Darcs.Patch.ConflictMarking into Darcs.Patch.Con...

Title merge Darcs.Patch.ConflictMarking into Darcs.Patch.Con...
Superseder Nosy List gh
Related Issues
Status accepted Assigned To
Milestone

Created on 2015-11-18.18:27:30 by gh, last changed 2015-11-29.22:49:18 by gh.

Files
File name Status Uploaded Type Edit Remove
merge-darcs_patch_conflictmarking-into-darcs_patch_conflict.dpatch gh, 2015-11-18.18:27:29 application/x-darcs-patch
patch-preview.txt gh, 2015-11-18.18:27:29 text/x-darcs-patch
unnamed gh, 2015-11-18.18:27:29 text/plain
See mailing list archives for discussion on individual patches.
Messages
msg18846 (view) Author: gh Date: 2015-11-18.18:27:29
ConflictMarking was only imported from modules that also
imported Conflict, so let's reduce the number of modules in Darcs
a little.

I'm not screening it for a couple of days in case there is a good
reason to keep these modules separate.

1 patch for repository http://darcs.net:

patch 21a1966fed8f0bf19b4d63d2207b004c0d28b0c4
Author: Guillaume Hoffmann <guillaumh@gmail.com>
Date:   Wed Nov 18 15:27:45 ART 2015
  * merge Darcs.Patch.ConflictMarking into Darcs.Patch.Conflict
Attachments
msg18847 (view) Author: ganesh Date: 2015-11-18.20:39:01
I don't feel too strongly, but I think the two are logically 
distinct and would be slightly better in separate files.

Darcs.Patch.Conflict is the infrastructure for dealing with 
conflicts internally, whereas ConflictMarking is specifically about 
the marking in the working directory.
msg18848 (view) Author: gh Date: 2015-11-18.22:04:22
2015-11-18 17:39 GMT-03:00 Ganesh Sittampalam <bugs@darcs.net>:
>
> Ganesh Sittampalam <ganesh@earth.li> added the comment:
>
> I don't feel too strongly, but I think the two are logically
> distinct and would be slightly better in separate files.

I agree that they are logically different, but I'd like to slow down
"module inflation", especially with small modules (the new merged
module is 188 lines big); and also every module that imported
ConflictMarking already imported Conflicts.

The biggest problem I have with module inflation is the overall
compile time of Darcs. Also there is a cost of reading and
understanding two different modules that are related to the same
topic. (OTOH some modules like Darcs.Repository.Internals suffer the
opposite issue of being too big).

Similarly the current version of Darcs.Repository.Packs (which has
become quite small, 117 lines) could also be merged again into
Darcs.Repository.Clone (473)... In fact *this* merge is more obvious
than Conflict/ConflictMarking. I'm sending a patch soon.
msg18849 (view) Author: ganesh Date: 2015-11-18.23:05:37
On 18/11/2015 22:12, Guillaume Hoffmann wrote:

> The biggest problem I have with module inflation is the overall
> compile time of Darcs.

Good point - the number of modules has definitely got quite large
lately, partly with the hashed-storage import, and partly because I tend
to create new ones everytime I refactor something :-)

Do we know whether compilation time is more influenced by the number of
files or by code size?

I did a very dumb experiment that suggests that compile time actually
scales super-linearly with the size of a file (the counts were chosen
just to get realistic but not extreme runtimes):

---------------------
darcs-ghc710:[ganesh@paxton:~/temp/1]$ echo "module Foo where" > Foo.hs
; for n in `seq 1 3000` ; do echo "foo$n = $n" >> Foo.hs ; done

darcs-ghc710:[ganesh@paxton:~/temp/1]$ time ghc -c Foo.hs

real    0m6.901s
user    0m5.941s
sys     0m0.948s

darcs-ghc710:[ganesh@paxton:~/temp/1]$ echo "module Foo where" > Foo.hs
; for n in `seq 1 1500` ; do echo "foo$n = $n" >> Foo.hs ; done

darcs-ghc710:[ganesh@paxton:~/temp/1]$ time ghc -c Foo.hs

real    0m1.881s
user    0m1.476s
sys     0m0.395s
---------------------

Smaller files also means more chance of incremental compilation.

> Also there is a cost of reading and
> understanding two different modules that are related to the same
> topic.

True, though in this case I sort of feel that splitting them helps break
down what needs to be understood into digestible chunks. The fact that
neither imports the other is quite useful information in itself.

> also every module that imported ConflictMarking already imported
> Conflicts.

FWIW, quite a lot more modules import Conflicts than ConflictMarking. So
if ConflictMarking had some heavier dependencies than Conflicts, this
change would introduce more dependencies to things that import
Conflicts. (I don't think it actually does though)

> Similarly the current version of Darcs.Repository.Packs (which has
> become quite small, 117 lines) could also be merged again into
> Darcs.Repository.Clone (473)..

Again I kind of like the logical separation of the two but don't feel
too strongly.

FWIW in the longer term
[Backpack](https://ghc.haskell.org/trac/ghc/wiki/Backpack) should be
coming along and then I think we will be able to organise our code in a
more structured way inside a single file. That feels like the best of
both worlds to me, compilation times aside.

Cheers,

Ganesh
msg18850 (view) Author: gh Date: 2015-11-19.16:51:42
Ganesh, I asked on #ghc about your experiment, and the non-linear
slowdown is due to the lack of type signatures ("that is actually
non-linear behavior in the size of typechecker group"). Adding type
signatures to the definitions makes compile times go linear again.

(Chat:
<bg.....> rwbarton, I guess this the result of type inference?
<rw......> yes, I think type inference involves all definitions without
an explicit type signature simultaneously, in case they are mutually
recursive
<rw......> we could probably do better there, especially when the
definitions are actually not mutually recursive
<rw......> (or mutually interdependent at all)
)
msg18851 (view) Author: ganesh Date: 2015-11-19.22:40:32
I see, that's good. If compilation times are linear in the size of 
the code and not bumped up by the number of files, then 
merging/splitting is neutral from that perspective.

If (as I guess is likely) there's also a small fixed overhead per 
file, then merging will be a slight win there.

From my perspective go ahead and merge things where you still think 
it makes sense after this discussion.
msg18855 (view) Author: gh Date: 2015-11-20.15:42:24
Indeed there is a small win:

$ darcs diff -p merge |diffstat 
 darcs.cabal                        |    1 
 src/Darcs/Patch/Conflict.hs        |   97
++++++++++++++++++++++++++++++++++-
 src/Darcs/Patch/ConflictMarking.hs |  100
-------------------------------------
 src/Darcs/Patch/V1/Commute.hs      |    3 -
 src/Darcs/Patch/V2/Real.hs         |    4 -
 5 files changed, 96 insertions(+), 109 deletions(-)


I'm screening the patch. OTOH, I'm not going to merge Packs into Clone
because the resulting module seems quite big and complex.
msg18865 (view) Author: gh Date: 2015-11-29.22:49:18
Still no regrets, self-accept.
History
Date User Action Args
2015-11-18 18:27:30ghcreate
2015-11-18 20:39:02ganeshsetmessages: + msg18847
2015-11-18 22:04:23ghsetmessages: + msg18848
2015-11-18 23:05:38ganeshsetmessages: + msg18849
2015-11-19 16:51:43ghsetmessages: + msg18850
2015-11-19 22:40:32ganeshsetmessages: + msg18851
2015-11-20 15:42:25ghsetstatus: needs-screening -> needs-review
messages: + msg18855
2015-11-29 22:49:18ghsetstatus: needs-review -> accepted
messages: + msg18865