darcs

Patch 1849 move PrimPatchId into its own module (and 1 more)

Title move PrimPatchId into its own module (and 1 more)
Superseder Nosy List ganesh
Related Issues
Status accepted Assigned To
Milestone

Created on 2019-07-13.18:58:35 by ganesh, last changed 2019-08-06.09:45:16 by ganesh.

Files
File name Status Uploaded Type Edit Remove
automatically-specialize-merging-and-equality-for-sequences-of-patches-with-identity.dpatch dead bfrk, 2019-07-14.10:27:55 application/x-darcs-patch
move-primpatchid-into-its-own-module.dpatch dead ganesh, 2019-07-13.18:58:35 application/x-darcs-patch
patch-preview.txt ganesh, 2019-07-13.18:58:35 text/x-darcs-patch
patch-preview.txt bfrk, 2019-07-14.10:27:55 text/x-darcs-patch
patch-preview.txt bfrk, 2019-07-14.11:46:35 text/x-darcs-patch
refactor-the-interface-for-creating-named-patches-from-prims.dpatch bfrk, 2019-07-14.11:46:35 application/x-darcs-patch
unnamed ganesh, 2019-07-13.18:58:35 text/plain
unnamed bfrk, 2019-07-14.10:27:55 text/plain
unnamed bfrk, 2019-07-14.11:46:35 text/plain
See mailing list archives for discussion on individual patches.
Messages
msg20927 (view) Author: ganesh Date: 2019-07-13.18:58:35
This is to illustrate what making the Int in PrimPatchId
abstract would look like.

As well as the new module, I had to make the darcs library
depend directly on QuickCheck so I could write the Arbitrary
instance behind the encapsulation. I also had to make the
code for generating an anonymous id visible.

This is not the final version of the patches: I realised half-way
through that I'd been inconsistent about
"serial" versus "sequence" (the latter is the word the existing
comment uses), and I also haven't documented the new type.

If you (Ben) are happy with the concept I'll clean them up before
pushing to screened.

2 patches for repository darcs-unstable@darcs.net:screened:

patch 3c3a01f97f9b743436153056626096bfd188b13f
Author: Ganesh Sittampalam <ganesh@earth.li>
Date:   Sat Jul 13 18:17:37 BST 2019
  * move PrimPatchId into its own module
  
  This is a necessary precursor to making the serial numbers
  properly abstract, because fromPrim will need to refer to
  the serial number type creating a difficult dependency loop.

patch dad75d8cdbc0b75b9fe05acd8c89fd96f31c471c
Author: Ganesh Sittampalam <ganesh@earth.li>
Date:   Sat Jul 13 18:59:27 BST 2019
  * hide the encoding of PrimPatchId sequence numbers
Attachments
msg20929 (view) Author: bfrk Date: 2019-07-13.22:30:37
I have looked at your changes. While in general I appreciate the idea of
making types abstract, I think including QC as a dependency here goes a
bit over the top, given that we export data constructors almost
everywhere. I'm assuming you don't actually plan to invest the effort of
pulling all Arbitrary instances into darcslib and making all the patch
types abstract... and even if you did plan on doing that I am not sure I
would agree that this is a good move.

If you really believe we need an abstract interface to PrimPatchIds,
wouldn't it be better to make it powerful enough so that an instance
Arbitrary can be defined without accessing the internals?

You expose the type PrimPatchSerial and functions firstPrimPatchSerial
and nextPrimPatchSerial. Isn't this, more or less, encoding natural
numbers a la Peano? There is a reason we use natural numbers: they are
the canonical implementation of this interface, i.e. all others are
isomorphic to it. What your interface is telling me is: a PrimPatchId
behaves as if it consisted of a SHA1, a natural number, and a signedness
flag.

My immediate idea to improve the API was to provide serial numbers as an
infinite list (one symbol to export, instead of two):

positivePrimPatchSerials :: [PrimPatchSerial]

But, looking at how this is actually used, we can do much better:

positivePrimPatchIds :: SHA1 -> [PrimPatchId]

that is, we don't have to expose the serial numbers at all. And taking
this further, can we get rid of the SHA1 as well? I am thinking of a
"subordinate PatchId" abstraction here: given a parent PatchId, generate
subordinate PatchIds for children.

The fromPrim method is used only to construct Named patches. And it has
this particular type only because it should work for Prims with and
without PatchId. What if we pull the PatchId family out of Ident:

type family PatchId (p :: *->*->*)

class (SignedId (PatchId p), Eq2 p) => Ident p where
  ident :: p wX wY -> PatchId p

This would allow us to define

class FromPrim p where
  fromPrim :: PatchId p -> (PrimOf p) wX wY -> p wX wY

without the requirement of an instance Ident p. Patches without identity
could define

type instance PatchId MyPatchType = ()

Taking up your idea to define container classes, we could then abstract
subordinate PatchIds using something like

class SubPatchId f where
  subPatchId :: PatchId (f p) -> PatchId p

or perhaps make it return a list. This is unfinished and untested and
probably doesn't even compile. But I think you get the idea: fix the
ugly type of fromPrim and find a better abstraction to express that
PatchIds for RepoPatches are derived from that of their parent
(container). I am not even sure the name PrimPatchId is appropriate:
from the outside it appears only as the PatchId of a RepoPatchV3.
msg20931 (view) Author: bfrk Date: 2019-07-14.10:27:55
Here is a alternative patch that implements the ideas I sketched in my
comment, except the last one about abstracting subordinate PatchIds. This
means we use PatchInfo in the interface which I think is okay for now.

I am not sure why it depends on the equality stuff; it should be easy to
rebase it to avoid that dependency.

3 patches for repository http://darcs.net/screened:

patch de792b6ffc34f785e7f02463d6ab503fdd465207
Author: Ben Franksen <ben.franksen@online.de>
Date:   Fri Jun 28 20:25:32 CEST 2019
  * automatically specialize merging and equality for sequences of patches with identity
  
  The idea is to allow patch sequences to adapt their behavior depending on
  whether the patch type has an Ident instance or not. This is done for two
  reasons:
  (1) equality tests for, and removing elements from sequences of patches is
      much faster if we can use PatchIds;
  (2) merging sequences of patches with identity must first remove duplicate
      patches, otherwise these would be wrongly treated as conflicting.
  The new class Sequence abstracts over the basic operations we want to
  specialize for FL and RL: equality and removing elements. Similarly, class
  Merge gets a new method mergeFLs.
  By using the new DerivingVia extension we can derive the Sequence instances
  for patch types with an Ident instance from one template definition for a
  suitable newtype wrapper named HasIdent.
  The fastRemoveXXX functions are now gone from the Patch API and only used
  internally inside D.P.Ident, the ugly IdEq2 class could be fully removed.

patch 1362fe7224611f2b5593ce90406551a9b21cf3c6
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Jun 30 21:26:07 CEST 2019
  * clean up patch equality
  
  We make a clear distinction between semantic and structural equality. The
  former is captured by Eq2, the latter by the new class StructuralEq2. We use
  nominal equality as optimization for semantic equality where available.
  Structural equality is understood to be local; it should not refer to
  structural equality of its components but rather to their semantic equality.
  Thus the new definitions for StructuralEq2 are quite similar to those which
  previously defined Eq2. Some of the equality definitions have been cleaned
  up and streamlined, for instance the two basic prim patch types now use
  deriving Eq instead of manually written definitions.
  There remain no /uses/ of unsafeCompare throughout the code except in
  RepoPatchV1 and in the default method definitions for Eq2. It still appears
  in many places to define Eq2, though, since this is quite convenient.

patch 9f4f344a0e0b1179793142179974b534f7775246
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Jul 14 11:19:35 CEST 2019
  * refactor the interface for creating Named patches from prims
  
  This uncouples the type family PatchId from class Ident, so we can use it
  for patch types that have no instance Ident. This allows us to change the
  type of method fromPrim to receive a 'PatchId p' instead of the raw
  ingredients of a PrimPatchId. We add method fromPrims so we can delegate
  generating prim patch ids to the implementation of the RepoPatch type. The
  default method definitions for fromPrim and fromPrims ignore their input, so
  we don't even have to create dummy type instances for PatchId.
  
  We add the function positivePrimPatchIds to the interface of NamedPrim to
  support implementation of fromPrims for RepoPatchV3. This hides the serial
  numbers from client code which are now an implementation detail of
  NamedPrim. The function takes a PatchInfo as input, so the fact that we use
  the hash of the PatchInfo underneath is now also an implementation detail. A
  single exception is made with unsafePrimPatchId which we need (only) in the
  harness to define an instance Arbitrary.
Attachments
msg20932 (view) Author: ganesh Date: 2019-07-14.10:28:51
On 13/07/2019 23:30, Ben Franksen wrote:
> 
> Ben Franksen <ben.franksen@online.de> added the comment:
> 
> I have looked at your changes. While in general I appreciate the idea of
> making types abstract, I think including QC as a dependency here goes a
> bit over the top, given that we export data constructors almost
> everywhere. I'm assuming you don't actually plan to invest the effort of
> pulling all Arbitrary instances into darcslib and making all the patch
> types abstract... and even if you did plan on doing that I am not sure I
> would agree that this is a good move.

No - but it is an acknowledgement that sometimes test code best lives in
the modules under test and so darcslib might need to have some testing
dependencies.

I do actually have a bad reason for moving all our test code into
darcslib, which is that I use ghcid for my development workflow and
being able to typecheck all relevant code in a single cabal (new-)repl
invocation makes that a lot easier. But much of our test code is too
horrible to put in darcslib anyway :-)

If we can make the TestOnly nullary type class idea work then I think
that tilts the balance a bit towards putting test code in darcslib.


> If you really believe we need an abstract interface to PrimPatchIds,
> wouldn't it be better to make it powerful enough so that an instance
> Arbitrary can be defined without accessing the internals?

I'm not really sure how to make it powerful enough without just exposing
the ability to construct one from an arbitrary natural. Which we may
conclude we should do.

> You expose the type PrimPatchSerial and functions firstPrimPatchSerial
> and nextPrimPatchSerial. Isn't this, more or less, encoding natural
> numbers a la Peano? There is a reason we use natural numbers: they are
> the canonical implementation of this interface, i.e. all others are
> isomorphic to it. What your interface is telling me is: a PrimPatchId
> behaves as if it consisted of a SHA1, a natural number, and a signedness
> flag.

Agreed. The thing I dislike most about the current code is that callers
can see the negative numbers and might accidentally misuse them. If it
exposed (Natural, Bool) I probably wouldn't have been so motivated to
change it.

> My immediate idea to improve the API was to provide serial numbers as an
> infinite list (one symbol to export, instead of two):
> 
> positivePrimPatchSerials :: [PrimPatchSerial]

That was also my initial idea, until I remembered that lists can be
finite, so the consuming code would need an error case for hitting an
empty list. Or I'd have to write or import a stream type, which seemed
like a bit more overhead/hassle. But given your instinct was the same
way, using Data.Stream seems worthwhile.

> But, looking at how this is actually used, we can do much better:
> 
> positivePrimPatchIds :: SHA1 -> [PrimPatchId]
> 
> that is, we don't have to expose the serial numbers at all. And taking
> this further, can we get rid of the SHA1 as well? I am thinking of a
> "subordinate PatchId" abstraction here: given a parent PatchId, generate
> subordinate PatchIds for children.

This seems like a good idea. The interface could be

normalPrimPatchIds :: PatchInfo -> Stream PrimPatchId

I prefer 'normal'/'inverted' to 'positive'/'negative', but I don't feel
too strongly.


> The fromPrim method is used only to construct Named patches. And it has
> this particular type only because it should work for Prims with and
> without PatchId. What if we pull the PatchId family out of Ident:

> 
> type family PatchId (p :: *->*->*)
> 
> class (SignedId (PatchId p), Eq2 p) => Ident p where
>   ident :: p wX wY -> PatchId p
> 
> This would allow us to define
> 
> class FromPrim p where
>   fromPrim :: PatchId p -> (PrimOf p) wX wY -> p wX wY
> 
> without the requirement of an instance Ident p. Patches without identity
> could define
> > type instance PatchId MyPatchType = ()

This also seems like a good idea which might also help with
loop-breaking the imports for FromPrim.

> Taking up your idea to define container classes, we could then abstract
> subordinate PatchIds using something like
> 
> class SubPatchId f where
>   subPatchId :: PatchId (f p) -> PatchId p
> 
> or perhaps make it return a list. This is unfinished and untested and
> probably doesn't even compile. But I think you get the idea: fix the
> ugly type of fromPrim and find a better abstraction to express that
> PatchIds for RepoPatches are derived from that of their parent
> (container). I am not even sure the name PrimPatchId is appropriate:
> from the outside it appears only as the PatchId of a RepoPatchV3.

The SubPatchId idea seems plausible, though probably for a future
refactoring. And even though I suggested it for testing purposes, I'm
unsure about explicitly enforcing that our Repo patch types have a 'f p'
structure for general code.

I'll have a play with streams and type family PatchId and see if I can
get a reasonable incremental refactoring we can then build on.
msg20934 (view) Author: bfrk Date: 2019-07-14.11:31:27
>> If you really believe we need an abstract interface to PrimPatchIds,
>> wouldn't it be better to make it powerful enough so that an instance
>> Arbitrary can be defined without accessing the internals?
> 
> I'm not really sure how to make it powerful enough without just exposing
> the ability to construct one from an arbitrary natural. Which we may
> conclude we should do.

There may be a solution. All we require from the Arbitrary instance is
to create (hopefully) unique PrimPatchIds, since it is only used when
testing NamedPrim wrappers in isolation.

My "counter proposal" patch currently compromises by adding an
unsafePrimPatchId for testing only.  We could change this to

unsafeSinglePrimPatchId :: PatchInfo -> PrimPatchid
unsafeSinglePrimPatchId info = PrimPatchId 1 (makePatchname info)

and rely on an instance Arbitrary PatchInfo that generates "unique
enough" PatchInfos, which it would have to do anyway.

>> You expose the type PrimPatchSerial and functions firstPrimPatchSerial
>> and nextPrimPatchSerial. Isn't this, more or less, encoding natural
>> numbers a la Peano? There is a reason we use natural numbers: they are
>> the canonical implementation of this interface, i.e. all others are
>> isomorphic to it. What your interface is telling me is: a PrimPatchId
>> behaves as if it consisted of a SHA1, a natural number, and a signedness
>> flag.
> 
> Agreed. The thing I dislike most about the current code is that callers
> can see the negative numbers and might accidentally misuse them. If it
> exposed (Natural, Bool) I probably wouldn't have been so motivated to
> change it.
> 
>> My immediate idea to improve the API was to provide serial numbers as an
>> infinite list (one symbol to export, instead of two):
>>
>> positivePrimPatchSerials :: [PrimPatchSerial]
> 
> That was also my initial idea, until I remembered that lists can be
> finite, so the consuming code would need an error case for hitting an
> empty list. Or I'd have to write or import a stream type, which seemed
> like a bit more overhead/hassle. But given your instinct was the same
> way, using Data.Stream seems worthwhile.

I would be happy with an error case here (see my patch), but I am not
opposed to add a dependency on Data.Stream to expose the infinite list
requirement to the type checker.

>> But, looking at how this is actually used, we can do much better:
>>
>> positivePrimPatchIds :: SHA1 -> [PrimPatchId]
>>
>> that is, we don't have to expose the serial numbers at all. And taking
>> this further, can we get rid of the SHA1 as well? I am thinking of a
>> "subordinate PatchId" abstraction here: given a parent PatchId, generate
>> subordinate PatchIds for children.
> 
> This seems like a good idea. The interface could be
> 
> normalPrimPatchIds :: PatchInfo -> Stream PrimPatchId
> 
> I prefer 'normal'/'inverted' to 'positive'/'negative', but I don't feel
> too strongly.

Me neither. If you want to change that, you should also rename class
SignedId and its methods.


>> Taking up your idea to define container classes, we could then abstract
>> subordinate PatchIds using something like
>>
>> class SubPatchId f where
>>   subPatchId :: PatchId (f p) -> PatchId p
>>
>> or perhaps make it return a list. This is unfinished and untested and
>> probably doesn't even compile. But I think you get the idea: fix the
>> ugly type of fromPrim and find a better abstraction to express that
>> PatchIds for RepoPatches are derived from that of their parent
>> (container). I am not even sure the name PrimPatchId is appropriate:
>> from the outside it appears only as the PatchId of a RepoPatchV3.
> 
> The SubPatchId idea seems plausible, though probably for a future
> refactoring.

Agreed. I played a bit with that but couldn't get it to work. I came to
the conclusion that this abstraction is not needed.

> I'll have a play with streams and type family PatchId and see if I can
> get a reasonable incremental refactoring we can then build on.
After I wrote my previous message I couldn't resist and implemented all
of it in one go, see the "counter proposal" patch I sent. Feel free to
copy and paste from it...
msg20935 (view) Author: bfrk Date: 2019-07-14.11:46:35
Here is a rebased version of my "counter proposal" without the dependencies.

1 patch for repository http://darcs.net/screened:

patch 0c5bd263bbcb3e4298db80ece8b21443d3ceac43
Author: Ben Franksen <ben.franksen@online.de>
Date:   Sun Jul 14 11:19:35 CEST 2019
  * refactor the interface for creating Named patches from prims
  
  This uncouples the type family PatchId from class Ident, so we can use it
  for patch types that have no instance Ident. This allows us to change the
  type of method fromPrim to receive a 'PatchId p' instead of the raw
  ingredients of a PrimPatchId. We add method fromPrims so we can delegate
  generating prim patch ids to the implementation of the RepoPatch type. The
  default method definitions for fromPrim and fromPrims ignore their input, so
  we don't even have to create dummy type instances for PatchId.
  
  We add the function positivePrimPatchIds to the interface of NamedPrim to
  support implementation of fromPrims for RepoPatchV3. This hides the serial
  numbers from client code which are now an implementation detail of
  NamedPrim. The function takes a PatchInfo as input, so the fact that we use
  the hash of the PatchInfo underneath is now also an implementation detail. A
  single exception is made with unsafePrimPatchId which we need (only) in the
  harness to define an instance Arbitrary.
Attachments
msg20936 (view) Author: ganesh Date: 2019-07-14.17:32:26
On 14/07/2019 11:27, Ben Franksen wrote:

> Here is a alternative patch that implements the ideas I sketched in my
> comment, except the last one about abstracting subordinate PatchIds. This
> means we use PatchInfo in the interface which I think is okay for now.

The rebased version looks good to me. Please go ahead and push it to
screened if it's ready, and also consider it reviewed. I've marked the
other bundles in this patch as "dead" in the tracker.
History
Date User Action Args
2019-07-13 18:58:35ganeshcreate
2019-07-13 22:30:38bfrksetmessages: + msg20929
2019-07-14 10:27:55bfrksetfiles: + patch-preview.txt, automatically-specialize-merging-and-equality-for-sequences-of-patches-with-identity.dpatch, unnamed
messages: + msg20931
2019-07-14 10:28:51ganeshsetmessages: + msg20932
2019-07-14 11:31:27bfrksetmessages: + msg20934
2019-07-14 11:46:35bfrksetfiles: + patch-preview.txt, refactor-the-interface-for-creating-named-patches-from-prims.dpatch, unnamed
messages: + msg20935
2019-07-14 17:32:26ganeshsetmessages: + msg20936
2019-07-15 23:31:47bfrksetstatus: needs-screening -> needs-review
2019-07-28 14:10:14ganeshsetstatus: needs-review -> accepted-pending-tests
2019-08-06 09:45:16ganeshsetstatus: accepted-pending-tests -> accepted