darcs

Patch 332 Add --import and --export to available f... (and 1 more)

Title Add --import and --export to available f... (and 1 more)
Superseder Nosy List ganesh, kowey, mornfall
Related Issues
Status obsoleted Assigned To
Milestone

Created on 2010-08-07.13:19:09 by mornfall, last changed 2011-05-10.22:07:42 by darcswatch. Tracked on DarcsWatch.

Files
File name Status Uploaded Type Edit Remove
add-__import-and-__export-to-available-flags_.dpatch mornfall, 2010-08-07.13:19:09 text/x-darcs-patch
add-__import-and-__export-to-available-flags_.dpatch mornfall, 2010-08-07.13:25:31 text/x-darcs-patch
add-__import-and-__export-to-available-flags_.dpatch mornfall, 2010-08-09.11:41:13 text/x-darcs-patch
add-__import-and-__export-to-available-flags_.dpatch mornfall, 2010-08-09.18:24:33 text/x-darcs-patch
unnamed mornfall, 2010-08-07.13:19:09
unnamed mornfall, 2010-08-07.13:25:31
unnamed mornfall, 2010-08-09.11:41:13
unnamed mornfall, 2010-08-09.18:24:33
See mailing list archives for discussion on individual patches.
Messages
msg12018 (view) Author: mornfall Date: 2010-08-07.13:19:09
Hi,

this is a first stab at darcs fast-export (anyone care to add a hidden alias,
if that's simple enough?). I have managed to convert a couple of repositories
just fine with this code (ghc-testsuite, darcs darcs repo, darcs-benchmark
repo, GHC currently running, at ~10000 / 20942).

It's likely that I have missed some corner cases (or even something completely
obvious, since I have very little clue on how to use git), but nothing can be
perfect the first time. To test:

$ mkdir repo.git
$ cd repo.git
$ git init
$ darcs convert --export ../repo | git fast-import
$ git checkout

and you should end up with a reasonable approximation of the darcs repository.
No renames are emitted by darcs convert --export and any merges are flattened
out. I also sometimes produce no-op "M" commands in the stream, I am not sure
how git handles those -- whether the file is mentioned in the commit unmodified
or its dropped (although presumably neither is too much of a problem). Another
caveat is that git does not support empty directories: these will simply vanish
during conversion.

I think that's it. Feedback welcome.

Yours,
   Petr.

2 patches for repository http://darcs.net/:

Sat Aug  7 15:08:44 CEST 2010  Petr Rockai <me@mornfall.net>
  * Add --import and --export to available flags.

Sat Aug  7 15:12:39 CEST 2010  Petr Rockai <me@mornfall.net>
  * Implement convert --export to generate a git fast-import stream.
Attachments
msg12020 (view) Author: mornfall Date: 2010-08-07.13:25:31
Hi again,

just amended out a minor fluke with sanitisation of email addresses... (I was
generating double opening <)...

Yours,
   Petr.

2 patches for repository http://darcs.net/:

Sat Aug  7 15:08:44 CEST 2010  Petr Rockai <me@mornfall.net>
  * Add --import and --export to available flags.

Sat Aug  7 15:25:32 CEST 2010  Petr Rockai <me@mornfall.net>
  * Implement convert --export to generate a git fast-import stream.
Attachments
msg12069 (view) Author: mornfall Date: 2010-08-09.11:41:13
Hi,

there comes my current (slightly in-progress) set of convert --import/--export
patches. The import code needs some features in hashed-storage that I haven't
even recorded yet, let alone pushed or (gasp) released. I will record/push them
later today so that people can build with this patchset...

Also, the bundle adds a dependency on datetime (because the date/time handling
in haskell is a total mess) and maybe more importantly attoparsec, which I felt
like learning so I implemented the fast-import dump parser in that.

As for --export, that part looks fairly OK and reliable. The --import-er,
that's a somewhat different story, since we are facing a paradigm boundary --
git representation of repositories is quite different from ours.

I believe that the final state of the --imported repository will always match
the current state of the corresponding git repository (assuming that there is a
single HEAD in the git repository; I have NO idea what happens when the git
repo is multiheaded). However, I can't say that with much certainty about tags,
unfortunately, since they are fairly elusive. We can also skip tags from time
to time, since I didn't implement any logic to create them out of order. You
will see warnings about that. You may also get duplicate tags. The latest tag
is the right one. I may add a cleanup pass that removes the earlier dups. (This
happens if the git branch/tag refs get interleaved in the dump file... don't
ask).

Oh, the tags we create have unsightly names like refs/tags/bla. I'll fix that
later.

And, we pick arbitrary linearisation of the history. You will see warnings
about that, too. (Arbitrary in this case means the order in which commits come
in the stream; each out of order commit in the stream will give you a warning).

On the up side, both the importer and the exporter are pretty fast. I'll
elaborate that when the feature is a bit more stable.

I'll drop a note here when the hashed-storage stuff is checked in. At that
point, I'd welcome if people could do some test conversions. The normal
procedure is

(cd git-repo; git fast-export --all --progress 500) | darcs convert --import darcs-repo

if that fails, a more fool-proof (and more limited) result can be obtained by
replacing --all with HEAD in the fast-export above. The latter won't get you
any tags, sadly (since they are not present in the dump git produces for that
case).

Yours,
   Petr.

12 patches for repository http://darcs.net/:

Sat Aug  7 15:08:44 CEST 2010  Petr Rockai <me@mornfall.net>
  * Add --import and --export to available flags.

Sat Aug  7 15:25:32 CEST 2010  Petr Rockai <me@mornfall.net>
  * Implement convert --export to generate a git fast-import stream.

Sat Aug  7 17:15:14 CEST 2010  Petr Rockai <me@mornfall.net>
  * Also export commit dates in convert --export.

Mon Aug  9 01:25:46 CEST 2010  Petr Rockai <me@mornfall.net>
  * First version of a fast-importer (convert --import).

Mon Aug  9 04:13:08 CEST 2010  Petr Rockai <me@mornfall.net>
  * Recognize tags in convert --import (but ignore them for now).

Mon Aug  9 05:19:19 CEST 2010  Petr Rockai <me@mornfall.net>
  * Optimize convert --import.

Mon Aug  9 04:51:28 CEST 2010  Petr Rockai <me@mornfall.net>
  * Add a forgotten import.

Sun Aug  8 20:35:50 CEST 2010  Petr Rockai <me@mornfall.net>
  * Also export (clean) tags in convert --export.

Mon Aug  9 05:18:28 CEST 2010  Petr Rockai <me@mornfall.net>
  * Import author names, mail addresses and commit dates, in convert --import.

Mon Aug  9 12:14:42 CEST 2010  Petr Rockai <me@mornfall.net>
  * More robust convert --import, support (in-order) tags.

Mon Aug  9 13:24:28 CEST 2010  Petr Rockai <me@mornfall.net>
  * (Approximately) import multi-branch dumps in convert --import. Support tagging.

Mon Aug  9 13:25:58 CEST 2010  Petr Rockai <me@mornfall.net>
  * Produce slightly cleaner dumps in convert --export.
Attachments
msg12072 (view) Author: mornfall Date: 2010-08-09.18:24:33
Hi again,

this time I have recorded and pushed the hashed-storage changes. To test this
bundle, you will need to fetch hashed-storage HEAD as well (note that it also
happens to contain some experimental changes, so be wary).

The instructions in my original bundle still hold and code-wise, nothing much
has changed. I would be interested in hearing about some test conversions...

There's a couple of TODO items:

- the code does not work at all with plain (darcs-1) repos... don't even try;
  we probably just need to catch this at the UI level; I am not interested in
  writing darcs-1-format support code for this
- handling of branch resets in the fast-import stream is a bit wobbly...
- there seems to be a minor space leak (or, well, accumulation of data) in
  hashed-storage, although this may be fairly hard to fix (without regressing
  speed)... as long as your project has only couple 10000 patches, it shouldn't
  be an issue
- there are some UI bugs, since I don't know how to use
  amInRepository/amNotInRepository properly with --import/--export; it might be
  best to just make convert a supercommad instead? can we have a supercommand
  take action even if it gets no subcommand?
- I'd like to add some checkpointing, since currently the conversion is all or
  nothing; if anything fails after 2 hours, you get no result (well, you can
  manually tweak the result to get a working darcs repo, but it's a bit icky);
  also, optimizing inventory in one go will fail with stack overflow on big
  repos (happened after about 50k patches) -- another reason to checkpoint
- the tag names are still icky; need to fix that (drop refs/tag/)

Please do not push, but do test if possible. Thanks!

Yours,
   Petr.

13 patches for repository http://darcs.net/:

Sat Aug  7 15:08:44 CEST 2010  Petr Rockai <me@mornfall.net>
  * Add --import and --export to available flags.

Sat Aug  7 15:25:32 CEST 2010  Petr Rockai <me@mornfall.net>
  * Implement convert --export to generate a git fast-import stream.

Sat Aug  7 17:15:14 CEST 2010  Petr Rockai <me@mornfall.net>
  * Also export commit dates in convert --export.

Mon Aug  9 01:25:46 CEST 2010  Petr Rockai <me@mornfall.net>
  * First version of a fast-importer (convert --import).

Mon Aug  9 04:13:08 CEST 2010  Petr Rockai <me@mornfall.net>
  * Recognize tags in convert --import (but ignore them for now).

Mon Aug  9 05:19:19 CEST 2010  Petr Rockai <me@mornfall.net>
  * Optimize convert --import.

Mon Aug  9 04:51:28 CEST 2010  Petr Rockai <me@mornfall.net>
  * Add a forgotten import.

Sun Aug  8 20:35:50 CEST 2010  Petr Rockai <me@mornfall.net>
  * Also export (clean) tags in convert --export.

Mon Aug  9 05:18:28 CEST 2010  Petr Rockai <me@mornfall.net>
  * Import author names, mail addresses and commit dates, in convert --import.

Mon Aug  9 12:14:42 CEST 2010  Petr Rockai <me@mornfall.net>
  * More robust convert --import, support (in-order) tags.

Mon Aug  9 13:24:28 CEST 2010  Petr Rockai <me@mornfall.net>
  * (Approximately) import multi-branch dumps in convert --import. Support tagging.

Mon Aug  9 13:25:58 CEST 2010  Petr Rockai <me@mornfall.net>
  * Produce slightly cleaner dumps in convert --export.

Mon Aug  9 19:55:11 CEST 2010  Petr Rockai <me@mornfall.net>
  * Bump hashed-storage dependency to 0.5.3.
Attachments
msg12073 (view) Author: mornfall Date: 2010-08-09.19:25:49
Oh, this patch screws up witnesses. That should have gone into that TODO. 
For now, compile with -f-library.
msg12074 (view) Author: mornfall Date: 2010-08-09.19:30:56
Oh, another TODO: git fast-export supports -M and -C to perform move and 
copy detection during the export and generate appropriate commands in the 
stream. However, this is not yet supported by darcs convert --import, 
although it would be quite beneficial.
msg12075 (view) Author: mornfall Date: 2010-08-09.19:35:53
Another TODO: binary files are not handled, we always generate text 
hunks.
msg12076 (view) Author: mornfall Date: 2010-08-09.19:49:57
Scratch that about text hunks. We generate binary patches based on 
isFunky. But we ignore the venerable darcs "binary" regexen.
msg12097 (view) Author: ganesh Date: 2010-08-10.19:52:46
General question - does this belong in darcs or should it be a client of 
the library?

My main concern is that there are probably various design choices 
particularly on the import side, and we might be committing to one of 
them by putting this in darcs.
msg12111 (view) Author: kowey Date: 2010-08-11.09:33:48
That's a good question and actually it's a very nice thing for Grumpy 
Old Men to point out (the possibility of creating an external 
application).  

Personally, I'm attracted to the idea of some sort of  darcs-convert 
application, although I can see why it would be particularly 
convenient/compelling to have this in Darcs proper.  Note also that this 
discussion may tie into the plugin system proposed in issue1504.
msg12112 (view) Author: mornfall Date: 2010-08-11.10:40:43
Hi,

so there are two things to consider. While I don't have strong opinions
about whether this should be internal or external, I am a bit concerned
about costs of making these external. Most specifically, I am a bit
overburdened with maintenance tasks already, and I don't want another
package added to that. So I see two options:

- We create a team-maintained "darcs-contrib" hackage package, which
  will contain various darcs-foo programs. We can make darcs call those
  programs when darcs foo is specified, maybe. Not sure about how help
  would be handled etc. Maybe we can really require ~/.darcs/plugins to
  describe any plugins that the user wants to use, and they can run
  darcs --register foo or something. Nevertheless, I am not exactly
  volunteering to do all that work.

- I keep working the conversion code, but someone else takes over
  packaging and UI. If there are interested parties, that is.

(The third option is to just accept this to darcs itself. I don't see
any strings attached. We aren't bound to keep this working the same way
forever, it's not even part of the core functionality. Whether it's the
darcs binary or some-other-binary.)

Yours,
   Petr.

Eric Kow <bugs@darcs.net> writes:
> That's a good question and actually it's a very nice thing for Grumpy 
> Old Men to point out (the possibility of creating an external 
> application).  
>
> Personally, I'm attracted to the idea of some sort of  darcs-convert 
> application, although I can see why it would be particularly 
> convenient/compelling to have this in Darcs proper.  Note also that this 
> discussion may tie into the plugin system proposed in issue1504.
msg12126 (view) Author: ganesh Date: 2010-08-11.21:34:33
Hi,

Re putting it in darcs, I guess it does or will allow repeated syncs from 
a git repo to a darcs repo? If so, you can imagine people starting to rely 
on it for synchronisation and being upset it the behaviour changed in a 
non-backwards compatible way. I think we do have some responsibility to 
maintain backwards compatibility in darcs itself.

Regarding the ongoing maintenance, I'm quite attracted to the idea of 
accepting it for maintenance by the darcs team, but with a somehow lower 
status than darcs itself. Rather than having a single darcs-contrib bucket 
for such things, I'd suggest it just be a separate package on hackage of 
its own. I'm also in no rush to add the loose coupling to darcs itself 
that you suggest, because I'd prefer it to be clearly separate.

I don't feel very strongly about this, but if we do put it into darcs I am 
quite keen on spending some time making sure it works in a way we're happy 
with and that has a reasonable degree of future-proofing.

Ganesh

On Wed, 11 Aug 2010, Petr Rockai wrote:

> Hi,
>
> so there are two things to consider. While I don't have strong opinions
> about whether this should be internal or external, I am a bit concerned
> about costs of making these external. Most specifically, I am a bit
> overburdened with maintenance tasks already, and I don't want another
> package added to that. So I see two options:
>
> - We create a team-maintained "darcs-contrib" hackage package, which
>  will contain various darcs-foo programs. We can make darcs call those
>  programs when darcs foo is specified, maybe. Not sure about how help
>  would be handled etc. Maybe we can really require ~/.darcs/plugins to
>  describe any plugins that the user wants to use, and they can run
>  darcs --register foo or something. Nevertheless, I am not exactly
>  volunteering to do all that work.
>
> - I keep working the conversion code, but someone else takes over
>  packaging and UI. If there are interested parties, that is.
>
> (The third option is to just accept this to darcs itself. I don't see
> any strings attached. We aren't bound to keep this working the same way
> forever, it's not even part of the core functionality. Whether it's the
> darcs binary or some-other-binary.)
>
> Yours,
>   Petr.
>
> Eric Kow <bugs@darcs.net> writes:
>> That's a good question and actually it's a very nice thing for Grumpy
>> Old Men to point out (the possibility of creating an external
>> application).
>>
>> Personally, I'm attracted to the idea of some sort of  darcs-convert
>> application, although I can see why it would be particularly
>> convenient/compelling to have this in Darcs proper.  Note also that this
>> discussion may tie into the plugin system proposed in issue1504.
> _______________________________________________
> darcs-users mailing list
> darcs-users@darcs.net
> http://lists.osuosl.org/mailman/listinfo/darcs-users
>
>
msg12139 (view) Author: kowey Date: 2010-08-12.12:38:56
On Wed, Aug 11, 2010 at 22:37:36 +0100, Ganesh Sittampalam wrote:
> Regarding the ongoing maintenance, I'm quite attracted to the idea
> of accepting it for maintenance by the darcs team, but with a
> somehow lower status than darcs itself.

+1 on that

> Rather than having a single
> darcs-contrib bucket for such things, I'd suggest it just be a
> separate package on hackage of its own.

And that

> I'm also in no rush to add the loose coupling to darcs itself that you
> suggest, because I'd prefer it to be clearly separate.

We could also think of the separation as being a really good opportunity
for us to improve the Darcs library.

Or maybe to develop a style of working on new features where we let them
stand alone for a little while (incubation phase) and then fold them
into Darcs as it becomes clear how important they are and how to fit
them in.

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
For a faster response, please try +44 (0)1273 64 2905.
msg12275 (view) Author: kowey Date: 2010-08-23.08:03:18
Appears to be taken over by a standalone darcs-fastconvert (or similarly
named) application.
History
Date User Action Args
2010-08-07 13:19:09mornfallcreate
2010-08-07 13:20:09darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-494dfa912a2e76bc0748467e1c8f6a12cd297f42
2010-08-07 13:25:32mornfallsetfiles: + add-__import-and-__export-to-available-flags_.dpatch, unnamed
messages: + msg12020
2010-08-07 13:26:18darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-494dfa912a2e76bc0748467e1c8f6a12cd297f42 -> http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-974976c2face3c499a4cb42b5dafd7013ec19238
2010-08-09 11:41:14mornfallsetfiles: + add-__import-and-__export-to-available-flags_.dpatch, unnamed
messages: + msg12069
2010-08-09 11:43:07darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-974976c2face3c499a4cb42b5dafd7013ec19238 -> http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-99fa3f48bcffc1f75c0c125ad9f34b177c20adb7
2010-08-09 18:24:33mornfallsetfiles: + add-__import-and-__export-to-available-flags_.dpatch, unnamed
messages: + msg12072
2010-08-09 18:25:39darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-99fa3f48bcffc1f75c0c125ad9f34b177c20adb7 -> http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-3743f92c72ef6b18c99181231ccf113a5d93d28a
2010-08-09 19:25:49mornfallsetmessages: + msg12073
2010-08-09 19:30:56mornfallsetmessages: + msg12074
2010-08-09 19:35:53mornfallsetmessages: + msg12075
2010-08-09 19:49:57mornfallsetmessages: + msg12076
2010-08-10 19:52:46ganeshsetnosy: + ganesh
messages: + msg12097
2010-08-11 09:33:48koweysetnosy: + kowey
messages: + msg12111
2010-08-11 10:40:43mornfallsetmessages: + msg12112
2010-08-11 21:34:33ganeshsetmessages: + msg12126
2010-08-12 12:38:56koweysetmessages: + msg12139
2010-08-23 08:03:18koweysetstatus: needs-review -> obsoleted
messages: + msg12275
2011-05-10 19:36:44darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-3743f92c72ef6b18c99181231ccf113a5d93d28a -> http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-494dfa912a2e76bc0748467e1c8f6a12cd297f42
2011-05-10 21:35:31darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-494dfa912a2e76bc0748467e1c8f6a12cd297f42 -> http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-3743f92c72ef6b18c99181231ccf113a5d93d28a
2011-05-10 22:05:32darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-3743f92c72ef6b18c99181231ccf113a5d93d28a -> http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-974976c2face3c499a4cb42b5dafd7013ec19238
2011-05-10 22:07:42darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-974976c2face3c499a4cb42b5dafd7013ec19238 -> http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-99fa3f48bcffc1f75c0c125ad9f34b177c20adb7