darcs

Issue 1392 Use Parsec for .authorspellings

Title Use Parsec for .authorspellings
Priority wishlist Status resolved
Milestone Resolved in
Superseder Nosy List caitt, darcs-devel, dmitry.kurochkin, jaredj, kowey, thorkilnaur, twb
Assigned To caitt
Topics ProbablyEasy

Created on 2009-03-14.07:22:10 by twb, last changed 2010-03-15.11:00:49 by caitt.

Messages
msg7455 (view) Author: twb Date: 2009-03-14.07:22:08
I think it'd be better to use Parsec to parse .authorspellings, rather
than an ad-hoc Perl-style use of primitives "dropWhile isSpace".

We already require Parsec for other parts of Darcs, so it would not
add a new dependency.
msg7456 (view) Author: twb Date: 2009-03-14.07:33:28
On Sat, Mar 14, 2009 at 07:22:11AM -0000, Trent Buck wrote:
> I think it'd be better to use Parsec to parse .authorspellings,
> rather than an ad-hoc Perl-style use of primitives "dropWhile
> isSpace".

While I'm on the subject, I note that currently

- only Unix-style EOLs are supported;

- comments cannot start with whitespace;

- canonical names cannot contain a comma (e.g. "Chuck Jones,
  Jr. <chuck@pobox.com>");

- regexps cannot contain commas; and

- regexps cannot begin or end in [[:space:]].

The last point is not so bad, since you can just prefix a leading
space with a backslash (but not a trailing space).  The point about ",
Jr." worries me, though.

Since regex already has foo|bar I suggest only supporting one regexp.

I also don't see the point of having canonical names *without* a
regexp.
msg9453 (view) Author: caitt Date: 2009-11-22.16:13:46
I've submitted patch (#94) to address majority of issues raised here:
 1) I reimplemented parser using parsec.
 2) I added escape mechanism, so commas can be now parts of names and regexps
 3) because ws are stripped i don't think there is any EOL issue 
 4) to start/end regexp with space, just use class syntax: [ ]
 5) comma separated list is in my opinion better than one regexp because:
   - alternatives are clumsy
   - many people just lists multiple e-mail addresses - it is more natural write
them down as a comma separated list than encode them in one big regexp
   - changing format is not backward compatible - all users would have to change
their .authorspelling files

There are few issues left:
 - canonical email address is now mandatory (previous doc suggested that, but
implementation did not require that)
 - malformed regexps are not handled nicely 
 - the matching process is sort of error-prone, people can accidentally take
over patches that don't belong to them. It is not unrealistic that two people
will have regexp that will match the same authorstring.
msg9656 (view) Author: caitt Date: 2009-12-22.16:20:16
The following patch updated the status of issue1392 to be resolved:

* resolve issue1392: use parsec to parse .authorspelling 
Ignore-this: 491f823882e9e2c9a49ad303ee6fb4ba
.authorspelling file is now parsed using parsec.
Parser reports errors (only affected line is discarded).
Added escaping of commas.
msg10211 (view) Author: caitt Date: 2010-03-15.11:00:41
The following patch updated the status of issue1392 to be resolved:

* resolve issue1392: use parsec to parse .authorspelling 
Ignore-this: 491f823882e9e2c9a49ad303ee6fb4ba
.authorspelling file is now parsed using parsec.
Parser reports errors (only affected line is discarded).
Added escaping of commas.
History
Date User Action Args
2009-03-14 07:22:10twbcreate
2009-03-14 07:33:30twbsetstatus: unread -> unknown
nosy: kowey, simon, twb, thorkilnaur, dmitry.kurochkin
messages: + msg7456
2009-04-09 11:01:23koweysetpriority: wishlist
nosy: + jaredj
topic: + ProbablyEasy
2009-08-11 18:07:23koweysetstatus: unknown -> needs-implementation
nosy: kowey, simon, twb, thorkilnaur, jaredj, dmitry.kurochkin
2009-08-25 17:42:06adminsetnosy: + darcs-devel, - simon
2009-08-27 14:23:14adminsetnosy: kowey, darcs-devel, twb, thorkilnaur, jaredj, dmitry.kurochkin
2009-11-22 16:13:49caittsetstatus: needs-implementation -> has-patch
nosy: + caitt
messages: + msg9453
assignedto: caitt
2009-12-05 22:57:04ganeshlinkpatch94 issues
2009-12-22 16:20:18caittsetstatus: has-patch -> resolved
messages: + msg9656
2010-03-15 11:00:49caittsetmessages: + msg10211