Patch 1623 OldDate belongs to Darcs.Util since it has no patch-related code

Title OldDate belongs to Darcs.Util since it has no patch-related code
Superseder Nosy List gh
Related Issues
Status accepted Assigned To

Created on 2017-10-17.16:33:56 by gh, last changed 2017-10-26.12:52:52 by gh.

File name Status Uploaded Type Edit Remove
OldDate_IsoDate.diff gh, 2017-10-18.14:40:54 text/x-patch
olddate-belongs-to-darcs_util-since-it-has-no-patch_related-code.dpatch gh, 2017-10-17.16:33:56 application/octet-stream
remove-darcs_util_olddate.dpatch bf, 2017-10-21.14:52:25 application/x-darcs-patch
remove-olddate-module-and-old-_2003_-darcs-date-format-parsing.dpatch gh, 2017-10-19.12:50:22 application/octet-stream
See mailing list archives for discussion on individual patches.
msg19762 (view) Author: gh Date: 2017-10-17.16:33:56
This patch's name should be self explanating.

The OldDate module was moved to Darcs.Patch in 2009 by patch "Move
OldDate module to Darcs.Patch.OldDate" before we had a clear separation
in Darcs.Util, Darcs.Patch, Darcs.Repository and Darcs.UI.
msg19763 (view) Author: gh Date: 2017-10-18.11:54:49
I'm currently checking whether we need OldDate at all. It seems that it
remained around so that Darcs can handle repositories created before
2004 (which IMO is not worth having a dedicated module in Darcs).  

I have successfully removed it and used the functions of IsoDate
instead, and at least all tests passed.
msg19764 (view) Author: gh Date: 2017-10-18.14:40:54
Indeed, comparing `darcs log` between the current code and the one that
uses module IsoDate instead of OldDate, the only differences are the
hours of patches before October of 2003 (see attached diff of darcs
log's output).
msg19765 (view) Author: bf Date: 2017-10-18.19:31:28
Is it possible to find the exact reason for the difference? Perhaps we
can add some special case handling in Darcs.Patch.Info, wrapping the
IsoDate calls, so we get the full compatibility w/o having to keep the
OldDate module forever. I took a quick look at the diffs between the two
modules and it seems to me that the relevant parts are pretty similar.

Also, while we're at it, I think src/DateTester.hs should be integrated
in the test suite.
msg19767 (view) Author: gh Date: 2017-10-19.12:50:22
Here is a followup patch that removes the OldDate module and any trace
of the old date format parser. I'm probably going to screen it soon.
msg19768 (view) Author: gh Date: 2017-10-19.14:12:36
Sorry Ben, I missed your mail before sending my message.

> Is it possible to find the exact reason for the difference?

With a little investigation on the aforementioned modules with `darcs
log -i`, we can find that IsoDate was created a long time ago by:

patch 130287974273597bd2af640e67569de45d1f7b32
Author: David Roundy <droundy@abridgegame.org>
Date:   Wed Oct 15 10:08:39 -03 2003
  * create repositories with new patch filename format.

Also take a look at the following interesting patch (last paragraph of
its description):

patch bebb7c18e241f77d3422a577f98d9127da959a76
Author: Eric Kow <eric.kow@gmail.com>
Date:   Sat Aug 12 07:20:34 -03 2006
  * Be explicit about timezone handling (issue220); assume local by default.
  Except for the local timezone in the user interface, this patch is not
  expected to change darcs's behaviour.  It merely makes current practice
  - Assume local timezone when parsing date strings from the user
    interface (previous behaviour was assuming UTC).
  - Assume UTC timezone when parsing date strings from PatchInfo.
    Newer patch date strings do *not* specify the timezone, so it
    would be prudent to treat these as UTC.
  - Disregard timezone information altogether when reading patch
    dates (issue220).  Note that this bug was not caused by assuming local
    timezone, because legacy patch date strings explicitly tell you what
    the timezone to use.  The bug was caused by a patch that fixed
    issue173 by using timezone information correctly.  To preserve
    backwards-compatability, we deliberatly replicate the incorrect
    behaviour of overriding the timezone with UTC.

Now, Darcs has always had a history of maintaining backwards
compatibility for a lot of thing, but IMO in 2017 we should not care
anymore about correctly reading the timezone of patches created in
2002/2003 (of which most probably the only surviving exemplars live in
Darcs's own repository!).

By this I mean we should just remove the related code and docs and not
even try replacing it.

I agree about DateTester, it should rather be replaced by unit tests or
shell tests.
msg19769 (view) Author: bf Date: 2017-10-19.18:06:20
gh: I have been reading about issue220. I must say I find it scary. What
happens there is that we read the wrong (non-existing) patch file
because of a difference in how dates are parsed. Apparently the date of
the patch is (was?) used to generate part of the patch file name. Are
you sure that throwing away the OldDate module will not re-introduce
issue220 or similar problems? It doesn't look like we have a regression
test for issue220, perhaps we should add one just to be sure?
msg19770 (view) Author: bf Date: 2017-10-19.18:46:56
Hm, it looks like darcs used a different format for the file names back
then. Current repos do not have this date string prepended to the hash.
Is this the old (non-hashed) repo format? If so, it may be sufficient to
check that we can still read (clone) old style repos and write them in
the hashed format without loosing any vital information. (And I agree
that the exact hour of the date is not vital.)
msg19771 (view) Author: gh Date: 2017-10-19.19:05:25
Indeed issue220 happens only in old-fashioned repos, since inside of
them, patch and inventories filenames contain the date of the patch
(these filenames are generated by Darcs.Patch.Info.makeFilename).

The test convert-darcs2.sh converts OF repositories to hashed and
compares bundles created by both versions of the repositories, so in
particular patch metadata (with the date) is comparted; this test pass
with the last patch I attached.

A concrete case in which issue220 would bite someone would be an OF
repository created by a Darcs version from 2003 (or 2002). The only case
close to this that we know of, is Darcs' repository itself. But it has
already been converted to the hashed format a few years ago.
msg19772 (view) Author: bf Date: 2017-10-20.08:59:40
Right. So suppose someone digs out an old repo, some work she abandoned
at some time before 2003. Things like that do happen from time to time,
and we have no idea how many of those old repos might exist. Can we
guarantee that we are able to convert it to hashed format, so the
history is not lost? Suppose this happens to someone who moved away from
darcs, reluctantly, because back then darcs couldn't handle things like
unicode or froze up due to nested conflicts. And now wants to give darcs
another try and discovers that it crashes when it tries to read the old
repo. Will such a person come back to re-discover a much improved darcs?

I am all for breaking compatibility with ancient formats but only if
there is a clear and reliable upgrade path.
msg19773 (view) Author: gh Date: 2017-10-20.14:05:51
No, we cannot guarantee anything for 2003 repos.

I *think* the patches can be parsed anyway but the timezone would be
off. As far as I care for older Darcs repos formats support in the
current Darcs, this is enough.

(I think it would not be difficult for such an archaeologist coder to
write their own converter for their 2003 repos, after all patches format
are just gz compressed plain text.)
msg19774 (view) Author: gh Date: 2017-10-20.14:24:22
Also, any programmer in their right mind would suspect that a piece of
software would have some format support changes in the span of 14 years :)

In this particular case it seems to me worth it making Darcs' code a
little less WTF-ish.
msg19775 (view) Author: bf Date: 2017-10-21.14:52:25
After digging a bit and comparing the OldDate with IsoDate I came to the
following conclusions.

OldDate exports two functions:

(1) showIsoDateTime: This is exactly the same in both versions. So we
can simply replace the one from OldDate with the one from IsoDate.

(2) readUTCDate: This one is different from the one in IsoDate. It can
be re-implemented inside the IsoDate module as:

-- | Similar to 'readUTCDate', except we /ignore/ timezone info
-- in the input string. This is incorrect and ugly. The only reason
-- it still exists is so we can generate file names for old-fashioned
-- repositories in the same way that old darcs versions expected them.
-- You should not use this function except for the above stated purpose.
readUTCDateOldFashioned :: String -> CalendarTime
readUTCDateOldFashioned d = 
             case parseDate 0 d of
             Left e -> error $ "bad date: "++d++" - "++show e
             Right ct -> (unsafeToCalendarTime ct) { ctTZ = 0 }

I propose we add this function to IsoDate and use it (only) in the
D.P.Info.makeFilename so as not to break reading old repos from before
2003-11. After this change OldDate is obsolete and can be removed.

The readUTCDate from IsoDate does the right thing for all purposes
except the above mentioned 'makeFilename': it correctly translates
localized date strings (i.e. which contain timezone info) to UTC. So the
change in the output of 'darcs log' that you noticed is actually a bug
fix. Let's look at the first patch ever in the darcs darcs repo.

Previously we got:

ben@yuiitsu[1]:.../darcs/screened>/usr/bin/darcs log -p'Initial version'
| grep Date
Date:   Sun Oct 20 22:01:05 CEST 2002

This is displaying the date/time in my timezone (CEST).

Now look at what's in the patch itself:

| head -3
[Initial version of darcs.
droundy@abridgegame.org**Sun Oct 20 20:01:05 EDT 2002] addfile ./Add.lhs
hunk ./Add.lhs 1

So let us display the patch using the original timezone:

ben@yuiitsu[1]:.../darcs/screened>TZ='America/New_York' /usr/bin/darcs
log -p'Initial version' | grep Date
Date:   Sun Oct 20 16:01:05 EDT 2002

This is plainly wrong. Whereas with the change:

ben@yuiitsu[1]:.../darcs/screened>TZ='America/New_York' darcs log
-p'Initial version' | grep Date
Date:   Sun Oct 20 20:01:05 EDT 2002

which is the correct date/time when the patch was recorded.

I have attached another bundle that makes the changes I propose above.
msg19782 (view) Author: gh Date: 2017-10-26.12:52:51
Hi Ben,

Your patch gets rid of the OldDate module but maintains the existing
behaviour in just a couple of lines of code, so it is probably the best
way of dealing with pre-2003 repos.

Thanks a lot, accepted :)
Date User Action Args
2017-10-17 16:33:56ghcreate
2017-10-18 11:54:51ghsetmessages: + msg19763
2017-10-18 14:40:54ghsetfiles: + OldDate_IsoDate.diff
messages: + msg19764
2017-10-18 19:31:28bfsetmessages: + msg19765
2017-10-19 12:50:22ghsetfiles: + remove-olddate-module-and-old-_2003_-darcs-date-format-parsing.dpatch
messages: + msg19767
2017-10-19 14:12:36ghsetmessages: + msg19768
2017-10-19 18:06:20bfsetmessages: + msg19769
2017-10-19 18:46:56bfsetmessages: + msg19770
2017-10-19 19:05:27ghsetmessages: + msg19771
2017-10-20 08:59:41bfsetmessages: + msg19772
2017-10-20 14:05:52ghsetmessages: + msg19773
2017-10-20 14:24:22ghsetmessages: + msg19774
2017-10-21 14:52:26bfsetfiles: + remove-darcs_util_olddate.dpatch
messages: + msg19775
2017-10-26 12:52:52ghsetstatus: needs-review -> accepted
messages: + msg19782