darcs

Patch 385 Resolve issue1942: Fix an IO interleaving bug in old-f...

Title Resolve issue1942: Fix an IO interleaving bug in old-f...
Superseder Nosy List mornfall
Related Issues
Status accepted Assigned To
Milestone

Created on 2010-09-03.20:35:04 by mornfall, last changed 2011-05-10.17:35:35 by darcswatch. Tracked on DarcsWatch.

Files
File name Status Uploaded Type Edit Remove
resolve-issue1942_-fix-an-io-interleaving-bug-in-old_fashioned-readrepo_.dpatch mornfall, 2010-09-03.20:35:04 text/x-darcs-patch
resolve-issue1942_-fix-an-io-interleaving-bug-in-old_fashioned-readrepo_.dpatch mornfall, 2010-09-04.08:50:10 text/x-darcs-patch
unnamed mornfall, 2010-09-03.20:35:04
unnamed mornfall, 2010-09-04.08:50:10
See mailing list archives for discussion on individual patches.
Messages
msg12447 (view) Author: mornfall Date: 2010-09-03.20:35:04
Seems that the patch cherrypicks cleanly from adventure. A point for darcs.

Yours,
   Petr.

1 patch for repository http://darcs.net:

Fri Sep  3 22:36:29 CEST 2010  Petr Rockai <me@mornfall.net>
  * Resolve issue1942: Fix an IO interleaving bug in old-fashioned readRepo.
Attachments
msg12458 (view) Author: mornfall Date: 2010-09-04.08:50:10
Sorry I like totally botched the type signatures. This should be better.

Yours,
   Petr.

PS: It could still need some testing &c.

1 patch for repository http://darcs.net:

Sat Sep  4 05:09:47 CEST 2010  Petr Rockai <me@mornfall.net>
  * Resolve issue1942: Fix an IO interleaving bug in old-fashioned readRepo.
Attachments
msg12459 (view) Author: kowey Date: 2010-09-04.10:07:04
So why don't hashed repositories suffer from issue1942?
Is it just by virtue of being able to treat the current
repository as a source for hashed files that we'd otherwise
have to fetch from the remote one.

Resolve issue1942: Fix an IO interleaving bug in old-fashioned readRepo.
------------------------------------------------------------------------
>            parse2 :: RepoPatch p => PatchInfo -> FilePath
>                                  -> IO (Sealed (PatchInfoAnd p C(x)))
> -          parse2 i fn = do ps <- gzFetchFilePS fn Cachable
> -                           return $ hopefullyNoParseError i (toPath fn) (readPatch ps)
> -          hopefullyNoParseError :: PatchInfo -> String
> -                                -> Maybe (Sealed (Named a1dr C(x)), b)
> -                                -> Sealed (PatchInfoAnd a1dr C(x))
> -          hopefullyNoParseError i _ (Just (Sealed x, _)) =
> -              seal $ patchInfoAndPatch i $ actually x
> -          hopefullyNoParseError i s Nothing =
> -              seal $ patchInfoAndPatch i $ unavailable $ "Couldn't parse file "++s

> +          parse2 i fn = do ps <- unsafeInterleaveIO $ gzFetchFilePS fn Cachable
> +                           Sealed p <- return $ hopefullyNoParseError (toPath fn) (readPatch ps)
> +                           return $ seal $ patchInfoAndPatch i p
> +          hopefullyNoParseError :: String -> Maybe (Sealed (Named a1dr C(x)), b)
> +                                -> Sealed (Hopefully (Named a1dr) C(x))
> +          hopefullyNoParseError _ (Just (Sealed x, _)) = seal $ actually x
> +          hopefullyNoParseError s Nothing = seal $ unavailable $ "Couldn't parse file "++s

OK, two changes:

1. Use of unsafeInterleaveIO to defer gzFetchFilePS until the
   ByteString it produces is actually demanded (by readPatch,
   and I guess walking up to whoever is really using the output of
   readPatch, and so on).  And if I understand correctly, in
   practice (darcs pull), we actually do have cases where we
   don't actually use the remote patch file in question
   (I guess where don't try to commute or present remote patches)

   What are the risks involved in this particular use of
   unsafeInterleaveIO, of not controlling when the file
   fetching happens? What if there are a lot of patches
   to potentially fetch?
   
2. Shuffling the patchInfoAndPatch out of hopefullyNoParseError
   Aside from a refactor, what does this accomplish?

Hmm, I still have an embarrassingly fuzzy grasp of laziness, but
I suspect I'm due for an ah-hah-nothing-to-it moment.  Or that's
maybe too optimistic.

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
For a faster response, try +44 (0)1273 64 2905 or
xmpp:kowey@jabber.fr (Jabber or Google Talk only)
msg12460 (view) Author: mornfall Date: 2010-09-04.13:18:20
Eric Kow <kowey@darcs.net> writes:

> So why don't hashed repositories suffer from issue1942?
> Is it just by virtue of being able to treat the current
> repository as a source for hashed files that we'd otherwise
> have to fetch from the remote one.

I believe the IO interleaving works better in the hashed version of this
code, although it could stand a double-check. If it didn't, we could
save some time and memory by not parsing patches when they aren't
needed.

> Resolve issue1942: Fix an IO interleaving bug in old-fashioned readRepo.
> ------------------------------------------------------------------------
>>            parse2 :: RepoPatch p => PatchInfo -> FilePath
>>                                  -> IO (Sealed (PatchInfoAnd p C(x)))
>> -          parse2 i fn = do ps <- gzFetchFilePS fn Cachable
>> -                           return $ hopefullyNoParseError i (toPath fn) (readPatch ps)
>> -          hopefullyNoParseError :: PatchInfo -> String
>> -                                -> Maybe (Sealed (Named a1dr C(x)), b)
>> -                                -> Sealed (PatchInfoAnd a1dr C(x))
>> -          hopefullyNoParseError i _ (Just (Sealed x, _)) =
>> -              seal $ patchInfoAndPatch i $ actually x
>> -          hopefullyNoParseError i s Nothing =
>> -              seal $ patchInfoAndPatch i $ unavailable $ "Couldn't parse file "++s
>
>> +          parse2 i fn = do ps <- unsafeInterleaveIO $ gzFetchFilePS fn Cachable
>> +                           Sealed p <- return $ hopefullyNoParseError (toPath fn) (readPatch ps)
>> +                           return $ seal $ patchInfoAndPatch i p
>> +          hopefullyNoParseError :: String -> Maybe (Sealed (Named a1dr C(x)), b)
>> +                                -> Sealed (Hopefully (Named a1dr) C(x))
>> +          hopefullyNoParseError _ (Just (Sealed x, _)) = seal $ actually x
>> +          hopefullyNoParseError s Nothing = seal $ unavailable $ "Couldn't parse file "++s
>
> OK, two changes:
>
> 1. Use of unsafeInterleaveIO to defer gzFetchFilePS until the
>    ByteString it produces is actually demanded (by readPatch,
>    and I guess walking up to whoever is really using the output of
>    readPatch, and so on).  And if I understand correctly, in
>    practice (darcs pull), we actually do have cases where we
>    don't actually use the remote patch file in question
>    (I guess where don't try to commute or present remote patches)
>
>    What are the risks involved in this particular use of
>    unsafeInterleaveIO, of not controlling when the file
>    fetching happens? What if there are a lot of patches
>    to potentially fetch?

It is as safe as the whole Hopefully concept is. Read that as "not very"
but it is baked so deep into darcs that we can't do anything about it
for now.

> 2. Shuffling the patchInfoAndPatch out of hopefullyNoParseError
>    Aside from a refactor, what does this accomplish?
>
> Hmm, I still have an embarrassingly fuzzy grasp of laziness, but
> I suspect I'm due for an ah-hah-nothing-to-it moment.  Or that's
> maybe too optimistic.

Because the "i" does not get shuffled out of the scope of the pattern
match on the result of the parsing by GHC (and I am not sure it could
be). So to get at "info p" (which comes from the inventory) you have to
download the patch to see if it parses correctly (to choose the branch
in hopefullyNoParseError). This was certainly not intentional.

Yours,
   Petr.
msg12470 (view) Author: kowey Date: 2010-09-06.10:01:36
On Sat, Sep 04, 2010 at 15:20:11 +0200, Petr Rockai wrote:
> > 2. Shuffling the patchInfoAndPatch out of hopefullyNoParseError
> >    Aside from a refactor, what does this accomplish?

> Because the "i" does not get shuffled out of the scope of the pattern
> match on the result of the parsing by GHC (and I am not sure it could
> be). So to get at "info p" (which comes from the inventory) you have to
> download the patch to see if it parses correctly (to choose the branch
> in hopefullyNoParseError). This was certainly not intentional.

I'm applying this patch

I think you're saying that old way,  we have to pattern-match the
readPatch p output in order to build a PatchInfoAnd.  This pattern match
would force the file to be fetched even though we're using lazy IO ,
because for readPatch to get a point where it can return a Just, it'd
have to have seen the file contents.  This would be wasteful if in
practice, we don't really care about the "And..." part of PatchInfoAnd
and we really just want the PatchInfo itself.  The new way builds the
PatchInfoAnd around hopefullyNoParseError so that you can consume the
patchinfo and optionally never touch the readParse output and if we use
lazy IO avoid downloading the patch.

If this is right, then hooray, I'm very slowly starting to understand
how to think about this...

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
For a faster response, try +44 (0)1273 64 2905 or
xmpp:kowey@jabber.fr (Jabber or Google Talk only)
msg12481 (view) Author: darcswatch Date: 2010-09-06.19:30:36
This patch bundle (with 1 patches) was just applied to the repository http://darcs.net/.
This message was brought to you by DarcsWatch
http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-212b923b00d7f1f819af4bfe345a94f4a1d746c3
msg14050 (view) Author: darcswatch Date: 2011-05-10.17:35:34
This patch bundle (with 1 patches) was just applied to the repository http://darcs.net/reviewed.
This message was brought to you by DarcsWatch
http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-212b923b00d7f1f819af4bfe345a94f4a1d746c3
History
Date User Action Args
2010-09-03 20:35:04mornfallcreate
2010-09-04 08:50:10mornfallsetfiles: + resolve-issue1942_-fix-an-io-interleaving-bug-in-old_fashioned-readrepo_.dpatch, unnamed
messages: + msg12458
2010-09-04 10:07:05koweysetmessages: + msg12459
2010-09-04 13:18:20mornfallsetmessages: + msg12460
2010-09-06 10:01:37koweysetmessages: + msg12470
2010-09-06 19:30:36darcswatchsetstatus: needs-review -> accepted
messages: + msg12481
2010-09-06 19:30:50darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-a4d06b7e33ade8147cbd6d85a693e4a0e0420641
2011-05-10 17:35:34darcswatchsetmessages: + msg14050
2011-05-10 17:35:35darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-a4d06b7e33ade8147cbd6d85a693e4a0e0420641 -> http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-a4d06b7e33ade8147cbd6d85a693e4a0e0420641