darcs

Patch 1149 resolve issue2364: fix file corruption on double fetch

Title resolve issue2364: fix file corruption on double fetch
Superseder Nosy List ganesh, trofi
Related Issues
Status accepted Assigned To ganesh
Milestone

Created on 2014-04-29.14:57:46 by trofi, last changed 2014-06-18.19:52:06 by gh.

Files
File name Status Uploaded Type Edit Remove
patch-preview.txt trofi, 2014-04-29.14:57:45 text/x-darcs-patch
resolve-issue2364_-fix-file-corruption-on-double-fetch.dpatch trofi, 2014-04-29.14:57:45 application/x-darcs-patch
unnamed trofi, 2014-04-29.14:57:45
See mailing list archives for discussion on individual patches.
Messages
msg17408 (view) Author: trofi Date: 2014-04-29.14:57:45
1 patch for repository http://darcs.net:

Tue Apr 29 16:40:20 FET 2014  Sergei Trofimovich <slyfox@community.haskell.org>
  * resolve issue2364: fix file corruption on double fetch
  
  The bug is the result of attempt to fetch the same file
  (say F) by the same URL (U) multiple times concurrently.
  
  First time U gets fetched by speculative prefetch logic.
  Second time as an ordinary file (while first fetch is not finished).
  
  The function 'copyUrlWithPriority' sends download request
  to 'urlChan' both times (it's already not a nice situation,
  fixed by this patch).
  
  Later urlThread satisfies first request, notifies receiver,
  and starts downloading exactly the same U again.
  
  I don't know exact data corruption mechanics yet, but it has
  to do with non-random intermediate file names of downloaded
  files and 'truncate' call when temp file is opened for a new
  downlaod job.
  
  All temp names are completely non-random for a single darcs run:
  
    urlThread :: Chan UrlRequest -> IO ()
    urlThread ch = do
      junk <- flip showHex "" `fmap` randomRIO rrange
      evalStateT urlThread' (UrlState Map.empty emptyQ 0 junk)
  
    createDownloadFileName :: FilePath -> UrlState -> FilePath
    createDownloadFileName f st = f ++ "-new_" ++ randomJunk st
  
  My theory is next download manages to step on toes of previous job.
  
  I'll try to make file names truly random in other patch.
  That way such errors should manifest as read erros instead of data
  corruption.
  
  Thanks!
Attachments
msg17416 (view) Author: ganesh Date: 2014-05-01.19:56:19
Thanks again for working on this. Just to clarify if my understanding is correct:

The race appears to be caused at a low level by name clashes in the intermediate files, and 
triggered frequently because U is routinely fetched twice by the higher-level code.

Your patch here reduces or eliminates the trigger, and you hope to send another patch that also 
avoids the name clashes in the intermediate files.

So we may as well apply this patch immediately. I'll do that soon unless I hear anything to the 
contrary.
msg17418 (view) Author: trofi Date: 2014-05-01.21:06:06
> Thanks again for working on this. Just to clarify if my understanding
is correct:
>
> The race appears to be caused at a low level by name clashes in the
intermediate files, and 
> triggered frequently because U is routinely fetched twice by the
higher-level code.
> 
> Your patch here reduces or eliminates the trigger, and you hope to
send another patch that also 
> avoids the name clashes in the intermediate files.
>
> So we may as well apply this patch immediately. I'll do that soon
unless I hear anything to the 
> contrary.

Correct.

Moreover my second patch won't fix any broken logic (if the bug still
exists),
it only will show hidden bugs of that kind as a file access errors.
( Patch will look like this one:
https://ghc.haskell.org/trac/ghc/ticket/9058 )
msg17419 (view) Author: ganesh Date: 2014-05-01.22:12:31
Thanks, I've pushed this one to 'screened' (that's the mainline branch you 
get from http://darcs.net/, we sometimes have a second review pass at some 
point before pushing to 'reviewed' to mark a patch as fully reviewed; in 
this case I suspect the best review will be trying it out for a while.)
msg17553 (view) Author: gh Date: 2014-06-18.19:52:06
I guess we tried it long enough. Accepting it.
History
Date User Action Args
2014-04-29 14:57:46troficreate
2014-05-01 19:56:19ganeshsetassignedto: ganesh
messages: + msg17416
nosy: + ganesh
2014-05-01 21:06:06trofisetmessages: + msg17418
2014-05-01 22:12:31ganeshsetstatus: needs-screening -> needs-review
messages: + msg17419
2014-06-18 19:52:06ghsetstatus: needs-review -> accepted
messages: + msg17553