darcs

Patch 215 Resolve issue1159: smart caches union. (and 2 more)

Title Resolve issue1159: smart caches union. (and 2 more)
Superseder Nosy List darcs-users, dmitry.kurochkin, galbolle
Related Issues
Status accepted Assigned To galbolle
Milestone

Created on 2010-04-18.15:05:15 by dmitry.kurochkin, last changed 2011-05-10.19:36:31 by darcswatch. Tracked on DarcsWatch.

Files
File name Status Uploaded Type Edit Remove
resolve-issue1159_-smart-caches-union_.dpatch dmitry.kurochkin, 2010-04-18.15:05:15 text/x-darcs-patch
unnamed dmitry.kurochkin, 2010-04-18.15:05:15 text/plain
See mailing list archives for discussion on individual patches.
Messages
msg10759 (view) Author: dmitry.kurochkin Date: 2010-04-18.15:05:15
3 patches for repository http://darcs.net:

Sat Apr 17 03:56:44 MSD 2010  Dmitry Kurochkin <dmitry.kurochkin@gmail.com>
  * Resolve issue1159: smart caches union.
  Try to do better than just blindly copying remote cache entries:
  
  * If remote repository is accessed through network, do not copy any cache
    entries from it. Taking local entries does not make sense and using network
    entries can lead to darcs hang when it tries to get to unaccessible host.
  
  * If remote repositoty is local, copy all network cache entries. For local
    cache entries if the cache directory exists and is writable it is added as
    writable cache, if it exists but is not writable it is added as read-only
    cache.
  
  This approach should save us from bogus cache entries. One case it does not
  work very well is when you fetch from partial repository over network.
  Hopefully this is not a common case.

Sun Apr 18 18:01:29 MSD 2010  Dmitry Kurochkin <dmitry.kurochkin@gmail.com>
  * URL: add maxPipelineLength function, rename maxPipeLength to maxPipelineLengthRef.

Sun Apr 18 19:03:02 MSD 2010  Dmitry Kurochkin <dmitry.kurochkin@gmail.com>
  * Darcs.Repository: use pipelining when copying patches.
  Speculate on patches in fetch_patches_if_necessary. Improves darcs get time
  for the http://darcs.net repository from 37:24 to 21:25 for me.
Attachments
msg10777 (view) Author: galbolle Date: 2010-04-20.13:27:02
Resolve issue1159: smart caches union.
--------------------------------------
Dmitry Kurochkin <dmitry.kurochkin@gmail.com>**20100416235644

hunk ./src/Darcs/Repository/Cache.hs 74
>  unionCaches :: Cache -> Cache -> Cache
>  unionCaches (Ca a) (Ca b) = Ca (nub (a++b))
>  
> +unionRemoteCaches :: Cache -> Cache -> String -> IO (Cache)
> +unionRemoteCaches local (Ca remote) repourl
> +    | is_file repourl = do f <- filtered
> +                           return $ local `unionCaches` Ca f
> +    | otherwise = return local
> +  where filtered = mapM (\x -> fn x `catchall` return Nothing) remote >>=
> +                   return . catMaybes
> +        fn :: CacheLoc -> IO (Maybe CacheLoc)
> +        fn (Cache Repo Writable _) = return Nothing
> +        fn c@(Cache t _ url)
> +          | is_file url = do
> +              ex <- doesDirectoryExist url
> +              if ex then do p <- getPermissions url
> +                            return $ Just $
> +                              if writable c && SD.writable p
> +                              then c else Cache t NotWritable url
> +                    else return Nothing
> +          | otherwise = return $ Just c
> +
>  repo2cache :: String -> Cache
>  repo2cache r = Ca [Cache Repo NotWritable r]
>  
Can you add a haddock for this function, in a followup patch? Your
patch description would do all right.

This means that cache propagation is a bit slower now, but more
correct. This seems right to me.

URL: add maxPipelineLength function, rename maxPipeLength to
maxPipelineLengthRef.
----------------------------------------------------------------------------------
Dmitry Kurochkin <dmitry.kurochkin@gmail.com>**20100418140129

hunk ./src/URL.hs 4
>  {-# LANGUAGE CPP, ForeignFunctionInterface #-}
>  
>  module URL ( copyUrl, copyUrlFirst, pipeliningEnabledByDefault,
> -             setDebugHTTP, setHTTPPipelining, waitUrl,
> -             Cachable(Cachable, Uncachable, MaxAge),
> +             setDebugHTTP, setHTTPPipelining, maxPipelineLength,
> +             waitUrl, Cachable(Cachable, Uncachable, MaxAge),
>               environmentHelpProxy, environmentHelpProxyPassword
>             ) where
>  

hunk ./src/URL.hs 95
>  pipeliningEnabledByDefault = False
>  #endif
>  
> -{-# NOINLINE maxPipeLength #-}
> -maxPipeLength :: IORef Int
> -maxPipeLength = unsafePerformIO $ newIORef $
> +{-# NOINLINE maxPipelineLengthRef #-}
> +maxPipelineLengthRef :: IORef Int
> +maxPipelineLengthRef = unsafePerformIO $ newIORef $
>  #ifdef CURL_PIPELINING_DEFAULT
>                  pipeliningLimit
>  #else

hunk ./src/URL.hs 104
>                  1
>  #endif
>  
> +maxPipelineLength :: IO Int
> +maxPipelineLength = readIORef maxPipelineLengthRef
> +
>  {-# NOINLINE urlNotifications #-}
>  urlNotifications :: MVar (Map String (MVar String))
>  urlNotifications = unsafePerformIO $ newMVar Map.empty

hunk ./src/URL.hs 185
>  checkWaitToStart = do
>    st <- get
>    let l = pipeLength st
> -  mpl <- liftIO $ readIORef maxPipeLength
> +  mpl <- liftIO maxPipelineLength
>    when (l < mpl) $ do
>      let w = waitToStart st
>      case readQ w of

hunk ./src/URL.hs 284
>  #endif
>  
>  setHTTPPipelining :: Bool -> IO ()
> -setHTTPPipelining False = writeIORef maxPipeLength 1
> -setHTTPPipelining True = writeIORef maxPipeLength
> +setHTTPPipelining False = writeIORef maxPipelineLengthRef 1
> +setHTTPPipelining True = writeIORef maxPipelineLengthRef
>  #ifdef CURL_PIPELINING
>      pipeliningLimit
>  #else

Ok, just renaming things.


Darcs.Repository: use pipelining when copying patches.
------------------------------------------------------
Dmitry Kurochkin <dmitry.kurochkin@gmail.com>**20100418150302

hunk ./src/Darcs/Repository.hs 76
>       make_new_pending
>      )
>  import Darcs.Repository.Merge( tentativelyMergePatches,
considerMergeToWorking )
> -import Darcs.Repository.Cache ( unionRemoteCaches,
fetchFileUsingCache, HashedDir(..) )
> +import Darcs.Repository.Cache ( unionRemoteCaches, fetchFileUsingCache,
> +                                speculateFileUsingCache, HashedDir(..) )
>  import Darcs.Patch.Set ( PatchSet, SealedPatchSet )
>  #ifdef GADT_WITNESSES
>  import Darcs.Patch.Set ( Origin )

hunk ./src/Darcs/Repository.hs 82
>  #endif
> +import URL ( maxPipelineLength )
>  
>  import Control.Monad ( unless, when )

hunk ./src/Darcs/Repository.hs 85
> -import Data.Either(Either(..))
> +import Data.Either ( Either(..), rights )
>  import System.Directory ( createDirectory, renameDirectory )
>  import System.IO.Error ( isAlreadyExistsError )
>  

hunk ./src/Darcs/Repository.hs 308
>               do unless (Complete `elem` opts) $
>                         putInfo "Copying patches, to get lazy
repository hit ctrl-C..."
>                  r <- read_repo torepository
> -                let peekaboo :: PatchInfoAnd p C(x y) -> IO ()
> -                    peekaboo x = case extractHash x of
> -                                 Left _ -> return ()
> -                                 Right h -> fetchFileUsingCache c
HashedPatchesDir h >> return ()
> -                sequence_ $ mapRL peekaboo $ progressRLShowTags
"Copying patches" $ concatRL r
> +                pipelineLength <- maxPipelineLength
> +                let patches = concatRL r
> +                    ppatches = progressRLShowTags "Copying patches"
patches
> +                    (first, other) = splitAt (pipelineLength - 1) $
tail $ hashes patches
> +                    speculate | pipelineLength > 1 = [] : first : map
(:[]) other
> +                              | otherwise = []
> +                mapM_ fetchAndSpeculate $ zip (hashes ppatches)
(speculate ++ repeat [])
>    where putInfo = when (not $ Quiet `elem` opts) . putStrLn


hunk ./src/Darcs/Repository.hs 316
> +        hashes = rights . mapRL extractHash
> +        fetchAndSpeculate :: (String, [String]) -> IO ()
> +        fetchAndSpeculate (f, ss) = do
> +          fetchFileUsingCache c HashedPatchesDir f
> +          mapM_ (speculateFileUsingCache c HashedPatchesDir) ss
>  
>  add_to_pending :: RepoPatch p => Repository p C(r u t) -> FL Prim C(u
y) -> IO ()
>  add_to_pending (Repo _ opts _ _) _ | NoUpdateWorking `elem` opts =
return ()

This seems correct.
msg10782 (view) Author: darcswatch Date: 2010-04-21.11:25:48
This patch bundle (with 3 patches) was just applied to the repository http://darcs.net/.
This message was brought to you by DarcsWatch
http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-9f59db8afcb344081a2b97846fbfabda43c3beee
msg14187 (view) Author: darcswatch Date: 2011-05-10.19:36:31
This patch bundle (with 3 patches) was just applied to the repository http://darcs.net/reviewed.
This message was brought to you by DarcsWatch
http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-9f59db8afcb344081a2b97846fbfabda43c3beee
History
Date User Action Args
2010-04-18 15:05:15dmitry.kurochkincreate
2010-04-18 15:06:11darcswatchsetdarcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-9f59db8afcb344081a2b97846fbfabda43c3beee
2010-04-20 08:56:06galbollesetstatus: needs-review -> review-in-progress
assignedto: galbolle
nosy: + galbolle
2010-04-20 13:26:39galbollelinkpatch209 superseder
2010-04-20 13:27:02galbollesetmessages: + msg10777
2010-04-20 13:27:14galbollesetstatus: review-in-progress -> accepted-pending-tests
2010-04-21 11:25:48darcswatchsetstatus: accepted-pending-tests -> accepted
messages: + msg10782
2011-05-10 19:36:31darcswatchsetmessages: + msg14187