Created on 2010-08-11.23:43:13 by mornfall, last changed 2011-05-10.20:06:08 by darcswatch. Tracked on DarcsWatch.
See mailing list archives
for discussion on individual patches.
msg12127 (view) |
Author: mornfall |
Date: 2010-08-11.23:43:13 |
|
Hi,
this is a work-in-progress bundle. The major part of it is me ripping through
path handling code in darcs (there's lots). I have written a new Path module
for hashed-storage and this bundle mostly replaces all darcs (relative) path
handling by using that new module. The Path module is a work in progress as
well, especially implementation-wise. Nevertheless, I am sending this as a
heads-up because this could be somewhat disrputive.
I am also attaching the current version of the Path module from hashed-storage
(it's not on HEAD yet). Don't bother looking for implementation bugs, I will
need to write a testsuite for this anyway (and the Absolute paths didn't get
any testing. Just from looking at distance I can see a bug...). However, API
suggestions would be welcome. I have a vague plan of splitting the path code
from hashed-storage, bundling it with IO functions that take (Path p) => p type
paths for convenient usage.
Also, support for non-POSIX systems is nearly nonexistent (but to be honest,
the support in current darcs is not very coherent either). I will work on that
as well.
It also seems that removing a pack-unpack and other jiggling code, patch
application in hashed-storage monads has received a significant performance
boost, on the order of 10-15 %. Unfortunately, this does not seem to affect
pull as significantly (as check/repair), as it seems to be spending its
lifetime mostly elsewhere.
The current version seems to be passing the testsuite, too. There's still a lot
of TODO on the darcs side as well.
Sorry, I am becoming incoherent and need to go to bed. Hopefully we can have
sensible discussion later. I did want to get this off my disk though and gather
some eyeballs maybe.
Yours,
Petr.
-----------------------------------------------------------------
module Storage.Hashed.Path
( Path, Relative, Absolute, (</>), Split(..), isPrefix, root, isRoot
, absolute, relative, directory, singleton, parent, parents, file, Name, (+/+), suffix
, pathToBS, pathToString, unsafePathFromBS, unsafePathFromString
, parsePath, parsePathBS ) where
import qualified Data.ByteString.Char8 as BS
import Data.List( isPrefixOf, inits )
-------------------------------
-- Path utilities
--
type Name = BS.ByteString
class Path p where
root :: p
unpath :: p -> BS.ByteString
path :: BS.ByteString -> p
parsePathBS :: BS.ByteString -> p
pathToBS :: p -> BS.ByteString
newtype Relative = Relative BS.ByteString deriving (Show, Eq, Ord)
newtype Absolute = Absolute BS.ByteString deriving (Show, Eq, Ord)
isRoot :: (Path p) => p -> Bool
isRoot = BS.null . unpath
instance Path Relative where
root = Relative BS.empty
unpath (Relative x) = x
path = Relative
pathToBS (unpath -> p) | BS.null p = "."
| otherwise = p
parsePathBS path
| BS.null path || path == "." = root
| BS.head path == '/' = error $ "BUG: Not a relative path in Relative.parsePathBS " ++ shown
| BS.take 2 path == "./" = parsePathBS $ BS.drop 2 path
| otherwise = case BS.span (/= '/') path of
(head, tail) -> root </> head +/+ parsePathBS (BS.drop 1 tail)
where shown = BS.unpack path
instance Path Absolute where
root = Absolute $ BS.singleton '/'
unpath (Absolute x) = BS.drop 1 x
path = Absolute . BS.cons '/'
pathToBS (unpath -> p) | BS.null p = "/"
| otherwise = p
relative :: Relative
relative = root
absolute :: Absolute
absolute = root
-- | Check whether a path is a prefix of another path.
isPrefix :: Path p => p -> p -> Bool
isPrefix (unpath -> a) (unpath -> b)
| BS.null a = True
| BS.length a == BS.length b = a == b
| BS.isPrefixOf a b = BS.index b (BS.length a) == '/'
| otherwise = False
(</>) :: forall p. (Show p, Path p) => p -> Name -> p
(unpath -> p) </> n
| n == BS.pack "." = path p
| n == BS.pack ".." = parent (path p :: p)
| BS.null p = path $ BS.concat [unpath (root :: p), n]
| BS.null n = path p
| BS.elem '/' n = error $ "BUG: Path components may not contain slashes: "
++ show (BS.unpack p) ++ ", " ++ BS.unpack n
| otherwise = path $ BS.concat [p, BS.singleton '/', n]
data Split a b = a :/: b | Atomic
singleton :: Path p => p -> Maybe BS.ByteString
singleton (file -> dir :/: file)
| isRoot dir = Just file
singleton _ = Nothing
directory :: Path p => p -> Split BS.ByteString Relative
directory (unpath -> p) = case BS.break (=='/') p of
_ | BS.null p -> Atomic
(dir, p') -> dir :/: path (BS.drop 1 p')
file :: Path p => p -> Split p BS.ByteString
file (unpath -> p) = case BS.breakEnd (=='/') p of
_ | BS.null p -> Atomic
(p', file) | BS.null p' -> root :/: file
| otherwise -> (path $ BS.init p') :/: file
suffix :: (Show p, Path p) => p -> p -> Relative
suffix x@(unpath -> x') y@(unpath -> y')
| BS.null x' = path y'
| x `isPrefix` y = path $ BS.drop (BS.length x' + 1) y'
| otherwise = error $ "BUG: Path " ++ show x ++ " is not a prefix of " ++ show y ++ "!"
-- unsafe
toNames (unpath -> p) = BS.split '/' p
fromNames names = path $ BS.intercalate (BS.singleton '/') names
(+/+) :: Path p => p -> Relative -> p
(unpath -> p) +/+ (unpath -> q)
| BS.null q = path p
| BS.null p = path q
| otherwise = path $ BS.concat [ p, BS.singleton '/', q ]
parent :: (Show p, Path p) => p -> p
parent (file -> parent :/: _) = parent
parent path@(file -> Atomic) = error $ "BUG: " ++ show path ++ " has no parent."
parents :: forall p. Path p => p -> [p]
parents (file -> dir :/: _) = dir : parents dir
parents (file -> Atomic) = []
pathToString :: (Path p) => p -> String
pathToString = BS.unpack . pathToBS
unsafePathFromBS :: (Path p) => BS.ByteString -> p
unsafePathFromBS = path
unsafePathFromString :: (Path p) => String -> p
unsafePathFromString = path . BS.pack
parsePath :: Path a => String -> a
parsePath = parsePathBS . BS.pack
-----------------------------------------------------------------
15 patches for repository http://darcs.net:
Wed Aug 11 17:39:29 CEST 2010 Petr Rockai <me@mornfall.net>
* First stab at a hashed-storage 0.6 port.
Wed Aug 11 21:25:55 CEST 2010 Petr Rockai <me@mornfall.net>
* Move the preferences system into IO where it belongs.
Wed Aug 11 21:45:04 CEST 2010 Petr Rockai <me@mornfall.net>
* Make FileName an alias to Relative (from Hashed.Storage.Path).
Wed Aug 11 22:12:49 CEST 2010 Petr Rockai <me@mornfall.net>
* Fix annotate that got broken due to path format change.
Thu Aug 12 00:02:43 CEST 2010 Petr Rockai <me@mornfall.net>
* Replace FilePath with FileName in SelectChanges and ChooseTouching.
Thu Aug 12 00:09:46 CEST 2010 Petr Rockai <me@mornfall.net>
* Make SubPath just another alias for Relative.
Thu Aug 12 00:16:21 CEST 2010 Petr Rockai <me@mornfall.net>
* Introduce a new Darcs.Path module to centralise path handling.
Thu Aug 12 00:36:15 CEST 2010 Petr Rockai <me@mornfall.net>
* Merge Darcs.Patch.FileName into Darcs.Path.
Thu Aug 12 00:47:39 CEST 2010 Petr Rockai <me@mornfall.net>
* Remove the now-redundant sp2fn.
Thu Aug 12 01:05:34 CEST 2010 Petr Rockai <me@mornfall.net>
* Fix announceFiles in WhatsNew (abolish unsafePathFrom*).
Thu Aug 12 01:06:04 CEST 2010 Petr Rockai <me@mornfall.net>
* Restore the ".." check in isMaliciousPath.
Thu Aug 12 01:29:40 CEST 2010 Petr Rockai <me@mornfall.net>
* Fix a subtle bug in onlyHunks with rather curious side-effects.
Thu Aug 12 01:30:27 CEST 2010 Petr Rockai <me@mornfall.net>
* Merge Darcs.RepoPath into Darcs.Path.
Thu Aug 12 01:31:25 CEST 2010 Petr Rockai <me@mornfall.net>
* Drop unused and redundant pathFromFileName.
Thu Aug 12 01:32:08 CEST 2010 Petr Rockai <me@mornfall.net>
* Make the tests pass (mostly ./foo versus foo in user-visible output).
Attachments
|
msg12140 (view) |
Author: kowey |
Date: 2010-08-12.12:47:40 |
|
On Wed, Aug 11, 2010 at 23:43:13 +0000, Petr Ročkai wrote:
> I have a vague plan of splitting the path code
> from hashed-storage, bundling it with IO functions that take (Path p) => p type
> paths for convenient usage.
Minor admin note (sorry!) Perhaps Ben Moseley (who was at the Utrecht
hacking sprint in 2008) would be interested in you becoming the
maintainer for the pathtype package?
http://hackage.haskell.org/package/pathtype
By the way, Duncan (bcc'ed) has expressed an interest in pathtype
before, so if he has anything to say about
http://bugs.darcs.net/patch343 perhaps he'll chime in.
> It also seems that removing a pack-unpack and other jiggling code, patch
> application in hashed-storage monads has received a significant performance
> boost, on the order of 10-15 %. Unfortunately, this does not seem to affect
> pull as significantly (as check/repair), as it seems to be spending its
> lifetime mostly elsewhere.
Still seems nice.
--
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
For a faster response, please try +44 (0)1273 64 2905.
|
msg12151 (view) |
Author: kowey |
Date: 2010-08-13.08:54:07 |
|
This is mostly an admin-level reply...
On Wed, Aug 11, 2010 at 23:43:13 +0000, Petr Ročkai wrote:
> I have written a new Path module for hashed-storage and this bundle
> mostly replaces all darcs (relative) path handling by using that new
> module.
Nice! You may find it useful to search the BTS for issues tagged
FilePath and status not resolved <http://is.gd/efHaI>.
Could you please summarise what's wrong with the current path handling?
(Crappy design or lack of design could be a good answer, but it's nice
to be a bit clear about this).
Are you specifically targeting performance issues (pack/unpack) or code
safety issues (list of path components better than strings to avoid
silly trailing slash stuff, typed paths), and/or something else? I
think this discussion has taken place over some days on IRC and I
confess I have not really been following along. Sorry!
> Nevertheless, I am sending this as a heads-up
> because this could be somewhat disrputive.
Thanks! Somewhere in the past I think we'd agreed that major pieces of
work should be discussed first to reduce pressure. But it's easy to
forget to do and I'm glad somebody's still remembering. (I can't find
an archive for this, unfortunately; perhaps it should go on
http://wiki.darcs.net/Development/GettingStarted)
By the way, you asked about the feasibility of this work entering Darcs
2.8 (what comes after 2.5). I replied that if we agree in principle
that the work is important and (I'll add now) that the work is "done
enough" to start merging it in, then maybe getting it in early in the
life cycle is best so that we have time to wring out the bugs.
> However, API suggestions would be welcome. I have a vague plan of
> splitting the path code from hashed-storage, bundling it with IO
> functions that take (Path p) => p type paths for convenient usage.
When I mentioned the pathtype package, I think you said it had a huge
API and that you'd rather focus on a limited set of operations that we
can make sure we get right
<http://irclog.perlgeek.de/darcs/2010-08-12#i_2694933>
I'm CC'ing Salvatore because I remember that in Zurich (over lunch), he
was telling me that we really really need to work on the filepath stuff.
Maybe he'll have comments to make.
> Also, support for non-POSIX systems is nearly nonexistent (but to be honest,
> the support in current darcs is not very coherent either). I will work on that
> as well.
One minor detail to remember is funny Windows paths for network shares
(\\something_or_another/blah). I seem to remember we've never quite
gotten that stuff right.
--
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
For a faster response, please try +44 (0)1273 64 2905.
|
msg12161 (view) |
Author: kowey |
Date: 2010-08-14.11:12:10 |
|
Jason posted on IRC: ?tell kowey re:FilePath. I can't speak for Petr,
but the current stuff does not have a coherent design. The
packing/unpacking is inefficient. We could use more type safety in our
filepaths, but perhaps not in the way typepath works. We need it to
separate 'clean' paths from 'dirty' paths. Dirty paths are ones that
have not been transformed or checked to make sure they are sane.
|
msg12162 (view) |
Author: kowey |
Date: 2010-08-14.11:14:23 |
|
On Sat, Aug 14, 2010 at 11:12:11 +0000, Eric Kow wrote:
> We need it to separate 'clean' paths from 'dirty' paths. Dirty paths
> are ones that have not been transformed or checked to make sure they
> are sane.
Aren't the current Darcs.RepoPath paths implicitly marked clean,
with the dirty ones being String? The RepoPath work was never
really finished, just sort of inserted piecemeal into Darcs;
hopefully, Petr's attempt is a lot better.
--
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
For a faster response, try +44 (0)1273 64 2905 or
xmpp:kowey@jabber.fr (Jabber or Google Talk only)
|
msg12211 (view) |
Author: benmos |
Date: 2010-08-16.20:06:41 |
|
Just been pointed to this as I started doing a bit more work on
'pathtype' recently. FWIW I'm working on adding basic Windows path
support (UNC, drive letters etc). As I see it, the major differences
are: pathtype's API is based on System.FilePath, pathtype uses phantom
types (as opposed to newtypes and a typeclass), pathtype uses a File/Dir
distinction. (As an aside, following some comments from Duncan and Wren,
I'm wondering about expanding its Abs/Rel distinction to capture
'Partial' paths). Anyway, I'd be interested in any thoughts.
|
msg12222 (view) |
Author: mornfall |
Date: 2010-08-18.18:46:15 |
|
@benmos: What would partial paths be? I also need the code to be
efficient, and probably to store the paths in bytestrings. And I am not
very fond of the filepath API, since it's quite big and somewhat
unwieldy (although it could be that it can't be really improved much).
Anyway, most of the hashed-storage code is about manipulating trees, so
I need to traverse the tree using a path. The view patterns I introduced
in here seem to be very convenient for that.
|
msg12382 (view) |
Author: kowey |
Date: 2010-08-31.09:52:57 |
|
Obsoleted by patch374 if I understand correctly
|
|
Date |
User |
Action |
Args |
2010-08-11 23:43:13 | mornfall | create | |
2010-08-11 23:44:45 | darcswatch | set | darcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-1ed13311268a357282cf7ca3523df562a7b40478 |
2010-08-12 12:47:40 | kowey | set | nosy:
+ kowey messages:
+ msg12140 |
2010-08-13 08:54:08 | kowey | set | nosy:
+ kirby messages:
+ msg12151 |
2010-08-14 11:12:11 | kowey | set | messages:
+ msg12161 |
2010-08-14 11:14:23 | kowey | set | messages:
+ msg12162 |
2010-08-16 20:06:41 | benmos | set | nosy:
+ benmos messages:
+ msg12211 |
2010-08-18 18:46:16 | mornfall | set | messages:
+ msg12222 |
2010-08-31 09:52:57 | kowey | set | status: needs-review -> obsoleted messages:
+ msg12382 superseder:
+ adventure branch |
2011-05-10 20:06:08 | darcswatch | set | darcswatchurl: http://darcswatch.nomeata.de/repo_http:__darcs.net_.html#bundle-1ed13311268a357282cf7ca3523df562a7b40478 -> http://darcswatch.nomeata.de/repo_http:__darcs.net_reviewed.html#bundle-1ed13311268a357282cf7ca3523df562a7b40478 |
|