Issue 498: progress indicators - Darcs bug tracker

Title	progress indicators
Priority	wishlist	Status	resolved
Milestone		Resolved in
Superseder	option to show commute progress (e.g. pull --very-verbose) View: 72	Nosy List	darcs-devel, dmitry.kurochkin, kowey, markstos, quick, thorkilnaur, tommy, zooko
Assigned To		Topics

Created on 2007-07-20.20:08:35 by zooko, last changed 2009-08-27.14:07:07 by admin.

Messages
msg1881 (view)	Author: zooko	Date: 2007-07-20.20:08:34
As per this e-mail message: http://lists.osuosl.org/pipermail/darcs-devel/2007-July/005927.html If darcs is in the process of doing something (in this case fetching metadata from a remote repository), it doesn't do a good enough job of informing the user about its progress. (It should do this when the user has passed "-v -v -v" to indicate that she wants verbose output.) Progress indicators might seem like a mere "decoration", but they are actually very important to the user experience, so this is not an unimportant feature request. Regards, Zooko
msg1882 (view)	Author: kowey	Date: 2007-07-20.20:48:15
Thanks Zooko, > Progress indicators might seem like a mere "decoration", but they are > actually very important to the user experience, so this is not an > unimportant feature request. I think we all agree that this is very important stuff. You might want to have a look at the related issue72. Also, just to check, is it really both darcs get and darcs pull where Rob is experiencing the problems? The reason I ask is that if it was just darcs pull, we have an idea why it's so slow (lots of commutation). And for this, we really ought to have some extra progress indicators. But if darcs get was also slow and silent, then I am puzzled, because then the issue is likely something more mundane like fetching files over HTTP. And darcs get has plenty of verbosity there, so it wouldn't be both slow and silent. So just to confirm, is it really just darcs pull that has this silent behaviour?
msg1889 (view)	Author: quick	Date: 2007-07-21.23:06:21
On 20 Jul 2007, at 1:47 PM, Eric Y. Kow wrote: > Thanks Zooko, > >> Progress indicators might seem like a mere "decoration", but they are >> actually very important to the user experience, so this is not an >> unimportant feature request. > > I think we all agree that this is very important stuff. You might > want to have > a look at the related issue72. > > Also, just to check, is it really both darcs get and darcs pull > where Rob is > experiencing the problems? > > The reason I ask is that if it was just darcs pull, we have an idea > why it's > so slow (lots of commutation). And for this, we really ought to > have some > extra progress indicators. I did some experiments on a local repository that I access remotely via SSH. It turns out that a lot of the time is consumed fetching patches via SSH and there's no -v progress indicators for when this is happening. The commutation was negligible in my case. I then got the (at the time seemingly) bright idea that I could use David's new caching (_darcs/prefs/sources) to cache SSH-fetched patches. Thus the patch fetching time would only be paid for the first retrieval. After a little Haskelling, I had this up and running and it did indeed help significantly: my original "darcs pull --dry-run" (with ssh control-mastering) was 1:47 (one minute, 47 seconds)... the new version with caching dropped that time to under 6 seconds for subsequent pulls. Alas, I realized I had leapt before I'd looked. A patch (for a DarcsRepo at least) is only valid in the context of it's repository, but the caching doesn't store the repository information. Details: * Imagine a repository (R1) containing file foo.txt. * Create two new repositories from R1 (called R2 and R3, oddly enough). * In R1, add some lines to the beginning of foo.txt and record that change. * Next in R1, add some lines to the middle of foo.txt and record that change. * In R2, pull the second patch only to R1. This will cause the patch to be rewritten to change the hunk offsets to account for the lack of the first patch. However, the patch name and internal PatchInfo are the same for that patch. * In R3, do a dry-run pull from R1, caching R1's patches locally. Then do a pull -a from R2: problem occurs because darcs will used the locally cached patch, which is not the patch from R2 (wrong hunk offsets). At this point, I have an alternative suggestion for people facing these types of performance problems, and some questions for David or more knowledgeable folks: Suggestion: create a local "mirror" of the remote repository. Periodically just do "push -a" and "pull -a" to keep the mirror in sync with the remote repository. Then do all the local working repository pushes and pulls against the local mirror; it's local so there's no ssh (or http) access overhead, and it should be much faster. Questions: * I looked back over the reflector mail, but I'm afraid I'm still a little foggy over the new HashedRepo format and how it's different from the old- style DarcsRepo. And particularly, how is it valid to cache patches for a HashedRepo in a manner that avoids the above problem for DarcsRepo-style patches? [If David, Eric, or one of the other darcs illuminati would send me notes on HashedRepo details, I'd be happy to assist in converting that to a lengthier exposition for inclusion in darcs docs. If this is already there and I've just overlooked it, please just clue me in. :-) ] * Does the caching sound interesting enough to warrant resolving the above issue in some manner, or is the suggested workaround above the simplest and recommended way to handle this? If the caching is interesting, here's some possible thoughts for making it work: * In the cache: writable directories, use a subdirectory that is named or has a special file that specifies the remote repository that the entries in that cache came from. * To fully support the above, the repositories probably need to have an internal file that indicates "last modified date" for their repository. Anything which might modify a patch file after it was stored in the repository (unpull/unrecord/ optimize+reorder/amend) would change "modified date". Any cache-based access from a remote repo would now pull this "modified date" information along with the inventory and compare it to the contents of the cache, wiping the cache if the source was modified more recently. Note that record and pull operations would not change the modified date, since they don't invalidate previous patches that might have been cached. * Alternatively, add a 5th _darcs/prefs/sources type which is a "mirror-repo:". Currently, a repo: entry in sources is read-only. A mirror- repo: would be writeable. This would essentially be an automation of the above workaround suggestion, but it would allow the local mirror to be automatically updated when working in other local repos. Unfortunately, there's some issues to consider for this method with respect to disallowing other types of updates to the mirror repo and things like that. Back to Zooko's original issue, I think feedback is definitely desireable. Although the user should be allowed to explicitly specify the verbosity of a particular operation, I'd like to suggest (1) having a timing threshold (obtained from a prefs file or environment variable) that defaults to something like 5 seconds. Any operation whose elapsed time exceeded this threshold would increase its verbosity, and (2) making any verbosity controls default their inputs from environment variables or user prefs files. Thus the normal stuff is fast and quiet, but at about the time the user starts wondering what's going on, darcs starts telling them. And making it an environment variable/prefs file means that it would always be in place; requiring it on the command line means that a user starts a command, then after the pain threshold is crossed, they have to Ctrl- C abort it and start from the beginning with more verbosity flags. OK, enough rambling. Feedback/info on the above would be appreciated. -KQ > > But if darcs get was also slow and silent, then I am puzzled, > because then > the issue is likely something more mundane like fetching files over > HTTP. > And darcs get has plenty of verbosity there, so it wouldn't be both > slow > and silent. > > So just to confirm, is it really just darcs pull that has this silent > behaviour? > > -- > Eric Kow http://www.loria.fr/~kow > PGP Key ID: 08AC04F9 Merci de corriger mon français. > _______________________________________________ > darcs-devel mailing list > darcs-devel@darcs.net > http://lists.osuosl.org/mailman/listinfo/darcs-devel
msg1890 (view)	Author: kowey	Date: 2007-07-22.04:20:29
On Sat, Jul 21, 2007 at 15:56:11 -0700, Kevin Quick wrote: > I did some experiments on a local repository that I access remotely > via SSH. It turns out that a lot of the time is consumed fetching > patches via SSH and there's no -v progress indicators for when this > is happening. The commutation was negligible in my case. Ah, the dangers of making assumptions. Thanks for pointing out this possibility, Kevin! In fact, this sort of lines up nicely with Zooko saying that it was about synching with remote repos. Zooko, have you tried the chatty ssh trick to log your ssh calls? http://wiki.darcs.net/index.html/DeveloperTips You might be able to watch (i.e. the 'watch' cmd) your ssh call log and see what is happening. One thing I wonder is if it is literally fetching the patches over SSH or something even simpler. For example, when making SSH connections at work, I always get a little delay unless I force it to use IPv4 addresses only. Never really figured out why; all I remember was that without this setting, ssh would hang for like 30 seconds (timeout?) before succesfully connecting. Zooko, are you sure it's not something ridiculously simple like that? Haven't read the rest of Kevin's mail yet ( sorry :-) ) -- Eric Kow http://www.loria.fr/~kow PGP Key ID: 08AC04F9 Merci de corriger mon français.
msg2887 (view)	Author: markstos	Date: 2008-01-30.02:33:23
There is now a progress reporting framework in the unstable branch.

History
Date	User	Action	Args
2007-07-20 20:08:36	zooko	create
2007-07-20 20:48:17	kowey	set	status: unread -> unknown nosy: + darcs-devel messages: + msg1882
2007-07-20 21:00:34	kowey	set	nosy: - darcs-devel superseder: + option to show commute progress (e.g. pull --very-verbose)
2007-07-21 23:06:22	quick	set	nosy: + darcs-devel, quick messages: + msg1889
2007-07-22 04:20:31	kowey	set	messages: + msg1890
2008-01-30 02:33:24	markstos	set	status: unknown -> resolved-in-unstable nosy: + markstos messages: + msg2887
2008-09-04 21:31:13	admin	set	status: resolved-in-unstable -> resolved nosy: + dagit
2009-08-06 17:33:54	admin	set	nosy: + jast, Serware, dmitry.kurochkin, mornfall, simon, thorkilnaur, - droundy, quick
2009-08-06 20:31:20	admin	set	nosy: - beschmi
2009-08-10 22:06:13	admin	set	nosy: + quick, - jast, Serware, mornfall
2009-08-11 00:01:35	admin	set	nosy: - dagit
2009-08-25 17:48:36	admin	set	nosy: - simon
2009-08-27 14:07:07	admin	set	nosy: tommy, kowey, markstos, darcs-devel, zooko, quick, thorkilnaur, dmitry.kurochkin