darcs

Issue 1773 web service for fast darcs get/pull over HTTP

Title web service for fast darcs get/pull over HTTP
Priority feature Status needs-implementation
Milestone Resolved in
Superseder Nosy List darcs-devel, dmitry.kurochkin, kowey, mornfall, tux_rocker
Assigned To
Topics HTTP, Performance

Created on 2010-03-21.07:04:56 by kowey, last changed 2010-03-22.13:12:02 by kowey.

Messages
msg10345 (view) Author: kowey Date: 2010-03-21.07:04:50
We'd like to have some of smart server for Darcs.  When you darcs get
http://example.com/foo the remote end should be intelligent about just
giving you the files you need, and in the optimal number of chunks (no
more zillions of little files).

This could make a good Summer of Code project.

Also it may be a good idea to think about how this fits into darcs
transfer-mode.  Could we aim for some sort of convergence?  Could darcs
transfer-mode offer similar intelligence over SSH?  Could the script
just be darcs transfer-mode --http?

Petr: I didn't understand everything in the conversation (during the
2010-03 sprint Friday lunch).  Do you think you could jot down a few
more details on how we'd like the design for this to work?  I dimly
recall some debate about how to arrange things so that it's all very
easy to debug, eg. with curl/wget
msg10349 (view) Author: tux_rocker Date: 2010-03-21.10:50:40
Op zondag 21 maart 2010 08:04 schreef Eric Kow:
> Petr: I didn't understand everything in the conversation (during the
> 2010-03 sprint Friday lunch).  Do you think you could jot down a few
> more details on how we'd like the design for this to work?  I dimly
> recall some debate about how to arrange things so that it's all very
> easy to debug, eg. with curl/wget

I suggested making this a web service, and make it a (simple, low-level) end-
user web interface and a darcs get/put/pull/push protocol at the same time.  
That is, when you request http://darcs.net/reposervice with an 'Accept: 
application/x-darcs-repo' header, you get the binary blob that darcs needs; if 
you request the same URL with 'Accept: text/html', you get a human-readable 
index web page about the repo.

Petr said that this would be hard to debug with wget (because wget would just 
give you HTML). I said that curl and wget have flags to set headers such as 
'Accept'.

However, building such a multipurpose web service seems to be against the Unix 
philosophy of building small tools that do one thing well. But making a simple 
web service (or "protocol over HTTP", if that's more appropriate), even 
without the human-readable interface, has the advantage that you get error 
reporting and proxying and such HTTP features for free. 

Besides, "CGI script" sounds like 1995. Wouldn't it be a better idea to use 
some Haskell web server package, or fcgi, to make stuff even faster?

Reinier
msg10359 (view) Author: mornfall Date: 2010-03-21.11:53:13
Reinier Lamers <bugs@darcs.net> writes:

> Reinier Lamers <tux_rocker@reinier.de> added the comment:
>
> Op zondag 21 maart 2010 08:04 schreef Eric Kow:
>> Petr: I didn't understand everything in the conversation (during the
>> 2010-03 sprint Friday lunch).  Do you think you could jot down a few
>> more details on how we'd like the design for this to work?  I dimly
>> recall some debate about how to arrange things so that it's all very
>> easy to debug, eg. with curl/wget
>
> I suggested making this a web service, and make it a (simple, low-level) end-
> user web interface and a darcs get/put/pull/push protocol at the same time.  
> That is, when you request http://darcs.net/reposervice with an 'Accept: 
> application/x-darcs-repo' header, you get the binary blob that darcs needs; if 
> you request the same URL with 'Accept: text/html', you get a human-readable 
> index web page about the repo.
I still think that http://darcs.net/?repository is a nicer way of
putting it. For bigger chunks of data, we can use HTTP POST, presumably.

> Petr said that this would be hard to debug with wget (because wget would just 
> give you HTML). I said that curl and wget have flags to set headers such as 
> 'Accept'.
That was actually Joachim who said that, but I agree that as long as it
works well to have everything in the URL it would be better. Makes
things clearer and more explicit.

> However, building such a multipurpose web service seems to be against the Unix 
> philosophy of building small tools that do one thing well. But making a simple 
> web service (or "protocol over HTTP", if that's more appropriate), even 
> without the human-readable interface, has the advantage that you get error 
> reporting and proxying and such HTTP features for free. 
I don't see the multipurposeness in this. This is about having a custom
darcs server working over HTTP. The fact we can have a web UI on the
same URL is orthogonal to that (but has a nice property of being able to
use the same URL for get/pull and for browsing).

> Besides, "CGI script" sounds like 1995. Wouldn't it be a better idea to use 
> some Haskell web server package, or fcgi, to make stuff even faster?

Maybe, but CGI actually works. To get FCGI, you need C libraries
(there's Haskell interface to FCGI but requires the FCGI C kit), so this
is not really an option. Having an embedded web server is an option of
course, but I'd make that strictly optional. It's usually much easier to
set up CGI than a reverse proxy.

Yours,
   Petr.
msg10360 (view) Author: tux_rocker Date: 2010-03-21.12:26:37
Op zondag 21 maart 2010 12:53 schreef je:
> Reinier Lamers <bugs@darcs.net> writes:
> > However, building such a multipurpose web service seems to be against the 
Unix 
> > philosophy of building small tools that do one thing well. But making a 
simple 
> > web service (or "protocol over HTTP", if that's more appropriate), even 
> > without the human-readable interface, has the advantage that you get error 
> > reporting and proxying and such HTTP features for free. 
> I don't see the multipurposeness in this. This is about having a custom
> darcs server working over HTTP. The fact we can have a web UI on the
> same URL is orthogonal to that (but has a nice property of being able to
> use the same URL for get/pull and for browsing).

I was criticizing my own proposal there, not yours :-). So I say getting a 
protocol that darcs can use is more important than a human-usable web 
interface. And I think we agree on that.

Reinier
History
Date User Action Args
2010-03-21 07:04:57koweycreate
2010-03-21 10:50:43tux_rockersetnosy: + tux_rocker
messages: + msg10349
2010-03-21 10:58:38koweysetstatus: needs-reproduction -> needs-implementation
assignedto: mornfall ->
title: smart CGI script for fast darcs get/pull over HTTP -> web service for fast darcs get/pull over HTTP
2010-03-21 11:53:16mornfallsetmessages: + msg10359
title: web service for fast darcs get/pull over HTTP -> smart CGI script for fast darcs get/pull over HTTP
2010-03-21 12:26:40tux_rockersetmessages: + msg10360
2010-03-22 13:12:02koweysettitle: smart CGI script for fast darcs get/pull over HTTP -> web service for fast darcs get/pull over HTTP
2010-03-23 15:59:52koweylinkissue1691 superseder