darcs

Issue 757 darcs get doesn't validate inventory file

Title darcs get doesn't validate inventory file
Priority bug Status resolved
Milestone Resolved in
Superseder Nosy List Serware, darcs-devel, dmitry.kurochkin, ertai, jch, kowey, thorkilnaur, tommy
Assigned To
Topics HTTP

Created on 2008-03-22.17:43:27 by ertai, last changed 2010-03-18.13:56:26 by kowey.

Messages
msg3970 (view) Author: ertai Date: 2008-03-22.17:43:24
With darcs2 doing

  darcs get http://www.lri.fr/~sozeau/repos/coq/fingertrees

Shows that asking for the _darcs/format file returns an HTML error since the
format file doesn't exist.

I would like to help but I don't know what would be the right fix. There is a
trade-off between reporting parsing errors and conclude that the file is not
reliable and so fall back to darcs1 repo format.
msg3972 (view) Author: droundy Date: 2008-03-22.18:00:23
On Sat, Mar 22, 2008 at 05:43:27PM -0000, Nicolas Pouillard wrote:
> With darcs2 doing
> 
>   darcs get http://www.lri.fr/~sozeau/repos/coq/fingertrees
> 
> Shows that asking for the _darcs/format file returns an HTML error since the
> format file doesn't exist.
>
> I would like to help but I don't know what would be the right fix. There is a
> trade-off between reporting parsing errors and conclude that the file is not
> reliable and so fall back to darcs1 repo format.

Darcs handles file-not-found errors just fine.  The problem is that this
server isn't returning a 404 error as it's supposed to, instead it's giving
a 302 response.  I'm not sure what's the best workaround for buggy web
servers.  The simplest fix is to just create a format file on the server,
or perhaps to use a less buggy file server.

If neither of these are reasonable options, I suppose we could try to hack
up some sort of workaround, maybe searching for the text "This page is not
available"... but that's a *seriously* ugly hack.  We can't just ignore
format files that we don't understand, because the entire point of the
format file is that when it's present and we don't understand it, we know
that it's a format we don't understand, and can thus fail with a nice error
message.
-- 
David Roundy
Department of Physics
Oregon State University
msg3973 (view) Author: ertai Date: 2008-03-22.18:44:06
Excerpts from David Roundy's message of Sat Mar 22 18:54:32 +0100 2008:
> On Sat, Mar 22, 2008 at 05:43:27PM -0000, Nicolas Pouillard wrote:
> > With darcs2 doing
> > 
> >   darcs get http://www.lri.fr/~sozeau/repos/coq/fingertrees
> > 
> > Shows that asking for the _darcs/format file returns an HTML error since the
> > format file doesn't exist.
> >
> > I would like to help but I don't know what would be the right fix. There is a
> > trade-off between reporting parsing errors and conclude that the file is not
> > reliable and so fall back to darcs1 repo format.
> 
> Darcs handles file-not-found errors just fine.  The problem is that this
> server isn't returning a 404 error as it's supposed to, instead it's giving
> a 302 response.  I'm not sure what's the best workaround for buggy web
> servers.  The simplest fix is to just create a format file on the server,
> or perhaps to use a less buggy file server.

I  understand  that's  basically  a  "not our bug" case, but it's annoying and
perhaps some heuristic could help us.

> If neither of these are reasonable options, I suppose we could try to hack
> up some sort of workaround, maybe searching for the text "This page is not
> available"... but that's a *seriously* ugly hack.  We can't just ignore
> format files that we don't understand, because the entire point of the
> format file is that when it's present and we don't understand it, we know
> that it's a format we don't understand, and can thus fail with a nice error
> message.

Perhaps  searching for any XML tag would be less ugly, printing a warning that
this XML answer has been considered as a "File not found".
msg4003 (view) Author: droundy Date: 2008-03-25.15:41:28
Fixed. I'm not sure why sometimes our automatic closing of bugs doesn't seem to
work... it may be an email problem.  :(

David
msg4006 (view) Author: ertai Date: 2008-03-25.20:11:06
These buggy servers can bring even more problems, consider:

$ darcs get http://www.lri.fr/~sozeau/this_does_not_exists
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
[-- snip --]
</html>
**********************
Finished getting.

It successfully download an empty repository.
However there is more important things to treat...
msg4007 (view) Author: droundy Date: 2008-03-25.20:20:53
So now the problem is that we don't apparently verify that the inventory file is
valid... yuck.
msg4206 (view) Author: jch Date: 2008-04-07.21:23:41
The server is definitely brain-damaged, but then, what do you expect from Orsay
University ;-)

On the other hand, the fact that we fail to verify that the inventory makes
sense to darcs is definitely a bug.  Retitling.

--Juliusz
msg6985 (view) Author: thorkilnaur Date: 2009-01-05.14:52:44
Changing status to "need-volunteer" on this, as there definitely seems to be 
some problem to be fixed (the validation of inventories). Although I am a great 
fan of validating one's data thoroughly, I am uncertain how important this 
particular lack of validation is, in practice, however.

Best regards
Thorkil
msg10274 (view) Author: droundy Date: 2010-03-18.13:44:43
The following patch updated the status of issue757 to be resolved:

* resolve issue757: ignore bogus formats including '<' 
This is an ugly hack, and means we can never use the '<' character in the
format description.  But it means that any xml or html garbage we might be
fed will be rejected.
msg10276 (view) Author: kowey Date: 2010-03-18.13:47:10
In case anybody is puzzling over this, I was just clearing out some bugs
that an older version of our update_roundup script had missed.  The
patch was from a couple of years ago.
History
Date User Action Args
2008-03-22 17:43:27ertaicreate
2008-03-22 18:00:25droundysetstatus: unread -> unknown
nosy: + darcs-devel
messages: + msg3972
2008-03-22 18:44:08ertaisetnosy: droundy, tommy, beschmi, kowey, darcs-devel, ertai, Serware
messages: + msg3973
2008-03-24 15:01:48droundysetnosy: droundy, tommy, beschmi, kowey, darcs-devel, ertai, Serware
title: darcs2 fails to identify the repository on some restrictive http servers. -> darcs2 fails to identify the repository on some buggy http servers.
2008-03-25 15:41:29droundysetstatus: unknown -> resolved
nosy: droundy, tommy, beschmi, kowey, darcs-devel, ertai, Serware
messages: + msg4003
2008-03-25 20:11:07ertaisetstatus: resolved -> unknown
nosy: droundy, tommy, beschmi, kowey, darcs-devel, ertai, Serware
messages: + msg4006
2008-03-25 20:20:54droundysetnosy: droundy, tommy, beschmi, kowey, darcs-devel, ertai, Serware
messages: + msg4007
2008-03-25 20:21:59droundysettopic: - Target-2.0
nosy: droundy, tommy, beschmi, kowey, darcs-devel, ertai, Serware
title: darcs2 fails to identify the repository on some buggy http servers. -> darcs2 fails to deal with some buggy http servers.
2008-03-26 14:39:54droundysetpriority: bug -> feature
nosy: droundy, tommy, beschmi, kowey, darcs-devel, ertai, Serware
2008-04-07 21:23:42jchsetpriority: feature -> bug
nosy: + jch
messages: + msg4206
title: darcs2 fails to deal with some buggy http servers. -> darcs2 get doesn't validate inventory file
2009-01-05 14:52:46thorkilnaursetstatus: unknown -> needs-reproduction
nosy: + dmitry.kurochkin, simon, thorkilnaur
messages: + msg6985
2009-01-05 15:04:33droundysetnosy: - droundy
2009-08-06 21:00:54adminsetnosy: - beschmi
2009-08-11 18:02:50koweysetstatus: needs-reproduction -> needs-implementation
nosy: jch, tommy, kowey, darcs-devel, simon, thorkilnaur, ertai, dmitry.kurochkin, Serware
topic: + HTTP, - Darcs2
title: darcs2 get doesn't validate inventory file -> darcs get doesn't validate inventory file
2009-08-25 17:38:06adminsetnosy: - simon
2009-08-27 14:23:12adminsetnosy: jch, tommy, kowey, darcs-devel, thorkilnaur, ertai, dmitry.kurochkin, Serware
2009-10-23 22:39:43adminsetnosy: + nicolas.pouillard, - ertai
2009-10-23 22:42:32adminsetnosy: + serware, - Serware
2009-10-23 23:28:28adminsetnosy: + Serware, - serware
2009-10-24 00:04:45adminsetnosy: + ertai, - nicolas.pouillard
2010-03-18 13:44:46droundysetstatus: needs-implementation -> resolved
nosy: + droundy
messages: + msg10274
2010-03-18 13:47:12koweysetstatus: resolved -> unknown
nosy: - droundy
messages: + msg10276
2010-03-18 13:56:26koweysetstatus: unknown -> resolved