Issue 2280: failure on NFS - Darcs bug tracker

Title	failure on NFS
Priority	bug	Status	needs-reproduction
Milestone		Resolved in
Superseder		Nosy List	fx
Assigned To		Topics

Created on 2012-12-17.11:07:08 by fx, last changed 2020-08-01.09:07:11 by bfrk.

Messages
msg16420 (view)	Author: fx	Date: 2012-12-17.11:07:07
I'm seeing problems with a repo on an NFSv4 mount (or an NFSv3 mount of the same export). The server is Solaris 10 and the client RHEL5, and darcs 2.8.3, built with 6.12.3, though it also happened with a version 2.5 (I think). Doing an obliterate, I either see darcs: _darcs/tentative_hashed_inventory-0: rename: resource busy (Device or resource busy) or darcs: _darcs/index: removeLink: resource busy (Device or resource busy) Copying the repo to local disk works fine. I can't easily try other client/server combination currently, in case it's something specific to that.
msg16421 (view)	Author: markstos	Date: 2012-12-17.15:02:54
What would you expect darcs to do in this case? If the file-system is reporting that it timed-out on an operation, it seems reason to bubble that up to the user. Perhaps you'd like darcs to have an option over a higher timeout value for filesystem operations of the filesystem appears "busy-but-functional"?
msg16422 (view)	Author: fx	Date: 2012-12-17.17:27:19
Mark Stosberg <bugs@darcs.net> writes: > Mark Stosberg <mark@summersault.com> added the comment: > > What would you expect darcs to do in this case? > > If the file-system is reporting that it timed-out on an operation, it seems > reason to bubble that up to the user. I see no evidence for filesystem timeouts. Is that a known problem that produces those symptoms? I was hoping/expecting it was known, despite the lack of bug reports I could find I was guessing it would be related to locking, if anything but, as I understand it, that should work on NFS4.
msg16423 (view)	Author: markstos	Date: 2012-12-17.18:44:49
From what I can tell, "Device or resource busy" comes not from Darcs or GHC-land, but from Linux/NFS. See all the results for here: https://encrypted.google.com/search?q=nfs++%22Device+or+resource+busy%22 I'm tentatively marking this as "not our bug". If the issue can be reproduced in a way that is clearly darcs fault, please re-open it.
msg16444 (view)	Author: fx	Date: 2012-12-20.16:35:43
Mark Stosberg <bugs@darcs.net> writes: > Mark Stosberg <mark@summersault.com> added the comment: > >>From what I can tell, "Device or resource busy" comes not from Darcs or > GHC-land, but from Linux/NFS. I realize it's NFS-related. Perhaps I should have made clear originally that I expected it to be known whether it works on NFS generally, though I couldn't find anything to say so. > See all the results for here: > > https://encrypted.google.com/search?q=nfs++%22Device+or+resource+busy%22 > > I'm tentatively marking this as "not our bug". If the issue can be > reproduced in a way that is clearly darcs fault, please re-open it. Are you saying that darcs simply isn't supported on NFS (or other -- AFS, CIFS, FUSE?) filesystems? If so, it needs documenting. Looking more closely, and assuming the "rename" message is actually from rename(2), POSIX says: [EBUSY] [CX] [Option Start] The directory named by old or new is currently in use by the system or another process, and the implementation considers this an error. [Option End] and the Debian says: EBUSY The rename fails because oldpath or newpath is a directory that is in use by some process (perhaps as current working directory, or as root directory, or because it was open for reading) or is in use by the system (for example as mount point), while the system considers this an error. (Note that there is no require‐ ment to return EBUSY in such cases — there is nothing wrong with doing the rename anyway — but it is allowed to return EBUSY if the system cannot otherwise handle such situations.) which suggests it is darcs' "fault", but if it's not going to be fixed, it would be useful for people to know they need to use a local filesystem. (It seems as if that's wise for performance reasons, but that's a different issue.)
msg16445 (view)	Author: fx	Date: 2012-12-20.17:47:35
It turns out there's a test case for failure on NFS (nfs-failure.sh), but I see it failing differently now: $ darcs init $ echo first > a $ darcs add a $ darcs record --pipe --all --name=first <<EOF > Thu Sep 18 22:37:06 MSD 2008 > author > EOF darcs: _darcs/index: removeLink: resource busy (Device or resource busy)
msg22344 (view)	Author: bfrk	Date: 2020-08-01.09:07:09
I have been using darcs over NFS(-3) and never observed this failure.

History
Date	User	Action	Args
2012-12-17 11:07:08	fx	create
2012-12-17 15:02:56	markstos	set	status: unknown -> waiting-for messages: + msg16421
2012-12-17 17:27:20	fx	set	messages: + msg16422
2012-12-17 18:44:50	markstos	set	priority: not-our-bug messages: + msg16423
2012-12-20 16:35:45	fx	set	messages: + msg16444
2012-12-20 17:47:36	fx	set	messages: + msg16445
2020-08-01 09:07:11	bfrk	set	priority: not-our-bug -> bug status: waiting-for -> needs-reproduction messages: + msg22344