darcs

Issue 560 pull => resource exhausted (Too many open files)

Title pull => resource exhausted (Too many open files)
Priority bug Status resolved
Milestone Resolved in
Superseder Nosy List RichardG, alex1, cvs-ghc, darcs-devel, dmitry.kurochkin, jch, kowey, simonpj, thorkilnaur, tommy
Assigned To
Topics

Created on 2007-11-08.08:40:07 by simonpj, last changed 2009-10-23.23:43:12 by admin.

Files
File name Uploaded Type Edit Remove
darcs-1.0.9-build.txt RichardG, 2007-11-09.03:06:04 text/plain
darcs-1.0.9-run.txt RichardG, 2007-11-09.03:06:45 text/plain
darcs-1.1.0pre1-build.txt RichardG, 2007-11-09.03:08:09 text/plain
darcs-1.1.0pre1-run.txt RichardG, 2007-11-09.03:08:56 text/plain
darcs-unstable-default.txt RichardG, 2007-11-11.01:13:58 text/plain
darcs-unstable-no_mmap.txt RichardG, 2007-11-11.01:16:54 text/plain
Messages
msg2245 (view) Author: simonpj Date: 2007-11-08.08:40:07
| darcs: getCurrentDirectory: resource exhausted (Too many open files)

This sounds like a Darcs bug, but it's not one I have heard before. I'm copying the Darcs bug-tracker list, because they probably know what to do.

(Darcs guys: Richard's message appears in full below.)

Simon

| -----Original Message-----
| From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] On Behalf Of
| richardg@richardg.name
| Sent: 07 November 2007 19:39
| To: glasgow-haskell-users@haskell.org
| Subject: Getting source code
|
| Hello
|
| I'm trying to get the source code for development purposes (helping add
| some Haddock documentation for TH).  I tried following the steps listed on
| http://hackage.haskell.org/trac/ghc/wiki/Building/GettingTheSources and
| ran into trouble.
|
| I downloaded ghc-HEAD-2007-08-29-ghc-corelibs-testsuite.tar.bz2 and
| continued through the steps.  On step 3, I received an error message from
| darcs:
| $ darcs pull -a
| Pulling from "http://darcs.haskell.org/ghc"...
| This is the GHC darcs repository (HEAD branch)
|
| For more information, visit the GHC developer wiki at
|   http://hackage.haskell.org/trac/ghc
| **********************
| darcs: getCurrentDirectory: resource exhausted (Too many open files)
|
|
|
| I tried using darcs get but I think I ran into the same case issue as on
| Windows:
| $ darcs get http://darcs.haskell.org/ghc
| This is the GHC darcs repository (HEAD branch)
|
| For more information, visit the GHC developer wiki at
|   http://hackage.haskell.org/trac/ghc
| **********************
| Copying patch 17349 of 17349... done.
| Applying patch 12 of 17349... Unapplicable patch:
| Thu Jan 11 07:26:13 MST 1996  partain
|   * [project @ 1996-01-11 14:06:51 by partain]
|
| darcs failed:  Error applying hunk to file ./ghc/includes/rtsTypes.lh
|
|
| I'm using GHC 6.6.1 (binary distribution) and darcs 1.0.9 (can't remember
| if it's binary distribution or if I compiled it myself) on Mac OS X
| 10.4.10 for Intel.
|
| Any help would be appreciated.
|
| Thanks,
|
| Richard
| _______________________________________________
| Glasgow-haskell-users mailing list
| Glasgow-haskell-users@haskell.org
| http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
msg2246 (view) Author: RichardG Date: 2007-11-08.19:44:15
After doing some investigation, this appears to the be result of OS X limiting 
the number of files that a process can have open.  Increasing this limit makes 
the issue disappear.

By default, user accounts are limited to 256 open files (per process?) but
this can be changed with ulimit.  To see the current limit, type in
`ulimit -n'.

To increase the allowed number of files to the maximum, type in `ulimit -n
unlimited'.  Sometimes an error message will be generated, sometimes not. 
If a message is generated, restart Terminal.app and try again.  After the
operation succeeds, it's probably a good idea to set the limit back to its
original value (`ulimit -n 256').

After doing some testing, it appears that 256 files is only slightly too
small; `darcs pull -a' succeeds (at the moment) when the file limit is set
to 320.

This may also be applicable to issue87, issue 135, issue478. 

----

I set the priority to bug because I couldn't add this info without doing so.
msg2247 (view) Author: droundy Date: 2007-11-08.21:30:57
On Thu, Nov 08, 2007 at 07:44:15PM -0000, Richard Giraud wrote:
> After doing some investigation, this appears to the be result of OS X limiting 
> the number of files that a process can have open.  Increasing this limit makes 
> the issue disappear.
> 
> By default, user accounts are limited to 256 open files (per process?) but
> this can be changed with ulimit.  To see the current limit, type in
> `ulimit -n'.

Yikes! On linux this value is 1024, which I'm sure is why this hasn't been
seen before.

> To increase the allowed number of files to the maximum, type in `ulimit -n
> unlimited'.  Sometimes an error message will be generated, sometimes not. 
> If a message is generated, restart Terminal.app and try again.  After the
> operation succeeds, it's probably a good idea to set the limit back to its
> original value (`ulimit -n 256').
> 
> After doing some testing, it appears that 256 files is only slightly too
> small; `darcs pull -a' succeeds (at the moment) when the file limit is set
> to 320.

This bug shouldn't manifest itself if you recompile darcs with
--disable-mmap.  Could you check this? If it's too much trouble, I'll
completely understand, but it'd be helpful for us.

Perhaps we should give up on using mmap by default?  Obviously we're
holding all the patch files open, and my guess is that this is because they
are mmapped.  There are two fixes for this: we could disable mmap, or we
could make the code not require that all those patches be in memory at the
same time.  The latter fix is trickier and more fragile (in the sense of
easier to break by making a small change in the code), but has the
advantage of also reducing memory usage and increasing speed.  Of course,
we don't want to hold mac users hostage in order to give ourselves greater
motivation for making even better improvements.

Using mmap *is* a win under memory pressure, and just in general it should
provide faster IO.  So it'd be nice to not disable it.  We could also
figure out how to query the number of available file handles and
dynamically check and stop using mmap when we approach the limit, but that
sounds both complicated and fragile.

> This may also be applicable to issue87, issue 135, issue478. 

Thanks for looking these up!

> I set the priority to bug because I couldn't add this info without doing so.

I think "bug" is reasonable.  If you were doing something insane, then this
wouldn't be a bug, but pulling three hundred patches at once is definitely
reasonable use of darcs.
-- 
David Roundy
Department of Physics
Oregon State University
msg2249 (view) Author: RichardG Date: 2007-11-09.03:05:03
I compiled darcs 1.0.9 and 1.1.0pre1 with the --disable-mmap and the result is 
still the same.
msg2250 (view) Author: RichardG Date: 2007-11-09.03:06:04
Added script from building 1.0.9.
Attachments
msg2251 (view) Author: RichardG Date: 2007-11-09.03:06:45
Added script from running 1.0.9.
Attachments
msg2252 (view) Author: RichardG Date: 2007-11-09.03:08:09
Added script from building 1.1.0pre1.
Attachments
msg2253 (view) Author: RichardG Date: 2007-11-09.03:08:56
Added script from running 1.1.0pre1.
Attachments
msg2254 (view) Author: RichardG Date: 2007-11-09.03:31:08
Given how easy it is to increase the limit, I'm guessing that the limit is a 
way of detecting and stopping runaway processes.  If this is the case, then I 
don't consider this a bug; OS X is doing what it should and darcs is doing what 
it should.

A reasonable soultion woud be:
- The item captured in the FAQ.
- The exception caught and a useful error message displayed (a suggestion to 
try changing the limit, a link to the FAQ, etc.).

I've updated the FAQ to include information about the issue and how to change 
the limit.
msg2255 (view) Author: droundy Date: 2007-11-09.15:32:32
On Fri, Nov 09, 2007 at 03:31:09AM -0000, Richard Giraud wrote:
> Given how easy it is to increase the limit, I'm guessing that the limit is a 
> way of detecting and stopping runaway processes.  If this is the case, then I 
> don't consider this a bug; OS X is doing what it should and darcs is doing what 
> it should.
> 
> A reasonable soultion woud be:
> - The item captured in the FAQ.
> - The exception caught and a useful error message displayed (a suggestion to 
> try changing the limit, a link to the FAQ, etc.).

This is definitely a good idea.  But I still consider this at the least a
performance bug:  we are using more resources than we need to use, and
that should be fixed.  The fact that the problem persists when you disable
mmap means that this is *should* be a simple fix: there's no reason we
shouldn't be closing these file handles if we aren't using mmap.

> I've updated the FAQ to include information about the issue and how to change 
> the limit.

Thank you!
-- 
David Roundy
Department of Physics
Oregon State University
msg2256 (view) Author: kowey Date: 2007-11-09.15:52:20
Is there any chance at all that Simon M's strict readFile would be
helpful here, adapted to FastPackedString?  Or are they irrelevant
here, for example, because we are already doing something like it?

http://www.haskell.org//pipermail/haskell/attachments/20050802/c4d01bec/readfile.obj
http://www.haskell.org/pipermail/haskell/2005-August/016207.html

Another question: maybe instead of failing to close handles, could
there be such a problem as opening handles long long before we
actually try to read from them?
msg2257 (view) Author: droundy Date: 2007-11-09.16:06:20
On Fri, Nov 09, 2007 at 03:52:21PM -0000, Eric Kow wrote:
> Is there any chance at all that Simon M's strict readFile would be
> helpful here, adapted to FastPackedString?  Or are they irrelevant
> here, for example, because we are already doing something like it?

Actually, we almost always read strictly (and then close the file handle).
The only time we don't is when we try to read patches lazily (or use mmap),
so somehow we must be reading these patches lazily.  Now that I think about
it, mmap probably isn't a problem, because we only mmap rather large files
that haven't been gzipped.

So perhaps just eliminating the lazy reading of patches will solve our
problem.

> Another question: maybe instead of failing to close handles, could
> there be such a problem as opening handles long long before we
> actually try to read from them?

This shouldn't happen, as long as we're reading the patches strictly.  I
think we ought to be go ahead and do this, to simply read all patches
strictly.  Ian put a lot of work into *removing* the strict reading of
patches, so that we wouldn't need to hold entire patch files in memory, but
in practice noone uses darcs with patches that don't fit into memory, and
many people use darcs with many small patches, so we should optimize for
the latter case.  If everything worked right, we could work in both cases,
but that's not currently the case, if it ever was.

The trick is that lazy reading of patch files means we don't close them
until we've "used" the entire patch, and then only if the parser reads the
entire file, to the bitter end.

We might also simplify the patch parsing code ReadMonad and all that, by
removing the whole lazy parsing concept.  Simpler code is easier to verify
is bug-free, and also makes it easier to reason about space/time behavior.

Of course, it could be that this is just the bug that you already fixed in
the lazy reading code, in which case we could perhaps close this
ticket... although improving the error message (as suggested by reporter)
would also be good.
-- 
David Roundy
Department of Physics
Oregon State University
msg2258 (view) Author: droundy Date: 2007-11-09.16:08:31
On Fri, Nov 09, 2007 at 04:06:21PM -0000, David Roundy wrote:
> Of course, it could be that this is just the bug that you already fixed in
> the lazy reading code, in which case we could perhaps close this
> ticket... although improving the error message (as suggested by reporter)
> would also be good.

Richard (I hate to ask you to do more, and am very greatful for the effort
you've put into this so far, but...) would you mind testing the latest
darcs-unstable branch, to see if Eric's already fixed this?
-- 
David Roundy
Department of Physics
Oregon State University
msg2259 (view) Author: RichardG Date: 2007-11-11.01:13:58
darcs-unstable with default configuration

Issue still occurs.
Attachments
msg2260 (view) Author: RichardG Date: 2007-11-11.01:16:54
darcs-unstable with mmap disable

Issue still occurs
Attachments
msg2270 (view) Author: jch Date: 2007-11-24.16:08:59
> | darcs: getCurrentDirectory: resource exhausted (Too many open files)

> This sounds like a Darcs bug, but it's not one I have heard before.

I've definitely seen this issue, but I don't remember how it ended.
I seem to recall that it's due to Darcs opening a number of immutable
files using unsafePerformIO, and due to lazy evaluation, they don't
get closed until some later time.

I don't remember what the solution was.  In any case, Darcs should
leave your repository in a sane state (but run ``darcs check'' just in
case).

Sorry for not being more helpful,

                                        Juliusz
msg2271 (view) Author: droundy Date: 2007-11-24.20:29:07
On Sat, Nov 24, 2007 at 05:07:29PM +0100, Juliusz Chroboczek wrote:
> > | darcs: getCurrentDirectory: resource exhausted (Too many open files)
> 
> > This sounds like a Darcs bug, but it's not one I have heard before.
> 
> I've definitely seen this issue, but I don't remember how it ended.
> I seem to recall that it's due to Darcs opening a number of immutable
> files using unsafePerformIO, and due to lazy evaluation, they don't
> get closed until some later time.
> 
> I don't remember what the solution was.  In any case, Darcs should
> leave your repository in a sane state (but run ``darcs check'' just in
> case).
> 
> Sorry for not being more helpful,

This has been much more thoroughly tracked down by the original reporter
(the one Simon refers to).  It's fixable by raising the ulimit (or
something like that), but we haven't been able to figure out why darcs is
holding onto those files.  None of us developers have had time to track it
down, although Richard put was very helpful in trying different
possibilities.
-- 
David Roundy
Department of Physics
Oregon State University
msg2272 (view) Author: alex1 Date: 2007-11-24.21:19:12
On 11/24/07, Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> wrote:
> > | darcs: getCurrentDirectory: resource exhausted (Too many open files)
>
> > This sounds like a Darcs bug, but it's not one I have heard before.
>
> I've definitely seen this issue, but I don't remember how it ended.
> I seem to recall that it's due to Darcs opening a number of immutable
> files using unsafePerformIO, and due to lazy evaluation, they don't
> get closed until some later time.
>
> I don't remember what the solution was.  In any case, Darcs should
> leave your repository in a sane state (but run ``darcs check'' just in
> case).

I experienced this problem once -- during a pull, and apparently after
Darcs had written out the "the following files had conflicts" bit --
and although it left the *repo* in a sane state, the working directory
was all in pieces.

This is a general problem with Darcs: the working directory operations
are not atomic, so any repo operation that changes the working
directory risks applying changes that are subsequently not associated
with their patches.

Alexander.
msg2275 (view) Author: droundy Date: 2007-11-27.13:28:48
It just occurred to me, that this is almost precisely the bug that the
pull_many_files.pl test in our test suite was supposed to catch.  We should look
into that test, to see if it needs to be improved.

David
msg2506 (view) Author: droundy Date: 2008-01-14.23:13:25
I've verified (by pulling 4,918 patches) that this bug is fixed in
darcs-unstable, at least when using hashed repositories.
History
Date User Action Args
2007-11-08 08:40:08simonpjcreate
2007-11-08 19:44:18RichardGsetstatus: unread -> unknown
nosy: + RichardG
messages: + msg2246
2007-11-08 21:30:58droundysetnosy: cvs-ghc, RichardG, droundy, simonpj, tommy, kowey, beschmi
messages: + msg2247
2007-11-09 03:05:04RichardGsetmessages: + msg2249
2007-11-09 03:06:05RichardGsetfiles: + darcs-1.0.9-build.txt
messages: + msg2250
2007-11-09 03:06:46RichardGsetfiles: + darcs-1.0.9-run.txt
messages: + msg2251
2007-11-09 03:08:11RichardGsetfiles: + darcs-1.1.0pre1-build.txt
messages: + msg2252
2007-11-09 03:08:57RichardGsetfiles: + darcs-1.1.0pre1-run.txt
messages: + msg2253
2007-11-09 03:31:10RichardGsetmessages: + msg2254
2007-11-09 15:32:33droundysetmessages: + msg2255
2007-11-09 15:52:22koweysetmessages: + msg2256
2007-11-09 15:56:20koweysettitle: Getting source code -> pull => resource exhausted (Too many open files) (1.0.9)
2007-11-09 16:06:21droundysetmessages: + msg2257
title: pull => resource exhausted (Too many open files) (1.0.9) -> Getting source code
2007-11-09 16:08:32droundysetmessages: + msg2258
2007-11-11 01:14:09RichardGsetfiles: + darcs-unstable-default.txt
messages: + msg2259
2007-11-11 01:16:56RichardGsetfiles: + darcs-unstable-no_mmap.txt
messages: + msg2260
2007-11-24 16:09:01jchsetnosy: + jch, darcs-devel
messages: + msg2270
2007-11-24 20:29:08droundysetnosy: RichardG, droundy, jch, simonpj, tommy, kowey, beschmi, cvs-ghc, darcs-devel
messages: + msg2271
2007-11-24 21:19:14alex1setnosy: + alex1
messages: + msg2272
2007-11-27 13:28:49droundysetnosy: tommy, darcs-devel, RichardG, droundy, jch, simonpj, cvs-ghc, kowey, beschmi, alex1
messages: + msg2275
2007-11-27 13:29:35droundysettitle: Getting source code -> pull => resource exhausted (Too many open files
2007-11-27 13:29:47droundysettitle: pull => resource exhausted (Too many open files -> pull => resource exhausted (Too many open files)
2008-01-14 23:08:10droundylinkissue87 superseder
2008-01-14 23:10:07droundylinkissue478 superseder
2008-01-14 23:13:27droundysetstatus: unknown -> resolved-in-unstable
messages: + msg2506
2008-09-04 21:31:34adminsetstatus: resolved-in-unstable -> resolved
nosy: + dagit
2009-08-06 17:48:00adminsetnosy: + markstos, jast, Serware, dmitry.kurochkin, zooko, mornfall, simon, thorkilnaur, - droundy, jch, simonpj, alex1, cvs-ghc, RichardG
2009-08-06 20:43:46adminsetnosy: - beschmi
2009-08-10 22:09:35adminsetnosy: + cvs-ghc, RichardG, jch, simonpj, alex1, - markstos, zooko, jast, Serware, mornfall
2009-08-11 00:03:29adminsetnosy: - dagit
2009-08-25 17:59:46adminsetnosy: - simon
2009-08-27 14:07:34adminsetnosy: jch, tommy, kowey, darcs-devel, simonpj, alex1, cvs-ghc, RichardG, thorkilnaur, dmitry.kurochkin
2009-10-23 22:34:13adminsetnosy: + alexander, - alex1
2009-10-23 23:43:12adminsetnosy: + alex1, - alexander