Created on 2008-09-14.12:19:28 by thorkilnaur, last changed 2009-08-27.14:32:25 by admin.
msg6009 (view) |
Author: thorkilnaur |
Date: 2008-09-14.12:19:25 |
|
This is in response to http://bugs.darcs.net/msg5703:
> Author: simonpj Date: 2008-08-26.09:22:03
> | I tried to pull the GHC HEAD into this STABLE repository, both from http://darcs.haskell.org/ghc and
> | from a
> | local repository. In both cases, I interrupted the pull after it's having spent 100% CPU for more than
> | an
> | hour without apparent progress.
> |
> | So this seems to be a dead end.
>
> Not a dead end at all! It's another darcs bug! Having pull take 100% of the CPU for an hour is an absolute
disaster
when it comes to user confidence. If you have a reproducible test case, I urge you to examine it.
>
> Simon
To examine this case more closely, I have used the basic machinery provided by hpc (Haskell Program Coverage).
Briefly, I intrumented a recent darcs as indicated in the attached patch to print the hpc tix counters to a series
of files regularly, once every second. This modified darcs also needed compiling with -fhpc (the ./configure flag -
-enable-hpc). By subtracting (hpc combine --function=SUB) two such sets of counters, one gets a set of counters
representing the activity between the writing of the two files. Some additional postprocessing (hpc markup --
highlight-covered) then produces a set of .html files with the active parts of the source code highlighted.
The attached strobeHpc_darcs_3600_SUB_3599_20080914_130546.tar.gz file represents the result of examining darcs
pulling GHC HEAD from a local repository into GHC STABLE in the last second before having executed for one whole
hour. Only .html files representing darcs source files with actual activity have been included. Producing such a
set of files from two .tix files is done by a bit of shell'ery (the attached strobeHpcDiff1.sh).
I have tried to produce the same set of .html files for the same run, but representing the last 10 minutes before
the hour and they were identical to the ones of the last second. So I feel quite confident that the .html files
represent faithfully what actually goes on.
It seems encouraging that apparently only a small fraction of the darcs code is active, but I have, myself, not
investigated any further. It may be that the information in the .html files doesn't add to a case like this, but I
suggest that somebody with knowledge about darcs internals take a look and judge.
Best regards
Thorkil
Attachments
|
msg6012 (view) |
Author: kowey |
Date: 2008-09-15.07:16:04 |
|
This looks very interesting. Something to pursue for darcs 2.1
On Sun, Sep 14, 2008 at 12:19:28 -0000, Thorkil Naur wrote:
> The attached strobeHpc_darcs_3600_SUB_3599_20080914_130546.tar.gz file represents the result of examining darcs
> pulling GHC HEAD from a local repository into GHC STABLE in the last second before having executed for one whole
> hour. Only .html files representing darcs source files with actual activity have been included. Producing such a
> set of files from two .tix files is done by a bit of shell'ery (the attached strobeHpcDiff1.sh).
I think you may have forgotten the attachment.
Also, should your darcs patch go into darcs, or is it meant to be set
aside and only used with hpc? Perhaps it could be enabled via some
ifdefs and then go into mainline darcs?
|
msg6013 (view) |
Author: thorkilnaur |
Date: 2008-09-15.08:33:58 |
|
Replying to http://bugs.darcs.net/msg6012 (Author: kowey Date: 2008-09-15.07:16:04):
> I think you may have forgotten the attachment.
I attached one file when submitting the issue initially and two others in separate updates to the
issue. I presently see all three files attached to the issue. Are you unable to access these files?
>
> Also, should your darcs patch go into darcs, or is it meant to be set
> aside and only used with hpc? Perhaps it could be enabled via some
> ifdefs and then go into mainline darcs?
I used a patch to describe the changes that I made to darcs because it was most convenient. In its
present form, the patch is not intended to go into mainline darcs. A more polished version (allowing
conditional compilation and, perhaps, setting the time interval via some flag) could be worked out, if
useful.
Best regards
Thorkil
|
msg8069 (view) |
Author: kowey |
Date: 2009-08-10.15:35:20 |
|
Hi Thorkil,
I think it would be really useful for anybody studying this if we could have
tarballs of both repositories. By any chance do you have something like this on
hand, or at least some recollection of what repos you downloaded?
PS: For the interested, Thorkil's hpc-strobe is now on hackage at
http://hackage.haskell.org/package/hpc-strobe
|
msg8409 (view) |
Author: thorkilnaur |
Date: 2009-08-23.17:10:32 |
|
Hi Eric,
A while back, I did some additional experiments related to this issue, but since
the experiments didn't really seem to gain me any additional insight, I never
reported the results. Recall that the problem is a pull of a GHC HEAD repository
of about August 2008 into a GHC STABLE repository of about September 2007 that
takes a very long time, hours, perhaps forever. The experiments amounted to
closing the gap between the source repository being pulled from and the target
repository being pulled into, retrying the pull, to see if anything interesting
turned up.
Both repositories are darcs-1 repositories, so it seemed likely, even when the
issue was reported, that this was simply an instance of "the big conflicts bug",
that "The time needed to resolve or detect conflicts grows exponentially with
depth of conflict nesting" (http://wiki.darcs.net/ConflictsFAQDarcs1#what-the-
big-conflicts-bug). This seemed to be confirmed by the experiments, in that I
found several cases where the time to pull the source into the target increased
dramatically by just increasing their difference by a single patch.
To comply with your request, I have prepared a couple of repositories,
http://thorkilnaur.dk/~tn/darcs/darcsIssue1075/darcs_Issue_1075_source.tar.gz
and
http://thorkilnaur.dk/~tn/darcs/darcsIssue1075/darcs_Issue_1075_target.tar.gz,
to demonstrate this. Locally, these files look like this:
> tn@tn21$ ls -l
> total 154352
> -rw-r--r-- 1 tn tn 96415047 Aug 23 15:21 darcs_Issue_1075_source.tar.gz
> -rw-r--r-- 1 tn tn 61511823 Aug 23 15:24 darcs_Issue_1075_target.tar.gz
> tn@tn21$ md5 darcs_Issue_1075_*
> MD5 (darcs_Issue_1075_source.tar.gz) = 4cd96734907dc8e965caa779caeac4a5
> MD5 (darcs_Issue_1075_target.tar.gz) = 274d0f8dd628ea837abb9e664ed58b33
> tn@tn21$
Here is a transcript of a session that illustrates the issue:
> $ darcs --version
> 2.1.2 (release)
> $ wget
http://thorkilnaur.dk/~tn/darcs/darcsIssue1075/darcs_Issue_1075_source.tar.gz
> ...
> $ md5 darcs_Issue_1075_source.tar.gz
> MD5 (darcs_Issue_1075_source.tar.gz) = 4cd96734907dc8e965caa779caeac4a5
> $ tar xzf darcs_Issue_1075_source.tar.gz
> $ darcs changes --last=2 --repodir=darcs_Issue_1075_source/ghc
> Fri Dec 28 19:40:24 CET 2007 Clemens Fruhwirth <clemens@endorphin.org>
> * Install dynlibs correctly
>
> Add dynlibdir target to config.mk.in, setting it to @libdir@.
> Invoke installPackage with dynlibdir at libraries/Makefile
> Make installPackage.hs hand dynlibdir to Cabal.
>
> Fri Dec 28 18:57:27 CET 2007 Isaac Dupree <id@isaac.cedarswampstudios.org>
> * import ord that alex secretly imported
> $ wget
http://thorkilnaur.dk/~tn/darcs/darcsIssue1075/darcs_Issue_1075_target.tar.gz
> ...
> $ md5 darcs_Issue_1075_target.tar.gz
> MD5 (darcs_Issue_1075_target.tar.gz) = 274d0f8dd628ea837abb9e664ed58b33
> $ tar xzf darcs_Issue_1075_target.tar.gz
> $ darcs changes --last=2 --repodir=darcs_Issue_1075_target/ghc
> Sun Sep 16 02:53:28 CEST 2007 Ian Lynagh <igloo@earth.li>
> * Another attempt at getting bindists working everywhere
>
> Fri Sep 14 22:34:11 CEST 2007 Ian Lynagh <igloo@earth.li>
> * MERGED: Better modelling of newtypes in the Term datatype
> Pepe Iborra <mnislaih@gmail.com>**20070912165855
>
> This helps to get pretty printing right,
> nested newtypes were not being shown correctly by :print
> $
Now pull the source into the target. This takes a long time:
> $ time darcs pull -a --repodir=darcs_Issue_1075_target/ghc
darcs_Issue_1075_source/ghc
> This is the GHC darcs repository (HEAD branch)
>
> For more information, visit the GHC developer wiki at
> http://hackage.haskell.org/trac/ghc
> **********************
> Backing up ./compiler/Makefile(-darcs-backup0)
> Backing up ./compiler/main/InteractiveEval.hs(-darcs-backup0)
> Backing up ./compiler/ghci/Debugger.hs(-darcs-backup0)
> Backing up ./compiler/ghci/RtClosureInspect.hs(-darcs-backup0)
> Backing up ./configure.ac(-darcs-backup0)
> Backing up ./compiler/nativeGen/RegAllocColor.hs(-darcs-backup0)
> Backing up ./compiler/ghc-inplace.c(-darcs-backup0)
> Backing up ./libraries/Makefile(-darcs-backup0)
> Backing up ./Makefile(-darcs-backup0)
> Backing up ./libraries/installPackage.hs(-darcs-backup0)
> Backing up ./compiler/main/DynFlags.hs(-darcs-backup0)
> We have conflicts in the following files:
> ./Makefile ./compiler/Makefile ./compiler/ghc-inplace.c
./compiler/ghci/Debugger.hs ./compiler/ghci/RtClosureInspect.hs
./compiler/main/DynFlags.hs ./compiler/main/InteractiveEval.hs
./compiler/nativeGen/RegAllocColor.hs ./compiler/utils/GraphColor.hs
./configure.ac ./libraries/Makefile ./libraries/installPackage.hs
> Finished pulling and applying.
>
> real 44m19.336s
> user 43m14.772s
> sys 0m17.611s
> $
Now unpack the target again for a new attempt:
> $ rm -r darcs_Issue_1075_target
> $ tar xzf darcs_Issue_1075_target.tar.gz
Before pulling the source into the target, unpull a single patch from the
source:
> $ darcs unpull --last=1 --repodir=darcs_Issue_1075_source/ghc
> Fri Dec 28 19:40:24 CET 2007 Clemens Fruhwirth <clemens@endorphin.org>
> * Install dynlibs correctly
>
> Add dynlibdir target to config.mk.in, setting it to @libdir@.
> Invoke installPackage with dynlibdir at libraries/Makefile
> Make installPackage.hs hand dynlibdir to Cabal.
> Shall I unpull this patch? (1/1) [ynWvpxdaqjk], or ? for help: y
> Finished unpulling.
> $ darcs changes --last=2 --repodir=darcs_Issue_1075_source/ghc
> Fri Dec 28 18:57:27 CET 2007 Isaac Dupree <id@isaac.cedarswampstudios.org>
> * import ord that alex secretly imported
>
> Thu Dec 27 18:13:35 CET 2007 Isaac Dupree <id@isaac.cedarswampstudios.org>
> * add missing import that happy -agc secretly provided
> $ time darcs pull -a --repodir=darcs_Issue_1075_target/ghc
darcs_Issue_1075_source/ghc
> This is the GHC darcs repository (HEAD branch)
>
> For more information, visit the GHC developer wiki at
> http://hackage.haskell.org/trac/ghc
> **********************
> Backing up ./compiler/Makefile(-darcs-backup0)
> Backing up ./compiler/main/InteractiveEval.hs(-darcs-backup0)
> Backing up ./compiler/ghci/Debugger.hs(-darcs-backup0)
> Backing up ./compiler/ghci/RtClosureInspect.hs(-darcs-backup0)
> Backing up ./configure.ac(-darcs-backup0)
> Backing up ./compiler/nativeGen/RegAllocColor.hs(-darcs-backup0)
> Backing up ./compiler/ghc-inplace.c(-darcs-backup0)
> Backing up ./libraries/Makefile(-darcs-backup0)
> Backing up ./Makefile(-darcs-backup0)
> Backing up ./libraries/installPackage.hs(-darcs-backup0)
> Backing up ./compiler/main/DynFlags.hs(-darcs-backup0)
> We have conflicts in the following files:
> ./Makefile ./compiler/Makefile ./compiler/ghc-inplace.c
./compiler/ghci/Debugger.hs ./compiler/ghci/RtClosureInspect.hs
./compiler/main/DynFlags.hs ./compiler/main/InteractiveEval.hs
./compiler/nativeGen/RegAllocColor.hs ./compiler/utils/GraphColor.hs
./configure.ac ./libraries/Makefile ./libraries/installPackage.hs
> Finished pulling and applying.
>
> real 2m22.360s
> user 2m3.731s
> sys 0m9.721s
> $
So much faster.
But honestly, I believe that we should close this bug, either as wontfix
(because we are not fixing darcs-1 code) or as resolved (although this would
assume that the problem has been tested as gone for darcs-2, something that I
have not attempted).
Best regards
Thorkil
|
msg8410 (view) |
Author: thorkilnaur |
Date: 2009-08-23.17:18:35 |
|
Hi Eric,
Fiddle status and assigning to you, hopefully suitable.
Best regards
Thorkil
|
msg8412 (view) |
Author: kowey |
Date: 2009-08-23.17:35:01 |
|
Hi Thorkil: Just to make sure I understand this before we wont-fix this, do you
remember off-hand if GHC STABLE (target) was a subset of GHC HEAD (source)?
The conflicts explanation would be easiest to understand if this was not the case.
It's OK if you don't remember; we'll just have take some time to find out otherwise.
(re-assigning to you but just to help me keep track of my query)
|
msg8421 (view) |
Author: thorkilnaur |
Date: 2009-08-23.18:29:01 |
|
Hi Eric,
I don't think I ever knew whether the set of patches of the target was a subset of
the patches of the source (which is what I presume you would like to know). But a
simple diff of the _darcs/patches directories of the two repositories show that
there are 143 patches in the target that are not in the source, 1760 patches in
the source that are not in the target, and 377 patches that are in both, but
differ in one way or the other. I hope that this information helps, but feel free
to ask for other checks to be carried out to decide the matter.
Best regards
Thorkil
|
msg8423 (view) |
Author: kowey |
Date: 2009-08-23.18:35:15 |
|
On Sun, Aug 23, 2009 at 18:29:06 +0000, Thorkil Naur wrote:
> But a simple diff of the _darcs/patches directories of the two
> repositories show that there are 143 patches in the target that are
> not in the source, 1760 patches in the source that are not in the
> target, and 377 patches that are in both, but differ in one way or the
> other. I hope that this information helps, but feel free to ask for
> other checks to be carried out to decide the matter.
Hmm. Could you compare the inventories?
It might be good to sort them just in case they are in different order
|
msg8428 (view) |
Author: thorkilnaur |
Date: 2009-08-23.19:18:46 |
|
Hi Eric,
I am not sure what you are asking for here.
Best regards
Thorkil
|
msg8429 (view) |
Author: kowey |
Date: 2009-08-23.19:43:13 |
|
It might be good to diff _darcs/inventory from the two repositories.
Because the patches may be in different order it may be good to perform a sort
on them to avoid spurious differences.
|
msg8476 (view) |
Author: thorkilnaur |
Date: 2009-08-24.18:38:21 |
|
Hi Eric,
The two _darcs/inventory files do not have much in common:
> $ head darcs_Issue_1075_target/ghc/_darcs/inventory
Starting with tag:
> [TAG ghc-6.8 branched 2007-09-03
> Ian Lynagh <igloo@earth.li>**20070903155416]
> [Change the version number to 6.8 on the branch
> Ian Lynagh <igloo@earth.li>**20070903172054]
> [Fix building RTS with gcc 2.*; declare all variables at the top of a block
> Ian Lynagh <igloo@earth.li>**20070903165847> Patch from Audrey Tang.
> ]
> [remove debugging code
> $ head darcs_Issue_1075_source/ghc/_darcs/inventory
> Starting with tag:
> [TAG 2007-11-11
> Ian Lynagh <igloo@earth.li>**20071111161540]
> [Turn -fprint-bind-result off by default
> Ian Lynagh <igloo@earth.li>**20071111001126]
> [arrows is no longer an extralib
> Ian Lynagh <igloo@earth.li>**20071027123656] > [Remove the X11 and HGL
libraries from extralibs
> Ian Lynagh <igloo@earth.li>**20071111213447
> Don Stewart, X11 maintainer, requested we remove X11, and HGL depends on it
> $
I have sorted these files and their diff has almost as many lines as the files
combined:
> $ wc -l darcs_Issue_1075_target.inventory darcs_Issue_1075_source.inventory
> 670 darcs_Issue_1075_target.inventory
> 1057 darcs_Issue_1075_source.inventory
> 1727 total
> $ diff darcs_Issue_1075_target.inventory darcs_Issue_1075_source.inventory | wc
-l
> 1457
> $
Comparing the _darcs/inventories directories:
> $ diff -r darcs_Issue_1075_target/ghc/_darcs/inventories/
darcs_Issue_1075_source/ghc/_darcs/inventories/
> Only in darcs_Issue_1075_target/ghc/_darcs/inventories/: 20070903155416-3fd76-
19bef68d1fa1b2cf190a06ce878867dd9b08ec5b.gz
> Only in darcs_Issue_1075_source/ghc/_darcs/inventories/: 20070925164536-3fd76-
07330cc5e4fca0c0f60bac53a10981c10b0fbb5f.gz
> Only in darcs_Issue_1075_source/ghc/_darcs/inventories/: 20071111161540-3fd76-
6f45848e09339188332b3d539b92bf95219c73d0.gz
> Only in darcs_Issue_1075_target/ghc/_darcs/inventories/: 20071111164452-3fd76-
5b6394153966f91d01962e5b157a09d4f6295d3c.gz
> Only in darcs_Issue_1075_source/ghc/_darcs/inventories/: 20080528003600-3fd76-
b01e2ea33fd5e49a0a86d3dd1a43545dfb00e1d2.gz
> Only in darcs_Issue_1075_source/ghc/_darcs/inventories/: 20080601155241-3fd76-
13b3337cedf6323aab252f8035c39326411e3614.gz
> Only in darcs_Issue_1075_target/ghc/_darcs/inventories/: 20080605235309-3fd76-
3c341a5531d47a506ae36a87e1f72b88cefa5be7.gz
> $
Concatenating and sorting all the differing inventories (including the
_darcs/inventory) of the respective repositories and comparing shows that there
are certainly lines in the set of inventories of the target repository that are
not in the set of inventories of the source repository, although there is
significant overlap:
> $ wc -l darcs_Issue_1075_target.inventories darcs_Issue_1075_source.inventories
> 4444 darcs_Issue_1075_target.inventories
> 8656 darcs_Issue_1075_source.inventories
> 13100 total
> $ diff darcs_Issue_1075_target.inventories darcs_Issue_1075_source.inventories
| wc -l
> 8259
> $
Best regards
Thorkil
|
msg8477 (view) |
Author: kowey |
Date: 2009-08-24.21:14:48 |
|
From your analysis, I think I agree that this is merely darcs-1 conflict misery
and that the only way forward in this case is to upgrade the repository to
darcs-2 format. Many thanks!
Minor note: resetting to status 'unknown' was perfect, but I'd assign it to 'no
selection'; in principle you don't know who's doing the triage work. It doesn't
really matter though especially in a tight discussion loop.
|
|
Date |
User |
Action |
Args |
2008-09-14 12:19:28 | thorkilnaur | create | |
2008-09-14 12:21:56 | thorkilnaur | set | files:
+ strobeHpc_darcs_3600_SUB_3599_20080914_130546.tar.gz |
2008-09-14 12:22:28 | thorkilnaur | set | files:
+ strobeHpcDiff1.sh |
2008-09-15 07:16:07 | kowey | set | status: unread -> unknown messages:
+ msg6012 |
2008-09-15 08:34:00 | thorkilnaur | set | messages:
+ msg6013 |
2009-08-10 15:31:04 | kowey | set | topic:
+ Performance nosy:
+ dmitry.kurochkin, simon |
2009-08-10 15:35:23 | kowey | set | status: unknown -> waiting-for nosy:
kowey, dagit, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8069 assignedto: thorkilnaur |
2009-08-10 23:45:32 | admin | set | nosy:
- dagit |
2009-08-23 17:10:40 | thorkilnaur | set | nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8409 |
2009-08-23 17:18:37 | thorkilnaur | set | status: waiting-for -> unknown nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8410 assignedto: thorkilnaur -> kowey |
2009-08-23 17:35:09 | kowey | set | status: unknown -> waiting-for nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin topic:
+ Conflicts messages:
+ msg8412 assignedto: kowey -> thorkilnaur |
2009-08-23 18:29:06 | thorkilnaur | set | status: waiting-for -> unknown nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8421 assignedto: thorkilnaur -> kowey |
2009-08-23 18:35:18 | kowey | set | nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8423 |
2009-08-23 19:18:51 | thorkilnaur | set | nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8428 |
2009-08-23 19:43:18 | kowey | set | status: unknown -> waiting-for nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8429 assignedto: kowey -> thorkilnaur |
2009-08-24 18:38:27 | thorkilnaur | set | status: waiting-for -> unknown nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8476 assignedto: thorkilnaur -> kowey |
2009-08-24 21:14:54 | kowey | set | status: unknown -> wont-fix nosy:
kowey, simon, thorkilnaur, dmitry.kurochkin messages:
+ msg8477 assignedto: kowey -> |
2009-08-25 18:14:26 | admin | set | nosy:
+ darcs-devel, - simon |
2009-08-27 14:32:25 | admin | set | nosy:
kowey, darcs-devel, thorkilnaur, dmitry.kurochkin |
|