Issue 2602: add command 'darcs config' - Darcs bug tracker

Title	add command 'darcs config'
Priority	feature	Status	unknown
Milestone		Resolved in
Superseder		Nosy List	bfrk
Assigned To		Topics

Created on 2018-10-01.07:07:06 by bfrk, last changed 2019-06-23.17:57:54 by bfrk.

Messages
msg20346 (view)	Author: bfrk	Date: 2018-10-01.07:07:03
We should unify the several ways of configuring darcs with a single command 'darcs config'. This should work on the current repo unless we pass --global, which means edit the global (user) configuration. Standard action is to add a setting, with option --remove to instead remove it. Syntax: darcs config COMMAND OPTION darcs config defaultrepo URL darcs config email EMAIL -- badly named, better idea anyone? darcs config author EMAIL ... any more? E.g. sources and more generally darcs config NAME VALUE to replace environment variables (see below for details) See also my comment on issue1898, part (b). In principle, I would like to store configuration in a unified file format, but that would cause lots of compatibility problems. With 'darcs config' we could at least hide the several file formats that we currently use from the casual user. Lastly, there are way too many environment variables that influence how darcs operates. Environment variables are a notoriously bad way to configure a user program. An exception to this rule are generic environment variables that the user can set to influence the behavior of many programs at once. This includes HOME, APPDATA, XDG_CACHE_HOME, EDITOR, PAGER, TMPDIR, EMAIL, and the various XYZ_PROXY variables. These should all be honored by default which is what we do now. Everyting else, i.e mostly the variables that start with DARCS_ should be replaced into a new preference file format with a simple key=value syntax, or else scrapped because there is already a better way to configure them. In particular: DARCS_EDITOR, DARCS_PAGER: preference, overrides standards env vars DARCS_DONT_COLOR, DARCS_ALWAYS_COLOR, DARCS_ALTERNATIVE_COLOR, DARCS_DO_COLOR_LINES: ditto, but try to refactor DARCS_DONT_ESCAPE_EXTRA, DARCS_ESCAPE_EXTRA: i don't think anyone wants to configure the behavior at this fine level of granularity so I'd say remove them DARCS_DONT_ESCAPE_TRAILING_CR: this should be decided depending on the OS we run on DARCS_DONT_ESCAPE_ISPRINT: obsolete, scrap DARCS_DONT_ESCAPE_TRAILING_SPACES, DARCS_DONT_ESCAPE_ANYTHING, DARCS_ESCAPE_8BIT: need to decide what to do about these, perhaps environment vars make a certain sense here for use in shell scripts DARCS_TMPDIR: preference, overrides standard env var DARCS_KEEP_TMPDIR: preference DARCS_EMAIL: scrap, superseeded by global authors file SENDMAIL: scrap, superseeded by defaults (i.e. use --sendmail-command CMD, we can give arguments to defaults since at least 2.12) DARCS_SLOPPY_LOCKS: check if this is still needed DARCS_SSH, DARCS_SCP, DARCS_SFTP: preferences DARCS_PROXYUSERPWD: putting a password in plain text into an env var is a very bad idea IMO, should be removed DARCS_CONNECTION_TIMEOUT: preferences (currently used only if built with -fcurl)
msg20851 (view)	Author: bfrk	Date: 2019-06-17.14:42:55
It has been a while since I opened this issue. Guess I'll have to send some code to get a reaction...
msg20852 (view)	Author: ganesh	Date: 2019-06-17.21:50:53
I'm happy with the description of darcs config. I'm also all in favour of removing env vars. Shouldn't we make them into options though (that could be given defaults in prefs files)?
msg20853 (view)	Author: bfrk	Date: 2019-06-18.08:54:27
The idea of using options as the single uniform mechanism for all configuration does have a certain charm to it. It is not without problems, though. As the number of options increases, especially those that are uniformly valid for all commands, this has the unfortunate tendency to clutter up the UI. Take for instance pre- and posthooks: I cannot think of a situation where I would want to specify them on the command line. Still their descriptions take up a lot of space in the help output (8 lines!) as well as when the shell displays options during command completion. If we add even more options like that to replace environment variables, we really need to do something about that. The obvious solution is to hide such options in the help output (at least by default) and the command completion (i.e. --list-options). We currently distinguish between "basic" and "advanced" options; if we add another axis for "command specific" vs. "generic" options, then options that are "advanced" and "generic" could be hidden by default. This should include: all hook options, --no-cache, --timings, --debug, --debug-http, --list- options, --disable; it would probably include most of the options we might add to replace env vars. (We could do that independent of whether we add more such options to replace env vars. Implementing this may not be as easy, though.) Then there is the technical problem of how to pass such options down to where they are needed. We can read env vars in any IO action but options are being explicitly passed. I would rather not add more global variables if it can be helped, but I guess we'd have to bite that bullet. Some people nowadays recommend to use a 'ReaderT Config IO' and I would use that pattern if I had to re-write Darcs from scratch. I have a hard time seeing us do this as a refactor, though. * * * * * * * Now for a completely different idea. Let us re-examine /why/ environment variables are a bad choice for configuring a tool like Darcs: (1) They cannot be specified per repository, only per user. (2) There is no uniform and predictable way to set them: it depends on the shell you are using, settings tend to become strewn over many different files, they may be set conditionally, etc etc. (3) AFAIR environment variables are quite awkward to use on Windows. On the plus side, environment variables can be set by the admin to a default value for all users, which can be quite useful in certain situations. We can solve all three problems with one stroke by adding two more configuration files for Darcs: _darcs/prefs/environment and ~/.darcs/environment. (The file name is up for grabs, could as well name it "settings" or "config", whatever.) These files contain simple 'KEY = VALUE' definitions. On startup we read these files (in order from per- user, then per-repo), and for each entry we /add/ a corresponding definition to our environment. (It just occurred to me that we may also need an 'unset KEY' command to remove an existing environment variable / setting.) We could do a bit of translation for keys to make the file format less awkward e.g. add the DARCS_ prefix or turn lower to upper case; details can be worked out. Besides solving the above problems, I see the following advantages: * fully compatible with existing configurations and darcs versions * doesn't complicate the UI any further * with 'unset key' there is no need to awkwardly add negated versions of every setting (as we currently need for options) * requires no deep refactorings in the source code, all settings are readily available to every IO action * easier to add new settings, compared to adding a new command line option If we agree on this proposal I can even imagine gradually phasing out some of the more obscure options and replacing them with such "environment" variables / settings. We may also want to add support for reading an additional per-system configuration file.
msg20859 (view)	Author: ganesh	Date: 2019-06-18.20:08:58
On 18/06/2019 10:54, Ben Franksen wrote: > > The obvious solution is to hide such options in the help output (at least > by default) and the command completion (i.e. --list-options). We currently > distinguish between "basic" and "advanced" options; if we add another axis > for "command specific" vs. "generic" options, then options that are > "advanced" and "generic" could be hidden by default. This should include: > all hook options, --no-cache, --timings, --debug, --debug-http, --list- > options, --disable; it would probably include most of the options we might > add to replace env vars. I agree with this. (I guess there ought to be some signpost to them somewhere in the help.) > Then there is the technical problem of how to pass such options down to > where they are needed. We can read env vars in any IO action but options > are being explicitly passed. I would rather not add more global variables > if it can be helped, but I guess we'd have to bite that bullet. Some > people nowadays recommend to use a 'ReaderT Config IO' and I would use > that pattern if I had to re-write Darcs from scratch. I have a hard time > seeing us do this as a refactor, though. I'm not keen on ReaderT Config IO because we'll be passing the whole config to places where only one value might be needed. My personal preference would be implicit parameters, but I know they aren't generally popular. > Now for a completely different idea. Let us re-examine /why/ environment > variables are a bad choice for configuring a tool like Darcs: > > (1) They cannot be specified per repository, only per user. > (2) There is no uniform and predictable way to set them: it depends > on the shell you are using, settings tend to become strewn over > many different files, they may be set conditionally, etc etc. > (3) AFAIR environment variables are quite awkward to use on Windows. For me the biggest problem with environment variables is that they are really problematic when using darcs as a library. The calling code doesn't know about them and even it it did it's not exactly a pleasant API to have to set an environment variable before calling a function. FWIW, they're not actually particularly hard to use on Windows. The UI for setting them permanently is a bit unpleasant but there's a 'setx' shell command that helps a bit. > We can solve all three problems with one stroke by adding two more > configuration files for Darcs: _darcs/prefs/environment and > ~/.darcs/environment. (The file name is up for grabs, could as well name > it "settings" or "config", whatever.) These files contain simple 'KEY = > VALUE' definitions. On startup we read these files (in order from per- > user, then per-repo), and for each entry we /add/ a corresponding > definition to our environment. (It just occurred to me that we may also > need an 'unset KEY' command to remove an existing environment variable / > setting.) We could do a bit of translation for keys to make the file > format less awkward e.g. add the DARCS_ prefix or turn lower to upper > case; details can be worked out. This makes it hard to set them on a per-invocation basis. I barely use them anyway, but when I do it's by doing something like DARCS_SOMETHING=yes darcs ... I may be atypical because I'd normally just be experimenting for development purposes rather than actually having decided I need them permanently. If we would still be using the environment (or other global state) to pass them around inside darcs then my complaint about the library still applies.
msg20862 (view)	Author: bfrk	Date: 2019-06-19.10:35:44
>> The obvious solution is to hide such options in the help output [...] > > I agree with this. (I guess there ought to be some signpost to them > somewhere in the help.) Of course. >> Then there is the technical problem of how to pass such options down to >> where they are needed. [...] > > I'm not keen on ReaderT Config IO because we'll be passing the whole > config to places where only one value might be needed In https://www.fpcomplete.com/blog/2017/06/readert-design-pattern under the heading "Has typeclass approach" he describes how to use type classes to avoid this. He also describes how we can let TH generate all the boilerplate using the lens library (I checked that the example works equally well with microlens). The way HasX constraints propagate upward from the point of use is actually pretty similar to implicit parameter constraints. The main difference is that with ReaderT we are forced to use monadic style, whereas with implicit parameters we aren't. But if we want to replace env vars we already are in the IO monad, so that wouldn't make a difference for us. > My personal > preference would be implicit parameters, but I know they aren't > generally popular. I did some research to find out why they aren't popular. It seems that there are some good reasons. This discussion: https://www.reddit.com/r/haskell/comments/6gz4w5/whats_wrong_with_implicitparams/ is an interesting read with some quite scary examples. Most disturbingly, the meaning of a local definition can change depending on whether it has a type signature or not. This is also mentioned in the ghc manual here: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#implicit-parameters-and-polymorphic-recursion One reason I tend to like the ReaderT approach more is precisely /because/ it enforces monadic style. Allowing dynamically scoped variables in arbitrary (pure) code scares me a lot because they make reasoning (in particular: refactoring) much more difficult. On the practical side, I have read that you are not allowed to use implicit parameter constraints as superclass constraints for classes or instances (because that introduces incoherence). This is what E. Kmett says here: https://www.reddit.com/r/haskell/comments/5xqozf/implicit_parameters_vs_reflection/ I may be wrong but i think we would need to do exactly that in order to remove the environment variables that configure the behavior of our Printer type, since that functionality is used by generic code under Darcs.Patch. BTW, the reflection library is something I haven't been able to wrap my head around. I look at the API and I can't see how to use it to achieve what it claims. If you happen to understand it, I would like to see the ReaderT example from the link above transscribed to use reflection instead. > For me the biggest problem with environment variables is that they are > really problematic when using darcs as a library. The calling code > doesn't know about them and even it it did it's not exactly a pleasant > API to have to set an environment variable before calling a function. Good point. Global (environment or in-program) variables are always a bad choice for libraries. (BTW, this also applies to the current working directory which you have to initialize properly for most procedures from Darcs.Repository and Darcs.UI.Commands; unless we completely remove the assumption of "." being the current repository from them.) >> We can solve all three problems with one stroke by adding two more >> configuration files for Darcs: _darcs/prefs/environment and >> ~/.darcs/environment. [...] > > This makes it hard to set them on a per-invocation basis. I barely use > them anyway, but when I do it's by doing something like > > DARCS_SOMETHING=yes darcs ... > > I may be atypical because I'd normally just be experimenting for > development purposes rather than actually having decided I need them > permanently. True. I didn't think of that either. (BTW, I use that idiom, too, though for a different purpose. I have DARCS_ALWAYS_COLOR=1 by default because I like to pipe things through less (directly or let darcs do it) which understands color codes. So I need to disable that when redirecting the output to a file...) To circumvent this problem, we could add /just one/ additional option --config "VAR=value" that can be given multiple times to override any of the settings on a per-invocation basis. Your other objections still stand, of course. Cheers Ben
msg20869 (view)	Author: ganesh	Date: 2019-06-23.10:09:03
On 19/06/2019 11:35, Ben Franksen wrote: >> I'm not keen on ReaderT Config IO because we'll be passing the whole >> config to places where only one value might be needed > In https://www.fpcomplete.com/blog/2017/06/readert-design-pattern > under the heading "Has typeclass approach" he describes how to use type > classes to avoid this. He also describes how we can let TH generate all > the boilerplate using the lens library (I checked that the example works > equally well with microlens). I think the Has typeclass is a good approach. Less keen on TH unless we have other compelling needs for it in darcs as well, it tends to make everything more complicated. For example binding group analysis is messed up by top-level splices which in my experience can lead to a lot of head-scratching, and it probably makes things less portable. (For more detail on binding group analysis, see the comment "Top-level declaration splices break up a source file" in https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#template-haskell ) >> My personal >> preference would be implicit parameters, but I know they aren't >> generally popular. > > I did some research to find out why they aren't popular. It seems that > there are some good reasons. This discussion: > > https://www.reddit.com/r/haskell/comments/6gz4w5/whats_wrong_with_implicitparams/ > > is an interesting read with some quite scary examples. Most > disturbingly, the meaning of a local definition can change depending on > whether it has a type signature or not. This is also mentioned in the > ghc manual here: > > https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#implicit-parameters-and-polymorphic-recursion I'm 'hsenag' in that reddit thread. As I say there, I like using them in cases where no recursive use is contemplated. I think probably even expecting to need to rebind one (non-recursively) would be an indicator not to use them. > One reason I tend to like the ReaderT approach more is precisely > /because/ it enforces monadic style. Allowing dynamically scoped > variables in arbitrary (pure) code scares me a lot because they make > reasoning (in particular: refactoring) much more difficult. TBH I don't see the difference. The Reader monad is just as dynamically scoped as implicit params: if you rebind the reader value you can get into just as much of a pickle, I think. > On the practical side, I have read that you are not allowed to use > implicit parameter constraints as superclass constraints for classes or > instances (because that introduces incoherence). This is what E. Kmett > says here: > > https://www.reddit.com/r/haskell/comments/5xqozf/implicit_parameters_vs_reflection/ > > I may be wrong but i think we would need to do exactly that in order to > remove the environment variables that configure the behavior of our > Printer type, since that functionality is used by generic code under > Darcs.Patch. Well, we'd need to put them in the type signatures of the class methods. But the same is true of using ReaderT, or conversely if it's not true of using ReaderT, then the same is true of implicit params. I suspect it depends on whether we need the vars to generate a Doc or just to output it. > BTW, the reflection library is something I haven't been able to wrap my > head around. I look at the API and I can't see how to use it to achieve > what it claims. If you happen to understand it, I would like to see the > ReaderT example from the link above transscribed to use reflection instead. I've never fully understood the point of it, but it feels like another way of doing something implicit-parameter-like. >> For me the biggest problem with environment variables is that they are >> really problematic when using darcs as a library. The calling code >> doesn't know about them and even it it did it's not exactly a pleasant >> API to have to set an environment variable before calling a function. > > Good point. Global (environment or in-program) variables are always a > bad choice for libraries. > > (BTW, this also applies to the current working directory which you have > to initialize properly for most procedures from Darcs.Repository and > Darcs.UI.Commands; unless we completely remove the assumption of "." > being the current repository from them.) Agreed, and that's a far worse problem in that I run into it constantly when trying to code against darcs-the-library. Environment variables on the other hand are mostly a theoretical problem right now. Conventional wisdom has been that it (using CWD) is needed for performance reasons, but I don't know how true that is nowadays. >>> We can solve all three problems with one stroke by adding two more >>> configuration files for Darcs: _darcs/prefs/environment and >>> ~/.darcs/environment. [...] >> >> This makes it hard to set them on a per-invocation basis. I barely use >> them anyway, but when I do it's by doing something like >> >> DARCS_SOMETHING=yes darcs ... >> >> I may be atypical because I'd normally just be experimenting for >> development purposes rather than actually having decided I need them >> permanently. > > True. I didn't think of that either. > > (BTW, I use that idiom, too, though for a different purpose. I have > DARCS_ALWAYS_COLOR=1 by default because I like to pipe things through > less (directly or let darcs do it) which understands color codes. So I > need to disable that when redirecting the output to a file...) > > To circumvent this problem, we could add /just one/ additional option > --config "VAR=value" that can be given multiple times to override any of > the settings on a per-invocation basis. That sounds fine. My main argument for using the existing flags mechanism is avoiding having multiple overlapping mechanisms. But if we can clearly differentiate the two kinds of things and the overhead/boilerplate of adding flags is high, there's a strong argument for a new mechanism.
msg20870 (view)	Author: bfrk	Date: 2019-06-23.17:57:51
>> In https://www.fpcomplete.com/blog/2017/06/readert-design-pattern >> under the heading "Has typeclass approach" he describes how to use type >> classes to avoid this. He also describes how we can let TH generate all >> the boilerplate using the lens library (I checked that the example works >> equally well with microlens). > > I think the Has typeclass is a good approach. Less keen on TH unless we > have other compelling needs for it in darcs as well, it tends to make > everything more complicated. Lenses without TH mean /lots/ of boilerplate, though. > For example binding group analysis is > messed up by top-level splices which in my experience can lead to a lot > of head-scratching, and it probably makes things less portable. > > (For more detail on binding group analysis, see the comment "Top-level > declaration splices break up a source file" in > https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#template-haskell > ) Ha! I've been stumbling over that problem a few times but didn't investigate. I just found that moving all splices in a file together in one place fixed it. Now I know why. Yes, that's kinda ugly. But compared to all the boilerplate... we're talking maybe several dozens of option values... plus any number of configuration variables that we currently use the environment for. >>> [ about implicit parameters] >> I did some research to find out why they aren't popular. [...] >> the meaning of a local definition can change depending on >> whether it has a type signature or not. > [...] I like using them in > cases where no recursive use is contemplated. I think probably even > expecting to need to rebind one (non-recursively) would be an indicator > not to use them. If this is the only problematic scenario, perhaps we could live with that. >> One reason I tend to like the ReaderT approach more is precisely >> /because/ it enforces monadic style. Allowing dynamically scoped >> variables in arbitrary (pure) code scares me a lot because they make >> reasoning (in particular: refactoring) much more difficult. > > TBH I don't see the difference. The Reader monad is just as dynamically > scoped as implicit params: if you rebind the reader value you can get > into just as much of a pickle, I think. Yes, clearly. But with monadic code I am used to being careful about effects and their order. When refactoring pure code I can be much more ruthless and I don't want to loose that guarantee, i.e. the ability to pick things apart and put them together in different ways, not having to worry about effects and ordering. >> On the practical side, I have read that you are not allowed to use >> implicit parameter constraints as superclass constraints for classes or >> instances (because that introduces incoherence). This is what E. Kmett >> says here: >> >> https://www.reddit.com/r/haskell/comments/5xqozf/implicit_parameters_vs_reflection/ >> >> I may be wrong but i think we would need to do exactly that in order to >> remove the environment variables that configure the behavior of our >> Printer type, since that functionality is used by generic code under >> Darcs.Patch. > > Well, we'd need to put them in the type signatures of the class methods. Good point. So this restriction doesn't bite as hard as I thought. > But the same is true of using ReaderT True. >, or conversely if it's not true of > using ReaderT, then the same is true of implicit params. I suspect it > depends on whether we need the vars to generate a Doc or just to > output it. Just to output it. More precisely, to turn a Doc into a String or ByteString that can then be written to some handle. And it only concerns the interactive output; for the on-disk representation we always use the plain printer that cannot be configured. The env vars are all read inside getPolicy which is only called by fancyPrinters, which in turn gets passed directly to the output routines. This is so because getPolicy needs the handle to decide whether to use colors and escaping by default: if it's a terminal device than yes, else no. So the constraint would not infect things like showPatch. We'd have to see how to handle debug and error messages, though. >>> For me the biggest problem with environment variables is that they are >>> really problematic when using darcs as a library. [...]>> >> (BTW, this also applies to the current working directory which you have >> to initialize properly for most procedures from Darcs.Repository and >> Darcs.UI.Commands; unless we completely remove the assumption of "." >> being the current repository from them.) > > Agreed, and that's a far worse problem in that I run into it constantly > when trying to code against darcs-the-library. Environment variables on > the other hand are mostly a theoretical problem right now. Conventional > wisdom has been that it (using CWD) is needed for performance reasons, > but I don't know how true that is nowadays. While this is indeed a problem, it is only tangential to configuration, so I'd say we should discuss this one separately. > My main argument for using the existing flags > mechanism is avoiding having multiple overlapping mechanisms. But if we > can clearly differentiate the two kinds of things and the > overhead/boilerplate of adding flags is high, there's a strong argument > for a new mechanism. I remain undecided about this. And since I have lots of trouble ATM with more dire problems like failing QC tests even for V3 I will postpone work on this issue. Still, thanks for your input, if and when I take this up again I will surely come back to it.

History
Date	User	Action	Args
2018-10-01 07:07:06	bfrk	create
2019-06-17 14:42:57	bfrk	set	messages: + msg20851
2019-06-17 21:50:54	ganesh	set	messages: + msg20852
2019-06-18 08:54:30	bfrk	set	messages: + msg20853
2019-06-18 08:59:43	bfrk	set	title: add command 'darcs config' -> add command 'darcs config' / re-think how to configure darcs
2019-06-18 20:09:01	ganesh	set	messages: + msg20859 title: add command 'darcs config' / re-think how to configure darcs -> add command 'darcs config'
2019-06-19 10:35:47	bfrk	set	messages: + msg20862
2019-06-23 10:09:07	ganesh	set	messages: + msg20869
2019-06-23 17:57:54	bfrk	set	messages: + msg20870