Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make a service abstraction layer #26067

Open
copumpkin opened this issue May 24, 2017 · 31 comments
Open

Make a service abstraction layer #26067

copumpkin opened this issue May 24, 2017 · 31 comments

Comments

@copumpkin
Copy link
Member

copumpkin commented May 24, 2017

We've talked about this for years, usually in the context of people hating or wanting to replace systemd, but I think it's time to start talking about it in other less negative contexts as well.

Motivation

Nixpkgs now works pretty well on Darwin, and the nix-darwin effort by @LnL7 can manage services using launchd. Unfortunately, even though 99% of the logic could be shared with NixOS services, it can't currently be.

Furthermore, people are taking more and more infrastructure to distributed container scheduling systems like Kubernetes, ECS, or Nomad. Although in some cases people might run the host on NixOS, I think Nix can provide an equally compelling story for the container guests. @lethalman's dockerTools is a promising start to that, but just like nix-darwin, anyone using it has to reinvent the configuration for their services because NixOS service modules can't be reused.

Finally, assorted work like declarative user environments, replacing the init system (please don't turn this thread into another systemd flamewar; we have plenty of other venues for that), and service testing on virtualized platforms (I can't run any of the NixOS VM tests on EC2, Azure, Google Cloud, and so on) all could benefit heavily from such a layer.

Prior work

The major effort so far has been #5246 and it's an impressive feat but has long bitrotted, and @offlinehacker does not have the time to pick it up again.

Key goals

  1. Should be possible to evaluate/build on non-NixOS systems, and ideally non-Linux ones (with a suitable back-end)
  2. Shouldn't lose the nice functionality we enjoy from systemd. More generally, it shouldn't prevent us from using system-specific idiosyncrasies that don't exist in other systems. If we see some of them being used repeatedly all over the place, we eventually factor them out into the SAL.
  3. Shouldn't get overly complicated or generalized. Let's not try to plan for every eventuality and just tackle low-hanging fruit for now.
  4. Don't get mired in getting something perfect all at once. 1000-line+ PRs rarely get merged or even reviewed, and bitrot easily. I'm convinced we can do this incrementally and see if we like it on one or two simple services with small PRs, and then start working outwards from there.

Proposed approach

I'm not actually going to propose a technical approach here. Rather, I'd like to propose how we approach figuring out how to implement it.

  1. Go through all of our service modules and categorize how many of them use each key in systemd.services.<name>.<key> and how.
  2. Write the thinnest layer that covers the two or three most commonly used service keys and translates its config to the current systemd.services machinery
  3. Write a simple "dump to text file" backend for the SAL that lists enabled services in an arbitrary piece of config, and builds on a non-Linux platform
  4. In separate PRs, start switching over small numbers of our simplest services to use the new SAL machinery, possibly leaving some systemd.services config behind to merge with the config that SAL sets
  5. Start work on more interesting backends like a launchd one and a docker one, etc.

I expect most services don't do much beyond setting ExecStart/script and some environment, possibly with a default wantedBy = [ "multi-user.target" ] which basically means "please run my unit" in most cases. Many services probably want a way to express that they need networking, and some might want to run as a separate user. Eventually we'll start running into services that depend on other services, and I propose not trying to tackle that at first.

cc @shlevy @offlinehacker @edolstra @globin @LnL7

P.S: this isn't an RFC because I'm not actually saying what to do, but just what I want and how I think we should go about making it happen

@edolstra
Copy link
Member

I'm skeptical about having a service abstraction layer.

  • I suppose the goal is to move from systemd.services.* back to a generic jobs.* option. However, this only works for services defined in NixOS modules, not for upstream units. This would cause us to lose one of the biggest advantages of systemd, namely the ability to use upstream units directly. (In the Upstart days, IIRC, we had to define all Upstart services in NixOS modules, so there was no reuse of upstream job files. Not that there were many upstream job files...)

  • An abstraction layer would condemn us to the least common denominator of init systems, which is very small. For example, systemd and Upstart have completely different dependency systems, so having dependencies between services would be out. Startup notification, cgroups, etc. - i.e. all the features that make systemd so much better in keeping track of services - would be out. Likewise for socket activation, resource control, sensible logging, etc. etc. Some of these could be implemented on other systems, but cgroups in particular would probably be impossible to emulate on OS X. And without cgroups, you can't reliably keep track of processes so service scripts quickly become littered with killall hacks.

  • Even if most services are currently simple (i.e. use ExecStart etc.), moving to an abstraction layer imposes a cost in that people will be discouraged from improving a service by using systemd-specific features (since that would make the service no longer portable to other init systems). So for instance, if somebody creates a PR to add socket activation to some service S, that PR might be rejected because it would cause S to no longer work on OS X.

@copumpkin
Copy link
Member Author

An abstraction layer would condemn us to the least common denominator of init systems, which is very small. For example, systemd and Upstart have completely different dependency systems,

@edolstra I explicitly addressed what I think we should do about idiosyncratic features in my proposal. Did you not see that? I'm very explicitly saying we should not do what you just said would be a problem.

I'm also curious what you think we should do about the problem. I know you use NixOS a lot but it feels like a pity to lose out on all the goodness if we're using other platforms.

@edolstra
Copy link
Member

I did see that, but I don't think it will work in practice. For example:

  • Alice adds a service that uses the abstraction layer.
  • A while later, Bob improves the service by adding socket activation, but this breaks the service on systems that don't provide socket activation.
  • Bob's improvement gets reverted because people were relying on the service working on those systems.

Now, maybe, somebody at this point steps up to add socket activation to the abstraction layer (and to all supported backends). But I don't think we should count on that. (Also, it won't work for all the non-portable stuff like cgroups.)

@copumpkin
Copy link
Member Author

copumpkin commented May 24, 2017

Your objections make sense; but how about turning the problem on its head a bit? The main thing I want to be able to share is the module config schema and how it plugs into service execution. I don't care as much if someone needs to write two or three different config sections per service to make it work on all backends. How ridiculous would this be?

{ config, lib, pkgs, ... }:

let
  cfg = config.myservice;

  configFile = builtins.toFile (builtins.toJSON { port = cfg.port; });

  startScript = ''
    ${pkgs.foo}/bin/foo -c ${configFile}
  '';
in {
  options = {
     enable = lib.mkOption { ... };
     port = lib.mkOption { ... };
     # You get the idea
   };

  config = lib.mkIf cfg.enable {
    systemd.services.foo = {
      serviceConfig.PrivateTmp = true;
      script = startScript;
   };
   docker.services.foo = {
     volumes = { "/data" = {}; };
     script = startScript;
   };
   launchd.services.foo = {
    launchdOptions.StartOnMount = true;
    script = startScript;
   };
  };
}

Obviously details would vary, but this would allow individual service modules to factor out common stuff (config generation, possibly start commands, environment variables, etc.) across the different launch systems, but we wouldn't make any attempt to abstract over common options between them. Laziness means we don't evaluate config we don't use, so e.g., asking for a docker container out of a particular service won't force the systemd config and vice versa. Eventually we might realize that many of those systemd/docker/launchd triplets look the same and might factor them into some common functions, but we wouldn't be forced to decide that ahead of time.

I can sketch something like this out quickly in a PR if you'd like something more concrete to munch on.

@7c6f434c
Copy link
Member

I will join as a user of NixOS services on a a system with NixPkgs kernel but without systemd.

  1. In many situations nixosInstance.config.systemd.services.[name].runner is a script that can be used to start the service, and the target service management solution can just.
  2. In case where service can be convinced to put configs into /etc/, nixosInstance.config.environment.etc.[filename] gives you the config; the launcher script is usually simple anyway.

Given that we often need to patch the build system to make sure it uses $PATH correctly, and that we try to move a lot of configs out of /etc/, I expect that upstream systemd configs will require an interesting amount of patching. And I think that adding alternative service definitions in this case (maintained by whoever needs them) seems no less reasonable than having multiple boost versions in NixPkgs.

I do hope that we could eventually have service.configs or something where the configs are kept, because a lot of service definitions do not leave an easy way to access their configs in a simple programmatic way, and I know of no guildelines on how to provide such access in NixOS. I really hope that not being able to list all the configs a service definition generates is not desirable for most services, so such export shouldn't limit desirable improvements, and experience shows that config generation is most of the work anyway unless the service definition is completely trivial.

@edolstra
Copy link
Member

@copumpkin Yeah that sounds pretty good to me!

We'll probably want to have variants of module-list.nix for various environments, in order to filter out services that don't support particular environments.

@vcunat
Copy link
Member

vcunat commented May 24, 2017

We added meta.maintainers to services, so why not meta.platforms? EDIT: I see the platforms here will be of a different kind; anyway that's all an unimportant nitpick from me ATM.

@edolstra
Copy link
Member

@7c6f434c I don't think upstream systemd units typically need a lot of patching. Also, they're extensible without patching. E.g. you can set systemd.services.foo.path while still using the upstream foo unit. (Also Exec* directives require absolute paths, so systemd units tend to rely less on $PATH.)

Regarding "mainained by whoever needs them" tends not to work that way in practice. Somebody will do a PR to add (say) OS X support to the Postgresql module, and then it becomes everybody else's responsibility to keep it working. (I.e. if somebody changes something to the module that breaks it on OS X, then that person will get blamed, even though that person might have no way to test on OS X.)

@vcunat Maybe, though that requires parsing/evaluating potentially a lot of files that are not usable on a particular platform.

@copumpkin
Copy link
Member Author

copumpkin commented May 24, 2017

Yeah, perhaps a module-level meta would make sense (also for module-level documentation, maintainership, etc.). Then module-list could just enumerate and filter. That's complicated slightly by modules usually being functions and needing to be knotted before being able to query the meta, but it's not awful.

@7c6f434c
Copy link
Member

@edolstra Re: «mainained by whoever needs them» — it doesn't work that way in general, but specifically in the case where the split is caused by using an upstream unit, there is a better chance, because from the mainline NixOS side there is no reason to touch the forked version anyway… Well, if the config is shared, the systemd-specific part will be inside ifs.

But the only thing I actively want is config access.

@LnL7
Copy link
Member

LnL7 commented May 24, 2017

@copumpkin I think that approach makes the most sense, this makes the difference very explicit and nix has primitives to define common options in a nice way. For example my modules for launchd also support some options like script and path so for simple cases the config could be reused.

{ config, lib, pkgs, ... }:
with lib;
let
  cfg = services.foo;
  service =
    { script = "${pkgs.foo}/bin/food";
      path = [ pkgs.hello ];
    };
in {
  options = {
    services.foo.enable = mkOption {};
  };

  config = mkIf cfg.enable {
    systemd.services.foo = service;
    launchd.services.foo = service;
    docker.services.foo.container = "foo";
  };
}

Something else that I've been wondering about is how this would work for darwin and regular linux systems without accidentally binging in activation scripts from the NixOS core modules, etc.
What I've done for nix-darwin is to copy or reimplement common options instead of importing them.

@copumpkin
Copy link
Member Author

copumpkin commented May 24, 2017

@LnL7 I think laziness mostly avoids bringing in any config we don't ask for. I'll sketch out a very simple proof of concept a bit later in a PR and see if I run into any issues.

In a sentence, ultimately all config reduces to one or two "entry points", like system.build; we'd probably just add separate entry points for Darwin/docker/etc.

@LnL7
Copy link
Member

LnL7 commented May 24, 2017

Yeah, that last comment is probably a little bit out of the scope of this issue.

@offlinehacker
Copy link
Contributor

offlinehacker commented May 24, 2017 via email

@shlevy
Copy link
Member

shlevy commented May 25, 2017

So, since I was tagged, my 2 cents are that the NixOS module system is annoying and uncomposable, and the solution here is probably just plain functions, choosing different ones with the same high level interface for different backends. In the case of "just run this program with these args", there of course can be a common helper function interface. In more complex cases, we only have as many backends as can support the interface, or possibly have graceful degredation for cases where we can't.

@copumpkin
Copy link
Member Author

Here's a sketch of the idea I proposed: #26075

@shlevy I partially agree but the merging behavior is nice (feels a bit like AOP in some cases) and I don't want to reinvent everything just to get some basic reuse 😄

@7c6f434c
Copy link
Member

@shlevy if there is config export, we can just use separate NixOS instances for every service, then it behaves as a pure function…

@mmahut
Copy link
Member

mmahut commented Aug 25, 2019

Are there any updates on this issue, please

@wmertens
Copy link
Contributor

This issue predates the RFC process. I propose a champion (not me, although I think it would be great to have for Darwin, Docker, WSL etc) pours the key points of this issue into a RFC and we can get traction on it that way.

@stale
Copy link

stale bot commented Jun 1, 2020

Thank you for your contributions.
This has been automatically marked as stale because it has had no activity for 180 days.
If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.
Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the
    related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse. 3. Ask on the #nixos channel on
    irc.freenode.net.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 1, 2020
@rien
Copy link
Contributor

rien commented Jun 7, 2020

This is still important to me. I would love to use a simpler init system with NixOS.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 7, 2020
@wmertens
Copy link
Contributor

wmertens commented Jun 8, 2020

I like the idea of taking the most capable system (kubernetes apparently) and adding wrappers or ignoring features in other systems.

So keep the systemd config layer, but add a more general one that mostly translates directly to systemd configurations.

Then, if you want to have an init.d type system, you would basically run all the scripts once, and output a bunch of warnings about unsupported features. The system would not restart crashed services or handle log rotation until someone comes up with a wrapper for those. On the plus side, the closure would be tiny and it would work in containers.

In fact, those general settings could get their defaults from the current systemd settings, so most modules would work as-is.

@lpolzer-enercity
Copy link

+1 here, this is one thing that still keeps me from considering to adopt NixOS.

@wmertens
Copy link
Contributor

wmertens commented Jun 9, 2020 via email

@abathur
Copy link
Member

abathur commented Sep 28, 2020

@wmertens I don't know enough about different init systems to evaluate how well it grapples with some of the challenges Eelco mentioned, but IIRC a Nix newsletter early this year featured a blog post (https://sandervanderburg.blogspot.com/2020/02/a-declarative-process-manager-agnostic.html) by @svanderburg about the project https://github.com/svanderburg/nix-processmgmt, which seems to have made substantive progress on the core concepts, here?

@wmertens
Copy link
Contributor

@abathur I hadn't seen that, it looks impressive! Seem like @svanderburg found a nice middle ground between service-specific and service-agnostic.

I wonder how far along this is to be able to boot e.g. nixos-minimal with a different service manager.

@wmertens
Copy link
Contributor

Don't forget to watch Sander present nix-processmgmt https://cfp.nixcon.org/nixcon2020/talk/TW79FU/ :)

@martin-braun
Copy link

I'd also enjoy to use NixOS without systemd, I'm rather interested to build a minimal system that uses openrc instead.

@ThisNekoGuy
Copy link

I've recently been curious about NixOS but, similarly, I'd rather have s6 and/or OpenRC as well :/

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/is-nixbsd-a-posibility/29612/16

@abathur
Copy link
Member

abathur commented Oct 6, 2023

:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests