I've had a passing curiosity about Guix, so it was good to read this report.
One thing I didn't find is Guix on servers. I am all-in on NixOS for both my daily driver desktop and couple of servers, and adding more of either will be simple modifications to my flake repository. I really appreciate that simplicity and consistency. Does Guix offer that?
The other thing is package availability: it's amazing on Nix. Plus, they remain relatively fresh on the unstable channel. How's that on Guix?
The scheme syntax gets a while to get used to if you are not familiar with it, and in the end having a real programming language is quite awesome, you can do lots of fun stuff like programmatically create a file or dir for every user of a certain group, etc.
The vast majority of what you’d want in a Guix server can be found in the services section and parts of the documentation that lay out how to build services. But it doesn’t have as many services available as nix.
I struggled with remote deployment + secret management, too. Like a lot of folks, my nix-config grew over the years as I added secrets management, user management etc ad hoc.
I recently found clan.nix [1] and am quite pleased. It's kind of a framework for writing nixos configurations with a focus on multiple devices. It bundles secrets management and remote deployment into a convenient CLI.
It has the concept of "services", which are e.g. used for user management and VPNs. Services define roles, which can be assigned to machines, e.g. the wireguard service has a controller and a peer role. That feels like the right abstraction and it was very easy to set up a VPN with zerotier like that, something I struggled doing myself in the past.
It's a rather young project, but I converted my nix-config repo after a short evaluation phase to use clan. It's worth taking a look for sure.
I would strongly recommend sops-nix[0]. Pair this with ssh-to-age/ssh-to-gpg for the keys for each server. We are using this at $work for multiple servers, one notable advantages is that it works in teams (for multiple people) and git (and also gitops).
For remote installations nixos-anywhere is great. deploy-rs or colemna is fine, nixos-rebuild with `--target-host` is also working well for us however.
There are plenty of options: nix-sops, or nix-age, or whatever you would like - past the overall idea the implementation details are purely a matter of taste how you fancy things to be. Key idea is to have encrypted secrets in the store, decrypted at runtime using machine-specific credentials (host SSH keys are a typical option to repurpose, or you can set up something else). For local management you also encrypt to the “developer” keys (under the hood data is symmetrically encrypted with a random key and that key is encrypted to every key you have - machines and humans).
Alternatively, you can set up a secrets service (like a password manager/vault) and source from that. Difference is where the secrets live (encrypted store or networked service, with all the consequences of every approach), commonality is that they’re fetched at runtime, before your programs start.
I’m currently using deploy-rs, but if I’d redo my stuff (and the only reason I don’t is that I’m pretty much overwhelmed by life) I’d probably go with plain vanilla nixos-rebuild --target-host and skip any additional layers (that introduce extra complexity and fragility).
Just a crude and somewhat haphazard summary but hope it helps.
I have been using nixos-rebuild with target host and it has been totally fine.
The only thing I have not solved is password-protected sudo on the target host. I deploy using a dedicated user, which has passwordless sudo set up to work. Seems like a necessary evil.
> I deploy using a dedicated user, which has passwordless sudo set up to work.
IMO there is no point in doing that over just using root, maybe unless you have multiple administrators and do it for audit purposes.
Anyway, what you can do is have a dedicated deployment key that is only allowed to execute a subset of commands (via the command= option in authorized_keys). I've used it to only allow starting the nixos-upgrade.service (and some other not necessarily required things), which then pulls updates from a predefined location.
I seemingly went what I consider simple, others might consider barebones: I have a Makefile, define all the hosts there, and then use scp to copy over all files into the right hosts, and then run `nixos-rebuild switch` on them once copied over. Some have extra "restart-service-X" as some services have their configuration managed outside NixOS (but still in my source repository) so those as well are scp'd into the right place, then the service restarted after the nixos-rebuild switch. Otherwise it's all .nix files and service-specific configuration files.
> merely pulling in Nixpkgs is an effort, due to the repository being massive.
I've embraced daily shallow clone/fetches and the burden is now mostly just the 2GB of disk space.
It's a bit annoying though that git doesn't make it easier. No one would shallow clone later screw up and download every commit anyway, I feel shallow clone repos should be set up with a different configuration that fully-embraces shallow history (not that the configuration options even exist today AFAIK).
Interesting, I have taken a stab at maintaining a repo on the nixpkgs and using a --sparse approach, i.e. `git clone --filter=blob:none --sparse --branch nixos-25.11 https://github.com/NixOS/nixpkgs.git nixpkgs-dorion
cd nixpkgs-dorion`
Oh, maybe I had a full clone on my laptop before I started doing shallow fetches, but since fetching takes quite a while I've been using a shallow clone on my workstation.
I've only been doing this for a few weeks, so too early to tell if it's a good setup, but I added a GitHub Action that rebases my personal fork atop nixpkgs-weekly. I'm hoping that will help keep me from having a stale-by-default personal nixpkgs. (I use a personal nixpkgs to stage PRs waiting to be merged upstream.)
I remember thinking about doing that, but found value in having my fork tell me when was the last time I synced with upstream which was something useful to assess if I wanted to rebase my patch once again. Eventually my change got upstream and I stopped tracking my own fork though.
I started doing it this way (auto-rebase) when I started sharing one home-manager config across different devices.
This lets me do `nix flake update` on my home-manager config to have all my in-flight patches + the canonical nixpkgs from any device, and trust that they can all see the shared GitHub to sync. Hopefully will make updating less of a chore.
Only time it's bit me so far was when godot3 was broken in nixpkgs-weekly but worked in 25.11. Forced me to go write a PR for that and get it upstream to get my build working again, but that was more of a nixpkgs-weekly problem than a personal fork one.
One of the wrinkles of getting home-manager going on a bunch of different devices is that it liked to copy my local git checkout of nixpkgs to /nix/store a lot. That's why I'm preferring to have flake.lock point at my github.com branch, and then I can test uncommitted changes as-needed by passing --local to my home-manager switch incantation:
> it didn't take long for me to realize the initial problem that caused my previous install to be unbootable was of course found between the chair and keyboard.
I think we’ve all been there! It still happens to me, exactly 5 seconds after calling Claude/Codex/Gemini names and dismissing their ability to follow instructions.
>"But NixOS isn't the only declarative distro out there. In fact GNU forked Nix fairly early and made their own spin called Guix, whose big innovation is that, instead of using the unwieldy Nix-language, it uses Scheme. Specifically Guile Scheme..."
I'd be curious if a list exists of all declarative Linux distros out there, along with the configuration language (Nix, Scheme, etc.)
I'd also be curious as to how easy it would be to convert Scheme to the Nix language or vice-versa, in other words, it seems to me that there might be a "parent language" (for lack of a better term) out there for all lisplike and functional programming language (a subset of Haskell, F#, or some other functional programming language perhaps) that sort of might act as an intermediary conversion step (again, for lack of a better term!) between one functional or lisplike programming language and another...
Probably unrelated (but maybe somewhat related!) -- consider Pandoc... Pandoc is a Haskell program that basically uses a document tree structure to convert between one type of document format and another... maybe in terms of programming languages you'd call that an AST, an Abstract Syntax Tree... so maybe there's some kind of simplified AST (or something like that) out there that works as the base tree for all functional and lisp-like programming language (yes, lisp/lisplikes sort of preserve its/their own tree; their own AST -- via their intrinsic data structure, and that would seem to be true about functional programming languages too... so what is the base tree/AST of all of these, that all languages in this family can "map on to" (for lack of better terminology), that could be used (with AI / LLM's) as an "Intermediary Language" or "Intermediary Data Structure" (choose your terminology) to allow easily converting between one and the other?
Anyway, if we had that or something like that, then Nix configurations could (in theory) be easily converted to Guix, and vice-versa, automatically, as could any other Linux configured by a functional and/or lisplike language...
I was thinking the same thing. Since scheme is in the Lisp family, it should be straightforward to modernize it to something like Clojure, which is similar to Haskell as you mentioned. Being functional, but from a Java/Lisp ecosystem that might be more viable in the typical modern software environment.
Not necessarily harder, just add 'jdk25' to home packages. If you really don't want to use JVM you can use Babashka to start clojure and use it like you would bash.
Modern GPU drivers are a nightmare for open source. Wifi no better but slightly less critical. Power management. Forget Linux this should be the year of the NetBSD desktop but we can’t have nice architectures bc of economics in computing. The whole scenario makes sense but the emergent result sucks.
> Modern GPU drivers are a nightmare for open source.
Modern NVIDIA drviers. Let me fix that for you.
Intel and AMD has their full stack in mainline already, and AMD made great effort to enable their cards fully under open source drivers, as their agreements and law allows. You can even use HDCP without exposing sensitive parts, if you want.
Intel also works completely fine.
However, NVIDIA's shenanigans and HDMI forum's v2.1 protectionism is something else completely.
NVIDIA drivers work perfectly fine, when you decide from the beginning to use them, instead of attempting to first use alternatives like nouveau.
The only problems appear when your Linux distribution decides for some reason to make difficult for its users to choose the NVIDIA drivers, like GUIX. On distributions where the users can choose NVIDIA freely, like Gentoo, which I am using, there are no problems whatsoever with NVIDIA. I have used the NVIDIA drivers for Linux and for FreeBSD during more than 20 years, on many kinds of desktops and laptops, with no problems. All this time I have also used Intel GPUs and AMD GPUs, and especially with the latter there have been problems more frequently.
Unfortunately, some of the Linux kernel developers actively try to sabotage the NVIDIA kernel driver from time to time, e.g. by imposing restrictions on the kernel symbols that it may import, which is certain to complicate the work of the maintainers of the NVIDIA drivers, and which can cause problems for the Linux users who are not aware that for this reason they must have a version of the NVIDIA driver that is matched to their kernel version.
I much prefer to have only open-source privileged code, but it cannot be said that NVIDIA has not done a good job with their Linux (and FreeBSD) support, which has been much better than that of almost any other hardware vendor. Only Intel had even better support for their hardware, including not only CPUs and GPUs, but also networking, WiFi etc. However, even with Intel, their Linux GPU drivers have been frequently worse than their Windows GPU drivers (like also with AMD), unlike NVIDIA, where their Linux drivers always had the same performance as the Windows drivers.
While non-professional users may not encounter many problems with the Intel or AMD GPUs, any user that has needed more complex OpenGL applications has frequently encountered problems in the past when the Intel or AMD OpenGL implementation in their Linux drivers was incomplete or buggy in comparison with NVIDIA.
- An open source kernel module which talks with the card.
- A set of closed source GLX libraries for acceleration support.
- A signed and encrypted firmware which only works with this closed source driver package to enable the card.
Nouveau drivers are intentionally crippled with a special firmware which enables the card to show a desktop, with abysmal performance and feature set.
Well, amd drivers sucked a whole lot (fglrx anyone?) before AMD made them open-source. And on every other front it's the same, as basically every other manufacturer. There is no such thing as open hardware.
I have used fglrx for a very long time, and have some adventures with it. I even knew people from the development team, actually.
Well, having a driver agnostic closed source firmware is pretty different from an end-to-end closed chain with a driver-authenticating firmware.
Also, while fglrx had some serious problems, they didn't wait two years to fix DVI DPMS issues like the green company.
Yes, neither are open hardware at the end of the day, but we have almost infinite number of colors and infinite shades of gray. Like everything else, this is a spectrum.
As I aforementioned, I'd love to have completely free hardware, but the world's reality works differently for many right and many wrong reasons. I'd prefer to use most open one I can get, in this case.
But at the same time (adding more shades of color), part of the reason why Nvidia remained closed source for longer was precisely because they were supporting all the same features on both windows and Linux, while amd's Linux was (is?) always lagging behind. For ML use cases basically the only choice was Nvidia.
(Nonetheless, I was very happy with my amd card, and now I'm very happy with a semi-modern Nvidia card)
Yes, that's a problem if you want a fully free-software powered system. However, considering how we had firmware since forever, this is a compromise I can personally accept, for now.
Having a completely Free Software firmware would be great, but I'm not sure barrier to this is as low as Free Software since there's involvement of IP blocks, regulation, misuse of general purpose hardware (like radios) and whatnot.
I really support an end-to-end Free Software system, but we have some road to go, and not all problems are technical in that regard.
I went with AMD for compatibility playing games, but AFAICT AMD ROCm is not in a great state for computation. Why can't I have both?
That's something like what they're describing as "a nightmare," isn't it? "As agreements and law allows," is part of the nightmare. Under a modern OS, it should not be difficult to have the full capability of the hundreds or thousands of dollars worth of hardware you paid for.
There is ZLUDA [1] for CUDA, Ollama works on AMD [2], and there's OptiScaler [3] for DLSS. For older AMD GPU there's also FSR 4 INT8, see howto here [4]
> With Nix, however, it was a matter of just describing a few packages in a shell and boom, Ruby in one folder, no Ruby (and thus no mess) everywhere else.
This approach was already done by GoboLinux in 2005. And even GoboLinux was by far not the first - versioned AppDirs existed for a long time before; even perl stow enabled that. NixOS just uses a modified variant e. g. via hashed directory names. But I already adopted a similar scheme as GoboLinux did soon after I switched to Linux in 2005 (well 2004 but mostly 2005 as I was still a big noob in 2004 really).
> I started adding shell.nix files to all my little projects
I appreciate that NixOS brought good ideas to Linux here; having reliable snapshots is good. If a user has a problem, someone else might have solved it already, so you could "jump" from snapshot to snapshot. No more need for StackOverflow. The HiveMind took over.
But with all its pros, the thing I hate by far the most in NixOS is .. nix. I think the language is ugly beyond comparison; only shell scripts are uglier. I instead opted for a less sophisticated solution in that ruby acts as the ultimate glue to whatever underlying operating system is used. What I would like is a NixOS variant that is simpler to use - and doesn't come with nix. Why can't I use ruby instead? Or simple config files? I am very used to simple yaml files; all my system description is stored in simple yaml files. Since +20 years. That approach works very well (ruby expands these to any target destination; for instance, I have aliases for e. g. bash, but these are stored in yaml files and from that ruby then generates any desired target format, such as also cmder on Windows and so forth).
> In fact GNU forked Nix fairly early and made their own spin called Guix, whose big innovation is that, instead of using the unwieldy Nix-language, it uses Scheme.
I am glad to not be the only one to dislike nix, but boy ... scheme? Aka Lisp? Seriously???
Why would users know what cons* does, anyway? That's stupid.
YAML files exist for a reason. Keep. Things. Simple. (I know, I know, many use YAML files in a complex manner with gazillion nested indentation. Well, they are using it in a wrong way, then they complain about how bad yaml is.)
> Since the code is pretty much just Scheme and the different mechanisms available are fairly well documented (see caveat below), the barrier to entry is much lower than with Nix in my opinion.
Can't evaluate this. To me it seems as if NixOS may have changed, but Nix was always a big barrier. I decided to not want to overcome it, since I did not want to be stuck with a horrible language I don't want to use.
This approach was already done by GoboLinux in 2005. And even GoboLinux was by far not the first - versioned AppDirs existed for a long time before; even perl stow enabled that. NixOS just uses a modified variant e. g. via hashed directory names. But I already adopted a similar scheme as GoboLinux did soon after I switched to Linux in 2005 (well 2004 but mostly 2005 as I was still a big noob in 2004 really).
Nix already existed in 2003. Besides that Nix store directories are more ingenious than versioned application directories (or hashed directories), the hash in the output path is the hash of the normalized derivation used to build the output path (well, in most cases, let's keep it simple). Derivations work similarly (also using hashes). Moreover, since a derivation can contain other derivations as an input, the Nix store represents hash/Merkle trees.
This makes it very powerful, because you can see which parts of the tree need to be rebuilt as a result of one derivation changing.
But with all its pros, the thing I hate by far the most in NixOS is .. nix.
I think it depends on your background. I did some Haskell at some point in my live and I like Nix. It is a very simple, clean, lazy, functional programming language. The primary thing I'm missing is static typing.
I instead opted for a less sophisticated solution in that ruby acts as the ultimate glue to whatever underlying operating system is used. What I would like is a NixOS variant that is simpler to use - and doesn't come with nix. Why can't I use ruby instead?
Because what nixpkgs does is not easily expressible/doable in Ruby. First, the package set is one huge expression in the end. That might seem weird, but it allows for a lot of powerful things like overlays. However, for performance reasons this requires lazy evaluation. Also other powerful abstractions require lazy evaluations (e.g. because there are some infinite recursions in nixpkgs).
Second, the Nix packaging model requires a purity (though this gap was only properly closed with flakes). You have to be able to rely on the fact that evaluating an expression evaluates to the same result. Otherwise a lot of things would break (like substitution from binary caches).
Third, things like overlays rely on fixed points, which can be done easily in a lazy functional language.
---
Having used Nix for 8 years now, I have a long list of criticisms as well though :).
Bikeshedding over the language is a huge waste of time, too.
I haven’t written a line of Nix since I started using it, yet it defines three of my systems. I just read diffs that an LLM created when editing my config.
Making a big deal about the language substrate feels like someone still trying to argue over vim vs emacs. It’s trivial and uninteresting.
I am young (29). I like computers because they are essentially something hacky. There’s nothing more hacky than Lisp IMO. I will never complain about Lisp. I just like it.
Same here, I'm 22 and Uncle Bob sold me on the Lisp language. Reading SICP and Land of Lisp gave me a newfound appreciation for programming and design.
Emacs has Geiser to edit Scheme/Guile code with ease. For sure Emacs users know how to handle Scheme which is just another Lisp (even MIT Scheme bundles an old Emacs release) and have at least partial knowledge of Common Lisp (and being able to grasp it in minutes).
My issue with Guix coming from NixOS is the missing first-class zfs support for root, crypto included, RustDesk, few other common services who are hard to package.
Guix potential target IMVHO should be desktop power users, not HPC, NixOS while mostly developed for embedded systems (Anduril) or servers in general still take care of desktops, Guix apparently not and that's a big issue... Nowadays outside academia I doubt there are many GNU/Linux users who deploy on plain ext4...
For desktop usage, I would be absolutely shocked if ext4 isn't the most common filesystem by a pretty wide margin. Its the default on Ubuntu, Debian, and Mint. Those are the 3 leading desktop distros.
No one is going to write a blog post titled "Why I just used the default filesystem in the installer" but that is what most people do. Things like btrfs and zfs are useful, complicated technologies that are fun to write about, fun to read about, and fun to experiment with. I'd be careful about assuming that leads to more general use, though. Its a lot like Guix and NixOS, in fact. They get all the attention in a forum like this. Ubuntu is what gets all the people, though.
You have a statistical point of view that doesn't go into detail enough: yes, Debian, Ubuntu, Mint are mainstream distros and use ext by default. The vast majority of their users are also mainstream users and would never approach declarative distros, which are alien to them.
Those who choose going declarative instead are people with operations knowledge, who understand the value of a system ready to be built, modified, and rebuilt with minimal effort thanks to the IaC built into the OS, who understand the value of their data and therefore babysit them properly. The average user of Debian, Ubuntu, Mint today doesn't even have a backup, uses someone else's cloud. If they run experiments, they waste storage with Docker, or use manually managed VPSs; they don't own a complete infrastructure, let alone a modern one.
So thinking about them for Guix means never letting it take off, because those users will never be Guix users. ZFS is the opposite of complicated; it's what you need to live comfortably when you know how to use it, which unfortunately isn't mainstream, and declarative distros do the same.
NixOS succeeds despite the indigestible Nix language because it offers what's needed to be comfortable to those who know. Guix remains niche not because of GNU philosophy but because it doesn't do the same, not offering what those coming from operations are looking for and they are the most potential realist target users.
Ext4 is still very popular as a solid, no frills filesystem. Btrfs is the primary alternative and still suffers from a poor reputation from their years of filesystem corruption bugs and hard to diagnose errors. ZFS and XFS only makes sense for beefier servers and all other filesystems have niche use cases or are still under development.
I don't consider myself a "believer" in anything, but as a sysadmin, if I see a deploy with ext4, I classify it as a newbie's choice or someone stuck in the 80s. It's not a matter of conviction; it's simply about managing your data:
- Transferable snapshots (zfs send) mean very low-cost backups and restores, and serious desktop users don't want to be down for half a day because a disk failed.
- A pool means effective low-cost RAID, and anyone in 2026 who isn't looking for at least a mirror for their desktop either doesn't care about their data or lacks the expertise to understand its purpose.
ZFS is the first real progress in storage since the 80s. It's the most natural choice for anyone who wants to manage their digital information. Unfortunately, many in the GNU/Linux world are stuck in another era and don't understand it. They are mostly developers whose data is on someone else's cloud, not on their own hardware. If they do personal backups, they do them halfway, without a proven restore strategy. They are average users, even if more skilled than average, who don't believe in disk failures or bit rot because they haven't experienced it personally, or if they have, they haven't stopped to think about the incident.
If you want to try out services and keep your desktop clean, you need a small, backup-able volume that can be sent to other machines eg. a home server, to be discarded once testing is done. If you want to efficiently manage storage because when something breaks, you don't want to spend a day manually reinstalling the OS and copying files by hand, you'll want ZFS with appropriate snapshots, whether managed with ZnapZend or something else doesn't really matter.
Unfortunately, those without operations experience don't care, don't understand. The possibility of their computer breaking isn't something they consider because in their experience it hasn't happened yet, or it's an exceptional event as exceptional that doesn't need automation. The idea of having an OS installed for 10 years, always clean, because every rebuild is a fresh-install and storage is managed complementarily, is alien to them. But the reality is that it's possible, and those who still understand operations really value it.
Those who don't understand it will hardly choose Guix or NixOS; they are people who play with Docker, sticking to "mainstream" distros like Fedora, Ubuntu, Mint, Arch. Those who choose declarative distros truly want to configure their infrastructure in text, IaC built-in into the OS, and truly have resilience, so their infrastructure must be able to resurrect from its configuration plus backups quickly and with minimal effort, because when something goes wrong, I have other things to think about than playing with the FLOSS toy of the moment.
I'll bite. I use NixOS as a daily driver and IMO makes the underlying FS type even less important. If my main drive goes I can bootstrap a new one by cloning my repo and running some commands. For my data, I just have some rsyc scripts that sling the bits to various locations.
I suppose if I really wanted to I could put the data on different partitions and disks and use the native fs tools but it's a level of detail that doesn't seem to matter that much relative to what I currently have. I could see thinking about FS details much more for a dedicated storage server
Fs level backups for an OS sounds more relevant when the OS setup is not reproducable and would be a pain to recreate.
Yes and no. ZFS is for managing your data with simplicity and efficiency that isn't possible with other "storage systems" on GNU/Linux. Setting up a desktop with mdraid+LUKS+LVM+the chosen filesystem is a way longer job than creating a pool with the configuration you want and the volumes you want. Managing backups without snapshots that can be sent over a LAN is a major hassle.
Can it be done? Yes. Formally. But it's unlikely that anyone does it at home because between the long setup and maintaining it, there's simply too much work to do. Backing up the OS itself isn't very useful with declarative distros, but sometimes a rebuild fails because for example there's a broken package/derivation at that moment, so having a recent OS ready, a simple volume to send over LAN or pull from USB storage is definitely convenient. It's already happened to me a few times that I had to give up a rebuild for an update because something was broken upstream, few days and that's fixed but without an OS backup, if I'd had to do a restore at that moment, I would have been stuck.
There's actually no substantial difference: it's the very concept of a mere filesystem that's obsolete. What's needed to manage your data is:
- Lightweight/instant/accessible and transmittable (at block-level) snapshots, not just logical access
- Integrated management of the underlying hardware, meaning support for various RAID types and dynamic volumes
- Simplicity of management
ZFS offers this, btrfs doesn't (even with LUKS + LVM, nor stratis); it has cumbersome snapshots, not transmittable at the block level, and has embryonic RAID support that's definitely not simple or useful in practice. Ext? Doesn't even have snapshots nor embryonic RAID nor dynamic volumes.
Let me give a simple example: I have a home server and 2 desktops. The home server acts as a backup for the desktops (and more) and is itself backed up primarily locally on cold storage. A deployment like this with NixOS, via org-mode, on a ZFS root is something that can be done at the home level. ZnapZend sends daily snapshots to the home server, which is backed up manually every day simply by physically connecting the cold storage and disconnecting it when it's done (script). That's the foundation.
What happens if I accidentally deleted a file I want back? Well, locally I recover it on the fly by going to $volRoot/.zfs/snapshots/... I can even diff them with Meld if needed. What happens if the single NVMe in the laptop dies?
- I physically change the NVMe on my desk, connect and boot the laptop with a live system on a USB NVMe that boots with sshd active, a known user with authorized keys saved (creating it with NixOS is one config and one command; I update it monthly on the home server, but everything needed is anyway in an org-mode file)
- From there, via ssh from the desktop, with one command (script) I create an empty pool and have mbuffer+zfs recv listening; the server via mbuffer+zfs send, send the latest snapshots of everything (data and OS)
- When it's done, chroot, rebuild the OS to update the bootloader, reboot by disconnecting the USB NVMe, and I'm operational as before
- what if one of mirrored two NVMEs of my desktop die? I change the faulted and simply wait for resilvering.
Human restore time: ~5 minutes. Machine time: ~40 minutes. EVERYTHING is exactly as before the disk failed; I have nothing to do manually. Same for every other machine in my infra. Cost of all this? Maintaining some org-mode notes with the Nix code inside + machine time for automated ISO creation, backup incremental updates etc.
Doing this with mainstream distros or legacy filesystems? Unfeasible. Just the mere logical backup without snapshots or via LVM snaps takes a huge amount of time; backing up the OS becomes unthinkable, and so on. That's the point.
Most people have never built an infra like this; they've spent HOURS working in the shell to build their fragile home infra, when something breaks they spend hours manually fixing it. They think this is normal because they don't know anything else to compare. They think a setup like the one described is beyond home reach, but it's not. That's why classic filesystems, from ext to xfs (which does have snapshots) passing through reiserfs, btrfs, bcachefs and so on, make no sense in 2026 and not even in 2016.
They are software written even in recent times, but born and stuck in a past era.
Or you just fully embrace the thin client life and offload everything to the server. pxe boot with remotely mounted filesystems. local hard drives? who needs those?
And the server is handled how? We're always there: complexity can be managed or hidden.
Why do you think some people asked SUN to un-free ZFS back in the day? Because unlike most, they understood its potential. Why do you think PC components today, graphics cards first, then RAM, and NVMe drives after that, cost so much? Because those who understand realize that today, a GNU/Linux homeserver and desktop are ready for the masses, and it's only a matter of time before a umbrel.com, start9.com, or even frigghome.ai succeeds and sweeps away an increasingly banning and therefore unreliable and expensive cloud providers. Most still haven't grasped this, but those who live above the masses have.
Why are snaps, flatpaks, docker etc are pushed so hard even though they have insane attack surfaces, minimal control over your own infrastructure, and are a huge waste of resources? Because they allow selling support to people who don't know. With NixOS or Guix, you only sell a text config. It's not the same business model, and after a while, with an LLM, people learn to do it themselves.
The scenarios you mentioned are indeed nice use cases of ZFS, but other tools can do this too.
I can make snapshots and recover files with SnapRAID or Kopia. In the case of a laptop system drive failure, I have scripts to quickly setup a new system, and restore data from backups. Sure, the new system won't be a bit-for-bit replica of the old one, and I'll have to manually tinker to get everything back in order, but these scenarios are so uncommon that I'm fine with this taking a bit more time and effort. I'd rather have that over relying on a complex filesystem whose performance degrades over time, and is difficult to work with and understand.
You speak about ZFS as if it's a silver bullet, and everything else is inferior. The reality is that every technical decision has tradeoffs, and the right solution will depend on which tradeoffs make the most sense for any given situation.
How often do you test your OS replication script? I used to do that too, and every time there was always something broken, outdated, or needing modification, often right when I desperately needed a restore because I was about to leave on a business trip and had a flight to catch with a broken laptop disk.
How much time do you spend setting up a desktop and maintaining it with mdraid+LUKS+LVM+your choice of filesystem, replacing a disk and doing the resilvering, or making backups with SnapRAID/Kopia etc? Again, I used to do that. I stopped after finding better solutions, also because I always had issues during restores, maybe small ones, but they were there, and when it's not a test but a real restore, the last thing you want is problems.
Have you actually tested your backup by doing a sudden, unplanned restore without thinking about it for three days before? Do you do it at least once a year to make sure everything works, or do you just hope that since computers rarely fail and restores take a long time, everything will work when you need it? When I did things like you and others I know who still do it, practically no one ever tested their restore, and the recovery script was always one distro major release behind. You had to modify it every few releases when doing a fresh install. In the meantime, it's "hope everything goes well or spend a whole day scrambling to fix things."
Maybe a student is okay with that risk and enjoys fixing things, but generally, it's definitely not best practice and that's why most are on someone else's computer, called the cloud, as protection from their IT choices...
> How often do you test your OS replication script?
Not often. It's mostly outdated, and I spend a lot of time bringing it up to date when I have to rely on it.
BUT I can easily understand what it does, and the tools it uses. In practice I use it rarely, so spending a few hours a year updating it is not a huge problem. I don't have the sense of urgency you describe, and when things do fail, it's an extraordinary event where everything else can wait for me to be productive again. I'm not running a critical business, these are my personal machines. Besides, I have plenty of spare machines I can use while one is out of service.
This is the tradeoff I have decided to make, which works for me. I'm sure that using ZFS and a reproducible system has its benefits, and I'm trying to adopt better practices at my own pace, but all of those have significant drawbacks as well.
> Have you actually tested your backup by doing a sudden, unplanned restore without thinking about it for three days before?
No, but again, I'm not running a critical business. Things can wait. I would argue that even in most corporate environments the obsession over HA comes at the expense of operational complexity, which has a greater negative impact than using boring tools and technology. Few companies need Kubernetes clusters and IaC tools, and even fewer people need ZFS and NixOS for personal use. It would be great if the benefits of these tools were accessible to more people with less drawbacks, but the technology is not there yet. You shouldn't gloss over these issues because they're not issues for you.
Most companies have terrible infrastructure; they're hardly ever examples to follow. But they also have it because there's a certain widespread mentality among those who work there, which originates on the average student's desktop, where they play with Docker instead of understanding what they're using. This is the origin of many modern software problems: the lack of proper IT training in universities.
MIT came up with "The Missing Semester of Your CS Education" to compensate, but it's nothing compared to what's actually needed. It's assumed that students will figure it out on their own, but that almost never happens, at least not in recent decades. It's also assumed that it's something easy to do on your own, that it can be done quickly, which is certainly not the case and I don't think it ever has been. But the teacher who doesn't know is the first to have that bias.
The exceptional event, even if it doesn't require such a rapid response, still reveals a fundamental problem in your setup. So the question should be: why maintain this complex script when you can do less work with something else? NixOS and Guix are tough nuts to crack at first: NixOS because of its language and poor/outdated/not exactly well-done documentation; Guix because its development is centered away from the desktop and it lacks some elements common in modern distros, etc. But once you learn them, there's much less overhead to solve problems and keep everything updated, much less than maintaining custom scripts.
I'm currently troubleshooting an issue on my Proxmox server with very slow read speeds from a ZFS volume on an NVMe disk. The disk shows ~7GBps reads outside of ZFS, but ~10MBps in a VM using the ZFS volume.
I've read other reports of this issue. It might be due to fragmentation, or misconfiguration, or who knows, really... The general consensus seems to be that performance degrades after ~80% utilization, and there are no sane defragmentation tools(!).
On my NAS, I've been using ext4 with SnapRAID and mergerfs for years without issues. Being able to use disparate drives and easily expand the array is flexible and cost effective, whereas ZFS makes this very difficult and expensive.
So, thanks, but no thanks. For personal use I'll keep using systems that are not black boxes, are reliable, and performant for anything I'd ever need. What ZFS offers is powerful, but it also has significant downsides that are not worth it to me.
Honestly, pre-made containers are usually black boxes and also a huge waste of resources. If anything, your problem is not using NixOS or Guix, which means you have no reason to waste resources with Proxmox and maintain a massive attack surface thanks to ready-made containers from who knows who, maybe even with their forgotten SSH keys left inside, with dependencies that haven't been updated in ages because whoever made them works in Silicon Valley mode, etc.
First of all, I don't see how containers are inherently black boxes or a waste of resources. They're a tool to containerize applications, which can be misused as anything else. If you build your own images, they can certainly be lightweight and transparent. They're based on well known and stable Linux primitives.
Secondly, I'm not using containers at all, but VMs. I build my own images, mainly based on Debian. We can argue whether Linux distros are black boxes, but I would posit that NixOS and Guix are even more so due to their esoteric primitives.
Thirdly, I do use NixOS on several machines, and have been trying to setup a Guix system for years now. I have a love/hate relationship with NixOS because when things go wrong—and they do very frequently—the troubleshooting experience is a nightmare, due to the user hostile error messages and poor/misleading/outdated/nonexistent documentation.
By "black box" I was referring to the black magic that powers ZFS. This is partly due to my own lack of familiarity with it, but whenever I've tried to learn more or troubleshoot an issue like the performance degradation I'm experiencing now, I'm met with confusing viewpoints and documentation. So given this, I'm inclined to use simpler tools that I can reasonably understand which have given me less problems over the years.
Ugh, containers/VMs are black boxes because in common practice you just pull the image as-is without bothering to study what's inside, without checking things like outdated dependencies left behind, some dev's forgotten SSH keys, and so on. There are companies that throw the first image they find from who-knows-who into production just because "it should have what I'm looking for"...
Are they knowable? Yes, but in practice they're unknown.
They waste resources because they duplicate storage, consume extra RAM, and so on to keep n common elements separate, without adding any real security, and with plenty of holes punched here and there to make the whole system/infra work.
This is also a terrible thing in human terms, led to a false sense of security. Using full-stack virtualization increases the overhead on x86 even more with no substantial benefit as well.
ZFS has a codebase that's not easy, sure, but using it is dramatically simple. On GNU/Linux the main problem is not being a first-class citizen due to the license and being a port from another OS, not something truly native even though a lot has been done to integrate it. But `zpool create mypool mirror /dev/... /dev/...` is definitely simple, as is `zfs create mypool/myvol` and so on... Compared to mdadm+luks+{pv,vg,lv}* etc. there's no comparison, it's damn easier and clearer.
Most people will use LVM+Ext4 in the enterprise for backups and easy resizing and the like. Also Guix has rollbacks for the whole package manager and system so whole backups aren't mandatory, the data would be handled and backed up externally with some LVM volume, for sure there's a Guix service for backups.
Well, yeah, but in the enterprise world, the infrastructure is often really pathetic, layers of legacy garbage with "do not touch, if it breaks everything is lost" labels. I wouldn't take them as a model, which is one of the common and self‑harming behaviors many people practice: trying to imitate others just because they're big, not because it makes technical sense.
The system can be regenerated, sure, but it might fail to rebuild at some point because of a broken upstream package; that's what backups are for, and it's handy to have them for everything. Adding, say, thirty gigabytes for the system costs little.
LVM does allow you to operate, yes, but with much heavier limitations and far more inconvenience than ZFS... And here we come back to operations vs. even top developers, like Andrew Morton's infamous quote about ZFS, which shows the difference between those who genuinely manage complex systems and those who only see their own workstation and don't understand broader needs to the point of denying they exist even for themselves.
TLDR: ... I'm getting a comparable experience to NixOS, with all the usual pros a declarative environment brings and without having to put up with Nixlang.
How the errors/debugging compare? From what I've read this is the main pain point with Nix where a more mature language like Guile should have a much better experience here. The article touches on this but I'd be curious of a more extensive comparison about this aspect.
Just a personal anecdote, but the errors from Guix are terrible. I had to reinstall because I couldn't figure out the scheme errors for my system config
And which an LLM/AI model can apply the huge training set of Lisp/Scheme to help solve your problem.
Nixlang is so infuriatingly obtuse that I generally have to fire up Discord and bug the local Nix acolyte when something goes wrong. I've bounced Nixlang off of the LLM/AIs, but I have learned that if the AI doesn't give you the correct answer immediately for Nixlang then you need to stop; everything forward will be increasingly incorrect hallucinations.
I suspect LLM/AIs will hallucinate far, far less with the Scheme from Guix.
seems like they have pretty clear goals in mind. if they were changing distributions haphazardly id otherwise agree, but to me it reads that they're refining their taste.
One thing I didn't find is Guix on servers. I am all-in on NixOS for both my daily driver desktop and couple of servers, and adding more of either will be simple modifications to my flake repository. I really appreciate that simplicity and consistency. Does Guix offer that?
The other thing is package availability: it's amazing on Nix. Plus, they remain relatively fresh on the unstable channel. How's that on Guix?
reply