Docker's reliance on overlay filesystems is one of the biggest problems I have w...

contingencies · on Nov 7, 2016

As an LXC user since very early days, I wrote something similar to but architecturally far more generic than docker, begun earlier (~201-2015). It's called cims. It was explicitly designed to be portable to *BSD as well as arbitrary storage drivers (docker added this later), cloud provider interfaces (VM only, AWS/GCE style full APIs, existing server with LXC or docker in place, etc.), and between arbitrary logical machine and service topologies. See docs @ http://stani.sh/walter/cims/ and early architectural forethought @ http://stani.sh/walter/pfcts/ Scope was arguably broader (ease of system administration, explicit need for existing CI/CD process integration), and to date I still believe the architectural paradigm is superior. Unfortunately my former employer decided to ask me to pay them to release the code, so I can't release it open source. But the high level docs have been open since ~early so knock yourself out on a clone. I may even write one myself in future.

Storage-wise, it had LVM2, ZFS, and loopback drivers. Never bothered with overlay, just used a generic clone API to the storage driver to spin up identical VMs before modification. Very easy with snapshot/thin-provisioning capable backends, like LVM2 and ZFS. Loopback just used cp, but because you could have these in memory they could also be very fast (but memory-hungry). Cloud-wise, we'd done a few but I was also iterating an orchestration system based upon internal requirements (high security/availability) and existing, proven solutions (pacemaker/corosync). It was designed for fully repeatable builds, something docker only began to add at a later date.

_0w8t · on Nov 4, 2016

One can get very similar experience with Docker. The idea is to make single Docker image with all the software one needs on the server and start all containers using it with read-only flag. Then one use volumes for persistent data.

davexunit · on Nov 4, 2016

That's not really the same thing. What about different applications that share dependencies? With Docker, they would have to share an exact subset of a linear Dockerfile to take advantage of any deduplication, so you're back to image layers. It's telling to me that when proposed solutions involve completely avoiding one of Docker's primary features that maybe Docker isn't very well designed to begin with.

With Guix, any sub-graph that is shared is naturally deduplicated, because we have a complete and precise dependency graph of the software, all the way down to libc. I find myself playing lots of games with Docker to take the most advantage of its brittle cache in order to reduce build times and share as much as possible. Furthermore, Docker's cache has no knowledge of temporal changes and therefore the cache becomes stale. Guix builds aren't subject to change with time because builds are isolated from the network. Docker needs the network, otherwise nothing would work because it's just a layer on top of an imperative distro's package manager. Docker will happily cache the image resulting from 'RUN apt-get upgrade' forever, but what happens when a security update to a package is released? You won't know about it unless you purge the cache and rebuild. Docker is completely disconnected from the real dependencies of an application, and is therefore fundamentally broken.

_0w8t · on Nov 4, 2016

Docker needs network only when one uses Dockerfile for deployment, a rather bad idea. Instead Docker images should be used. It allows to verify them on development/testing machine before deployment. And with this setup all "bad pieces" of Docker are located at developer's notebook. In production everything is read-only and shared among all containers.