BorgBackup: Deduplicating Archiver

rsync · on Nov 26, 2019

We[1] built borg into our environment[2] as soon as it was stable, release software. In the years since, it has (ironically) supplanted use of rsync as the de facto standard that our users back up to us with.

As one of our users have said[2], borg is "the holy grail" of backups as it does everything rsync always did, and produces remotely encrypted backups that the provider has zero insight into.

It also does not have the inefficiencies that the older, duplicity software has.

If you are willing to go without (borg specific) technical support and do your retention with borg instead of our zfs snapshots, there is a special, discounted rate available.[4]

[1] rsync.net

[2] https://news.ycombinator.com/item?id=17408624

[3] https://www.stavros.io/posts/holy-grail-backups/

[4] https://www.rsync.net/products/borg.html

pizza234 · on Nov 26, 2019

Borg is definitely not the holy grail of backups, unless one has very loose requirements.

Desktop backup solutions have typically a set of functionalities to compare against (block-based, safe on untrusted repositories, with realtime backup, compression-efficient, and optionally, with a functional GUI), that show how open source solutions are strangely lacking, each in a way or another.

> produces remotely encrypted backups that the provider has zero insight into.

If you're implying an untrusted repository, this is not entirely correct. From https://borgbackup.readthedocs.io/en/stable/internals/securi...:

> in a multiple-client scenario a repository can trick a client into reusing counter values by ignoring counter reservations and replaying the manifest

The last discussion on HN was actually significantly more critical: https://news.ycombinator.com/item?id=18952839.

Borg also has no efficient compression, as it doesn't use multithreading. There are multiple GitHub issues on this subject; I've just checked, and they're closed.

kbenson · on Nov 26, 2019

> Borg also has no efficient compression, as it doesn't use multithreading. There are multiple GitHub issues on this subject; I've just checked, and they're closed.

Depending on your use model, this may or may not affect you. Whether you use multiple threads to compress a single stream or not probably matters a lot less if you have 15-30 backups going on at the same time.

theamk · on Nov 26, 2019

Nice to see someone offering public borg hosting! I have two security questions:

(1) Are you doing any hardening of the borg remote side? The server is pretty complex and it has lots of code paths which can be exercised by a potentially untrusted client.

(2) Is there a way to protect compromised clients (like a cryptolocker)? borg offers "append only" mode, but it does not allow pruning (obviously). And trying to run "prune" using separate trusted connection will commit untrusted changes as well.

blakesterz · on Nov 26, 2019

   inefficiencies that the older, duplicity software has.

As much as I love duplicity, it does sort of suck when you get into larger numbers of files and gigabytes of data. It's so darn slow sometimes it's almost unusable. Good to know borg is better, I've been meaning to check that out!

JackMcMack · on Nov 26, 2019

It's incredibly slow (backing up 1.5TB took more than a day), but the built-in support for PAR2 makes it worth it (for me).

Sadly, it seems like borg has no built-in support for any kind of redundancy [1]

[1] https://github.com/borgbackup/borg/issues/225

Erlich_Bachman · on Nov 27, 2019

Duplicity and Borg use two very distinct technologies under the hood - duplicity does differential partial backup archives (which all require a full backup AND a full list of all previous differential archives) and Borg uses deduplication. So yes it is much more efficient in many ways, not simply because it is newer and somehow is more efficiently coded, it is so because it uses different technology and backup model.

dx87 · on Nov 26, 2019

I've been using borg to backup small things, like schoolwork, config files, mail, and code, to rsync.net for ~1 year now, and can vouch that it's pretty simple to setup. If you've got a simple use case like I do, the example config in the borg tutorial is pretty much all you need. I've got it set to automatically run as a cron job, and I haven't had to touch it except to occasjionally do sanity checks that everything is being back up correctly.

m3nu · on Nov 27, 2019

If your customers back up multiple servers, are they still under the same rsync.net account and could they see each other's data?

This is one of the issues I aimed to solve with BorgBase.com[1]: every single repo is its own backup user and can't see other repos. This separation allowed to add some Borg-specific features, like append-only mode, monitoring for stale backups or using a specific Borg version.

1: https://www.borgbase.com

rsync · on Nov 27, 2019

rsync.net has "subaccounts" which allow you to control side-by-side backup directories using standard UNIX permissions and ownerships. That is, if the default configuration is not exactly what you'd like.

Subaccounts even get their own .ssh/authorized_keys file.

It's all very unix centered and command-line focused. There's no web interface for features like this.

theamk · on Nov 28, 2019

I see that you offer "append only mode" as a core feature. How do you deal with pruning of old backups in this case?

Regular borg only offers two options: either pruning does not decrease space used at all; or pruning permanently commits changes, negating any security advantages. Both of those seem bad enough to prevent offering "append-only" commerically?

millettjon · on Nov 26, 2019

I use borg with both rsync.net and borgbase.com. Both are great.

novok · on Nov 27, 2019

Why aren't you price competitive with Google Drive, Amazon Drive, Wasabi, etc? Their rate is ~$0.005/GB/month once you start buying +1TB

Youden · on Nov 27, 2019

$5/TB is getting pretty common now (Backblaze B2 comes to mind) and if you roll your own you can get ~1.6EUR/TB from providers like Hetzner (https://www.hetzner.com/dedicated-rootserver/matrix-sx).

j1elo · on Nov 27, 2019

How do you arrive to the 1.6€/TB/month?

In your link the cheapest option is 76€, with an 82€ setup fee.

EDIT - I see it now, had read 4 TB for the cheapest option, but it is actually 40 TB !!

Too much for my typical home user needs, though :-)

novok · on Nov 27, 2019

How reliable is Hetzner? Is there major downtime, enough bandwidth (Around 100Mbps minimum during congested times?)

Youden · on Nov 28, 2019

In my experience pretty reliable. I've not seen major downtime and bandwidth will depend entirely on routing to you in particular (FWIW my ISP peers with Hetzner directly and congestion hasn't been an issue, I can typically saturate my 1Gbps connection at any time of day).

jolmg · on Nov 28, 2019

> borg is "the holy grail" of backups as it does everything rsync always did

I tried borg for some time for full backups from / and there were 3 things I didn't like about it:

1. Mounting and navigating a snapshot was extremely slow. Like wait a few seconds for any `ls` to finish.

2. It needed the encryption password for many things I think it shouldn't.

3. The resulting backups are very opaque.

I've since switched to making my backups by rsync'ing to btrfs subvolumes. I snapshot the previous backup and rsync the differences to the new one. I get to navigate each snapshot with the same snappiness as any other directory in the filesystem. btrfs can also instantly tell me which files differ between 2 snapshots, no matter how huge they are. I don't remember all the options that borg offers on synching, but I doubt it has the same breadth of options as rsync. So for me, rsync + COW filesystem > borg.

I am a little concerned with my use of btrfs, but I'm planning to do the same thing using ext4's reflinks for some disk and filesystem redundancy.

cies · on Nov 26, 2019

How do I get rotation schemas to work with borg? Say I want to keep the last 3 daily snapshots, one per week for the last 4 weeks, and one per month for the last 6 months...

I understand this is kinda hard with "zero knowledge" encryption, but it is possible and not a wild feature to request from the self-named holy grail of backups soft.

corobo · on Nov 26, 2019

The prune option is what you’re after there

https://borgbackup.readthedocs.io/en/stable/usage/prune.html

ngrilly · on Nov 27, 2019

Yes, but which server will run the purge command? It cannot be the same server as the one you are backuping, since one of the goal of the backup is to protect the server if it is compromised.

Erlich_Bachman · on Nov 27, 2019

You can run command from inside your backup script, from the client. It needs the encryption key anyway to add backups, this way it is just one extra command that does all the pruning (quite quickly as well). And you don't have to worry about full/incremental backups, nor really about any "rotation" or managing them manually, it's all deduplicated and Borg just keeps X number of daily/weekly/whatever snapshots (like duplicacy, like restic, like bup, etc. etc.).

ngrilly · on Nov 27, 2019

Yes, I got this. But if your server is compromised, then the attacker can erase your backups, which ruins the point of doing backups in the first place. This is why there is an append-only mode.

majewsky · on Nov 27, 2019

What I do is to have two different SSH keys. The one that is used by the backup script is --append-only. The one that I keep locally on my notebook has full access, so I `borg prune` from there when I see the disk usage reaching critical levels.

Erlich_Bachman · on Nov 27, 2019

Don't rely on append-only mode in Borg, it does not work really as advertised. Every transaction in that mode is recorded in a log, which if you are very careful you can go back into (but even reading or monitoring it is a PITA, the "interface" requires you to carefully track id's of files and check with dates etc).

And being careful is difficult, because in Borg, once you do any other backup command from a full access account (for example to do pruning) - it will automatically, no warning, go through the log and apply it. You should really really read up on that functionality first before relying on it, the way Borg has implemented it is close to anti-feature.

Regarding compromisation of strictly the server itself, I believe there are commands to check the state of the repository? Isn't that enough?

corobo · on Nov 27, 2019

I don’t know the best method, but my method is to have a locked down backup box in the same DC as the machine(s) it’s backing up.

It mounts the machine to be backed up and runs Borg on that.

The backup machine does need locking down in of itself but that’s a lot easier to do than locking down something public facing

aljungberg · on Nov 27, 2019

If we are still talking about rsync.net, and an attacker gains access to your account and deletes your borg archive, you can still restore your backup from the rsync.net ZFS snapshots.

The snapshots are not deletable even with your full credentials.

_kwmj · on Nov 27, 2019

It says 1.5c per GB month but then starts at 100GB minimum. Any options for users who would want to use less storage than that?

sliken · on Nov 27, 2019

Generally it's not worth using a credit card for something less than $1.5 per month. Various cloud services will give you 100GB for free.

frio · on Nov 26, 2019

There's so many backup systems out there these days, but none that are quite perfect. The closest I've found is `rdedup`[1], which may no longer be maintained. What I'd like in a backup solution is:

- compressed

- deduplicated

- encrypted using asymmetric encryption, so that the encrypting machine doesn't need to know a password

- durable (using par2 or directly supporting repository repair from an independent backup)/signed

- composed of standard open source tools so that if the backup software goes away, everything can still be retrieved

- supports differing upstream data hosting providers

- open source

There's lots of things that fill most of those requirements, but none that tick all the boxes.

- rdiff-backup does all of the above except deduplication; it's full + incremental which requires a management strategy

- restic doesn't compress, doesn't use asymmetric encryption and doesn't use par2 or similar tools for durability

- borg doesn't use asymmetric encryption and doesn't use par2/durability stuff; it's also pretty slow

I know people call borg the "holy grail", but I think we're still a wee way off that emerging, personally -- even though I do really like (and use!) borg.

[1]: https://github.com/dpc/rdedup

cookiecaper · on Nov 26, 2019

Yeah, I've been searching for a great general archiver for a long time. I tried borgbackup a couple of years ago and immediately hit a handful of issues that put me off, though I can't really recall the specific technical details clearly (I'm vaguely recollecting something about incompatible block sizes).

Whatever the details were, I came away with the distinct impression that the borgbackup developers didn't really respect the gravity of archived records and long-term storage. While that's probably not a totally fair conclusion from a one-shot use, in my defense, I'll offer up the latest version of their changelog [0], where:

* the first thing listed is an apparently serious data corruption bug that lived through several stable releases

* the second thing listed is an apparently-very-serious security vulnerability ("a flaw in the cryptographic authentication scheme used in Borg allowed an attacker to spoof the manifest ...")

* the third thing listed is another data corruption bug titled "Pre-1.0.9 data loss"

* the fourth thing listed is another data corruption bug titled "Pre-1.0.4 potential repo corruption"

Note these are all post-1.0 versions. To be frank, I dare not scroll further.

Users are depending on this software to safeguard important files with the assumption that it will be able to reproduce them bit-for-bit-intact some years down the road. Any long-term storage software requires developers with a fanatical devotion to compatibility, longevity, and integrity if it's ever going to be more than a toy.

I hate to pile on open-source devs who're trying their hand at making software that a lot of people appreciate, but overall, it's hard for me to say I regret the choice to pass it up and stick to combinations of ZFS, rsync, and the trusty old tarball.

[0] https://github.com/borgbackup/borg/commit/75dcf9356334188276...

viraptor · on Nov 26, 2019

I still don't know how to really judge software from changelogs, but I think it's really hard to do and this post is not a great example either.

This is a changelog backup software which does data storage and encryption mainly. If it ever has bugs worth talking about they're going to be in... data storage and encryption. If we look at rsync for a change which is much older, simpler and mature: https://download.samba.org/pub/rsync/src/rsync-3.1.1-NEWS

- fixing .. traversal in V3 (how did that survive so long?)

- issues with copying attributes

(There won't be much data corruption since rsync does 1:1 copies)

Time-to-fix could give us a better idea maybe? Either way, software doing X has bugs in X is "normal".

cookiecaper · on Nov 27, 2019

I mean, the issue isn't so much that bugs exist -- like you say, that's par for the course, even among the best developers. It's about a large incidence of severe eat-your-data bugs in such a short span of time (roughly a year) in something that's quote-unquote stable and whose entire purpose is long-term data storage and retrieval.

It's even worse because unlike rsync, which synchronizes filetrees from point A to B and can be immediately confirmed to have either worked correctly or not, borg uses a bespoke storage format that's not easily verified or operated upon by standard utilities. You just have to trust it to pull the correct data out when you need it. That's a heavy burden to put on a tool, and the borg devs seem to be struggling under its weight.

Consider the standards expected of filesystem or database maintainers. borg repos are not really that different -- they're a big, opaque chunks of bytes that you expect to be able to produce specific data on demand with perfect reliability.

These types of systems don't get marked stable when they're still in primordial, eat-your-data development phases, and on the small handful of unfortunate occasions when data loss bugs sneak in, they're a) usually limited to some bizarre corner case; and b) taken extremely seriously, almost to the point of solemnity, and often result in major overhauls to a project's validation and QA routine.

If any stable filesystem or database had 4 widely-applicable database corruption bugs within a year, it'd be lights out for that project. All trust lost, reputation irreparably ruined, angry letters to a variety of mailing lists, permanently-increased scrutiny on any new projects or maintainers among the same class of projects, etc.

All I'm saying is that based on my admittedly-murky prior experience, the project's observed track record, and the high standard of care that must be met to qualify a project for storing authoritative copies of data, I'm personally not comfortable trusting borg with anything important any time soon. No one is obliged to share that evaluation, of course.

mekster · on Nov 27, 2019

Duplicacy compresses.

https://github.com/gilbertchen/duplicacy

parfamz · on Nov 27, 2019

What about restic? I'm pretty happy with it

Erlich_Bachman · on Nov 27, 2019

You're happy with a backup software that does not have compression? Really? You don't pay for storage?

pizza234 · on Nov 27, 2019

If compression is purely a function of the price, things are not so obvious.

I think Backblaze B2 is the cheapest, but it's not supported by Borg. So a backup system without compression support, but with B2 support, may ultimately be cheaper than a system with opposite features.

It's also important to consider the use case. If a user's bulk of data is photos/videos/audio files, there's virtually no use of compression - but of course, the proportion will vary per user.

tomatocracy · on Nov 27, 2019

I’m using restic too for part of my backups, having moved from duplicity. My experience has been (on a dataset which is relatively compressible ie not audio/video files) that deduplication has saved me quite a lot more space vs duplicity than I lost through not having compression.

frio · on Nov 27, 2019

Too late to edit now, but I meant "duplicity" rather than "rdiff-backup" in the above post. Woops.

Erlich_Bachman · on Nov 27, 2019

Comparing duplicity to something a deduplicating software suite is not really correct, they don't even provide the same functionality (like mounting snapshots, like handling moving around of the files on the hard drive, like space-efficiently backing up files from different machines, like smart and efficient pruning, like speed of access(impossible for duplicity to achieve), etc.).

4gotunameagain · on Nov 26, 2019

What about rclone? Not advocating, just want your opinion since you seem like you researched the topic

frio · on Nov 26, 2019

rclone is rsync-on-steroids, and you can use it for backups, but it's not the same shape of tool as the above tools. It's effectively "copy these files from $here to $there" where $here and $there are any cloud provider or network filesystem. It's got a built-in encryption layer and some other bits, but you'd still be responsible for ensuring $there is actually $there/backups/(date +%F) or whatever.

I'd treat it as a tool in a bigger backup system (which is roughly what rdedup does with it).

Erlich_Bachman · on Nov 27, 2019

Rclone is not a backup system, it is a sync system. It does not do any archiving, deduplication, pruning, etc. It just syncs two directories (with possible encryption added on top). It can be efficiently utilized as an extra final step in a backup plan.

bradknowles · on Nov 26, 2019

Don't you want to dedup first, then compress?

cben · on Dec 4, 2019

Well, with `gzip --rsyncable` and similar (most compression tools have such a mode), it's technically possible to compress first then dedup! To be precise it's "chunk first, then compress", which still admits a 2nd "chunk, then dedup" pass to reuse data, but not as efficiently as a combined "chunk+dedup+compress each chunk"...

`gzip --rsyncable` and friends are mostly useful for when you already have compressed files on your filesystem, and want to sync/backup them byte-for-byte. Of course the benefit is lost unless most compressed files in the world are produced with such mode, and sadly most aren't :-(

The author of `zsync` did various experiments confirming gzip --rsyncable then sync is sub-optimal, and implemented a somewhat crazy "look inside" approach that can sync compressed files byte-for-byte AND very efficiently:

> gzip --rsync does fairly well, with both rsync and zsync transferring about 410kB at the optimum point. zsync with the look-inside method does much better than either of these, with as little as 140K transferred. > -- http://zsync.moria.org.uk/paper/ch03s03.html

IIUC though it's more of a "2-files sync" scenario like rsync, not applicable to "chunk everything then dedup" approach of borg and similar tools.

frio · on Nov 26, 2019

Probably, yes. Sorry, I should’ve said “in no particular order” — this is part of the reason why I haven’t tried to tackle building something myself :).

m3nu · on Nov 27, 2019

I offer dedicated Borg hosting at BorgBase.com[1] from $5/TB. As opposed to normal SFTP-based backup services, every backup repo is fully isolated. This allowed to add features, like append-only mode and monitoring for stales backups.

We have also developed a Qt-based desktop client[2] that runs in the system tray and makes it easier to browse archives or do restores.

For headless deployments, I highly recommend Dan's Borgmatic[3]. You could deploy it all together with our Ansible role[4].

1: https://www.borgbase.com

2: https://vorta.borgbase.com

3: https://torsion.org/borgmatic/

4: https://github.com/borgbase/ansible-role-borgbackup

beagle3 · on Nov 26, 2019

I switched from bup backup to borg backup, because it generally works better for my use cases; however, there are two things bup does do better:

1. remote backup through ssh, especially de-duplicating between different clients, especially concurrent client backups.

borg needs a compatible version installed on the other side (and compatibility has been broken between versions in the past); bup uses bare-bones ssh+sftp, so the other side can basically be anything.

borg will have a lot of download to client from server if multiple clients back up to the same repository (essentially every time in most common use cases); bup will have a minimal download.

borg maintains a repo lock, so multiple clients backing up to the same repo will be serialized; bup does not, so it can be concurrent.

2. Storage format

borg's format is it's own format; bup's underlying data format is basically a git repo (which you can treat as such; you may need to manually apply "cat" to rebuild files, bit "git" and "cat" are all you need).

I have heard good things about restic, but did not have a chance to evaluate it myself.

frio · on Nov 26, 2019

restic doesn't perform compression. Depending on your use case (I use it for photography backups, which don't compress well), that might be OK -- but it's something to be aware of.

dmd · on Nov 27, 2019

In my testing (early 2019), NONE of restic, rclone, borg, duplicity, or tarsnap was able to handle anything other than hobbyist workloads.

restic fell over hard at around 100 terabytes; the others at around 500.

I'm backing up a little over 3 petabytes. I use Bacula. It's awful, but I haven't found anything else that can deal with that kind of volume.

JeremyNT · on Nov 27, 2019

When I was backing up several 100 tb, I tried every open source option at the time and found them all lacking.

What ultimately worked? Plain old rsync over ssh, to a zfs pool with snapshots and compression.

As far as I could figure, the only notable downside of this is that the storage device must be trusted, since it has access to all of the data, and that you effectively needed root permissions on the storage you're copying to for filesystem permissions which makes a multi tenant backup server cumbersome (you could chroot or use containers or something but these solutions can become fiddly, e.g. running multiple instances of ssh on nonstandard ports to enable multitenancy).

dmd · on Nov 27, 2019

I've been very tempted to dump bacula and do that, especially since my main disk-based storage is an Oracle ZFS appliance.

But I also do backups to tape (LTO-8, in a Storagetek library), and I do like that Bacula handles that for me.

jval43 · on Nov 27, 2019

Doesn't rsync also slow to a crawl if there are too many files, due to memory requirements? Is there a way around that?

EDIT: Ah I see rsync 3.0.0 (released in 2008) fixed this issue. Maybe I was on an old version.

zmix · on Nov 27, 2019

> I use Bacula.

Did you test 'Burp'[1]?

[1] https://burp.grke.org/why.html

aleph- · on Nov 27, 2019

So restic has at least gotten somewhat better with several upstream PRs getting us past 100TB usability.

bloopernova · on Nov 26, 2019

Borg backup is amazing. I've been using it for years now, backing up a few terabytes of data from a few dozen VMs.

The dedupe is downright magical. We've been able to remain super frugal on the storage allocation for backups solely because of how wonderful its dedupe is.

Restores are also really nice and easy since they are just a FUSE mount.

If you have a Linux host or hosts, I can wholeheartedly recommend borg. It's elegant, robust, and fast. And open source.

BlackLotus89 · on Nov 27, 2019

I benchmarked borg and restics deduplication on a few datasets and restic was the winner (1-2 years ago). Did you do any comparisons with other deduplicating backup solutions? (Benchmarked the efficiency not the speed. Test data set was a few 100GB)

bloopernova · on Nov 27, 2019

We were previously using Bacula, which was way too heavyweight for what we wanted. I found Borg after looking into Attic, and wasn't aware of Restic at that point.

Compared to the Bacula and BackupExec, Borg was lightning fast and its disk usage was very frugal.

urgeblumbling · on Nov 26, 2019

Use, love, and support[1] borg.

[1] https://liberapay.com/borgbackup/donate

albertzeyer · on Nov 26, 2019

I'm still not totally happy with most of the backup solutions, or maybe I want too many features all at once (sth like Perkeep). Although, otherwise, if the tools are simple, I have to use multiple tools to cover all the features, and there will be lots of overlap, which is maybe not too much of a problem actually.

I collected a list here: https://github.com/albertz/wiki/blob/master/backup-software....

senotrusov · on Nov 27, 2019

I use BorgBackup to make a local backup on linux and mac machines and rclone to upload that backup to the cloud.

It took me some time to figure it all out. I wrote a script to not manually repeat configuration steps next time. Although it's quite opinionated it still grew to 600 lines.

Feel free to check it out, maybe it can help someone https://github.com/senotrusov/backup-script

john37386 · on Nov 26, 2019

I'm using borg and I honestly like it. Not ready to only rely on it. I try to use the 3-2-1 principle of backup strategy and borg is really a space and bandwidth saver! My collection grew from few Gb to some Tb. It's already more than a year old and I was wondering whether it's better to start a fresh new repo every year or to let grow the original one until it reaches a certain size? Any recommendations from ppl that used it for +2 years?

hikarudo · on Nov 26, 2019

BorgBackup is great.

One limitation that isn't mentioned often is that when doing anything with your borg repo, RAM use increases with the number of files you have backed up. In my case I had 15 million files, and mounting the repo took quite a bit of time (minutes) and used 11GB of RAM.

Restic also has the same issue.

dano · on Nov 27, 2019

Has anyone done a pricing comparison between borgbase, rsync.net, AWS S3, and Wasabi for offsite borgbackup storage?

rsync · on Nov 27, 2019

We try to stay current and competitive with pricing - currently at 1.5 cents/GB/month for the discounted, "no support"[1] "borg accounts":

https://www.rsync.net/products/borg.html

What is s3 these days ? 2.x cents ? Plus traffic ? We don't charge for traffic/usage/bandiwdth in any way ...

[1] You do get technical support, just not specific support for setting up your borg backups, which can be fairly complicated ...

frio · on Nov 27, 2019

I (sadly) moved from you guys to Backblaze, as they're down at $0.005 (ie. 0.5 cents) per GB. They will charge me (1 cent/GB) if I have to restore the backup, but this is my extreme off-site everything-blew-up-including-the-backups-closer-to-me backup, so, hopefully that won't eventuate.

rsync · on Nov 27, 2019

We will always be more expensive than B2, Wasabi, et. al.

You're not ever going to get immediate, personal technical support from a UNIX engineer at those services like you do at rsync.net.

frio · on Nov 28, 2019

Fair enough. I still use and like your service for a second off-site backup of ~40GB of very very important stuff, but was unable to ignore the cost savings for bulk data.

cmiles74 · on Nov 26, 2019

I've been using Borg for a couple of years now (and before that, Attic[0]) and I have been very happy with it. It's seen me through three laptops and it has been reliable and easy-to-use.

[0]: https://attic-backup.org/

rezgi · on Nov 27, 2019

I use borg extensively as well. One thing I haven't found a good solution for yet is monitoring.

How do you all keep track of whether your backups succeeded? I'd like to receive an email if a scheduled backup didn't run. The only thing I have for now is rsync's feature where they warn you if your data hasn't changed by X kb in the last Y hours/days but I find this lacking because multiple machines write to my rsync account. It'll only warn me if none made any change but I'd never know if only a few failed. Same thing if there is nothing new to backup, I get an email from rsync.net but I don't know if every backup job failed or there is just no changes.

patchtopic · on Nov 26, 2019

How does this compare against S3QL? Does it do multithreaded compression when backing up?

So far S3QL is the least worst of all these deduplicating/compressing/encrypting backup solutions I have tried, but I havent tried borgbackup. Despite it's stated focus on cloud object storage, S3QL also works great on NFS and local filesystems as a target as well.. and sshfs..

cmiles74 · on Nov 26, 2019

I see that BorgBackup is available in Chocolatey[0] and it looks like the current version. It's not clear to me if this is an official port or not, but I am interested. :-)

[0]: https://chocolatey.org/packages/borgbackup#testingResults

StreamBright · on Nov 26, 2019

How does it compare against tarsnap?

https://www.tarsnap.com

croon · on Nov 27, 2019

Is that $1/4GB/month? Plus bandwidth at the same cost on top of that? That seems wildly expensive.

gouggoug · on Nov 26, 2019

I had never heard of borg backup, seems like a great backup tool.

Now, I just installed it on my mac and running `borg --help` takes about 5 seconds before outputting the help.

Same for any other `borg ...` command.

I'm not quite sure why, but that's the only command I've noticed to be running slow on my system.

I'm running borg 1.1.10.

wcip · on Nov 26, 2019

That's the Python interpreter and all the dependencies starting up. Shouldn't really matter if you are using it as a chron job.

gouggoug · on Nov 26, 2019

I see. The screencast on their website made it seem very fast (which it may have been on a beefier machine than mine?), so I was surprised.

blattimwind · on Nov 26, 2019

It's edited.

alpaca128 · on Nov 27, 2019

Pretty sure it's not.

I've never seen borg take 5+ seconds for anything except creating backups. The same `borg --help` command finishes in 0.4s on my machine. That's not blazing fast either but I'm okay with it.

blattimwind · on Nov 27, 2019

They have been updated; the old screencasts were edited.

tenebrisalietum · on Nov 26, 2019

is it based on python?

hyperion2010 · on Nov 26, 2019

https://github.com/borgbackup/borg

62% C 35% Python

blattimwind · on Nov 27, 2019

Most of the C code is vendored third party libraries, there are only 2000-3000 lines of C that are actually part of the project.

croon · on Nov 27, 2019

3% magic?

ocdtrekkie · on Nov 26, 2019

Does it care about the underlying file systems? I have one Linux machine I need to backup a directory on, and dedupe would be a huge help on this one in particular, but I'm backing up to an SMB share that's mounted to a folder location.

Filligree · on Nov 26, 2019

Not generally, but I wonder if the locking will still work in your case.

bigdubs · on Nov 26, 2019

Curious if this name is going to draw the ire of Paramount's copyright lawyers.

beagle3 · on Nov 26, 2019

It is named in honour of Jonas Borgström [0], who wrote Attic, from which Borg was forked.

[0] https://github.com/jborg

flurdy · on Nov 26, 2019

Borg? Hardly, there is already a Google product with the same name.

But more importantly, it is the name for a castle in several languages. So unless Paramamount is planning to sue a lot of very old places in Norway and other countries...

capableweb · on Nov 26, 2019

> there is already a Google product with the same name

Borg at Google is not a product per se, it's a internal tool/service. Pretty sure you can name internal tools whatever you want.

slenk · on Nov 26, 2019

Google has a system called Borg, and they are fine.

Also, BorgBackup is not new; they have been around for a while.

bigdubs · on Nov 26, 2019

My understanding was that the internal Borg was fine because it wasn't publicly available, versus this which is.

slenk · on Nov 26, 2019

The word Borg itself I do not believe is trademarked.

I didn't think you could trademark someone's last name?

https://en.wikipedia.org/wiki/Bj%C3%B6rn_Borg

A search of borg in the patent office shows the name used a lot: http://tmsearch.uspto.gov/bin/showfield?f=toc&state=4806%3Av...

trothamel · on Nov 26, 2019

Why not?

If you look at https://en.wikipedia.org/wiki/Apple_(name) , the surname 'Apple' is used quite a bit.

ranDOMscripts · on Nov 26, 2019

Tell that to Uzi Nissan.

https://nissan.com/Digest/The_Story.php

slenk · on Nov 26, 2019

Wow...that sucks. If I ever bought anything Nissan Motor I would have to reconsider.

I hate how easily corruptible trademark/copyright/patent rules are

clumsysmurf · on Nov 26, 2019

I wonder if I'm the only one with this phobia: i see a tool written in a dynamically typed language and automatically trust it less. Especially something that is responsible for all my data. I wonder if this is a good candidate project to be rewritten in rust.

blattimwind · on Nov 26, 2019

It actually would be for several reasons, which I could outline if people are interested (but I suspect they are not, so I will not go to the effort per-emptively). Also yes, there have been a number of bugs specifically due to both dynamic typing of the code and the dynamic data structures (msgpack, basically JSON) used.

jval43 · on Nov 27, 2019

I used borg for a while but the bugs you mentioned and some of the discussions in the Github issues gave me pause. For my taste, there is still too much development going on for me to rely on it as my main backup solution.

tarruda · on Nov 26, 2019

Software written in dynamically typed languages can be robust, just as software written in the most strongly typed language can still be buggy.

As an user of such critical piece of software, I would ask the following questions:

* Is the software popular?

* For how many years it has been battle tested?

* Are there many reports of data corruption?

* Is it well maintained?

* Do the maintainers provide support?

Borg/Attic has been greatly used for many years now and check all the boxes above. I don't think I've ever read about data corruption that was caused by a bug in it.

Which language it is written in is nothing more than a mere curiosity for me.

mixmastamyk · on Nov 27, 2019

Test suite, uses analysis tools?

zmix · on Nov 27, 2019

I feel, that, if someone would write a backup solution from scratch and use Rust, that may be a good selling point.

Emphasis on scratch not on RIR ("rewrite in Rust") ;-)

It would need to respect heterogenous networks (Microsoft Volume Shadow Copy, for example), networks at all, meaning global deduplication, full volume imaging and file-level backups.

Erlich_Bachman · on Nov 27, 2019

And what tool do you use currently? Active development and large user base seem way more important than what language they used.