2

Conventional wisdom has it that creating backups by using dd if=/dev/mmcblk0 ... from a live filesystem is prone to error. Changing a filesystem that's being copied as an image will result in a corrupted image that at best will fail immediately on fsck or mount, and at worst will appear to be safe but will have left important files silently corrupted. This is what I've understood since first using UNIX way back when.

Pi documentation, repeated many times by others, seems to state that this dd approach is not only acceptable but indeed recommended. Here are two of the many posts I've seen:

I'm ignoring snapshot filesystems, since mostly Pi systems default to using ext4.

I am also not looking for a backup solution, or suggestions of alternatives such as using LVM or rsync.

Has the Pi kernel got a feature that pauses writes to a block device while it's open for reading from user space? Or is it on the assumption that "not much" generally writes to the filesystem?

If not, how are these two positions reconciled?

roaima
  • 131
  • 6

4 Answers4

4

Pi documentation, repeated many times by others, seems to state that this dd approach is not only acceptable but indeed recommended.

This is because many of the people who got in earlier with the Pi and were eager to blog about their expertise were completely new to both the device and GNU/Linux and often regurgitated information that they were in no position to evaluate the veracity of.

I've been a linux user since the late 90's, and before I became a moderator here I was very active at Unix & Linux, making it into the top 5 members there -- since slipped out, but my point is to emphasise that I've been an observer and participant in the online linux user community longer than the Pi has existed, but also an observer and participant in the online Pi community for almost it's entire lifetime.

Particularly early on, the quality of material about linux that targeted Pi users was often very poor and fairly obviously the product of the phenomenon described in my first paragraph. The reason this stuff became widespread is because Pi newbies, when they are searching for information online, use "Raspberry Pi" instead of "Linux" as a search term when the information they are searching for is really about the latter and had no specificity to the former -- the blind leading the blind. Going with the dd example, it's often things that may work well enough for some people some of the time, and are easier to cargo cult than the alternatives.1

It has improved but those memes remain pernicious. TBH, I would a little bit fault the Raspberry Pi organisation; their own documentation and blogs early on sometimes had a tone that to me implied a sort of resentment that Raspbian (which was initially independent of them) had quickly become the operating system of choice for the device (I'm guessing they were hoping for RISC OS, but again all of this is my own conjecture).


  1. Another example is the use of /home/pi/.profile to start programs at boot. This proliferated because autologin is commonly enabled for the pi user -- so it "works" in that context, but of course can have unintended side effects if you login via ssh etc., and is certainly not a good idea.
goldilocks
  • 56,430
  • 17
  • 109
  • 217
  • 2
    I'm also a longtime user of UNIX systems long before even Linux existed, let alone the Pi. Looking to see if I'd missed something or my expectation was still valid (it seems so) – roaima Oct 26 '21 at 15:47
4

Your "conventional wisdom" is right. The fact that someone managed to copy an image from a mounted device and it worked doesn't mean it will work every single time. Actually, it doesn't even mean that the backup was not corrupt - it was in a good enough shape to be bootable, but there might have been data loss nevertheless.

Speaking of good solutions, I'm using image-utils which is a collection of scripts for backup creation and handling. It makes complete backup images preserving partition UUIDs, just as dd would, but actually uses rsync behind the scenes to copy data.

You still should keep disk activity during a backup to a minimum, otherwise you will end up with a backup which is technically consistent but broken practically. E.g. if you start installing a package while the backup is being made, some of the files could make it to the backup while others would be missing, so the software you were installing may not work when you flash your backup image on an SD card.

Also note that dd is a tool designed to let you pick parts of the data that you need, using somewhat complex syntax and default settings which make sense for this task (like the block size of 512 bytes). If you need to copy the complete SD card, doing sudo cp /dev/mmcblk0 > backup.img will be faster and less prone to typos. Of course, cp will fail to produce consistent backups when used with a mounted device, just like dd.

Dmitry Grigoryev
  • 26,688
  • 4
  • 44
  • 133
  • 1
    Thanks. I'm familiar enough with `dd` to prefer `cat` in many situations where `dd` is often misused. – roaima Oct 29 '21 at 14:21
  • cat doesn't let you limit how much is copied and doesn't handle block devices correctly, just to mention a few features of dd. Not saying I don't like cat, I just don't want anyone to get illusions that it is a good tool to copy block devices or give you any control at all of how something is copied. – user10489 Oct 29 '21 at 22:01
  • cat is my favorite editor. – user10489 Oct 29 '21 at 22:03
  • 1
    @user10489 For someone who understands how `dd` works, more control is good. But many `dd` users actually just copy-paste commands they find online without much understanding. Those people would have less trouble if they just used `cp` or whatever command they *do* understand. – Dmitry Grigoryev Nov 01 '21 at 22:05
  • *shudder* copy paste dd commands from google searches is a horrifying thought. Not that cat is better in that respect, just as much damage can be done. – user10489 Nov 01 '21 at 22:53
3

The first link is of questionable reputation.

The Toms's Hardware link seems to be describing an offline backup with the SD card in another computer, not in the pi, let alone mounted.

Using dd to take a snapshot of a live writable filesystem is not reliable, especially for ext4fs and vfat. Using dd for an offline filesystem makes a small amount of sense for what is typically a small SD card, but is still not a great way to make backups. For a larger SD card, this may be painful, and other alternatives may be better.

I can think of other reasons for making a SD card snapshot for a pi (for instance, to prepare an image for mass distribution to more pis), but it isn't a great way to make backups.

In general, there are many tools that are designed to back up a live filesystem on a file by file basis which is much more effective (not to mention safer) than a filesystem snapshot, especially for large or sparse filesystems. This is a well solved problem with many good solutions.

user10489
  • 794
  • 1
  • 3
  • 9
  • Can you give example in your answer of what you consider to be good solutions? – YCN- Oct 26 '21 at 12:36
  • 1
    Recommendations like that are opinion based and out of scope of this site. For small systems, you could use git to back up (and even push to another server) edited config file changes, etc. ; this would be appropriate to make reinstalling from a fresh image easy. For larger systems, you might want to use tar, rsync, or maybe even something like bacula. For a better answer, ask in one of the unix forums, not on the raspberry pi forum, as backups are not pi specific and a larger audience might give a better answer. – user10489 Oct 26 '21 at 12:40
  • Not only is the question of backups out of scope for this forum, but entire books have been written on the topic. – user10489 Oct 26 '21 at 12:44
  • 1
    @YCN- https://raspberrypi.stackexchange.com/q/5427/5538 – goldilocks Oct 26 '21 at 14:38
  • The first half of the Tom's Hardware post does indeed talk about offline. But as you head further down it also succumbs to the `dd` approach – roaima Oct 26 '21 at 21:02
  • Once you pull the SD card out of the pi and put it in another computer, dd is a fine way to make an image of it, especially if your intent is to write it back to another SD card later. It's not so great if you want to recover a file off of it. – user10489 Oct 26 '21 at 23:53
  • @user10489 yes I know, but that's not the question I'm asking – roaima Oct 29 '21 at 14:19
2

QUESTION: Are backups using dd if=/dev/mmcblk0 safe and consistent?

There are two cases that must be considered:

  1. If /dev/mmcblk0 is mounted, the answer is, "Absolutely and unequivocally, "No". It's been said of one who follows this practice: "If you're lucky, the filesystem corruption will be detected as soon as you try to mount the copy. If you're unlucky, it won't be detected until later."

  2. If the device is not mounted, the answer is "it can be safe & consistent".

"Pi documentation, repeated many times by others, seems to state that this dd approach is not only acceptable but indeed recommended."

Anyone that claims that is in error - it's just that simple. Or perhaps you've missed the fine print re the file system being un-mounted to use dd.

QUESTION: Has the Pi kernel got a feature that pauses writes to a block device while it's open for reading from user space? Or is it on the assumption that "not much" generally writes to the filesystem? .... If not, how are these two positions reconciled?

The "RPi kernel" (which is the Linux kernel) has no feature that makes it safe to use dd with a mounted file system. 1, 2. If that's not the answer you were seeking, please try to clarify this question as I'm unclear on exactly what you're after. And as you're well aware, at least one apparently knowledgeable person on U&L SE feels strongly that there is almost no good use-case for dd.

Seamus
  • 18,728
  • 2
  • 27
  • 57