3

Recently my SD card died, becoming permanently locked in a read-only state. I realized it a few weeks after the fact, when I saw that files I would create disappeared after a reboot. Of course, all changes I've done in these couple of weeks are also gone.

I wonder if there's a way I could get an early warning on the issue: a test which I could run in a cron job or similar, which ideally wouldn't require a reboot and wouldn't trash the SD card too much.

Any ideas?

Dmitry Grigoryev
  • 26,688
  • 4
  • 44
  • 133
  • If you create a new file and try to save it on a read only file system you immediately get an error message. You don't see that? Do you had a look into the journal? – Ingo Aug 20 '18 at 08:12
  • @Ingo The filesystem is not read-only, the meduim is. SD cards don't seem to report their end-of-life state to the OS (or the OS ignores those errors). Try running [`sudo sdtool /dev/mmcblk0 lock`](https://github.com/BertoldVdb/sdtool) and see how long it takes for Raspbian to report IO errors. Unless you run a torrent client, it will take days. – Dmitry Grigoryev Aug 20 '18 at 08:32
  • what is `sdtool` @DmitryGrigoryev – Jaromanda X Aug 21 '18 at 05:12
  • @JaromandaX click on the link. – Dmitry Grigoryev Aug 21 '18 at 07:16

1 Answers1

5

You could do a simple write/read test with a test pattern. A good test pattern is a byte with binary '01010101' = 0x55 hex and a byte '10101010' = 0xaa. Writing both will toggle every bit.

As I had realized by the comments the real problem is that the hardware of the storage does not report any error if it does not accept data. So even sync does not help. The operating system writes the buffered data (synchronize) to the storage and assumes buffer and storage content are identical when no error occurs. It continues reading data from the buffer it just has written and the error is not checked.

So we have to use unbuffered IO to check this. Python can do this with Raw I/O:

Raw I/O (also called unbuffered I/O) is generally used as a low-level building-block for binary and text streams; it is rarely useful to directly manipulate a raw stream from user code. Nevertheless, you can create a raw stream by opening a file in binary mode with buffering disabled.

Here is my example code. I write 64 bit (8 byte) test pattern and compare it. I open the test file only once and seek to the beginning for new comparison. Because we read and write direct to the storage there is no need to close and open always I think.

#!/usr/bin/python3

testpattern_55 = bytes( [0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55] )
f = open("/tmp/rwtst.raw", "w+b", buffering=0)
f.write(testpattern_55)
f.seek(0)
testpattern_read = f.read(8)
if testpattern_read != testpattern_55:
    print("ERROR writing test pattern!")
    f.close()
    exit(1)

testpattern_aa = bytes( [0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa, 0xaa] )
f.seek(0)
f.write(testpattern_aa)
f.seek(0)
testpattern_read = f.read(8)
if testpattern_read != testpattern_aa:
    print("ERROR writing test pattern!")
    f.close()
    exit(1)

f.close()

You can call it in a cron job to check automatically from time to time.

Ingo
  • 40,606
  • 15
  • 76
  • 189
  • 1
    I'm pretty certain `cat /tmp/rwtst.bin` will read from cache instead of SD card. – Dmitry Grigoryev Aug 20 '18 at 11:42
  • Hmm... haven't thought of this. Just looking for a flush or sync possibility. Or reading after a delay longer than the sync period? – Ingo Aug 20 '18 at 11:56
  • @DmitryGrigoryev Seems to be simple. Just added `sync` to the script. – Ingo Aug 20 '18 at 12:01
  • 1
    `sync` does not clear caches, it just writes cached content to the disk. I'm looking at `/proc/sys/vm/drop_caches`. – Dmitry Grigoryev Aug 20 '18 at 12:10
  • @DmitryGrigoryev OK have realized the real problem now and have changed my answer. – Ingo Aug 20 '18 at 15:54
  • Nice find, I will check it out! – Dmitry Grigoryev Aug 21 '18 at 07:18
  • @DmitryGrigoryev Just have answered [Does this message mean my microSD card is worn out?](https://raspberrypi.stackexchange.com/q/90076/79866). So I come back to this question. Does you checked it out? Does it work? – Ingo Oct 18 '18 at 07:59
  • @kaptan You asked at https://raspberrypi.stackexchange.com/a/100860/79866 if this code is doing almost the same thing as `badblocks` but at filesystem level as opposed to block device level. I don't know what badblocks is really doing. And no, I do not use filesystem level. I use low level unbuffered raw data I/O so there is no caching of data that may hide read errors. – Ingo Jul 19 '19 at 21:39
  • Does your technique reveal a flaw at only one specific location/address - or are you writing to the entire SD card? – Seamus Jan 28 '22 at 20:25