3

I'm going to return and exchange this Raspberry Pi 2, but wanted to know if anyone has seen these errors before: I have one RPi2 (out of 13 I use for my after-school program) which acts up during boot, often giving this output (looks like a kernel panic):

kernel panic during boot

Sometimes it does not give this kernel panic, but instead gives garbled display output, like in the below image (note - the image looks like it is out of focus, but that's actually the video output from the RPi - it seems to superimpose text output with a slight horizontal offset). Note the green lines on the right side:

Fuzzy garbled output

On other occasions (typically the first boot after being switched off for a while) it seemingly boots into X, but never renders graphical output - my monitor suddenly says "No Input" and only the red light on the RPi is on. Anyone seen this before, or have an idea of what is happening here? My best bet is that the VideoCore IV GPU has an error.

PS: I'm booting Raspbian Wheezy, but get the same issue when booting a fresh Raspbian Jessie. I have tried several different SD cards, all of which boot fine on my other RPis.

UPDATE: Ran the faulty RPi using Jessie in a different setup and managed to get into the desktop. Then ran it back in my original setup (still with Jessie) and also got into desktop, repeatedly. Booting into my trusted Wheezy image (tried and true for hundreds of boots on all my other RPi's): ERROR. Put Jessie back in, boots. Only difference I can notice is that this RPi has a small label "Made in P.R.C." on it, whereas most of my other RPi's (haven't checked all of them yet) are "Made in U.K.". Is there perhaps a firmware difference?

Phil B.
  • 5,013
  • 14
  • 30
  • Have you tried removing ALL peripherals (mice, keyboards, WiFi, ethernet) and use a different cable to connect to the monitor? – Kachamenus Oct 12 '15 at 16:20
  • @Shojan - yes, have changed monitor cables, SD cards, with/without WiFi dongle ... always the same result(s) on this one RPi - on any of my other RPi's (exactly the same setup) all works fine with the same SD card. – Phil B. Oct 12 '15 at 17:46
  • 1
    I'll be darned - ran the faulty RPi using Jessie in a different setup and managed to get into the desktop. Then ran it back in my original setup (still with Jessie) and also got into desktop, repeatedly. Booting into my trusted Wheezy image (tried and true for hundreds of boots on all my other RPi's): ERROR. Put Jessie back in, boots. – Phil B. Oct 12 '15 at 19:06
  • fantastic! Type that out in an answer for yourself if it fixed it, so we can make this solved. :-) – Kachamenus Oct 12 '15 at 19:32
  • Not yet there - want to know why Jessie worked but Wheezy still fails - is there something on "Made in P.R.C." boards that does not work when "Made in U.K." boards do work? Both are Raspberry Pi 2 model B v1.1. – Phil B. Oct 12 '15 at 20:14
  • Dunno, man... one of life's great mysteries... – Kachamenus Oct 12 '15 at 20:18

2 Answers2

2

Only difference I can notice is that this RPi has a small label "Made in P.R.C."

You will probably find this article interesting. In relation to that, my Pi 2, made under license by RS Components (one of two official distributors, the other is Element 14/Farnell) has some of the characteristics of the P.R.C. made board pictured there (namely the HDMI jack), and a similar PCB, but the RS box does say "Made in the U.K.". The RAM is dated 1447 (week 47, 2014).

If the explanation here is bad RAM (that would be my bet), where/when the board was made is probably not super relevant; it happens to every system in time. However, it might be worth noting the rumor in that article that some of the (earlier?) RAM "needed an update to the firmware to work". So you might compare those dates (it's on the bottom Elpida labelled chip). Your first error screen does look like a memory error.1

In any case, if you have the latest firmware and kernel installed, I'd call it a defect. If not, since you've said it is the difference between a wheezy and jessie image, compare the firmware files and kernel version and try updating the wheezy firmware and kernel (with rpi-update) to see if that corrects the problem.

Finally, there is apparently a way to do a memtest on the Pi -- since this runs in userspace it cannot prove the memory isn't defective, but it could prove that it is. You should try this with the memory split set as low as possible for the GPU (see footnote).


1. Which tends to cause mysterious problems that are not always the same...the fact that the GPU shares the RAM using an arbitrary split might explain that crazy blurred screen.

goldilocks
  • 56,430
  • 17
  • 109
  • 217
  • Thanks for the many tips in your answer! Actually, out of my 13+ RPi 2's, most of them turn out to be "Made in P.R.C." (Note that I was lucky enough to buy 8 of them on the day the Pi2 came out - you can blame me for any shortages that caused :) ). All my "Made in P.R.C." boards are easily recognizable, because they have black CSI/DSI connectors - the "Made in U.K." ones are part black part white. What I think happened is that I might have done an `rpi-update` on this specific board and it installed a firmware with a Wheezy incompatibility (assuming firmware gets stored on board, not on SD). – Phil B. Oct 13 '15 at 01:54
  • The Elpida chip is indeed labeled 1443 (Week 43, 2014) on the malfunctioning board. Will read the article you linked. – Phil B. Oct 13 '15 at 01:56
  • Ok so I have *EXACTLY* the same board as this "Colleague" mentioned at the bottom of the article - same 4-lined barcode sticker with same date codes, Elpida RAM with same codes. The comparison board I used is the "other" UK board with the 2-fingered HDMI port. I will check my other RPi's tomorrow as much as I can. Thanks a lot @Goldilocks for these great resources. – Phil B. Oct 13 '15 at 02:05
  • The firmware updates do get stored on the SD card -- it's [this stuff](https://github.com/raspberrypi/firmware/tree/master/boot) in the first partition and loaded at boot; [here's an explanation](http://raspberrypi.stackexchange.com/a/8965/5538) of that. `Rpi-update` updates the firmware and kernel (w/ modules) at the same time. To the extent that there can be a mismatch, it would be between the kernel and firmware. I.e., there can't be "a Wheezy incompatibility" caused by that. I really think the most likely answer is dud ram... – goldilocks Oct 13 '15 at 12:19
  • 1
    ...Unlike non-volatile storage, the system generally can't isolate and blacklist bad mem blocks; if even one byte of it is defunct, everything becomes a crap shoot. However, on boot the system will tend to do exactly the same thing every time, including physical RAM mappings. This is why you may have one kernel/userspace combo fail at boot more or less the same way every time, and another one not fail at all -- in the latter case, the issue has just been deferred until something gets allocated the bad blocks. – goldilocks Oct 13 '15 at 12:19
  • 1
    BTW, where/when the board was made, etc., may be a red herring. Bad RAM is something that just happens eventually to everything; the normal way to check an x86 system is [Memtest86](https://en.wikipedia.org/wiki/Memtest86) which is free and comes standard on a lot of install or rescue disks. There's no correlate for ARM I'm aware of though. This is why I'd try an image with an updated firmware and kernel; if that boots, set your memory split with the video as low as possible (I think 16 MB) and try that userspace mem test from the last paragraph as vigorously as possible. – goldilocks Oct 13 '15 at 12:29
  • Add this last comment to your answer and I'll mark it as accepted. Here is what I think happens: I have bad memory on the RPi2, in a spot that does not get used with the default RAM/GPU-RAM split. On my custom image, I have a different split, which forces the GPU to use a piece of RAM that is faulty - which is why the error typically happens when X is starting (the green-line blurry picture). I'll run `memtester` shortly. – Phil B. Oct 13 '15 at 12:29
  • 1
    Actually the fact that RAM is shared with the GPU could explain the wacky blurred screen scenario (will edit that into the footnote). – goldilocks Oct 13 '15 at 12:29
  • 1
    We just came to the same conclusion :) – Phil B. Oct 13 '15 at 12:30
  • 1
    Memtester output (and my RPI2 shuts down at some time during the memtester run - likely because of the issue) shows this: Solid Bits : testing 50FAILURE: 0x00000000 != 0xffff0000 at offset 0x138057ec. FAILURE: 0xffffffff != 0x0000ffff at offset 0x138057f0. FAILURE: 0x00000000 != 0xffff0000 at offset 0x138057f4. FAILURE: 0xffffffff != 0x0000ffff at offset 0x138057f8. – Phil B. Oct 13 '15 at 16:09
0

Almost certainly bad RAM. My Model A had the same fault until I reworked its soldering on the input connector and then it worked fine. Evidently RAM was "power sensitive" and brownout caused single bit errors.