Linux kernel oops
In computing, an oops is a serious but non-fatal error in the Linux kernel. An oops may precede a kernel panic, but it may also allow continued operation with compromised reliability. The term does not stand for anything, other than that it is a simple mistake.
Functioning
When the kernel detects a problem, it kills any offending processes and prints an oops message, which Linux kernel engineers can use in debugging the condition that created the oops and fixing the underlying programming error. After a system has experienced an oops, some internal resources may no longer be operational. Thus, even if the system appears to work correctly, undesirable side effects may have resulted from the active task being killed. A kernel oops often leads to a kernel panic when the system attempts to use resources that have been lost. Some kernels are configured to panic when many oopses (10,000 by default) have occurred.[1][2] This oops limit is due to the potential, for example, for attackers to repeatedly trigger an oops and an associated resource leak, which eventually overflows an integer and allows further exploitation.[3][4]
The official Linux kernel documentation regarding oops messages resides in the file Documentation/admin-guide/bug-hunting.rst[5] of the kernel sources. Some logger configurations may affect the ability to collect oops messages.[6] The kerneloops
software can collect and submit kernel oopses to a repository such as the www.kerneloops.org website,[7] which provides statistics and public access to reported oopses.
For a person not familiar with technical details of computers and operating systems, an oops message might look confusing. Unlike other operating systems such as Windows or macOS, Linux chooses to present details explaining the crash of the kernel rather than display a simplified, user-friendly message, such as the BSoD on Windows. A simplified crash screen has been proposed a few times, however currently none are in development.[8]
See also
- kdump (Linux) – Linux kernel's crash dump mechanism, which internally uses kexec
- System.map – contains mappings between symbol names and their addresses in memory, used to interpret oopses
References
- Horn, Jann (7 November 2022). "[PATCH] exit: Put an upper limit on how often we can oops". lore.kernel.org. Retrieved 31 January 2023.
- "Documentation for /proc/sys/kernel/". docs.kernel.org. Retrieved 31 January 2023.
- Corbet, Jonathan (18 November 2022). "Averting excessive oopses". LWN.net.
- Jenkins, Seth (19 January 2023). "Exploiting null-dereferences in the Linux kernel". Google Project Zero.
- "bug-hunting". kernel.org.
- "DevDocs/KernelOops". madwifi-project.org. Archived from the original on 2020-08-03. Retrieved 2010-08-21.
- "kerneloops(8) - Linux man page". Retrieved 31 January 2023.
- Larabel, Michael (10 March 2019). "A DRM-Based Linux Oops Viewer Is Being Proposed Again - Similar To Blue Screen of Death". Phoronix.
Further reading
- Linux Device Drivers, 3rd edition, Chapter 4.
- John Bradford (2003-03-08). "Re: what's an OOPS". LKML (Mailing list). Archived from the original on 2007-03-10. Retrieved 2006-05-22.
- Szakacsits Szabolcs (2003-03-08). "Re: what's an OOPS". LKML (Mailing list). Archived from the original on 2007-03-13. Retrieved 2006-05-22.
- Al Viro (2008-01-14). "OOPS report analysis". LKML (Mailing list). Archived from the original on 2008-04-21. Retrieved 2008-01-14.
- Kernel Oops Howto (the madwifi project) Archived 2020-08-03 at the Wayback Machine Useful information on configuration files and tools to help display oops messages. Also many other links.