Introduction

The term “kernel panic” is nothing short of terrifying for any system administrator. A kernel panic is a desperate safety precaution exercised by the operating system’s kernel upon detecting an internal fatal error which it is either unable to safely recover from or cannot have the system continue to run without having a much higher risk of major data loss. The kernel routines that handle panics, known as panic() in AT&T-derived and BSD Unix source code, are generally designed to output an error message to the console, dump an image of kernel memory to disk for post-mortem debugging, and then either wait for the system to be manually rebooted, or initiate an automatic reboot.  The information provided is of a highly technical nature and aims to assist a system administrator or more specifically a kernel engineer in diagnosing the problem.  Although not very common, kernel panics can also be caused by errors originating outside kernel space.

A kernel panic could be caused due to numerous different reasons. In this article, we will be demonstrating the steps we performed to troubleshoot a kernel panic scenario which occurred after we rebooted the system post application of a system patching/update activity.

Background

The system on which the kernel panic occurred was running RHEL 6.9 and the panic was caused after we restarted the system to boot it from the new kernel installed on the system by a recent ‘yum update’ activity.

Given below is a screenshot of the kernel panic message displayed on the console

Diagnostics steps and troubleshooting

The kernel panic message itself was not very descriptive as you may have understood from the screenshot shared in the previous section. But after some investigation, we were able to identify the cause on our own and fix it. We’ll now outline the sequence of steps we followed.

Identify at which stage of the boot process does the system panic?

I restarted the system from the new kernel a couple of times to realize that the system was panicking as soon the kernel was attempted to be loaded after the grub prompt.

Try to boot from the old kernel

I was able to successfully boot the system to run level 3 without any glitches when attempting to boot from the old kernel.

Differentiate between the kernel lines for the old and new kernel

I checked the /boot/grub/grub.conf file and examined the kernel lines at the grub prompt to realize that the entry for the initramfs was missing for the new kernel. I restarted the system from the old kernel again and observed that there was no initramfs file created for the new kernel in the /boot directory.

[ssuri@linuxnix:/boot] $ ls -ltr | grep -i initramfs
-rw-------  1 root root 26654700 Dec 11  2016 initramfs-2.6.32-642.6.2.el6.x86_64.img
-rw-------  1 root root 26088799 Jan  8  2017 initramfs-2.6.32-573.1.1.el6.x86_64.tmp
-rw-------  1 root root 26781018 Jun 11  2017 initramfs-2.6.32-642.13.1.el6.x86_64.img
-rw-------  1 root root 26088606 Dec 17 03:11 initramfs-2.6.32-573.18.1.el6.x86_64.tmp
-rw-------  1 root root 26785659 Dec 17 03:12 initramfs-2.6.32-642.6.2.el6.x86_64.tmp

As you can observe from the above output, we have some initramfs files for older kernels but the file for the latest kernel 2.6.32-696.6.3.el6.x86_64 is missing.

Attempt to create initramfs file manually

When I tried to create the initramfs file for the new kernel manually, the dracut command failed.

[ssuri@linuxnix:/boot] $ sudo dracut -f /boot/initramfs-2.6.32-696.6.3.el6.x86_64.img 2.6.32-696.6.3.el6.x86_64

mktemp: failed to create directory via template `/tmp/initramfs.XXXXXX': No space left on device
chmod: cannot access `': No such file or directory
usage: plymouth [ --verbose | -v ] { --targetdir | -t } <initrd_directory>

cp: `/etc/ld.so.conf' and `/etc/ld.so.conf' are the same file
cp: `/etc/ld.so.conf.d' and `/etc/ld.so.conf.d' are the same file
find: cannot search `': No such file or directory
cpio: File ./initramfs-2.6.32-696.6.3.el6.x86_64.img grew, 19316736 new bytes not copied

Find cause of missing initramfs file

The error in the dracut command mentioned previously prompted me to check the state of the /tmp file system. Although the file system had plenty of storage space available it was out of inodes due to which it was unable to create any new temporary files that might have been needed by the dracut command.

[ssuri@linuxnix:/boot] $ df -h .
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/os_pvp1  493M  159M  309M  34% /boot
[ssuri@linuxnix:/boot] $
[ssuri@linuxnix:~] $ df -hi /tmp
Filesystem           Inodes IUsed IFree IUse% Mounted on
/dev/mapper/os_vg-tmp_lv
128K  128K     0  100% /tmp

Free up inodes in /tmp and create initramfs file

Immediately after viewing the state of inodes for the /tmp file system, I proceeded towards cleaning it up.

[ssuri@linuxnix:~] $ df -hi /tmp
Filesystem           Inodes IUsed IFree IUse% Mounted on
/dev/mapper/os_vg-tmp_lv
128K   232  128K    1% /tmp
[ssuri@linuxnix:~] $

Once the inodes were freed, I created the initramfs file as shown in the below command.

[ssuri@linuxnix:/boot] $ sudo dracut -f /boot/initramfs-2.6.32-696.6.3.el6.x86_64.img 2.6.32-696.6.3.el6.x86_64
[ssuri@linuxnix:/boot] $ ls -ltr | grep -i initramfs
-rw-------  1 root root 26654700 Dec 11  2016 initramfs-2.6.32-642.6.2.el6.x86_64.img
-rw-------  1 root root 26088799 Jan  8  2017 initramfs-2.6.32-573.1.1.el6.x86_64.tmp
-rw-------  1 root root 26781018 Jun 11  2017 initramfs-2.6.32-642.13.1.el6.x86_64.img
-rw-------  1 root root 26088606 Dec 17 03:11 initramfs-2.6.32-573.18.1.el6.x86_64.tmp
-rw-------  1 root root 26785659 Dec 17 03:12 initramfs-2.6.32-642.6.2.el6.x86_64.tmp
-rw-------  1 root root 25606929 Jan 14 06:51 initramfs-2.6.32-696.6.3.el6.x86_64.img

The dracut command automatically added the entry for the initramfs file for the new kernel in the /boot/grub/grub.conf file which was also missing earlier.

[ssuri@linuxnix:~] $ sudo cat /etc/grub.conf

# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd3,0)
#          kernel /vmlinuz-version ro root=/dev/mapper/os_vg-root_lv
#          initrd /initrd-[generic-]version.img
#boot=/dev/sda
default=0
timeout=50
#splashimage=(hd0,0)/grub/splash.xpm.gz
#hiddenmenu

title Red Hat Enterprise Linux Server (2.6.32-696.6.3.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-696.6.3.el6.x86_64 ro root=/dev/mapper/os_vg-root_lv rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=os_vg/swap_01_lv rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=512M rd_LVM_LV=os_vg/root_lv  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM elevator=deadline transparent_hugepage=never debug KEYTABLE=us rd_NO_DM elevator=deadline transparent_hugepage=never debug

initrd /initramfs-2.6.32-696.6.3.el6.x86_64.img

title Red Hat Enterprise Linux Server (2.6.32-642.13.1.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-642.13.1.el6.x86_64 ro root=/dev/mapper/os_vg-root_lv rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=os_vg/swap_01_lv rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=512M rd_LVM_LV=os_vg/root_lv  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM elevator=deadline transparent_hugepage=never debug KEYTABLE=us rd_NO_DM elevator=deadline transparent_hugepage=never debug

initrd /initramfs-2.6.32-642.13.1.el6.x86_64.img

title Red Hat Enterprise Linux Server (2.6.32-642.6.2.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-642.6.2.el6.x86_64 ro root=/dev/mapper/os_vg-root_lv rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=os_vg/swap_01_lv rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=512M rd_LVM_LV=os_vg/root_lv  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM elevator=noop transparent_hugepage=never debug

initrd /initramfs-2.6.32-642.6.2.el6.x86_64.img

title Red Hat Enterprise Linux Server (2.6.32-573.1.1.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-573.1.1.el6.x86_64 ro root=/dev/mapper/os_vg-root_lv rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=os_vg/swap_01_lv rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=512M rd_LVM_LV=os_vg/root_lv  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM elevator=noop transparent_hugepage=never debug

initrd /initramfs-2.6.32-573.1.1.el6.x86_64.img
[ssuri@linuxnix:~] $

Boot system from new kernel

I now restarted the system again and this time the server was able to boot successfully from the new kernel.

Source : https://www.linuxnix.com/troubleshooting-linux-kernel-panic-patching/

Leave a Reply

Your email address will not be published. Required fields are marked *