I’m writing this because the fallback behavior of grub seems broken or at the least poorly documented.
Here is the scenario. I have a server in a remote location. I have no access to the local console. I need to update the kernel however there is a chance that this new kernel could crash and leave my server completely disabled. In my scenario that is not an acceptable risk.
Fortunately grub can be manipulated enough so that a kernel is booted once and it if crashes, the system will reset and boot a previously known good kernel.
BTW, this is a RHEL5 system. Here is my grub.conf file.
default=0 timeout=10 title RHEL known *good* kernel root (hd0,0) kernel /vmlinuz-2.6.18-164.el5 ro root=/dev/OS/vol1 rhgb quiet initrd /initrd-2.6.18-164.el5.img savedefault title RHEL *new* kernel root (hd0,0) kernel /vmlinuz-2.6.18-194.8.1.el5 ro root=/dev/OS/vol1 rhgb quiet panic=5 initrd /initrd-2.6.18-194.8.1.el5.img
My “untested” kernel is the second item listed. The kernel param panic=5 will cause the system to automatically reset after 5 seconds following the event of a kernel panic. Otherwise, the server would be stuck in a crashed state until someone could physically intervene and power cycle the server.
Now comes the interesting part. We need to tell grub to set the “untested” kernel as the default and to boot it only once. We can do this from the grub shell.
# grub grub> savedefault --default=1 --once savedefault --default=1 --once grub> quit
As you can see I tell grub to save the “untested” kernel (–default=1) as the default for one-time only (–once). This makes the second menu item be the default item to be booted next. After that, the default item will go back to being the one designated in the grub.conf file (default 0).
Now I can reboot the server with extra confidence. If my kernel panics and crashes the server will soon reboot to the previous “good” kernel. If the kernel does not panic and eventually proves itself to my satisfaction, then I can log back in and edit my grub.conf file such that my new kernel is now my known “good” kernel.
The official grub documentation gives a more elaborate example, which doesn’t work–At least not under RHEL5 or Centos5 distributions. After googling around, it seems that others have the same issue with the documented features not working as expected. If you have been struggling to get this to work, I hope you will enjoy my example. It is easy to follow and hopefully it will give you the functionality that you need.
“official grub documentation”-way is the using of “fallback” option or something else?
Pingback: Grub Fallback: Boot good kernel if new one crashes - LinuxScrew: Linux Blog
Pingback: Technology And Software » Configuring grub to boot a fallback kernel « nfolamp blog
Pingback: Grub Fallback: Boot good kernel if new one crashes « Mohan
Pingback: Grub Fallback: Boot good kernel if new one crashes | Rm - Rf
Pingback: ArabTecno – Breathe Intelligence – Grub Fallback: Boot good kernel if new one crashes
This should be default functionality thanks for sharing this. I messed up the boot config after a migration and I’m waiting for a reboot. I wish I read this first!