Possible solution for random "Kernel Panics" on Arch Linux boot

This post is to show how to "fix" almost the problem of startups with errors Arch Linux. Something like the following image:

IMG_20140707_210559

As can be seen, we see that this is one of the many "combinations" of errors that appear randomly when starting an operating system with this problem. As it says in that error, it indicates that there may be a problem in the "Hardware", however, as we all know in this operating system, even the bad tricks of what does not belong to the OS can be solved.

So, I am going to describe my experience of this problem. From what I was able to experience, the problem was only with Arch Linux or another distro that I tested externally, since with any ubuntu that I had installed or tested, it started without problems. But if he tried to rip the Arch Linux installed on the hard drive, I had a problem that I had to reboot about 50 times in order for the OS to boot normally and use it.

This already had something wrong with me because I could only use the ubuntu that I had installed to test it and I could not do even half of the things I could do with Arch Linux. So I decided to solve this problem and began to investigate, looking for forum threads that had the same problem, they also mentioned that it was a hardware error and that it was precisely the CPU, so it began to worry me, so I got to open the PC and verify what was happening, however, it did not help.

But something that showed me, that I should not give up was that if Ubuntu I could because Arch Linux no (perhaps Ubuntu is better than Arch…?). So I started writing boot parameters to the kernel of Arch Linux, things like: lapic, nomce, intel_idle.max_cstate = 0, disable_cpu_apic, acpi_skip_timer_override, acpi = stric, clk, apm, noapic, acpi = oldboot, acpi-cpufreq, intel_pstate = disable, i8042.noacpi = 1, apm = copyds, acdtpi = 0, apm = copyds, acdtpi = XNUMX, apm = copyds pci = nocrs, rhgb, acpi = force, pnpacpi = XNUMXff and others more ... All this was recommended in the forums that I read.

Until I had to enter the documentation of the kernel parameters, which I recommend by the way: https://www.kernel.org/doc/Documentation/kernel-parameters.txt

And I found a quite interesting parameter that for the moment I managed to boot Arch Linux No problem:

linux /boot/vmlinuz-linux root=UUID=fbefe36c-1712-4f3b-b3e3-3eac759d71c9 notsc nomce maxcpus = 0

As indicated there, what this parameter does is limit the use to a cpu without activating the symmetric mode of processing. At first it worked quite well until when I used the command pacman-Syyu; threw me a core dumped o segmentation fault.

So I automatically noticed that something strange was happening, so I started running other processes until suddenly the system completely froze and didn't work anymore, until I rebooted it. So I did the same operation, but this time I managed to execute htop and it showed me the following:

IMG-20140729-WA0001

As expected, it only showed one CPU, since the other had disabled it, however, it seemed very strange to me why the programs threw segfault, and couldn't even start the graphical environment; so it was something that at least gave me more hope that if I set the kernel parameters one way it would boot my Arch Linux as usual.

So I kept trying the other parameters that I wrote in the list until I came across this one, which is the best solution at the moment:

 linux /boot/vmlinuz-linux root=UUID=fbefe36c-1712-4f3b-b3e3-3eac759d71c9 notsc nomce isolcpus = 1

This parameter does something as simple as isolating (not deactivating) the second core from the CPU in symmetric processing, that is, the processing load is given to a single core while the other is only complementary. This, although it seems contradictory, does not affect performance so much, since this great OS was able to run applications in this way:

test

linux_rlz_compiz

So with this, the only problem that I observed that occurs at boot time, is one or two kernel panics or oops; but compared to the 50 times I had to reboot previously, I can consider it a "workaround". For the rest, so far it has allowed me to use the OS and write this post that you are reading right now :-).

I hope they help you, and do not get out of GNU / Linux, which is the best operating system they have ever invented. I say it for sure.


Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.

  1.   Gregory Swords said

    Very interesting info. I've never had those kernel panics in ArchLinux in the years I've been using it, but it's good to know what to do if at any point the problem occurs. Thank you!

    1.    kik1n said

      Anyway, I've been using Arch for a long time (I was like 1 year without Arch) and without a kernel panic.
      Thanks for the tip.

    2.    c4explosive said

      Most likely, as I mentioned in the post, the problem happens because of the hardware, because in what I use arch, it had not given me any problem of this type either.

    3.    elav said

      Another one with excellent results in Arch. I have never had a Kernel Panic

    4.    rawBasic said

      More than 2 years with GNU / Linux ... 2 years already with ArchLinux, never a kernel panic .. 😉

    5.    Manual of the Source said

      I think kernel panics are due more to hardware than to the distro itself. I've never seen a panic kernel on the laptop I use now except once I put an Ubuntu alpha into it (and Arch Linux was here for two years too). On the other hand, in another laptop that I have, any distro that I put always gives kernel panic and a wide variety of errors for all tastes.

  2.   eliotime3000 said

    With kernel 3.14 on Debian, I have run into the kernel panic problem, besides that whenever I turn on my PC, I get a "connect / disconnect timeout" message (and also when I turn it off).

    1.    Amaury said

      It has happened to me so much in Fedora as in Arch, but I don't know why, and how I see no difference because I have not spent time investigating or solving that (if it is a problem).

    2.    dasasd said
  3.   Tony said

    Thank you very much for the info. Some of the many things that we can boast about is this type of forum

  4.   manu said

    Why does this happen to Arch Linux? Perhaps it is not enough with the problems that frequently appear with the slowness or the hanging of the system reaching the point of throwing the system to the fret.

    1.    elav said

      Hey? What are you talking about? o_O

    2.    Amaury said

      Arch is a KISS distribution configurable from the base of the operating system itself, in a few words, if the system is heavy it is because you built it that way, if the system has errors it is because you generated them or because you did not configure something correctly. Arch wiki is quite complete, a few years ago there weren't many important topics in Spanish, that and the installation process was much rougher and somewhat difficult, now everything is a bit more automated.
      Blaming the distro for user errors is so… Windows (?).

      1.    dayara said

        Blaming the distro for errors is being consistent, simply because it is the truth. After having a similar problem with Manjaro, I tried Arch, Antergos and another unknown distribution (I can't remember the name now, sorry) that someone recommended to me assuring me that it did not give problems, but nothing; they all give it. In OpenSuse, Fedora, Mint, Mageia and all the ones that I have tried afterwards it does not pass. So as far as I'm concerned, I'm left with no choice but to think that it's the distro's fault. But, hey, I don't demonize it or anything, what's more, it annoys me that I can't use anything based on Arch, because I like it a lot, but that damn problem prevents me. Nor do I think that it is about the hardware, because many of us that happen to us did not happen before using the same fucking. Well, actually it must be something related to the hardware, but, going back to the same thing, if I have not made any changes and I have problems with the same equipment with which I did not have them before, obviously it will be due to a change made by Arch who screwed me up.

      2.    johnfgs said

        "Blaming the distro for user errors is so ... Windows (?)."

        I would tell you that blaming users for product errors is so Apple. I've honestly thought about it a thousand times, but I don't see the advantage of using something whose maintainers basically wash their hands, for any serious purpose. And I say that considering that the GPL software comes without warranty.

        You can say as you want but if it is the same case of the reports of lack of signal to the iPhone and the response of Apple "is that you are taking it wrong" several years ago. If you make a distro you usually want to provide some quality, and minimal support, and the truth is that Arch is basically a hobbyist system, where you see that its developers have fun packaging new things, but have little interest in offering a true support. Every time I see this type of post I value more the work behind the distro that I use.

        And yes, it is a software problem if it does not work, if it stops working in an update, or if something of the hardware breaks. That one distro of kernel panic while another does not… well yes, clearly there is a distro that is doing things right and another wrong. Now if it is your pleasure to use Linux in the style of the 90s where we had to recompile the kernel every time we plugged in a new printer… there you.

  5.   mario said

    Is the kernel compiled by the developers? or your own?
    Kernel panics are generated when certain components have not been selected (AND) when compiling, or some modules have not been activated to support certain hardware. With the practice and knowledge of your hardware (you have to open the pc and see what brands of chips it has), you can build a custom kernel (by chrooting). If ubuntu and the Arch installation CD were on your computer, there is something in the compilation that is not activated.

    1.    c4explosive said

      It was the stock kernel from the archlinux itself, from the repositories.

  6.   anonymous said

    The kernel that you are using, there is something left over that your hardware does not like, you must have a rare version of a chip on your motherboard or even a bug in a chip (it usually happens).
    It may be a corrupt table in your bios acpi, it is normal that the Chinese on duty does not even calculate the checksum of each table well, these messages usually appear with $ dmesg -human at the start of the boot.
    You should also try another power supply, when the filtering fails, the ripple tends to make just such failures.
    First, try changing the source and see what happens, if it remains the same, try configuring a kernel to suit your hardware, by the way you will get to know your pc better in the process.

    1.    c4explosive said

      Thanks for the tips. By the way it is a laptop, I think I should change the battery. But I see what you've told me can help me.

  7.   yukiteru said

    The one kernel panic that still drives me crazy is partly the fault of the nouveau guys and my old, outdated, and very dusty nVidia 6150 SE integrated card (I mean partly because; they've done an excellent job supporting a universe of graphics chips like the ones that nVidia has, and all this, using only reverse engineering, plus the problem only occurs for some cards with the NV4E chipset).

    All you have to do is start Openbox + Firefox and disaster strikes (nothing more beautiful than seeing a completely random black and white mosaic on your screen). And I have been singing it since kernel 3.6 in Debian, Fedora, Archlinux, Slackware and now re-verified again in Gentoo (just installed with kernel 3.12), I no longer bother to take a log, to the kernel or give it time to write something that don't be a whopping nonsense characters.

    1.    anonymous said

      I give you the solution, a pc that I have with gentoo and integrated nvidia video is the same with the nouveau driver, so I had no choice but to use the closed nvidia driver, my chip must use the 304.123 driver

      00: 0d.0 VGA compatible controller [0300]: NVIDIA Corporation C61 [GeForce 7025 / nForce 630a] [10de: 03d6] (rev a2) (prog-if 00 [VGA controller])

      You have to patch a kernel file before compiling it, if it is not patched the graphics mode will refuse to start.

      The steps are:
      # nano -w /usr/src/linux-3.15.7-gentoo/drivers/acpi/osl.c
      Search with ctrl + w within nano for this text, acpi_os_wait_events_complete and nano takes you to this part:

      void acpi_os_wait_events_complete (void)
      {
      flush_workqueue (kacpid_wq);
      flush_workqueue (kacpi_notify_wq);
      }
      EXPORT_SYMBOL (acpi_os_wait_events_complete);

      The patch you have to add is this last line that starts with EXPORT, ctrl + or ctrl + x
      Then you compile the kernel, install the modules, install the kernel, generate the initramfs if you need it, add the splash to the initramfs if you use splash, regenerate the entries for grub and finally and very importantly, you must rebuild the modules that are not from the kernel or the proprietary nvidia module, without doing this the graphic mode will not work.

      # select kernel list
      #eselect kernel set x
      # cd / usr / src / linux
      # make
      # make modules_install
      # mount / boot
      # make install
      # dracut –hostonly »3.15.7-gentoo –force
      # splash_geninitramfs –verbose –res 1400 × 1050 –append /boot/initramfs-3.15.7-gentoo.img emerge-world
      # grub-mkconfig -o /boot/grub/grub.cfg
      # emerge @ module-rebuild
      # umount / boot
      # shutdown -r now

      If you use genkernel you just patch that file and I understand that genkernel fixes itself.
      In addition, you must remove the drm support and nvidia drivers and other video chips from the kernel so that they do not collide head-on with the closed nvidia driver that is installed as an nvidia module.
      In the case of using bootsplash, you must include the uvesa driver in the kernel to support high screen resolutions since the closed nvidia driver (if I remember correctly) does not support more than 800 × 600 in the terminal tty1 «F1» of the boot.
      I don't know about other distros, but I suppose it should run on any distro if these steps were done, saving the emerge change for whatever.

      These are the guidelines you must follow, for nvidia and uvesa:
      http://wiki.gentoo.org/wiki/NVidia/nvidia-drivers/es
      http://wiki.gentoo.org/wiki/Uvesafb

      1.    yukiteru said

        Thanks for the info, but I solved the problem precisely by changing to the proprietary ones. I remember that the previous nVidia driver (304.121) also had to be patched when going to 3.13 because it had a problem in the compilation of the module (there were no errors, but the module refused to work) and everything also because of the ACPI event handler . In Debian I got the problem and found the solution too.

        https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=740097

    2.    dayara said

      I have used Manjaro as an example, but I have already mentioned before that the same thing happened to me with Arch and other derivatives. Therefore I believe that the problem is more theirs than those affected.

      Pd: I have not been able to respond directly to the relevant message because the option to reply does not appear ...

  8.   dayara said

    I just went from Manjaro to Linux Mint because it would freeze when booting after updating to a version after 0.8.9 (I can't remember which one). From what I read, this usually happens on laptops. My problem in question was not the same as the one in this post, I think I came to the conclusion that it could be related to power management. There were people who did not freeze if they started the laptop while unplugged. Right now I don't remember if that allowed me to always start without problems, but of course I was able to do it more times at the cost of taking longer to do it.
    Anyway, in the end I gave up and switched to Fedora and Linux Mint.

    1.    c4explosive said

      Coincidentally, yesterday I tried to suspend it without the charger and when resuming it it hung and I had to restart.

  9.   Amaury said

    It's quite funny, I've been with Arch for a few months and I haven't had a single Kernel Panic! It has happened to me with Antergos (Arch with an added repository) from the live environment, but there I consider it more understandable. Could it be a problem with the mother board or a faulty RAM module? I remember about 2 years ago a RAM module caused me several blue screens in Windows and also several Kernel Panics! on Mandriva. I had to test each memory at a time between reboot and reboot.

    1.    dayara said

      It is an Arch problem (which drags all its derivatives), because in other distros there are no problems of that type. What I find embarrassing is that at this point they have not solved it. It's been just them for years! I have read similar problems from 2011. I am clear that it is something that comes and goes as they update, because using versions 0.8.7, 0.8.8 and 0.8.9 without updating them, nothing happens. From then on everything goes to shit, and surely in old versions it also happened. Why does it happen only to a few of us? I don't know, but I don't think it's our problem, but Arch's, because, as already said, other distributions work perfectly. I already broke my horns in his day to find a solution, but I got tired. So, as much as I'm sorry, I'm not going to use Arch.

      1.    yukiteru said

        Arch 0.8.7, 0.8.8 and 0.8.9? I find out that Arch uses that version nomenclature.

        Could it be that you are using Manjaro?

      2.    yukiteru said

        Ok, I answer myself by reading your previous comment, and one thing is Manjaro and another is Arch.

        That blaming a distro for a certain problem is not consistent either (not really consistent), at least in my case I can't blame how many distro I try for the problem with nouveau and my nVidia 6150SE card, because the problem is the MMIO handling of the driver and the card (the nVidia will know what to fix and crazy things they will have to fix that detail). Hardware can also be the problem, and you can see that in whatever OS you use (Windows, Linux, BSD), and in my experience repairing computers I have seen very strange hardware problems (such as a PC that refuses to boot unless you change the memory location, and when shutting down you have to repeat the process), and I can't blame Windows and Debian for that.

  10.   raalso7 said

    I had a kernel panic with a live ubuntu 12.04

  11.   Ulysses Bernal Perez said

    I have frenetic to my my Secure HP pavilion dm4 Notebook PC, 8 GB of RAM, 500 of hard drive, it has more than 5 years of use. I do not remember the speed of the microprocessor, an Intel core i5, I think more than 2 mhz.
    I can't write anything on the terminal screen. I will keep looking for more information, to solve this problem.