On Thu, Nov 06, 2014 at 18:34, Jakub Skrzypnik wrote: > Can you describe in English that can be understand for people other than > kernel developers, what's were wrong and how you fixed that? I'm a biut > curious ;)
Sure. Some time ago I changed free() in the kernel to take an extra argument, the size. Eventually this will help us make free better/faster/etc., but in the mean time the correct size to pass is unknown (free() figures it out on its own, but we're trying to avoid that in the future) so all the calls to free() were changed to specify a size of 0. Then we've slowly been fixing up all the free() calls to provide the correct size. To make sure we get the size right, free() has some checks to make sure both the internally calculated size and the provided size are consistent. Unfortunately, one of those checks was itself incorrect for certain large allocations, and triggered the panic. It didn't show up before because the size argument was just added to that particular free() only a few days ago. In this case, it was a free() in the code path that handles core dumps, and it had to be a large core to trigger the panic. So basically, you had to run firefox and have it crash. I never saw the panic because I typically run with 'ulimit -c 0' precisely because I don't like firefox littering my home directory with core files.