Operational Dynamics
Technology, Strategy, and IT Operations Consulting   |   Open Source Research and Development   |   Blogs

hackergotchi
Operations and other mysteries

A blog by Andrew Cowie

RSS 2.0

Thursday, 26 Nov 2009

Getting a core dump

Sometimes things crash. This is the normal order of things, even if we like to pretend that Linux is so much better than its proprietary competitors. When a native library crashes underneath a java-gnome program, however, this isn’t so much fun, because the actual process which crashed is a Java Virtual Machine.

Usually I see with crashes because of something I’ve done wrong in binding an underlying GNOME library from java-gnome. So I bisect & printf() my way down until I can find the thing that causes it, read the docs, and hopefully figure it out.

Recently, however, I’ve been getting crashes somewhere deep in libpangoft when my the app is first loading or worse just sitting there and I’m not doing anything more onerous than moving the cursor around a TextView. This is still likely due to something I’ve done wrong in my code or in the bindings layer, but when it’s not happening deterministically or on demand it’s hard to even begin to analyze the problem.

The OpenJDK HotSpot VM (formerly the Sun HotSpot VM) has a pretty good SIGSEGV handler; it does its best to show you what the C library call was that died, and what the Java and C call stacks were leading up to the crash. You may have seen them around as hs_err_pid10733.log and such [I wish it would just spew that out to stderr instead of troubling to write a file, but anyway].

Strangely, the only line I’m getting when reading the crash report is:

C  [libpangoft2-1.0.so.0+0x17687]

no other stack frames. Which is a bit strange, and decidedly unhelpful. So I’m going to need to try a bit harder to get a backtrace, it seems.

Getting a native C stack trace of a Java program with GDB

Much as I otherwise feel GDB is the most horrendous user interface in the history of civilization (ok, maybe second only to GPG’s command line interface), it does do one thing extraordinarily well and that’s stack backtraces of crashed programs. These days one normally runs one’s program in gdb, induces it to crash, and then runs bt:

$ gdb ./program
(gdb) run
SIGSEGV caught
(gdb) bt

And you get your stack trace.

That’s a bit of a pain with a Java “program” because as mentioned the process running is a Java Virtual Machine (and because invoking java is … almost as bad as GPG’s command line interface):

$ java ... gobbledygook ... package.Class arguments

Usually people put their invocation line in a shell script along with various environment setup and so on:¹

$ ./script arguments

and off they go. The complication is that’s not really easy to use with GDB, since you need to invoke gdb on the binary executable (the JVM) and then, once GDB is finally up, tell it to “run” that executable with a bunch of arguments. Which means you’re back to the gory mess:

$ gdb java
(gdb) run ... gobbledygook ... package.Class arguments
SIGSEGV caught
(gdb) thread apply all bt

which is incredibly tedious for casual use.

But the old fashioned 1980s way of using a debugger is to get a “core dump” of memory into a file called core and to run GDB on that. Just set your shell to core dump, then go back to running your program as normal:

$ ulimit -c unlimted
$ ./script arguments
Aborted (core dumped)
$ gdb java core
(gdb) thread apply all bt

and our stack traces will spill out in great gory detail. Hooray!

Incidentally, if you want to play with GDB and see what a HotSpot JVM is up to, then you need to induce it to crash; one way to do this is to send it a signal, say SIGSEGV or SIGBUS

$ kill -11 10733

Yeay, Open

The real point here is that with Sun having open sourced Java and it’s HotSpot VM implementation, we can now build Java ourselves and include debugging symbols [on Debian Linux, for example, install package openjdk-6-dbg along with the symbols for the various libraries in the GNOME stack, libgtk2.0-0-dbg and so on]. This means, at long last, we can actually run Java under GDB — something we weren’t able to do when Java was proprietary — and get lovely backtraces when it thunders in.

Yeay for crashes.

AfC

¹ Which is why people put this invocation into a shell script, which makes it even harder to debug because you’ve got to run ps axww or whatever to try and get the full command line used to run the program. {sigh}, but fixing this will have to wait for another day.

² Bernd Eckenfels suggests avoiding SIGSEGV as apparently he believes this is caught in some places and rethrown as NullPointerException. I’ve never observed that, but I thought I’d mention his advice to use SIGBUS instead.

Comments


Material on this site copyright © 2002-2009 Operational Dynamics Consulting Pty Ltd, unless otherwise noted. All rights reserved. Not for redistribution or attribution without permission in writing.

We make this service available to our staff in order to promote the discourse of ideas especially as relates to the development of Open Source worldwide. Blog entries on this site, however, are the musings of the authors as individuals and do not represent the views of Operational Dynamics. All times UTC.