This chapter provides information and guidance on some specific procedures for troubleshooting system crashes.
A crash, or fatal error, causes a process to terminate abnormally. There are various possible reasons for a crash. For example, a crash can occur due to a bug in the Java HotSpot VM, in a system library, in a Java SE library or an API, in application native code, or even in the operating system (OS). External factors, such as resource exhaustion in the OS can also cause a crash.
Crashes caused by bugs in the Java HotSpot VM or in the Java SE library code are rare. This chapter provides suggestions on how to examine a crash and work around some of the issues (if possible) until the cause of the bug is diagnosed and fixed.
In general the first step with any crash is to locate the fatal error log. This is a text file that the Java HotSpot VM generates in the event of a crash. See "Fatal Error Log" for an explanation of how to locate this file, as well as a detailed description of the file.
This chapter contains the following sections:
This section provides a number of examples which demonstrate how the error log can be used to find the cause of the crash, and suggests some tips for troubleshooting the problem depending on the cause.
The error log header indicates the type of error and the problematic frame, while the thread stack indicates the current thread and stack trace. See "Header Format".
If the fatal error log indicates the problematic frame to be a native library, there might be a bug in native code or the Java Native Interface (JNI) library code. The crash could of course be caused by something else, but analysis of the library and any core file or crash dump is a good starting place. For example, consider the following extract from the header of a fatal error log:
# An unexpected error has been detected by HotSpot Virtual Machine: # # SIGSEGV (0xb) at pc=0x417789d7, pid=21139, tid=1024 # # Java VM: Java HotSpot(TM) Server VM (6-beta2-b63 mixed mode) # Problematic frame: # C [libApplication.so+0x9d7]
In this case a SIGSEGV
occurred with a thread executing in the library libApplication.so
.
In some cases a bug in a native library manifests itself as a crash in Java VM code. Consider the following crash where a JavaThread
fails while in the _thread_in_vm
state (meaning that it is executing in Java VM code):
# An unexpected error has been detected by HotSpot Virtual Machine: # # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x08083d77, pid=3700, tid=2896 # # Java VM: Java HotSpot(TM) Client VM (1.5-internal mixed mode) # Problematic frame: # V [jvm.dll+0x83d77] --------------- T H R E A D --------------- Current thread (0x00036960): JavaThread "main" [_thread_in_vm, id=2896] : Stack: [0x00040000,0x00080000), sp=0x0007f9f8, free space=254k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [jvm.dll+0x83d77] C [App.dll+0x1047] <========= C/native frame j Test.foo()V+0 j Test.main([Ljava/lang/String;)V+0 v ~StubRoutines::call_stub V [jvm.dll+0x80f13] V [jvm.dll+0xd3842] V [jvm.dll+0x80de4] V [jvm.dll+0x87cd2] C [java.exe+0x14c0] C [java.exe+0x64cd] C [kernel32.dll+0x214c7] :
In this case, although the problematic frame is a VM frame, the thread stack shows that a native routine in App.dll
has called into the VM (probably with JNI).
The first step to solving a crash in a native library is to investigate the source of the native library where the crash occurred.
If the native library is provided by your application, then investigate the source code of your native library. A significant number of issues with JNI code can be identified by running the application with the -Xcheck:jni
option added to the command line. See "The -Xcheck:jni
Option".
If the native library has been provided by another vendor and is used by your application, then file a bug report against this third-party application and provide the fatal error log information.
If the native library where the crash occurred is part of the Java Runtime Environment (JRE) (for example awt.dll, net.dll, and so forth), then it is possible that you have encountered a library or API bug. If so, gather as much data as possible and submit a bug or report, indicating the library name. You can find JRE libraries in the jre/lib or jre/bin directories of the JRE distribution. See "Submitting Bug Reports".
You can troubleshoot a crash in a native application library by attaching the native debugger to the core file or crash dump, if it is available. Depending on the OS, the native debugger is dbx
, gdb
, or windbg
. See "Native Operating System Tools".
If the fatal error log indicates that the crash occurred in compiled code, then it is possible that you have encountered a compiler bug that has resulted in incorrect code generation. You can recognize a crash in compiled code if the type of the problematic frame is J
(meaning a compiled Java frame). Below is an example of a such a crash:
# An unexpected error has been detected by HotSpot Virtual Machine: # # SIGSEGV (0xb) at pc=0x0000002a99eb0c10, pid=6106, tid=278546 # # Java VM: Java HotSpot(TM) 64-Bit Server VM (1.6.0-beta-b51 mixed mode) # Problematic frame: # J org.foobar.Scanner.body()V # : Stack: [0x0000002aea560000,0x0000002aea660000), sp=0x0000002aea65ddf0, free space=1015k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) J org.foobar.Scanner.body()V [error occurred during error reporting, step 120, id 0xb]
Note that a complete thread stack is not available. The output line "error occurred during error reporting" means that a problem arose trying to obtain the stack trace (this might indicate stack corruption).
It might be possible to temporarily work around the issue by switching the compiler (for example, by using the HotSpot Client VM instead of the HotSpot Server VM, or visa versa) or by excluding from compilation the method that provoked the crash. In this specific example it might not be possible to switch the compiler as it was taken from a 64-bit Server VM and hence it might not be feasible to switch to a 32-bit Client VM.
For more information on possible workarounds, see "Working Around Crashes in the HotSpot Compiler Thread or Compiled Code".
If the fatal error log output shows that the current thread is JavaThread
named CompilerThread0
, CompilerThread1
, or AdapterCompiler
, then it is possible that you have encountered a compiler bug. In this case it might be necessary to temporarily work around the issue by switching the compiler (for example, by using the HotSpot Client VM instead of the HotSpot Server VM, or visa versa), or by excluding from compilation the method that provoked the crash.
For more information on possible workarounds, see "Working Around Crashes in the HotSpot Compiler Thread or Compiled Code".
If the fatal error log output shows that the current thread is VMThread
, then look for the line containing VM_Operation
in the THREAD
section. VMThread
is a special thread in the HotSpot VM. It performs special tasks in the VM such as garbage collection (GC). If the VM_Operation
suggests that the operation is a GC, then it is possible that you have encountered an issue such as heap corruption.
Besides a GC issue, it could equally be something else (such as a compiler or runtime bug) that leaves object references in the heap in an inconsistent or incorrect state. In this case, collect as much information as possible about the environment and try possible workarounds. If the issue is related to GC, you might be able to temporarily work around the issue by changing the GC configuration.
For more information on possible workaronds, see "Working Around Crashes During Garbage Collection".
A stack overflow in Java language code will normally result in the offending thread throwing the java.lang.StackOverflowError
exception. On the other hand, C and C++ write past the end of the stack and provoke a stack overflow. This is a fatal error which causes the process to terminate.
In the HotSpot implementation, Java methods share stack frames with C/C++ native code, namely user native code and the virtual machine itself. Java methods generate code that checks whether stack space is available a fixed distance towards the end of the stack so that the native code can be called without exceeding the stack space. This distance towards the end of the stack is called "Shadow Pages". The size of the shadow pages is between 3 and 20 pages, depending on the platform. This distance is tunable, so that applications with native code needing more than the default distance can increase the shadow page size. The option to increase shadow pages is -XX:StackShadowPages=
n, where n is greater than the default stack shadow pages for the platform.
If your application gets a segmentation fault without a core file or fatal error log file (see Appendix C, Fatal Error Log), or a STACK_OVERFLOW_ERROR
on Windows, or the message "An irrecoverable stack overflow has occurred", this indicates that the value of StackShadowPages
was exceeded and more space is needed.
If you increase the value of StackShadowPages
, you might also need to increase the default thread stack size using the -Xss
parameter. Increasing the default thread stack size might decrease the number of threads that can be created, so be careful in choosing a value for the thread stack size. The thread stack size varies by platform from 256 KB to 1024 KB.
The following is a fragment from a fatal error log on a Windows system, where a thread has provoked a stack overflow in native code:
# An unexpected error has been detected by HotSpot Virtual Machine: # # EXCEPTION_STACK_OVERFLOW (0xc00000fd) at pc=0x10001011, pid=296, tid=2940 # # Java VM: Java HotSpot(TM) Client VM (1.6-internal mixed mode, sharing) # Problematic frame: # C [App.dll+0x1011] # --------------- T H R E A D --------------- Current thread (0x000367c0): JavaThread "main" [_thread_in_native, id=2940] : Stack: [0x00040000,0x00080000), sp=0x00041000, free space=4k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [App.dll+0x1011] C [App.dll+0x1020] C [App.dll+0x1020] : C [App.dll+0x1020] C [App.dll+0x1020] ...<more frames>... Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j Test.foo()V+0 j Test.main([Ljava/lang/String;)V+0 v ~StubRoutines::call_stub
Note the following information in the above output:
The exception is EXCEPTION_STACK_OVERFLOW
.
The thread state is _thread_in_native,
which means that the thread is executing native or JNI code.
In the stack information, the free space is only 4 KB (a single page on a Windows system). In addition, the stack pointer (sp
) is at 0x00041000
, which is close to the end of the stack at 0x00040000
.
The printout of the native frames shows that a recursive native function is the issue in this case. The output notation ...<more frames>...
indicates that additional frames exist but were not printed. The output is limited to 100 frames.
If a crash occurs with a critical application, and the crash appears to be caused by a bug in the HotSpot VM, then it might be desirable to quickly find a temporary workaround. The purpose of this section is to suggest some possible workarounds. If the crash occurs with an application that is deployed with the most recent release of the JDK, then the crash should always be reported to Oracle.
WARNING: Even if a workaround in this section successfully eliminates a crash, the workaround is not a fix for the problem, but merely a temporary solution. Submit a support call or bug report with the original configuration that demonstrated the issue. |
If the fatal error log indicates that the crash occurred in a compiler thread, then it is possible (but not always the case) that you have encountered a compiler bug. Similarly, if the crash is in compiled code then it is possible that the compiler has generated incorrect code.
In case of the HotSpot Client VM (-client
option), the compiler thread appears in the error log as CompilerThread0
. With the HotSpot Server VM there are multiple compiler threads and these appear in the error log file as CompilerThread0
, CompilerThread1
, and AdapterThread
.
Below is a fragment of an error log for a compiler bug that was encountered and fixed during the development of J2SE 5.0. The log file shows that the HotSpot Server VM is used and the crash occurred in CompilerThread1
. In addition, the log file shows that the current CompileTask
was the compilation of the java.lang.Thread.setPriority
method.
# An unexpected error has been detected by HotSpot Virtual Machine: # : # Java VM: Java HotSpot(TM) Server VM (1.5-internal-debug mixed mode) : --------------- T H R E A D --------------- Current thread (0x001e9350): JavaThread "CompilerThread1" daemon [_thread_in_vm, id=20] Stack: [0xb2500000,0xb2580000), sp=0xb257e500, free space=505k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xc3b13c] : Current CompileTask: opto: 11 java.lang.Thread.setPriority(I)V (53 bytes) --------------- P R O C E S S --------------- Java Threads: ( => current thread ) 0x00229930 JavaThread "Low Memory Detector" daemon [_thread_blocked, id=21] =>0x001e9350 JavaThread "CompilerThread1" daemon [_thread_in_vm, id=20] :
In this case there are two potential workarounds:
The brute force approach: change the configuration so that the application is run with the -client
option to specify the HotSpot Client VM.
The subtle approach: assume that the bug only occurs during the compilation of the java.lang.Thread.setPriority
method and exclude this method from compilation.
The first approach (to use the -client
option) might be trivial to configure in some environments. In others, it might be more difficult if the configuration is complex or if the command line to configure the VM is not readily accessible. In general, switching from the HotSpot Server VM to the HotSpot Client VM also reduces the peak performance of an application. Depending on the environment, this might be acceptable until the actual issue is diagnosed and fixed.
The second approach (exclude the method from compilation) requires creating the file .hotspot_compiler in the working directory of the application. Below is an example of this file:
exclude java/lang/Thread setPriority
In general the format of this file is exclude
class method, where class is the class (fully qualified with the package name) and method is the name of the method. Constructor methods are specified as <init>
and static initializers are specified as <clinit>
.
Note: The .hotspot_compiler file is an unsupported interface. It is documented here solely for the purposes of troubleshooting and finding a temporary workaround. |
Once the application is restarted, the compiler will not attempt to compile any of the methods excluded in the .hotspot_compiler file. In some cases this can provide temporary relief until the root cause of the crash is diagnosed and the bug is fixed.
In order to verify that the HotSpot VM correctly located and processed the .hotspot_compiler file that is shown in the example above, look for the following log information at runtime. Note that the file name separator is a dot, not a slash.
### Excluding compile: java.lang.Thread::setPriority
If a crash occurs during garbage collection (GC), then the fatal error log reports that a VM_Operation
is in progress. For the purposes of this discussion, assume that the mostly concurrent GC (-XX:+UseConcMarkSweep
) is not in use. The VM_Operation
is shown in the THREAD
section of the log and indicates one of the following situations:
Generation collection for allocation
Full generation collection
Parallel gc failed allocation
Parallel gc failed permanent allocation
Parallel gc system gc
Most likely the current thread reported in the log is the VMThread
. This is the special thread used to execute special tasks in the HotSpot VM. The following fragment of the fatal error log shows an example of a crash in the serial garbage collector:
--------------- T H R E A D --------------- Current thread (0x002cb720): VMThread [id=3252] siginfo: ExceptionCode=0xc0000005, reading address 0x00000000 Registers: EAX=0x0000000a, EBX=0x00000001, ECX=0x00289530, EDX=0x00000000 ESP=0x02aefc2c, EBP=0x02aefc44, ESI=0x00289530, EDI=0x00289530 EIP=0x0806d17a, EFLAGS=0x00010246 Top of Stack: (sp=0x02aefc2c) 0x02aefc2c: 00289530 081641e8 00000001 0806e4b8 0x02aefc3c: 00000001 00000000 02aefc9c 0806e4c5 0x02aefc4c: 081641e8 081641c8 00000001 00289530 0x02aefc5c: 00000000 00000000 00000001 00000001 0x02aefc6c: 00000000 00000000 00000000 08072a9e 0x02aefc7c: 00000000 00000000 00000000 00035378 0x02aefc8c: 00035378 00280d88 00280d88 147fee00 0x02aefc9c: 02aefce8 0806e0f5 00000001 00289530 Instructions: (pc=0x0806d17a) 0x0806d16a: 15 08 83 3d c0 be 15 08 05 53 56 57 8b f1 75 0f 0x0806d17a: 0f be 05 00 00 00 00 83 c0 05 a3 c0 be 15 08 8b Stack: [0x02ab0000,0x02af0000), sp=0x02aefc2c, free space=255k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [jvm.dll+0x6d17a] V [jvm.dll+0x6e4c5] V [jvm.dll+0x6e0f5] V [jvm.dll+0x71771] V [jvm.dll+0xfd1d3] V [jvm.dll+0x6cd99] V [jvm.dll+0x504bf] V [jvm.dll+0x6cf4b] V [jvm.dll+0x1175d5] V [jvm.dll+0x1170a0] V [jvm.dll+0x11728f] V [jvm.dll+0x116fd5] C [MSVCRT.dll+0x27fb8] C [kernel32.dll+0x1d33b] VM_Operation (0x0373f71c): generation collection for allocation, mode: safepoint, requested by thread 0x02db7108
Note: A crash during garbage collection does not imply a bug in the garbage collection implementation. It could also indicate a compiler or runtime bug, or some other issue. |
You can try the following workarounds if you get a repeated crash during garbage collection:
Switch GC configuration. For example, if you are using the serial collector, try the throughput collector, or visa versa.
If you are using the HotSpot Server VM, try the HotSpot Client VM.
If you are not sure which garbage collector is in use, you can use the jmap
utility on Solaris OS and Linux (see "The jmap
Utility") to obtain the heap information from the core file, if the core file is available. In general if the GC configuration is not specified on the command line, then the serial collector will be used on Windows. On Solaris OS and Linux it depends on the machine configuration. If the machine has at least 2 GB of memory and has at least 2 CPUs, then the throughput collector (Parallel GC) will be used. For smaller machines the serial collector is the default. The option to select the serial collector is -XX:+UseSerialGC
and the option to select the throughput collector is -XX:+UseParallelGC
. If, as a workaround, you switch from the throughput collector to the serial collector, then you might experience some performance degradation on multi-processor systems. This might be acceptable until the root issue is diagnosed and resolved.
Class data sharing (CDS) was a new feature in J2SE 5.0. When the JRE is installed on 32-bit platforms using the Sun-provided installer, the installer loads a set of classes from the system JAR file into a private internal representation and dumps that representation to a file called a shared archive. When the VM is started, the shared archive is memory-mapped in. This saves on class loading and allows much of the metadata associated with the classes to be shared across multiple VM instances. In J2SE 5.0, CDS is enabled only when the HotSpot Client VM is used. In addition, sharing is supported only with the serial garbage collector.
The fatal error log prints the version string in the header of the log. If sharing is enabled, it is indicated by the text "sharing", as shown in the following example:
# An unexpected error has been detected by HotSpot Virtual Machine: # # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x08083d77, pid=3572, tid=784 # # Java VM: Java HotSpot(TM) Client VM (1.5-internal mixed mode, sharing) # Problematic frame: # V [jvm.dll+0x83d77]
CDS can be disabled by providing the -Xshare:off
option on the command line. If the crash only occurs with sharing enabled, then it is possible that you have encountered a bug in this feature. In that case gather as much information as possible and submit a bug report.
The JDK 7 software is built on Windows using Microsoft Visual Studio 2010 Professional for both 32-bit and 64-bit platforms. If you experience a crash with a Java application and if you have native or JNI libraries that are compiled with a different release of the compiler, then you must consider compatibility issues between the runtimes. Specifically, your environment is supported only if you follow the Microsoft guidelines when dealing with multiple runtimes. For example, if you allocate memory using one runtime, then you must release it using the same runtime. Unpredictable behavior or crashes can arise if you release a resource using a different library than the one that allocated the resource.