If gdb is installed on the cluster a stack trace can be got from a core core file that Trafodion processes produce on on unexpected exit.

Core files can be found by using the Linux find command or by using the cfindcore utility. Many exit situations will lead to the core file being placed in $MY_SQROOT/sql/scripts directory, in the node where the failure occurred.

$ ls -ltr core.*
-rw------- 1 trafodion trafodion 1338802176 Sep 16 12:09 core.49256
-rw------- 1 trafodion trafodion 1124069376 Sep 16 12:09 core.48633
-rw------- 1 trafodion trafodion 1069666304 Sep 16 12:09 core.49290

Once a core file has been found the following two commands are sufficient to get a backtrace. The first command, file' is used to identify the name of the failing Trafodion process. Then the usual method to get a stack trace from the failing process can be used, with the process name we found from the first command. This stack trace is helpful to determine cause of failure.

$ file core.49256
core.49256: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'tdm_udrserv SQMON1.1 00004 00004 049256 $Z04157B 00004 000'

$ gdb tdm_udrserv core.49256
#0 0x00007f71d7c73625 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install apr-1.3.9-5.el6_2.x86_64 apr-util-1.3.9-3.el6_0.1.x86_64 cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64 db4-4.7.25-19.el6_6.x86_64 expat-2.0.1-11.el6_2.x86_64 glibc-2.12-1.166.el6_7.1.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-42.el6.x86_64 libcom_err-1.41.12-22.el6.x86_64 libcurl-7.19.7-46.el6.x86_64 libgcc-4.4.7-16.el6.x86_64 libidn-1.18-2.el6.x86_64 libselinux-2.0.94-5.8.el6.x86_64 libssh2-1.4.2-1.el6_6.1.x86_64 libstdc++-4.4.7-16.el6.x86_64 libuuid-2.17.2-12.18.el6.x86_64 log4cxx-0.10.0-13.el6.x86_64 nspr-4.10.8-1.el6_6.x86_64 nss-3.19.1-3.el6_6.x86_64 nss-softokn-freebl-3.14.3-22.el6_6.x86_64 nss-util-3.19.1-1.el6_6.x86_64 openldap-2.4.40-5.el6.x86_64 openssl-1.0.1e-42.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x00007f71d7c73625 in raise () from /lib64/libc.so.6
#1 0x00007f71d7c74e05 in abort () from /lib64/libc.so.6
#2 0x0000000000424369 in comTFDS (
msg1=0x43c070 "Trafodion UDR Server Internal Error",
msg2=<value optimized out>,
msg3=0x7fff7f5b06b0 "Source file information unavailable",
msg4=0x7fff7f5afeb0 "User routine being processed : TRAFODION.TPCC.PAYMENT, Routine Type : Stored Procedure, Language Type : JAVA, Error occurred outside the user routine code", msg5=0x43ddc3 "", dialOut=<value optimized out>,
writeToSeaLog=1) at ../udrserv/UdrFFDC.cpp:191
#3 0x00000000004245d7 in makeTFDSCall (
msg=0x7f71da31e310 "The Java virtual machine aborted",
file=<value optimized out>, line=<value optimized out>, dialOut=1,
writeToSeaLog=1) at ../udrserv/UdrFFDC.cpp:219
#4 0x00007f71da3046b8 in LmJavaHooks::abortHookJVM ()
at ../langman/LmJavaHooks.cpp:54
#5 0x00007f71d9a9ebc6 in ParallelScavengeHeap::initialize() ()
from /usr/jdk64/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
#6 0x00007f71d9bd1dba in Universe::initialize_heap() ()
from /usr/jdk64/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
#7 0x00007f71d9bd2f89 in universe_init() ()
from /usr/jdk64/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
#8 0x00007f71d98109f5 in init_globals() ()
---Type <return> to continue, or q <return> to quit---
from /usr/jdk64/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
#9 0x00007f71d9bba8ed in Threads::create_vm(JavaVMInitArgs*, bool*) ()
from /usr/jdk64/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
#10 0x00007f71d9898a34 in JNI_CreateJavaVM ()
from /usr/jdk64/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
#11 0x00007f71da300e51 in LmLanguageManagerJava::initialize (
this=<value optimized out>, result=<value optimized out>,
maxLMJava=<value optimized out>, userOptions=0x7f71c6692418,
diagsArea=<value optimized out>) at ../langman/LmLangManagerJava.cpp:379
#12 0x00007f71da302564 in LmLanguageManagerJava::LmLanguageManagerJava (
this=0x7f71c66975b0, result=@0x7fff7f5b605c,
commandLineMode=<value optimized out>, maxLMJava=1,
userOptions=0x7f71c6692418, diagsArea=0x7f71c666aa48)
at ../langman/LmLangManagerJava.cpp:155
#13 0x0000000000425619 in UdrGlobals::getOrCreateJavaLM (this=0x7f71c6692040,
result=@0x7fff7f5b605c, diags=<value optimized out>)
at ../udrserv/udrglobals.cpp:322
#14 0x0000000000427328 in processALoadMessage (UdrGlob=0x7f71c6692040,
msgStream=..., request=..., env=<value optimized out>)
at ../udrserv/udrload.cpp:163
#15 0x000000000042fbfd in processARequest (UdrGlob=0x7f71c6692040,
msgStream=..., env=...) at ../udrserv/udrserv.cpp:660
#16 0x000000000043269c in runServer (argc=2, argv=0x7fff7f5b63e8)
---Type <return> to continue, or q <return> to quit---
at ../udrserv/udrserv.cpp:520
#17 0x000000000043294e in main (argc=2, argv=0x7fff7f5b63e8)
at ../udrserv/udrserv.cpp:356