This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Page tree
Skip to end of metadata
Go to start of metadata

What are Minidumps

Minidumps are small, concise crash dumps, written by the Google Breakpad library. They contain the crash reason, up to 8 KB of stack memory per thread, plus environment (OS, CPU, ...) and process information (loaded modules). The resulting files are much smaller than traditional core dumps and can be collected over networks much more quickly. Symbol resolution happens on the developer side, so there is no need to ship debug symbols together with Impala binaries. The Breakpad library is widely used, e.g. in Chrome, Firefox, Android and developed as an open source project. Minidump files written by Impala have names like 7dd118b5-fd6a-9098-79215292-715fcff9.dmp. A more detailed introduction to Breakpad can be found here.

Enabled by default

Google Breakpad has been integrated in Impala since version 2.6.0 (IMPALA-2686). During a crash, Impala daemons (catalogs, impalad, statestored) will automatically write minidump files - there's nothing to enable. Starting with Impala 2.7.0 (IMPALA-3677), Impala daemons will also write minidumps when receiving a SIGUSR1 signal.

If you have a local cluster running, you can try this by using the kill command:

$ pidof impalad
26547 26466 26401
$ kill -SIGUSR1 26547
$ ls logs/cluster/minidumps/impalad
5282c9d1-892e-3f43-6867e5f3-6134bc8f.dmp

I found a Minidump, what now?

Minidump files do not contain debug symbols. You will have to resolve the symbols yourself. This takes several steps to do:

  1. Dump the debug symbols for the binary that wrote the minidump and convert them into Breakpad's internal symbol format.
  2. Resolve the symbols by combining the minidump file and the extracted symbols.

How to dump symbols in Breakpad's format

The Impala repository contains a script called dump_breakpad_symbols.py, which can be used to convert debug symbols from various sources into Breakpad's own format. The script will write a directory of files, which can then be used by Breakpad's minidump_stackwalk tool to resolve the symbols. Breakpad symbol files are identified by hashes, so you can dump symbols for multiple binaries into the same destination folder.

Extracting from a private build

If you compiled Impala locally and have access to the build results, you can tell the script where to search for binaries. /tmp/syms is the target directory for the converted symbol files.

$ dump_breakpad_symbols.py -b be/build/latest -d /tmp/syms
INFO:root:Processing binary file: be/build/latest/service/libfesupport.so
INFO:root:Processing binary file: be/build/latest/service/impalad
INFO:root:Processing binary file: be/build/latest/util/libloggingsupport.so
INFO:root:Processing binary file: be/build/latest/scheduling/simple-scheduler-test
INFO:root:Processing binary file: be/build/latest/scheduling/backend-config-test

 As you can see the script picks up all ELF files it can find, including some unit tests. Dumping symbols from the tests is not necessary and can take several minutes. If you don't want this you can specify all files to extract manually.

$ dump_breakpad_symbols.py -f be/build/latest/service/libfesupport.so be/build/latest/service/impalad be/build/latest/util/libloggingsupport.so -d /tmp/syms
INFO:root:Processing binary file: be/build/latest/service/libfesupport.so
INFO:root:Processing binary file: be/build/latest/service/impalad
INFO:root:Processing binary file: be/build/latest/util/libloggingsupport.so

Extracting from RPMs

If you obtained a minidump file from a host that had been installed using RPM packages, then you will have to retrieve matching versions of the RPM package that is installed on the host and its corresponding debuginfo package. For example, for release versions of Cloudera's distribution of Impala, you can find these files here: http://archive.cloudera.com/cdh5/redhat/

If your system was running CentOS 6 and Cloudera's CDH5.8.0, you would need these two files:

If you don't know where to start looking, then running strings logs/cluster/minidumps/impalad/5282c9d1-892e-3f43-6867e5f3-6134bc8f.dmp may provide clues.


Then you can use the script to dump the symbols like so:

$ dump_breakpad_symbols.py -r impala-2.6.0+cdh5.8.0+0-1.cdh5.8.0.p0.111.el6.x86_64.rpm -s impala-debuginfo-2.6.0+cdh5.8.0+0-1.cdh5.8.0.p0.111.el6.x86_64.rpm -d /tmp/syms
INFO:root:Extracting: impala-2.6.0+cdh5.8.0+0-1.cdh5.8.0.p0.111.el6.x86_64.rpm
573966 blocks
INFO:root:Extracting: impala-debuginfo-2.6.0+cdh5.8.0+0-1.cdh5.8.0.p0.111.el6.x86_64.rpm
2590102 blocks
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/lib/libstdc++.so.6.0.20
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/lib/libkudu_client.so.0.1.0
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/lib/libstdc++.so.6
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/lib/libkudu_client.so.0
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/lib/libgcc_s.so.1
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/sbin-retail/libfesupport.so
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/sbin-retail/impalad
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/sbin-debug/libfesupport.so
INFO:root:Processing binary file: /tmp/tmpL5sAoP/usr/lib/impala/sbin-debug/impalad

As you can see the script dumps symbols for both, release and debug binaries so you can resolve minidump files written by either one.

Extracting from Jenkins build artifacts

If you built Impala on a Jenkins server and used it to deploy it to a cluster of machines, you can extract Breakpad symbols from the build artifact of the Jenkins job. On the job page, click "Build Artifacts" → "all files in zip". Then extract the resulting archive.zip and extract the symbols like so:

$ unzip archive.zip
Archive:  archive.zip
  inflating: archive/repos/Impala/be/build/debug/service/impalad
  inflating: archive/repos/Impala/be/build/debug/service/libService.a
  inflating: archive/repos/Impala/be/build/debug/service/libfesupport.so
  inflating: archive/repos/Impala/be/build/release/service/impalad
  inflating: archive/repos/Impala/be/build/release/service/libService.a
  inflating: archive/repos/Impala/be/build/release/service/libfesupport.so
  inflating: archive/repos/Impala/ext-data-source/api/target/impala-data-source-api-1.0-SNAPSHOT-sources.jar
  inflating: archive/repos/Impala/ext-data-source/api/target/impala-data-source-api-1.0-SNAPSHOT.jar
  inflating: archive/repos/Impala/fe/target/impala-frontend-0.1-SNAPSHOT-tests.jar
  inflating: archive/repos/Impala/fe/target/impala-frontend-0.1-SNAPSHOT.jar
  inflating: archive/repos/Impala/libstdc++.so.6.0.20
  inflating: archive/repos/Impala/llvm-ir/impala-no-sse.bc
  inflating: archive/repos/Impala/llvm-ir/impala-sse.bc
$ dump_breakpad_symbols.py -b archive/repos/Impala/be/build/ -d /tmp/syms
INFO:root:Processing binary file: archive/repos/Impala/be/build/debug/service/libfesupport.so
INFO:root:Processing binary file: archive/repos/Impala/be/build/debug/service/impalad
INFO:root:Processing binary file: archive/repos/Impala/be/build/release/service/libfesupport.so
INFO:root:Processing binary file: archive/repos/Impala/be/build/release/service/impalad

If you know which binary wrote the minidump, you can also limit the selection to either archive/repos/Impala/be/build/debug or archive/repos/Impala/be/build/release.

How to process the minidump file

Once you have extracted debug symbols into a folder you can use the minidump_stackwalk tool from Breakpad to resolve the symbols. You can locate the binary in $IMPALA_TOOLCHAIN/breakpad-$IMPALA_BREAKPAD_VERSION/bin, for example in /opt/Impala-Toolchain/breakpad-20150612-p1/bin.

$ $IMPALA_TOOLCHAIN/breakpad-$IMPALA_BREAKPAD_VERSION/bin/minidump_stackwalk logs/cluster/minidumps/impalad/5282c9d1-892e-3f43-6867e5f3-6134bc8f.dmp /tmp/syms > /tmp/resolved.txt

This command will print a lot of warnings, which generally are ok to ignore.

The result should look like this:

Operating system: Linux
                  0.0.0 Linux 4.2.0-35-generic #40~14.04.1-Ubuntu SMP Fri Mar 18 16:37:35 UTC 2016 x86_64
CPU: amd64
     family 6 model 60 stepping 3
     1 CPU
GPU: UNKNOWN
Crash reason:  DUMP_REQUESTED
Crash address: 0x2845873
Process uptime: not available
Thread 0 (crashed)
 0  impalad!google_breakpad::ExceptionHandler::WriteMinidump [exception_handler.cc : 650 + 0xd]
    rax = 0x00007f499f5bbd38   rdx = 0x0000000000000000
    rcx = 0x0000000002845852   rbx = 0x0000000000000000
    rsi = 0x0000000000000001   rdi = 0x00007fff6bfe4488
    rbp = 0x00007fff6bfe4aa0   rsp = 0x00007fff6bfe43f0
     r8 = 0x0000000000000000    r9 = 0x00007fff6bfe4238
    r10 = 0x00007fff6bfe46c0   r11 = 0x00000000040f1f20
    r12 = 0x00007fff6bfe4a60   r13 = 0x00000000015bf605
    r14 = 0x0000000000000000   r15 = 0x0000000000000000
    rip = 0x0000000002845873
    Found by: given as instruction pointer in context
 1  impalad!google_breakpad::ExceptionHandler::WriteMinidump [exception_handler.cc : 621 + 0x8]
    rbx = 0x00007f499ef7e660   rbp = 0x00007fff6bfe4aa0
    rsp = 0x00007fff6bfe4a40   r12 = 0x00007fff6bfe4a60
    r13 = 0x00000000015bf605   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x00000000028460cc
    Found by: call frame info
 2  impalad!impala::HandleSignal [minidump.cc : 93 + 0x1e]
    rbx = 0x0000000000000000   rbp = 0x00007fff6bfe4b70
    rsp = 0x00007fff6bfe4b60   r12 = 0x00007fff6bfe51e0
    r13 = 0x000000000b466170   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x00000000015bf725
    Found by: call frame info
 3  libpthread-2.19.so + 0x10340
    rbx = 0x0000000000000000   rbp = 0x00007fff6bfe51d0
    rsp = 0x00007fff6bfe4b80   r12 = 0x00007fff6bfe51e0
    r13 = 0x000000000b466170   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x00007f499f5ca340
    Found by: call frame info
 4  impalad!boost::thread::join_noexcept() + 0x5c
    rbp = 0x00007fff6bfe51f0   rsp = 0x00007fff6bfe51e0
    rip = 0x0000000001a5031c
    Found by: previous frame's frame pointer
 5  impalad!boost::thread::join() [thread.hpp : 767 + 0xc]
    rbx = 0x0000000000000000   rbp = 0x00007fff6bfe52b0
    rsp = 0x00007fff6bfe5240   r12 = 0x0000000000000000
    r13 = 0x00007fff6bfe57a0   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x0000000001329f4f
    Found by: call frame info
 6  impalad!impala::Thread::Join() const [thread.h : 105 + 0x14]
    rbx = 0x0000000000000000   rbp = 0x00007fff6bfe52d0
    rsp = 0x00007fff6bfe52c0   r12 = 0x0000000000000000
    r13 = 0x00007fff6bfe57a0   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x000000000132a5b0
    Found by: call frame info
 7  impalad!impala::ThriftServer::Join() [thrift-server.cc : 443 + 0x18]
    rbx = 0x0000000000000000   rbp = 0x00007fff6bfe5350
    rsp = 0x00007fff6bfe52e0   r12 = 0x0000000000000000
    r13 = 0x00007fff6bfe57a0   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x000000000132945d
    Found by: call frame info
 8  impalad!ImpaladMain(int, char**) [impalad-main.cc : 95 + 0xf]
    rbx = 0x0000000009fd7c20   rbp = 0x00007fff6bfe5660
    rsp = 0x00007fff6bfe5360   r12 = 0x0000000001164ab3
    r13 = 0x00007fff6bfe57a0   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x000000000153d9a1
    Found by: call frame info
 9  impalad!main [daemon-main.cc : 37 + 0x11]
    rbx = 0x0000000000000000   rbp = 0x00007fff6bfe56c0
    rsp = 0x00007fff6bfe5670   r12 = 0x0000000001164ab3
    r13 = 0x00007fff6bfe57a0   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x0000000001164c60
    Found by: call frame info
10  libc-2.19.so + 0x21ec5
    rbx = 0x0000000000000000   rbp = 0x0000000000000000
    rsp = 0x00007fff6bfe56d0   r12 = 0x0000000001164ab3
    r13 = 0x00007fff6bfe57a0   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x00007f499e3b5ec5
    Found by: call frame info
11  impalad!frame_dummy + 0x26
    rsp = 0x00007fff6bfe56f0   rip = 0x0000000001164ba6
    Found by: stack scanning
12  impalad!_GLOBAL__sub_I_symbolize.cc + 0x83
    rsp = 0x00007fff6bfe5708   rip = 0x0000000001164ab3
    Found by: stack scanning
13  0x7fff6bfe57a0
    rbx = 0x0000000000000000   rbp = 0x0000000001164ab3
    rsp = 0x00007fff6bfe5710   r12 = 0x2b795cb845c21408
    rip = 0x00007fff6bfe57a0
    Found by: call frame info
14  impalad!SignalHandler + 0x1d0
    rsp = 0x00007fff6bfe5750   rip = 0x000000000289d190
    Found by: stack scanning
15  impalad!_GLOBAL__sub_I_symbolize.cc + 0x83
    rsp = 0x00007fff6bfe5778   rip = 0x0000000001164ab3
    Found by: stack scanning
16  0x7fff6bfe57a0
    rbx = 0x0000000000000000   rbp = 0x0000000001164ab3
    rsp = 0x00007fff6bfe5780   r12 = 0x0000000000000000
    rip = 0x00007fff6bfe57a0
    Found by: call frame info
17  impalad!_start + 0x29
    rsp = 0x00007fff6bfe5790   rip = 0x0000000001164adc
    Found by: stack scanning
18  0x7fff6bfe5798
    rsp = 0x00007fff6bfe5798   rip = 0x00007fff6bfe5798
    Found by: call frame info
Thread 1
 0  libc-2.19.so + 0xc0f3d
    rax = 0xfffffffffffffdfc   rdx = 0x0000000000000000
    rcx = 0x00007f499e454f3d   rbx = 0x000000000410d240
    rsi = 0x0000000000000000   rdi = 0x00007f499e392b70
    rbp = 0x00007f499e392ba0   rsp = 0x00007f499e392b60
     r8 = 0x0000000000000000    r9 = 0x00000000000007d0
    r10 = 0xffffffffffffbe5f   r11 = 0x0000000000000293
    r12 = 0x0000000000000000   r13 = 0x0000000000000000
    r14 = 0x00007f499e3939c0   r15 = 0x00007f499e393700
    rip = 0x00007f499e454f3d
    Found by: given as instruction pointer in context
 1  impalad!impala::PeriodicCounterUpdater::UpdateLoop() [periodic-counter-updater.cc : 130 + 0x13]
    rbp = 0x00007f499e392d40   rsp = 0x00007f499e392bb0
    rip = 0x00000000015c5665
    Found by: previous frame's frame pointer
 2  impalad!boost::_mfi::mf0<void, impala::PeriodicCounterUpdater>::operator()(impala::PeriodicCounterUpdater*) const [mem_fn_template.hpp : 49 + 0x17]
    rbx = 0x000000000410d240   rbp = 0x00007f499e392d60
    rsp = 0x00007f499e392d50   r12 = 0x0000000000000000
    r13 = 0x0000000000000000   rip = 0x00000000015cff55
    Found by: call frame info
 3  impalad!void boost::_bi::list1<boost::_bi::value<impala::PeriodicCounterUpdater*> >::operator()<boost::_mfi::mf0<void, impala::PeriodicCounterUpdater>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, impala::PeriodicCounterUpdater>&, boost::_bi::list0&, int) [bind.hpp : 253 + 0x32]
    rbx = 0x000000000410d240   rbp = 0x00007f499e392da0
    rsp = 0x00007f499e392d70   r12 = 0x0000000000000000
    r13 = 0x0000000000000000   rip = 0x00000000015cfeb8
    Found by: call frame info
 4  impalad!boost::_bi::bind_t<void, boost::_mfi::mf0<void, impala::PeriodicCounterUpdater>, boost::_bi::list1<boost::_bi::value<impala::PeriodicCounterUpdater*> > >::operator()() [bind_template.hpp : 20 + 0x22]
    rbx = 0x00007f499e393700   rbp = 0x00007f499e392df0
    rsp = 0x00007f499e392db0   r12 = 0x0000000000000000
    r13 = 0x0000000000000000   rip = 0x00000000015cfe63
    Found by: call frame info
 5  impalad!boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf0<void, impala::PeriodicCounterUpdater>, boost::_bi::list1<boost::_bi::value<impala::PeriodicCounterUpdater*> > > >::run() [thread.hpp : 116 + 0x12]
    rbx = 0x00007f499e393700   rbp = 0x00007f499e392e10
    rsp = 0x00007f499e392e00   r12 = 0x0000000000000000
    r13 = 0x0000000000000000   rip = 0x00000000015cfe26
    Found by: call frame info
 6  impalad!thread_proxy + 0xda
    rbx = 0x00007f499e393700   rbp = 0x0000000000000000
    rsp = 0x00007f499e392e20   r12 = 0x0000000000000000
    r13 = 0x0000000000000000   rip = 0x0000000001a4f8ea
    Found by: call frame info
 7  libpthread-2.19.so + 0x8182
    rbx = 0x00007f499e393700   rbp = 0x0000000000000000
    rsp = 0x00007f499e392e60   r12 = 0x0000000000000000
    r13 = 0x0000000000000000   rip = 0x00007f499f5c2182
    Found by: call frame info
 8  libc-2.19.so + 0xfa47d
    rsp = 0x00007f499e392f00   rip = 0x00007f499e48e47d
    Found by: stack scanning

...
More thread stacks
...

Loaded modules:
0x00400000 - 0x03d89fff  impalad  ???  (main)
0x7f4908f12000 - 0x7f4908f18fff  gconv-modules.cache  ???
0x7f4962bcf000 - 0x7f4962bdffff  libnio.so  ???
0x7f4975f1e000 - 0x7f4975f33fff  libnet.so  ???
0x7f4976141000 - 0x7f4976141fff  libjaas_unix.so  ???
0x7f497667c000 - 0x7f4976699fff  libhadoop.so.1.0.0  ???  (WARNING: No symbols, libhadoop.so.1.0.0, FFB3438DCC629D98CD5D96D111B144D80)
0x7f497689a000 - 0x7f49768a2fff  libmanagement.so  ???
0x7f4976e36000 - 0x7f4976e36fff  javax.inject-1.jar  ???
0x7f497815f000 - 0x7f4978162fff  libdt_socket.so  ???  (WARNING: No symbols, libdt_socket.so, 170F2B9D9E6B340D72F3C17E79C1F31F0)
0x7f499b06d000 - 0x7f499b086fff  libzip.so  ???
0x7f499b288000 - 0x7f499b292fff  libnss_files-2.19.so  ???
0x7f499b494000 - 0x7f499b49efff  libnss_nis-2.19.so  ???
0x7f499b6a0000 - 0x7f499b6b6fff  libnsl-2.19.so  ???
0x7f499b8ba000 - 0x7f499b8c2fff  libnss_compat-2.19.so  ???
0x7f499bac4000 - 0x7f499c318fff  locale-archive  ???
0x7f499c319000 - 0x7f499c31afff  libnpt.so  ???
0x7f499c51c000 - 0x7f499c554fff  libjdwp.so  ???  (WARNING: No symbols, libjdwp.so, F70DA77F5ABFFBFE75D87684678C59330)
0x7f499c756000 - 0x7f499c77ffff  libjava.so  ???
0x7f499c982000 - 0x7f499c98efff  libverify.so  ???
0x7f499e394000 - 0x7f499e54efff  libc-2.19.so  ???  (WARNING: No symbols, libc-2.19.so, 8B0582530369AA1C9B9996C11061CD160)
0x7f499e759000 - 0x7f499e76efff  libgcc_s.so.1  ???
0x7f499e96f000 - 0x7f499ea73fff  libm-2.19.so  ???
0x7f499ec75000 - 0x7f499ed5ffff  libstdc++.so.6.0.20  ???  (WARNING: No symbols, libstdc++.so.6.0.20, F77273E1036410C5C64984F088D75D6D0)
0x7f499ef7f000 - 0x7f499f131fff  libcrypto.so.1.0.0  ???
0x7f499f35b000 - 0x7f499f3affff  libssl.so.1.0.0  ???
0x7f499f5ba000 - 0x7f499f5d2fff  libpthread-2.19.so  ???  (WARNING: No symbols, libpthread-2.19.so, 860606AF75813607E0524DF17D5A86CB0)
0x7f499f7d8000 - 0x7f499f7dafff  libdl-2.19.so  ???
0x7f499f9dc000 - 0x7f499f9e2fff  librt-2.19.so  ???
0x7f499fbe4000 - 0x7f49a01bffff  libkudu_client.so.0.1.0  ???
0x7f49a03f7000 - 0x7f49a10b5fff  libjvm.so  ???  (WARNING: No symbols, libjvm.so, 3B87355D8E735F001A2469734BAF3A400)
0x7f49a13d7000 - 0x7f49a13effff  libsasl2.so.2.0.25  ???
0x7f49a15f2000 - 0x7f49a15f3fff  libjsig.so  ???
0x7f49a17f5000 - 0x7f49a1817fff  ld-2.19.so  ???  (WARNING: No symbols, ld-2.19.so, 3E6C812AA4BB28E113FBD34B06FBD25B0)
0x7f49a1823000 - 0x7f49a1824fff  aopalliance-1.0.jar  ???
0x7f49a1a05000 - 0x7f49a1a0cfff  26547  ???
0x7fff6bff0000 - 0x7fff6bff1fff  linux-gate.so  ???

Some things to observe:

  • The crashed stack is always the first one. Here the crash reason is DUMP_REQUESTED since we asked for the minidump file by sending a SIGUSR1 to the process. Other reasons could be SIGSEGV, etc.
  • Only the Impala symbols have been resolved. To resolve symbols of system libraries you have to dump them first.

Other tools to use with minidumps

minidump-2-core

Minidumps can be converted into core files, which can be inspected with gdb. To do so, call the minidump-2-core tool. Then you should be able to open the core as usual. (TODO: It doesn't seem to work like this).

$ $IMPALA_TOOLCHAIN/breakpad-$IMPALA_BREAKPAD_VERSION/bin/minidump-2-core logs/cluster/minidumps/impalad/5282c9d1-892e-3f43-6867e5f3-6134bc8f.dmp > core
$ gdb be/build/latest/service/impalad core

 

TODO

  • Describe how to extract symbols for system libraries
  • Figure out how to use minidump-2-core

 

  • No labels