Project

General

Profile

Bug #4865

some osmo projects don't generate coredump upon SIGABRT

Added by pespin 6 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
11/20/2020
Due date:
% Done:

100%

Spec Reference:

Description

See for instance osmo-bsc:

static void signal_handler(int signal)
{
    fprintf(stdout, "signal %u received\n", signal);

    switch (signal) {
    case SIGINT:
    case SIGTERM:
        bsc_shutdown_net(bsc_gsmnet);
        osmo_signal_dispatch(SS_L_GLOBAL, S_L_GLOBAL_SHUTDOWN, NULL);
        sleep(3);
        exit(0);
        break;
    case SIGABRT:
        /* in case of abort, we want to obtain a talloc report
         * and then return to the caller, who will abort the process */
    case SIGUSR1:
        talloc_report(tall_vty_ctx, stderr);
        talloc_report_full(tall_bsc_ctx, stderr);
        break;
    default:
        break;
    }
}

    signal(SIGINT, &signal_handler);
    signal(SIGTERM, &signal_handler);
    signal(SIGABRT, &signal_handler);
    signal(SIGUSR1, &signal_handler);
    signal(SIGUSR2, &signal_handler);

So when a process sends SIGABRT to it (killall -ABRT osmo-bsc), it will simply print talloc contexts and continue executing normally (and no coredump dumped). That's definetly not what we want here. Upon SIGABRT, we want the process to produce a coredump and exit.

In order to do so, after calling talloc_report we have to call the default SIGABRT handler.

That can be done by keeping reference to the older signal handler at startup when we voerride the signal:

sighandler_t default_sigabrt; /* global varialbe */
default_sigabrt = signal(SIGABRT, &signal_handler);

Then in our signal_handler:


static void signal_handler(int signal)
{
...
    case SIGABRT:
        talloc_report(tall_vty_ctx, stderr);
        talloc_report_full(tall_bsc_ctx, stderr);
        default_sigabrt(signal);
        break;
...
}

Associated revisions

Revision 57db77f1 (diff)
Added by pespin 5 months ago

main: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: I3a3ff56cb2d740a33731ecfdf76aa32606872883
Fixes: OS#4865

Revision 821da08a (diff)
Added by pespin 5 months ago

gbproxy: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: I97559b29328101c7cf340aaf1052c0c406634065
Fixes: OS#4865

Revision fd1614c4
Added by pespin 5 months ago

gtphub: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: I1cab4a716cf2fda6353f698888edbcec6228d78b
Fixes: OS#4865

Revision 3fdd720d
Added by pespin 5 months ago

sgsn: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: I65f70a53b6982bff9ea4bd6ff786d8a2f8181eac
Fixes: OS#4865

Revision 48ffeb81 (diff)
Added by pespin 5 months ago

common: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: Ic3b7c223046a80b51f0bd70ef1b15e12e6487ad0
Fixes: OS#4865

Revision 28d0a1db (diff)
Added by pespin 5 months ago

sysmobts-mgr: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: I35ae930b59c48892e5ad9a2826e05d6c5d415abc
Fixes: OS#4865

Revision 44d72333 (diff)
Added by pespin 5 months ago

oc2g-mgr: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: I7a5756e106ac1061d37b42c22cc127fdacd87ce7
Fixes: OS#4865

Revision 090f1ceb (diff)
Added by pespin 5 months ago

lc15-mgr: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: I6ddc04c5815858c7dfab04ffdac52bce2e7940a1
Fixes: OS#4865

Revision 55d7ee57 (diff)
Added by pespin 5 months ago

main: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (killall -SIGABRT
osmo-bsc), then the process would print the talloc report and continue
running, which is not desired.

Change-Id: I125288283af630efa20d64505e319636964a0982
Fixes: OS#4865

Revision 9402d1ee (diff)
Added by pespin 5 months ago

ipaccess-proxy: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: Iff920ff3dbeb48bd871b7578470f27fe9d0f9516
Fixes: OS#4865

Revision 685816f9 (diff)
Added by pespin 2 months ago

stp: generate coredump and exit upon SIGABRT received

Previous code relied on abort() switching sigaction to SIG_FDL +
retriggering SIGABRT in case the signal handler returns, which would
then generate the coredump + terminate the process.
However, if a SIGABRT is received from somewhere else (kill -SIGABRT),
then the process would print the talloc report and continue running,
which is not desired.

Change-Id: Idca8e360968cb6998591737348ce520954e251b2
Fixes: OS#4865

History

#1 Updated by laforge 6 months ago

Nice catch!

#2 Updated by pespin 5 months ago

So this issue only appears when sending plain SIGABRT, abort() should handle this gracefully. From "man abort":

DESCRIPTION
       The  abort()  function first unblocks the SIGABRT signal, and then raises that signal for the calling process (as though raise(3) was called).  This
       results in the abnormal termination of the process unless the SIGABRT signal is caught and the signal handler does not return (see longjmp(3)).

       If the SIGABRT signal is ignored, or caught by a handler that returns, the abort() function will still terminate  the  process.   It  does  this  by
       restoring the default disposition for SIGABRT and then raising the signal for a second time.

So abort takes care of calling SIG_DFL if our handler returned. But in the event a plain SIGBART is sent, the program will continue without generating a coredump (because the code generating the coredump is inside SIGABRT SIG_DFL).

So the best is probably to do this in the SIGABRT signal handler:

signal(SIGABRT, SIG_DFL);
raise (SIGABRT);

#3 Updated by pespin 5 months ago

Fixed (tested) here:
https://gerrit.osmocom.org/c/osmo-bsc/+/21336 main: generate coredump and exit upon SIGABRT received

I will now look at similar issue in other osmocom projects.

#4 Updated by pespin 5 months ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 90

I submitted a bunch of patches to gerrit fixing same issue in the other osmocom projects.
Once merged ticket can be closed.

#5 Updated by pespin 5 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 90 to 100

Merged, closing.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)