Project

General

Profile

Actions

Bug #5325

closed

ttcn3-bts-test[-latest] provokes a segfault

Added by fixeria 13 days ago. Updated 11 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
11/24/2021
Due date:
% Done:

100%

Spec Reference:

Description

Build artifacts of the recent ttcn3-bts-test[-latest] runs contain core dumps:

https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bts-test/1479/artifact/logs/bts/
https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bts-test-latest/1153/artifact/logs/bts/

Looks like double free to me:

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f60624f642a in __GI_abort () at abort.c:89
#2  0x00007f606334bd7c in ?? () from /usr/lib/x86_64-linux-gnu/libtalloc.so.2
#3  0x00007f606334b949 in _talloc_free () from /usr/lib/x86_64-linux-gnu/libtalloc.so.2
#4  0x0000557d09d77752 in bts_smscb_state_reset (bts_ss=bts_ss@entry=0x557d0c068ef8) at cbch.c:336
#5  0x0000557d09d77d90 in bts_cbch_reset (bts=bts@entry=0x557d0c065bc0) at cbch.c:341
#6  0x0000557d09d7acfa in st_op_disabled_notinstalled_on_enter (fi=<optimized out>, prev_state=<optimized out>) at nm_bts_fsm.c:64
#7  0x00007f6062c9491f in ?? () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#8  0x00007f6062c94c3d in _osmo_fsm_inst_state_chg () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#9  0x00007f6062c94e14 in _osmo_fsm_inst_dispatch () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#10 0x0000557d09d7aa54 in ev_dispatch_children (event=6, site_mgr=0x557d0c065d30) at nm_bts_sm_fsm.c:48
#11 nm_bts_sm_allstate (fi=0x557d0c069250, event=<optimized out>, data=<optimized out>) at nm_bts_sm_fsm.c:135
#12 0x00007f6062c94e14 in _osmo_fsm_inst_dispatch () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#13 0x0000557d09d72f1d in st_exit_on_enter (fi=0x557d0c069020, prev_state=<optimized out>) at bts_shutdown_fsm.c:164
#14 0x00007f6062c9491f in ?? () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#15 0x00007f6062c94c3d in _osmo_fsm_inst_state_chg () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#16 0x0000557d09d72b57 in st_wait_trx_closed (fi=0x557d0c069020, event=<optimized out>, data=<optimized out>) at bts_shutdown_fsm.c:155
#17 0x00007f6062c94de4 in _osmo_fsm_inst_dispatch () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#18 0x0000557d09d7318c in bts_model_trx_close_cb (trx=<optimized out>, rc=<optimized out>) at bts_shutdown_fsm.c:277
#19 0x0000557d09d546cb in trx_prov_fsm_apply_close (plink=0x557d0c06fec0, rc=0) at trx_provision_fsm.c:316
#20 0x00007f6062c94de4 in _osmo_fsm_inst_dispatch () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#21 0x0000557d09d49fc2 in trx_ctrl_rx_rsp_poweroff (rsp=0x7ffdbf202540, rsp=0x7ffdbf202540, l1h=0x557d0c07d790) at trx_if.c:518
#22 trx_ctrl_rx_rsp (tcm=0x557d0c404bc0, rsp=0x7ffdbf202540, l1h=0x557d0c07d790) at trx_if.c:637
#23 trx_ctrl_read_cb (ofd=<optimized out>, what=<optimized out>) at trx_if.c:733
#24 0x00007f6062c906fc in ?? () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#25 0x00007f6062c907a6 in osmo_select_main () from /usr/lib/x86_64-linux-gnu/libosmocore.so.18
#26 0x0000557d09d79284 in bts_main (argc=3, argv=0x7ffdbf202db8) at main.c:437
#27 0x00007f60624e22e1 in __libc_start_main (main=0x557d09d48aa0 <main>, argc=3, argv=0x7ffdbf202db8, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7ffdbf202da8) at ../csu/libc-start.c:291
#28 0x0000557d09d48d0a in _start ()
Actions #1

Updated by fixeria 13 days ago

Here is a more detailed backtrace (with libosmocore-dbg installed):

(gdb) frame bt
No symbol "bt" in current context.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f60624f642a in __GI_abort () at abort.c:89
#2  0x00007f606334bd7c in ?? () from /usr/lib/x86_64-linux-gnu/libtalloc.so.2
#3  0x00007f606334b949 in _talloc_free () from /usr/lib/x86_64-linux-gnu/libtalloc.so.2
#4  0x0000557d09d77752 in bts_smscb_state_reset (bts_ss=bts_ss@entry=0x557d0c068ef8) at cbch.c:336
#5  0x0000557d09d77d90 in bts_cbch_reset (bts=bts@entry=0x557d0c065bc0) at cbch.c:341
#6  0x0000557d09d7acfa in st_op_disabled_notinstalled_on_enter (fi=<optimized out>, prev_state=<optimized out>) at nm_bts_fsm.c:64
#7  0x00007f6062c9491f in state_chg (fi=0x557d0c069480, new_state=<optimized out>, keep_timer=keep_timer@entry=false, timeout_ms=0, T=<optimized out>, 
    file=<optimized out>, line=159) at fsm.c:699
#8  0x00007f6062c94c3d in _osmo_fsm_inst_state_chg (fi=<optimized out>, new_state=<optimized out>, timeout_secs=<optimized out>, T=<optimized out>, 
    file=<optimized out>, line=<optimized out>) at fsm.c:748
#9  0x00007f6062c94e14 in _osmo_fsm_inst_dispatch (fi=0x557d0c069480, event=event@entry=6, data=data@entry=0x0, file=file@entry=0x557d09d968e0 "nm_bts_sm_fsm.c", 
    line=line@entry=48) at fsm.c:865
#10 0x0000557d09d7aa54 in ev_dispatch_children (event=6, site_mgr=0x557d0c065d30) at nm_bts_sm_fsm.c:48
#11 nm_bts_sm_allstate (fi=0x557d0c069250, event=<optimized out>, data=<optimized out>) at nm_bts_sm_fsm.c:135
#12 0x00007f6062c94e14 in _osmo_fsm_inst_dispatch (fi=0x557d0c069250, event=event@entry=6, data=data@entry=0x0, file=file@entry=0x557d09d8d48f "bts_shutdown_fsm.c", 
    line=line@entry=164) at fsm.c:865
#13 0x0000557d09d72f1d in st_exit_on_enter (fi=0x557d0c069020, prev_state=<optimized out>) at bts_shutdown_fsm.c:164
#14 0x00007f6062c9491f in state_chg (fi=fi@entry=0x557d0c069020, new_state=new_state@entry=3, keep_timer=keep_timer@entry=false, timeout_ms=0, T=<optimized out>, 
    file=<optimized out>, line=155) at fsm.c:699
#15 0x00007f6062c94c3d in _osmo_fsm_inst_state_chg (fi=fi@entry=0x557d0c069020, new_state=new_state@entry=3, timeout_secs=<optimized out>, T=<optimized out>, 
    file=<optimized out>, line=<optimized out>) at fsm.c:748
#16 0x00007f6062ca5a32 in _osmo_tdef_fsm_inst_state_chg (fi=fi@entry=0x557d0c069020, state=state@entry=3, 
    timeouts_array=timeouts_array@entry=0x557d09d8d520 <bts_shutdown_fsm_timeouts>, tdefs=<optimized out>, default_timeout=93995561029664, default_timeout@entry=-1, 
    file=file@entry=0x557d09d8d48f "bts_shutdown_fsm.c", line=155) at tdef.c:357
#17 0x0000557d09d72b57 in st_wait_trx_closed (fi=0x557d0c069020, event=<optimized out>, data=<optimized out>) at bts_shutdown_fsm.c:155
#18 0x00007f6062c94de4 in _osmo_fsm_inst_dispatch (fi=0x557d0c069020, event=event@entry=2, data=0x7f6064067070, file=file@entry=0x557d09d8d48f "bts_shutdown_fsm.c", 
    line=line@entry=277) at fsm.c:877
#19 0x0000557d09d7318c in bts_model_trx_close_cb (trx=<optimized out>, rc=rc@entry=0) at bts_shutdown_fsm.c:277
#20 0x0000557d09d546cb in trx_prov_fsm_apply_close (plink=0x557d0c06fec0, rc=0) at trx_provision_fsm.c:316
#21 0x00007f6062c94de4 in _osmo_fsm_inst_dispatch (fi=0x557d0c07d930, event=16, data=0x0, file=0x557d09d84d37 "trx_provision_fsm.c", line=59) at fsm.c:877
#22 0x0000557d09d49fc2 in trx_ctrl_rx_rsp_poweroff (rsp=0x7ffdbf202540, rsp=0x7ffdbf202540, l1h=0x557d0c07d790) at trx_if.c:518
#23 trx_ctrl_rx_rsp (tcm=0x557d0c404bc0, rsp=0x7ffdbf202540, l1h=0x557d0c07d790) at trx_if.c:637
#24 trx_ctrl_read_cb (ofd=<optimized out>, what=<optimized out>) at trx_if.c:733
#25 0x00007f6062c906fc in poll_disp_fds (n_fd=<optimized out>) at select.c:361
#26 _osmo_select_main (polling=polling@entry=0) at select.c:393
#27 0x00007f6062c907a6 in osmo_select_main (polling=polling@entry=0) at select.c:432
#28 0x0000557d09d79284 in bts_main (argc=3, argv=0x7ffdbf202db8) at main.c:437
#29 0x00007f60624e22e1 in __libc_start_main (main=0x557d09d48aa0 <main>, argc=3, argv=0x7ffdbf202db8, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7ffdbf202da8) at ../csu/libc-start.c:291
#30 0x0000557d09d48d0a in _start ()
Actions #2

Updated by laforge 12 days ago

I think the problem is that the list head of bts->smscb_basic.queue and bts->smscb_extended.queue are not initialized anywhere?

Actions #3

Updated by laforge 12 days ago

laforge wrote in #note-2:

I think the problem is that the list head of bts->smscb_basic.queue and bts->smscb_extended.queue are not initialized anywhere?

nevermind, typo in my grep. bts_main9) calls bts_init() which initializes those fields.

I think the problem is likely related to

commit ae606d69a46a59cab1415502ffee24020bce515b
Author: Pau Espin Pedrol <pespin@sysmocom.de>
Date:   Wed Oct 20 16:11:54 2021 +0200

    Reset CBCH state after BTS shutdown

    Related: OS#5273
    Change-Id: Ib01d38c59ba9fa083fcc0682009c13d2db3664fe
Actions #4

Updated by laforge 12 days ago

  • Status changed from New to Feedback
  • Assignee set to fixeria
  • % Done changed from 0 to 60

Expected to be fixed by https://gerrit.osmocom.org/c/osmo-bts/+/26351

commit 7ea684723ca9ac8acf47302921e1af988c567322
Author: Harald Welte <laforge@osmocom.org>
Date:   Wed Nov 24 14:47:23 2021 +0100

    cbch: Fix dangling cur_msg leading to double-free in bts_cbch_reset()

    If a new default message is installed via RSL, and the old default
    message is currently being transmitted, we must set cur_msg to NULL.

    The old default message must be talloc_free()d unconditionally whenever
    a new default message is being set.

    We can do that by using the TALLOC_FREE macro.

    Change-Id: Id32c2074b61cd1f09957b9d1558ffb3a7691a8e0
    Closes: OS#5325

diff --git a/src/common/cbch.c b/src/common/cbch.c
index addd68c9..46774803 100644
--- a/src/common/cbch.c
+++ b/src/common/cbch.c
@@ -233,10 +233,10 @@ int bts_process_smscb_cmd(struct gsm_bts *bts, struct rsl_ie_cb_cmd_type cmd_typ
                rate_ctr_inc2(bts_ss->ctrs, CBCH_CTR_RCVD_QUEUED);
                break;
        case RSL_CB_CMD_TYPE_DEFAULT:
-               /* old default msg will be free'd in get_smscb_block() if it is currently in transit
-                * and we set a new default_msg here */
+               /* clear the cur_msg pointer if it is the old default message */
                if (bts_ss->cur_msg && bts_ss->cur_msg == bts_ss->default_msg)
-                       talloc_free(bts_ss->cur_msg);
+                       bts_ss->cur_msg = NULL;
+               talloc_free(bts_ss->default_msg);
                if (cmd_type.def_bcast == RSL_CB_CMD_DEFBCAST_NORMAL)
                        /* def_bcast == 0: normal message */
                        bts_ss->default_msg = scm;
Actions #5

Updated by laforge 12 days ago

  • Status changed from Feedback to In Progress
  • Assignee changed from fixeria to laforge

problem still persists even with patch:

(gdb) frame 7
#7  0x000055555589e445 in bts_smscb_state_reset (bts_ss=0x627000003498) at cbch.c:336
336             TALLOC_FREE(bts_ss->default_msg);
(gdb) p bts_ss->default_msg
$3 = (struct smscb_msg *) 0x6110000092e0
(gdb) p *bts_ss->default_msg
$4 = {list = {next = 0x0, prev = 0x0}, is_schedule = false, msg = "\001\002\003\004\005\006\a\a\b\t\n\v\f\r\016\017\020\021\022\023\024\025", '+' <repeats 66 times>, num_segs = 1 '\001'}
(gdb) p *bts_ss
$5 = {queue = {next = 0x627000003498, prev = 0x627000003498}, queue_len = 0, ctrs = 0x6160000051e0, cur_msg = 0x0, default_msg = 0x6110000092e0}

Actions #6

Updated by laforge 12 days ago

There was another problem with the cbch-state-clearing code introduced recently, it's now fixed in https://gerrit.osmocom.org/c/osmo-bts/+/26355

commit 79f21c4ed172eadf1e3b046446cdec48ccce6a99
Author: Harald Welte <laforge@osmocom.org>
Date:   Wed Nov 24 20:00:29 2021 +0100

    cbch: Fix bts_smscb_state_reset() to avoid double-free

    If the currently transmitted message is the default message,
    bts_ss->cur_msg == bts_ss->derfault_msg.  In this case we cannot
    simply talloc_free() both of them, as it would result in a boudle-free.

    Change-Id: I2d3645e34d31507b012a53ffe12d14223682f808
    Closes: OS#5325
    Fixes: Ib01d38c59ba9fa083fcc0682009c13d2db3664fe

diff --git a/src/common/cbch.c b/src/common/cbch.c
index addd68c9..a3e12961 100644
--- a/src/common/cbch.c
+++ b/src/common/cbch.c
@@ -332,7 +332,10 @@ static void bts_smscb_state_reset(struct bts_smscb_state *bts_ss)
        }
        bts_ss->queue_len = 0;
        rate_ctr_group_reset(bts_ss->ctrs);
-       TALLOC_FREE(bts_ss->cur_msg);
+       /* avoid double-free of default_msg in case cur_msg == default_msg */
+       if (bts_ss->cur_msg && bts_ss->cur_msg != bts_ss->default_msg)
+               talloc_free(bts_ss->cur_msg);
+       bts_ss->cur_msg = NULL;
        TALLOC_FREE(bts_ss->default_msg);
 }

Actions #7

Updated by laforge 11 days ago

  • Status changed from In Progress to Resolved
  • % Done changed from 60 to 100
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)