Project

General

Profile

Bug #3182

OSMO-BSC: Intermittent Segmentation fault (core dumped)

Added by ron.menez@entropysolution.com about 1 month ago. Updated 27 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
04/18/2018
Due date:
% Done:

0%

Spec Reference:

Description

Segmentation Fault (core dumped) is experienced intermittently in running OSMO-BSC.

Hardwares used are the following:

  • Ettus B210
  • UPBOARD

UPBOARD information:

  1. lscpu
    Architecture: x86_64
    CPU op-mode(s): 32-bit, 64-bit
    Byte Order: Little Endian
    CPU: 4
    On-line CPU list: 0-3
    Thread(s) per core: 1
    Core(s) per socket: 4
    Socket(s): 1
    NUMA node(s): 1
    Vendor ID: GenuineIntel
    CPU family: 6
    Model: 92
    Model name: Intel(R) Pentium(R) CPU N4200 @ 1.10GHz
    Stepping: 9
    CPU MHz: 800.000
    CPU max MHz: 1101.0000
    CPU min MHz: 800.0000
    BogoMIPS: 2188.79
    Virtualization: VT-x
    L1d cache: 24K
    L1i cache: 32K
    L2 cache: 1024K
    NUMA node0 CPU: 0-3
    Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 ds_cpl vmx est tm2 ssse3 sdbg cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand lahf_lm 3dnowprefetch intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust smep erms mpx rdseed smap clflushopt sha_ni xsaveopt xsavec xgetbv1 dtherm ida arat pln pts
  1. lshw (kindly see attached file)

OS used:

  1. lsb_release -a
    No LSB modules are available.
    Distributor ID:
    Ubuntu
    Description:
    Ubuntu 16.04.4 LTS
    Release: 16.04
    Codename: xenial

The following osmocom elements and libraries were installed last April 14 through git:

  • libosmocore
  • libosmo-abis
  • libosmo-crypt-a53
  • libosmo-dsp
  • libosmo-netif
  • libosmo-sccp
  • libsmpp34
  • libgtpnl
  • osmo-mgw
  • openbsc
  • osmo-bsc
  • osmo-bts
  • osmo-hlr
  • osmo-msc
  • osmo-trx

Also run a backtrace after segfault occurs. Logs, pcap network trace and core dump file are also provided. Kindly see attached.

(gdb) bt
#0 0x000000000041aa51 in rsl_rx_conn_fail (msg=msg@entry=0x8a4090) at abis_rsl.c:1380
#1 0x000000000042172b in abis_rsl_rx_dchan (msg=0x8a4090) at abis_rsl.c:1646
#2 abis_rsl_rcvmsg (msg=0x8a4090) at abis_rsl.c:2853
#3 0x00007ffff710f7e8 in handle_ts1_read (bfd=0x895760) at input/ipaccess.c:282
#4 ipaccess_fd_cb (bfd=0x895760, what=1) at input/ipaccess.c:397
#5 0x00007ffff7329352 in osmo_fd_disp_fds (_eset=0x7fffffffe3b0, _wset=0x7fffffffe330, _rset=0x7fffffffe2b0)
at select.c:216
#6 osmo_select_main (polling=polling@entry=0) at select.c:256
#7 0x00000000004075ef in main (argc=<optimized out>, argv=<optimized out>;) at osmo_bsc_main.c:532

(gdb) bt full
#0 0x000000000041aa51 in rsl_rx_conn_fail (msg=msg@entry=0x8a4090) at abis_rsl.c:1380
dh = 0x8a411e
lchan = 0x7ffff7fb3290
tp = {lv = {{len = 0, val = 0x0} <repeats 26 times>, {len = 1, val = 0x8a4124 "\001"}, {len = 0,
val = 0x0} <repeats 229 times>}}
cause = 1 '\001'
#1 0x000000000042172b in abis_rsl_rx_dchan (msg=0x8a4090) at abis_rsl.c:1646
rslh = 0x8a411e
rc = 0
sign_link = 0x8941e0
#2 abis_rsl_rcvmsg (msg=0x8a4090) at abis_rsl.c:2853
sign_link = 0x8941e0
rslh = 0x8a411e
rc = 0
#3 0x00007ffff710f7e8 in handle_ts1_read (bfd=0x895760) at input/ipaccess.c:282
line = 0x894ba0
ts_nr = <optimized out>
link = <optimized out>
e1i_ts = <optimized out>
hh = 0x8a411b
msg = 0x8a4090
ret = <optimized out>
rc = <optimized out>
#4 ipaccess_fd_cb (bfd=0x895760, what=1) at input/ipaccess.c:397
rc = 0
#5 0x00007ffff7329352 in osmo_fd_disp_fds (eset=0x7fffffffe3b0, _wset=0x7fffffffe330, _rset=0x7fffffffe2b0)
at select.c:216
flags = 1
ufd = 0x895760
tmp = 0x895d38
work = 1
#6 osmo_select_main (polling=polling@entry=0) at select.c:256
readset = {
_fds_bits = {0 <repeats 16 times>}}
writeset = {__fds_bits = {0 <repeats 16 times>}}
exceptset = {__fds_bits = {0 <repeats 16 times>}}
rc = <optimized out>
no_time = {tv_sec = 0, tv_usec = 0}
#7 0x00000000004075ef in main (argc=<optimized out>, argv=<optimized out>;) at osmo_bsc_main.c:532
msc = 0x6bdab8
data = <optimized out>
rc = <optimized out>

segfault_osmo-bsc.pcap (50.7 KB) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

segfault_osmo-bsc.log Magnifier (128 KB) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

osmo-bts_standalone_demo.cfg (2.26 KB) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

osmo-bsc_standalone_demo.cfg (4.57 KB) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

osmo-mgw_standalone_demo.cfg (1.16 KB) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

osmo-stp_standalone_demo.cfg (424 Bytes) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

osmo-trx_standalone_demo.cfg (903 Bytes) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

osmo-hlr_standalone_demo.cfg (305 Bytes) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

osmo-msc_standalone_demo.cfg (1.78 KB) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

core (3.3 MB) ron.menez@entropysolution.com, 04/18/2018 04:40 AM

gdb_segfault_04202018.log Magnifier (88.4 KB) ron.menez@entropysolution.com, 04/20/2018 02:59 AM

History

#1 Updated by pespin about 1 month ago

I'd say lchan pointer (0x7ffff7fb3290) is not correct in rsl_rx_conn_fail, and it is obtained in caller abis_rsl_rx_dchan(). It is also the only thing which I think can fail in the line causing the segfault.

Interestingly, though, lchan is accessed once in that code path without any issue when calling gsm_lchan_name, where pointer is derreferenced:

    msg->lchan = lchan_lookup(sign_link->trx, rslh->chan_nr,
                  "Abis RSL rx DCHAN: ");
    if (!msg->lchan)
        return -1;
    ts_name = gsm_lchan_name(msg->lchan);

static inline char *gsm_lchan_name(const struct gsm_lchan *lchan)
{
    return lchan->name;
}

It is also used in rsl_rx_conn_fail previous to the crash without any issue:

    LOGP(DRSL, LOGL_NOTICE, "%s CONNECTION FAIL in state %s ",
         gsm_lchan_name(msg->lchan),
         gsm_lchans_name(msg->lchan->state));

Outputing in the log:

<0004> abis_rsl.c:1367 (bts=0,trx=0,ts=0,ss=0) CONNECTION FAIL in state ACTIVE CAUSE=0x01(Radio Link Failure)

So it seems what is wrong is not lchan pointer, but lchan->conn, which is used first in that code path.

It would be interesting to know the value of lchan->conn when getting the segfault, to see if it's NULL or it contains garbage. If you run again into the crash with gdb, can you print the value of the pointer? In gdb cmd line: "print lchan->conn". You can also print the full lchan info: "print *lchan".

#2 Updated by neels about 1 month ago

Just to mention it -- you have tried completely uninstalling all osmo libraries, cleaning all source trees and rebuilding everything from scratch?
If you e.g. install a newer version of a library (with an ABI change) and a dependent program is not rebuilt subsequently, that may cause stack corruption issues.
I hope that's not it and we can uncover a bug here.

#3 Updated by neels about 1 month ago

BTW, unrelated: note http://git.osmocom.org/libosmo-crypt-a53/tree/README.md
i.e. you shouldn't need libosmo-crypt-a53

#4 Updated by ron.menez@entropysolution.com about 1 month ago

pespin wrote:

I'd say lchan pointer (0x7ffff7fb3290) is not correct in rsl_rx_conn_fail, and it is obtained in caller abis_rsl_rx_dchan(). It is also the only thing which I think can fail in the line causing the segfault.

Interestingly, though, lchan is accessed once in that code path without any issue when calling gsm_lchan_name, where pointer is derreferenced:
[...]
[...]

It is also used in rsl_rx_conn_fail previous to the crash without any issue:
[...]

Outputing in the log:
[...]

So it seems what is wrong is not lchan pointer, but lchan->conn, which is used first in that code path.

It would be interesting to know the value of lchan->conn when getting the segfault, to see if it's NULL or it contains garbage. If you run again into the crash with gdb, can you print the value of the pointer? In gdb cmd line: "print lchan->conn". You can also print the full lchan info: "print *lchan".

Run the following commands requested:

(gdb) print lchan->conn
$1 = (struct gsm_subscriber_connection *) 0x0
(gdb) print *lchan
$2 = {ts = 0x7ffff7fb2168, nr = 1 '\001', type = GSM_LCHAN_SDCCH, rsl_cmode = RSL_CMOD_SPD_SIGN, 
  tch_mode = GSM48_CMODE_SIGN, csd_mode = LCHAN_CSD_M_NT, state = LCHAN_S_ACTIVE, broken_reason = 0x45a5a5 "", 
  bs_power = 0 '\000', ms_power = 14 '\016', encr = {alg_id = 1 '\001', key_len = 0 '\000', 
    key = '\000' <repeats 15 times>}, mr_ms_lv = "\000\000\000\000\000\000", 
  mr_bts_lv = "\000\000\000\000\000\000", sapis = "\000\000\000\000\000\000\000", abis_ip = {bound_ip = 0, 
    connect_ip = 0, bound_port = 0, connect_port = 0, conn_id = 0, rtp_payload = 0 '\000', 
    rtp_payload2 = 0 '\000', speech_mode = 0 '\000', rtp_socket = 0x0, ass_compl = {rr_cause = 0 '\000', 
      valid = false}}, rqd_ta = 0 '\000', name = 0x87ab00 "(bts=0,trx=0,ts=0,ss=1)", T3101 = {node = {
      rb_parent_color = 7067681, rb_right = 0x0, rb_left = 0x0}, list = {next = 0x7ffff7fb3b38, 
      prev = 0x7ffff7fb3b38}, timeout = {tv_sec = 1524195320, tv_usec = 830425}, active = 1, 
    cb = 0x4205d0 <t3101_expired>, data = 0x7ffff7fb3a90}, T3109 = {node = {rb_parent_color = 0, rb_right = 0x0, 
      rb_left = 0x0}, list = {next = 0x0, prev = 0x0}, timeout = {tv_sec = 0, tv_usec = 0}, active = 0, cb = 0x0, 
    data = 0x0}, T3111 = {node = {rb_parent_color = 0, rb_right = 0x0, rb_left = 0x0}, list = {next = 0x0, 
      prev = 0x0}, timeout = {tv_sec = 0, tv_usec = 0}, active = 0, cb = 0x0, data = 0x0}, error_timer = {node = {
      rb_parent_color = 0, rb_right = 0x0, rb_left = 0x0}, list = {next = 0x0, prev = 0x0}, timeout = {tv_sec = 0, 
      tv_usec = 0}, active = 0, cb = 0x0, data = 0x0}, act_timer = {node = {rb_parent_color = 7068001, 
      rb_right = 0x0, rb_left = 0x0}, list = {next = 0x7ffff7fb3c78, prev = 0x7ffff7fb3c78}, timeout = {
      tv_sec = 1524192324, tv_usec = 829670}, active = 0, cb = 0x41e010 <lchan_act_tmr_cb>, 
    data = 0x7ffff7fb3a90}, rel_work = {node = {rb_parent_color = 0, rb_right = 0x0, rb_left = 0x0}, list = {
      next = 0x0, prev = 0x0}, timeout = {tv_sec = 0, tv_usec = 0}, active = 0, cb = 0x0, data = 0x0}, 
  error_cause = 0 '\000', neigh_meas = {{arfcn = 0, bsic = 0 '\000', 
      rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, 
      bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {
      arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, 
      last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", 
      rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', 
      rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, 
      bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {
      arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, 
      last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", 
      rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', 
      rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, 
      bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}}, 
  meas_rep = {{lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {
          rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {
          rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {
        pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {
          rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {
          rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, 
      bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, 
      nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', 
        ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, 
          flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', 
      ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
---Type <return> to continue, or q <return> to quit---
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, 
      ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {
        full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, 
      bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, 
      nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', 
        ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, 
          flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', 
      ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, 
      ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {
        full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, 
      bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, 
      nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', 
        ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, 
          flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', 
          rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', 
      ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', 
          neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', 
          arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
          rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', 
          bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}}, meas_rep_idx = 0, meas_rep_count = 0, 
  meas_rep_last_seen_nr = 255 '\377', rqd_ref = 0x0, conn = 0x0, dyn = {act_type = 0 '\000', ho_ref = 0 '\000', 
    rqd_ref = 0x0, rqd_ta = 0 '\000'}}

Also attached is the complete gdb logs.

#5 Updated by ron.menez@entropysolution.com about 1 month ago

neels wrote:

Just to mention it -- you have tried completely uninstalling all osmo libraries, cleaning all source trees and rebuilding everything from scratch?
If you e.g. install a newer version of a library (with an ABI change) and a dependent program is not rebuilt subsequently, that may cause stack corruption issues.
I hope that's not it and we can uncover a bug here.

Hi Neels,

We installed all the osmo elements from scratch to a newly and updated installation of Ubuntu 16.04 last April 14, 2018 using the latest git version that time.

We will try to reinstall all of the osmo elements again today using the latest versions from git and removing the "libosmo-crypt-a53" from the installation.

We'll let you know if we will experience the segfault again.

#6 Updated by ron.menez@entropysolution.com about 1 month ago

wrote:

neels wrote:

Just to mention it -- you have tried completely uninstalling all osmo libraries, cleaning all source trees and rebuilding everything from scratch?
If you e.g. install a newer version of a library (with an ABI change) and a dependent program is not rebuilt subsequently, that may cause stack corruption issues.
I hope that's not it and we can uncover a bug here.

Hi Neels,

We installed all the osmo elements from scratch to a newly and updated installation of Ubuntu 16.04 last April 14, 2018 using the latest git version that time.

We will try to reinstall all of the osmo elements again today using the latest versions from git and removing the "libosmo-crypt-a53" from the installation.

We'll let you know if we will experience the segfault again.

Hi Neels,

We installed the latest version today and still we experience segfault.

It seems that every time a "Radio Link Failure" occurs, segfault will be triggered. Kindly see logs below for your reference:

<0004> abis_rsl.c:1367 (bts=0,trx=0,ts=0,ss=0) CONNECTION FAIL in state ACTIVE CAUSE=0x01(Radio Link Failure) 
Segmentation fault (core dumped)

#7 Updated by neels 27 days ago

@Ron, thanks for the clarification.

The next thing to do to fix this issue is to try reproducing the failure with ttcn3 tests:
Trigger the CONNECTION FAIL with cause Radio Link Failure as seen in the logs and ensure graceful handling.

So far we haven't assigned or prioritized this issue.
A segfault is inherently important, but currently not sure when we'll get a chance to investigate in detail.

Also available in: Atom PDF