https://osmocom.org/https://osmocom.org/favicon.ico?16647414092021-10-13T16:48:43ZOpen Source Mobile CommunicationsOsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=226862021-10-13T16:48:43Zpespin
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/22686/diff?detail_id=38117">diff</a>)</li></ul> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=226872021-10-13T17:11:11Zpespin
<ul></ul><p>In f_establish_mo() we send a SABM, we properly receive the EstInd in RSL but then we fail to match:<br /><pre>
/* UA: SAPI = 0, R = 0, F = 1, M = 0, L = L of SABM. */
LAPDM.receive(t_PH_DATA(0, is_sacch, tr_LAPDm_UA(sapi, cr_MT_RSP, f:=true, l3:=l3_mo)));
</pre></p>
<p>TTCN3 logs:<br /><pre>
16:45:07.828382 11 BTS_Tests_LAPDm.ttcn:277 Message enqueued on LAPDM from 12 @LAPDm_RAW_PT.LAPDm_ph_data : { sacch := false, sapi := 0, lapdm := { ab := { addr := { spare := '0'B, lpd := 0, sapi := 0, c_r := true, ea := true }, ctrl := { u := { u := '000'B, p_f := false, u2 := '00'B, spare := '11'B } }, len := 0, m := false, el := 1, payload := ''O } } } id 5
16:45:07.828396 12 LAPDm_RAW_PT.ttcn:514 Matching on port L1CTL succeeded.
16:45:07.828422 12 LAPDm_RAW_PT.ttcn:514 Receive operation on port L1CTL succeeded, message from system(): @L1CTL_Types.L1ctlDlMessage: { header := { msg_type := L1CTL_DATA_IND (3), flags := { padding := '0000000'B, f_done := false }, padding := '0000'O }, dl_info := { chan_nr := { u := { ch0 := RSL_CHAN_NR_Bm_ACCH (1) }, tn := 1 }, link_id := { c := FACCH_SDCCH (0), na := false, prio := SAPI0_PRIO_NORMAL (0), sapi := 0 }, arfcn := { pcs := false, uplink := false, spare := '0000'B, arfcn := 871 }, frame_nr := 160, rx_level := 0, snr := 0, num_biterr := 0, fire_crc := 2 }, payload := omit } id 18
16:45:07.828445 11 BTS_Tests_LAPDm.ttcn:277 Matching on port LAPDM .lapdm.ab.addr.c_r := true with false unmatched.lapdm.ab.ctrl.u.u := '000'B with '011'B unmatched.lapdm.ab.ctrl.u.p_f := false with true unmatched.lapdm.ab.payload := ''O with '77D30B2993'O unmatched: First message in the queue does not match the template:
</pre></p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=226882021-10-13T17:15:43Zpespin
<ul></ul><p>It seems osmo-bts is crashing when receiving that Lapdm message (I was also seeing a crash in nightly ttcn3-bts-tests but I didn't figure out yet where did occured):</p>
<pre>
20211013164507661 DRSL <0000> rsl.c:3636 (bts=0,trx=0,ts=0,ss=0) Rx RSL BCCH_INFO
20211013164507661 DRSL <0000> rsl.c:521 Rx RSL BCCH INFO (SI4, 23 bytes)
20211013164507661 DPAG <0005> paging.c:720 Paging SI update
20211013164507677 DRSL <0000> rsl.c:3702 (bts=0,trx=0,ts=1,pchan=TCH/F) ss=0 Rx RSL CHAN_ACTIV
20211013164507677 DRSL <0000> rsl.c:1854 (bts=0,trx=0,ts=1,ss=0) chan_nr=TCH/F on TS1 type=0x00=INITIAL mode=SIGNALLING
20211013164507677 DL1C <0006> l1sap.c:2038 (bts=0,trx=0,ts=1,ss=0) Activating channel TCH/F on TS1
20211013164507677 DL1C <0006> scheduler.c:1107 (bts=0,trx=0,ts=1,ss=0) Activating TCH/F
20211013164507677 DL1C <0006> scheduler.c:1107 (bts=0,trx=0,ts=1,ss=0) Activating SACCH/TF
20211013164507677 DL1C <0006> scheduler.c:1170 (bts=0,trx=0,ts=1) Set mode for TCH/F (rsl_cmode=3, tch_mode=0, handover=0)
20211013164507677 DTRX <000b> trx_if.c:256 phy0.0: Enqueuing TRX control command 'CMD NOHANDOVER 1 0'
20211013164507677 DL1C <0006> scheduler.c:1236 (bts=0,trx=0,ts=1,ss=0) Set A5/0 uplink for TCH/F
20211013164507677 DL1C <0006> scheduler.c:1236 (bts=0,trx=0,ts=1,ss=0) Set A5/0 uplink for SACCH/TF
20211013164507677 DL1C <0006> scheduler.c:1236 (bts=0,trx=0,ts=1,ss=0) Set A5/0 downlink for TCH/F
20211013164507677 DL1C <0006> scheduler.c:1236 (bts=0,trx=0,ts=1,ss=0) Set A5/0 downlink for SACCH/TF
20211013164507677 DL1C <0006> l1sap.c:839 (bts=0,trx=0,ts=1,ss=0) activate confirm chan_nr=TCH/F on TS1 trx=0
20211013164507677 DRSL <0000> rsl.c:1327 (bts=0,trx=0,ts=1,pchan=TCH/F) (ss=0) TCH_F Tx CHAN ACT ACK
20211013164507677 DTRX <000b> trx_if.c:672 phy0.0: Response message: 'RSP NOHANDOVER 0 1 0'
20211013164507777 DL1P <0007> sched_lchan_tchf.c:216 000159/00/03/06/03 (bts=0,trx=0,ts=1) TCH/F: Received bad data (55/104)
20211013164507796 DLLAPD <0011> lapd_core.c:949 ((bts=0,trx=0,ts=1,ss=0)[DCCH][0]) Store content res.
20211013164507796 DRSL <0000> rsl.c:3603 (bts=0,trx=0,ts=1,ss=0) Fwd RLL msg EST_IND from LAPDm to A-bis
0: stopped pid 8 with status 139
sleeping 1 seconds...
</pre> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=226892021-10-13T17:17:27Zpespin
<ul></ul><p>If test TC_rr_response_frame_loss is run alone, osmo-bts doesn't crash.</p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=226912021-10-13T17:24:35Zpespin
<ul></ul><p>I enabled ASan:<br /><pre>
20211013172315371 DRSL <0000> rsl.c:3702 (bts=0,trx=0,ts=1,pchan=TCH/F) ss=0 Rx RSL CHAN_ACTIV
20211013172315371 DRSL <0000> rsl.c:1854 (bts=0,trx=0,ts=1,ss=0) chan_nr=TCH/F on TS1 type=0x00=INITIAL mode=SIGNALLING
20211013172315371 DL1C <0006> l1sap.c:2038 (bts=0,trx=0,ts=1,ss=0) Activating channel TCH/F on TS1
20211013172315371 DL1C <0006> scheduler.c:1107 (bts=0,trx=0,ts=1,ss=0) Activating TCH/F
20211013172315371 DL1C <0006> scheduler.c:1107 (bts=0,trx=0,ts=1,ss=0) Activating SACCH/TF
20211013172315371 DL1C <0006> scheduler.c:1170 (bts=0,trx=0,ts=1) Set mode for TCH/F (rsl_cmode=3, tch_mode=0, handover=0)
20211013172315371 DTRX <000b> trx_if.c:256 phy0.0: Enqueuing TRX control command 'CMD NOHANDOVER 1 0'
20211013172315371 DL1C <0006> scheduler.c:1236 (bts=0,trx=0,ts=1,ss=0) Set A5/0 uplink for TCH/F
20211013172315371 DL1C <0006> scheduler.c:1236 (bts=0,trx=0,ts=1,ss=0) Set A5/0 uplink for SACCH/TF
20211013172315371 DL1C <0006> scheduler.c:1236 (bts=0,trx=0,ts=1,ss=0) Set A5/0 downlink for TCH/F
20211013172315371 DL1C <0006> scheduler.c:1236 (bts=0,trx=0,ts=1,ss=0) Set A5/0 downlink for SACCH/TF
20211013172315371 DL1C <0006> l1sap.c:839 (bts=0,trx=0,ts=1,ss=0) activate confirm chan_nr=TCH/F on TS1 trx=0
20211013172315372 DRSL <0000> rsl.c:1327 (bts=0,trx=0,ts=1,pchan=TCH/F) (ss=0) TCH_F Tx CHAN ACT ACK
20211013172315372 DTRX <000b> trx_if.c:672 phy0.0: Response message: 'RSP NOHANDOVER 0 1 0'
20211013172315442 DL1P <0007> sched_lchan_tchf.c:216 000159/00/03/06/03 (bts=0,trx=0,ts=1) TCH/F: Received bad data (55/104)
20211013172315461 DLLAPD <0011> lapd_core.c:949 ((bts=0,trx=0,ts=1,ss=0)[DCCH][0]) Store content res.
20211013172315461 DRSL <0000> rsl.c:3603 (bts=0,trx=0,ts=1,ss=0) Fwd RLL msg EST_IND from LAPDm to A-bis
rbtree.c:82:24: runtime error: member access within null pointer of type 'struct rb_node'
ASAN:DEADLYSIGNAL
=================================================================
==8==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7ff53be4cc96 bp 0x7ffd03c21950 sp 0x7ffd03c21920 T0)
#0 0x7ff53be4cc95 in rb_insert_color (/usr/local/lib/libosmocore.so.17+0x100c95)
#1 0x7ff53be04256 in __add_timer (/usr/local/lib/libosmocore.so.17+0xb8256)
#2 0x7ff53be04460 in osmo_timer_add (/usr/local/lib/libosmocore.so.17+0xb8460)
#3 0x7ff53be04856 in osmo_timer_schedule (/usr/local/lib/libosmocore.so.17+0xb8856)
#4 0x7ff53be42ee5 in rate_ctr_timer_cb (/usr/local/lib/libosmocore.so.17+0xf6ee5)
#5 0x7ff53be05b10 in osmo_timers_update (/usr/local/lib/libosmocore.so.17+0xb9b10)
#6 0x7ff53be09395 in _osmo_select_main (/usr/local/lib/libosmocore.so.17+0xbd395)
#7 0x7ff53be09446 in osmo_select_main (/usr/local/lib/libosmocore.so.17+0xbd446)
#8 0x55a27d03205e in bts_main (/usr/local/bin/osmo-bts-trx+0x23205e)
#9 0x55a27cf598e9 in main (/usr/local/bin/osmo-bts-trx+0x1598e9)
#10 0x7ff53a8952e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
#11 0x55a27cf58729 in _start (/usr/local/bin/osmo-bts-trx+0x158729)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/local/lib/libosmocore.so.17+0x100c95) in rb_insert_color
==8==ABORTING
0: stopped pid 8 with status 1
</pre></p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=226922021-10-13T17:26:44Zpespin
<ul><li><strong>File</strong> <a href="/attachments/4718">osmo-bts.log</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/4718/osmo-bts.log">osmo-bts.log</a> added</li></ul><p>I attach full osmo-bts.log file while running both tests.</p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=226932021-10-13T17:28:20Zpespin
<ul></ul><p>So to me it looks like some struct is freed at some point which still holds a timer set and which is not removed when freed.</p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=226942021-10-13T17:33:56Zpespin
<ul></ul><p>Since the crash happens in rate_ctr_timer_cb(), it's probably some rate_ctr_group which is somehow freed but not removed properly. So probably automatically freed by talloc_free() but no rate_ctr_group_free() is called on it.</p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=227072021-10-14T12:21:25Zpespin
<ul></ul><p>The most plausible ctr_group having issues is:<br /><pre>
src/common/scheduler.c
619: l1ts->ctrs = rate_ctr_group_alloc(ts->trx,
src/common/scheduler.c
663: rate_ctr_group_free(l1ts->ctrs);
</pre></p>
<p>This is called twice per TS (non-shadow and shadow TS). trx_sched_init_ts() does the alloc and trx_sched_clean_ts() does the free.</p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=227182021-10-14T16:24:39Zpespin
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Feedback</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>90</i></li></ul><p>The issue was lchan->lapdm_ch not being released under some conditions. Fixed by:<br /><a class="external" href="https://gerrit.osmocom.org/c/osmo-bts/+/25792">https://gerrit.osmocom.org/c/osmo-bts/+/25792</a></p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=227362021-10-15T12:29:17Zpespin
<ul></ul><p>Patch was merged, waiting for complete ttcn3-bts-tests nightly to run before closing.</p> OsmoBTS - Bug #5262: BTS_Tests_LAPDm: TC_rr_response_frame_loss fails if run after TC_t200_n200https://osmocom.org/issues/5262?journal_id=227702021-10-19T09:56:08Zpespin
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li><li><strong>% Done</strong> changed from <i>90</i> to <i>100</i></li></ul><p>tests are fixed now, closing.</p>