Project

General

Profile

Bug #4864

osmo-bts-trx: Assert failed bfd->data == line upon IPA reset

Added by pespin 5 days ago. Updated 1 day ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
11/19/2020
Due date:
% Done:

90%

Spec Reference:

Description

While setting up networking for osmo-bts + osmo-bsc inside a docker container and osmo-trx outside. Due to connection failure, osmo-bts is crashing:

Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: ((*))
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]:   |
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]:  / \ OsmoBTS
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0017> control_if.c:911 CTRL at 127.0.0.1 4238
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0010> telnet_interface.c:104 Available via telnet 0.0.0.0 4241
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipaccess.c:1060 enabling ipaccess BTS mode, OML connecting to 127.0.0.1:3002
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <000b> trx_if.c:1216 phy0.0: Open transceiver
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipa.c:128 127.0.0.1:3002 connection done
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipaccess.c:882 received ID_GET for unit ID 6969/0/0
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipaccess.c:505 failed to send A-bis IPA signalling message. Reason: Connection reset by peer
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipaccess.c:92 Forcing socket shutdown with no signal link set
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: Assert failed bfd->data == line input/ipaccess.c:99
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: backtrace() returned 10 addresses
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmocore.so.16(osmo_panic+0xcb) [0x7fad876641eb]
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmoabis.so.9(+0xc47f) [0x7fad87f2647f]
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmoabis.so.9(+0xc5b4) [0x7fad87f265b4]
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmoabis.so.9(+0xa7c4) [0x7fad87f247c4]
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmocore.so.16(+0xd258) [0x7fad87656258]
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmocore.so.16(osmo_select_main+0x6) [0x7fad876562d6]
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/bin/osmo-bts-trx(+0x3738c) [0x55c16aee038c]
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7fad86ea92e1]
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/bin/osmo-bts-trx(+0xb19a) [0x55c16aeb419a]
Nov 19 13:42:32 3debd210bf35 systemd[1]: osmo-bts-trx.service: Main process exited, code=dumped, status=6/ABRT
Nov 19 13:42:32 3debd210bf35 systemd[1]: osmo-bts-trx.service: Unit entered failed state.
Nov 19 13:42:32 3debd210bf35 systemd[1]: osmo-bts-trx.service: Failed with result 'core-dump'

# osmo-bts-trx --version
OsmoBTS version 1.2.0.331-70d7

History

#1 Updated by pespin 5 days ago

way to reproduce: it seems it's resetting due to wrong unit-id being passed (incorrect config).

#2 Updated by pespin 1 day ago

It can be reproduced with following patch to emulate the conn lost (in the real case ipacces_drop() is called during send() failure in __handle_ts1_write()):

diff --git a/src/input/ipaccess.c b/src/input/ipaccess.c
index 593f9f0..c22dd09 100644
--- a/src/input/ipaccess.c
+++ b/src/input/ipaccess.c
@@ -899,6 +899,7 @@ int ipaccess_bts_handle_ccm(struct ipa_client_conn *link,
                                goto err_rmsg;
                        }
                        msgb_free(rmsg);
+                       ipaccess_drop(link->ofd, line);
                }
                return 1;
        }

((*))
  |
 / \ OsmoBTS
20201123134944765 DL1C <0006> fsm.c:461 TRX_PROV[0x6120000042a0]{CLOSED}: Allocated
20201123134944765 DL1C <0006> trx_provision_fsm.c:102 phy0.0: Delaying provision, TRX not yet assigned to phy instance
20201123134944765 DLGLOBAL <0010> cpu_sched_vty.c:470 Setting SCHED_RR priority 1
20201123134944766 DLCTRL <0017> control_if.c:911 CTRL at 127.0.0.1 4238
20201123134944766 DLGLOBAL <0010> telnet_interface.c:104 Available via telnet 127.0.0.1 4241
20201123134944766 DPCU <0009> pcu_sock.c:1131 Started listening on PCU socket: /tmp/pcu_bts
20201123134944766 DLINP <0012> e1_input.c:466 E1L(0) Line (0x62f000000460) reference count ctor changed 0 -> 1 [1 (ctor)]
20201123134944766 DLINP <0012> ipaccess.c:1060 enabling ipaccess BTS mode, OML connecting to 127.0.0.1:3002
20201123134944766 DL1C <0006> phy_link.c:58 phy0: PHY link state change shutdown -> connecting
20201123134944766 DOML <0001> phy_link.c:70 NM_RCARRIER_OP(bts0-trx0)[0x6120000033a0]{DISABLED_NOTINSTALLED}: Received Event PHYLINK_DOWN
20201123134944766 DOML <0001> phy_link.c:74 NM_BBTRANSC_OP(bts0-trx0)[0x612000003520]{DISABLED_NOTINSTALLED}: Received Event PHYLINK_DOWN
20201123134944766 DL1C <0006> scheduler.c:589 Init scheduler for trx=0
20201123134944766 DTRX <000b> trx_if.c:1216 phy0.0: Open transceiver
20201123134944766 DL1C <0006> trx_if.c:1304 TRX_PROV(phy0-0)[0x6120000042a0]{CLOSED}: Received Event TRX_PROV_EV_OPEN
20201123134944766 DTRX <000b> trx_if.c:248 phy0.0: Enqueuing TRX control command 'CMD POWEROFF'
20201123134944766 DL1C <0006> trx_provision_fsm.c:267 TRX_PROV(phy0-0)[0x6120000042a0]{CLOSED}: state_chg to OPEN_POWEROFF
20201123134944766 DTRX <000b> trx_if.c:248 phy0.0: Enqueuing TRX control command 'CMD RFMUTE 1'
20201123134944766 DLINP <0012> ipa.c:128 127.0.0.1:3002 connection done
20201123134944766 DLINP <0012> ipaccess.c:882 received ID_GET for unit ID 6969/0/0
20201123134944766 DLINP <0012> ipaccess.c:802  tag 8: 6969/0/0
20201123134944766 DLINP <0012> ipaccess.c:802  tag 7: 00:00:00:00:00:00
20201123134944766 DLINP <0012> ipaccess.c:802  tag 2:
20201123134944766 DLINP <0012> ipaccess.c:802  tag 3: sysmoBTS
20201123134944766 DLINP <0012> ipaccess.c:802  tag 4:
20201123134944766 DLINP <0012> ipaccess.c:802  tag 5: 1.2.0.332-2edf
20201123134944766 DLINP <0012> ipaccess.c:802  tag 1: sysmoBTS-00-00-00-00-00-00
20201123134944766 DLINP <0012> ipaccess.c:802  tag 0:
20201123134944766 DLINP <0012> ipaccess.c:85 E1L(0) Line (0x62f000000460) reference count ipaccess_drop changed 0 -> 1 [2 (ctor,ipaccess_drop)]
20201123134944766 DLINP <0012> ipaccess.c:91 Forcing socket shutdown with no signal link set
Assert failed bfd->data == line /git/libosmo-abis/src/input/ipaccess.c:99
backtrace() returned 17 addresses
/build/new/out/lib/libosmocore.so.16(+0x1508f5) [0x7ffff66878f5]
/build/new/out/lib/libosmocore.so.16(osmo_generate_backtrace+0x18) [0x7ffff6687cc4]
/build/new/out/lib/libosmocore.so.16(+0x15067f) [0x7ffff668767f]
/build/new/out/lib/libosmocore.so.16(osmo_panic+0x13c) [0x7ffff66877c5]
/build/new/out/lib/libosmoabis.so.9(+0xbb4fe) [0x7ffff72fe4fe]
/build/new/out/lib/libosmoabis.so.9(ipaccess_bts_handle_ccm+0xf0b) [0x7ffff7307d9e]
/build/new/out/lib/libosmoabis.so.9(+0xc5272) [0x7ffff7308272]
/build/new/out/lib/libosmoabis.so.9(+0xb10cc) [0x7ffff72f40cc]
/build/new/out/lib/libosmoabis.so.9(+0xb24cf) [0x7ffff72f54cf]
/build/new/out/lib/libosmocore.so.16(+0x105366) [0x7ffff663c366]
/build/new/out/lib/libosmocore.so.16(+0x1054ac) [0x7ffff663c4ac]
/build/new/out/lib/libosmocore.so.16(osmo_select_main+0x16) [0x7ffff663c4cb]
/build/new/out/bin/osmo-bts-trx(+0x2ba1cd) [0x55555580e1cd]
/build/new/out/bin/osmo-bts-trx(+0x1c172a) [0x55555571572a]
/usr/lib/libc.so.6(__libc_start_main+0xf2) [0x7ffff59ea152]
/build/new/out/bin/osmo-bts-trx(+0x1c031e) [0x55555571431e]

Program received signal SIGABRT, Aborted.
0x00007ffff59ff615 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff59ff615 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff59e8862 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff6687689 in osmo_panic_default (fmt=0x7ffff733f5e0 "Assert failed %s %s:%d\n", args=0x7fffffffdb50)
    at /git/libosmocore/src/panic.c:49
#3  0x00007ffff66877c5 in osmo_panic (fmt=0x7ffff733f5e0 "Assert failed %s %s:%d\n") at /git/libosmocore/src/panic.c:84
#4  0x00007ffff72fe4fe in ipaccess_drop (bfd=0x62f000000a50, line=0x62f000000460) at /git/libosmo-abis/src/input/ipaccess.c:99
#5  0x00007ffff7307d9e in ipaccess_bts_handle_ccm (link=0x612000004420, dev=0x555555a85580 <bts_dev_info>, msg=0x6330000008e0)
    at /git/libosmo-abis/src/input/ipaccess.c:902
#6  0x00007ffff7308272 in ipaccess_bts_read_cb (link=0x612000004420, msg=0x6330000008e0)
    at /git/libosmo-abis/src/input/ipaccess.c:938
#7  0x00007ffff72f40cc in ipa_client_read (link=0x612000004420) at /git/libosmo-abis/src/input/ipa.c:72
#8  0x00007ffff72f54cf in ipa_client_fd_cb (ofd=0x62f000000a50, what=1) at /git/libosmo-abis/src/input/ipa.c:136
#9  0x00007ffff663c366 in poll_disp_fds (n_fd=9) at /git/libosmocore/src/select.c:350
#10 0x00007ffff663c4ac in _osmo_select_main (polling=0) at /git/libosmocore/src/select.c:378
#11 0x00007ffff663c4cb in osmo_select_main (polling=0) at /git/libosmocore/src/select.c:417
#12 0x000055555580e1cd in bts_main (argc=7, argv=0x7fffffffe1e8) at /git/osmo-bts/src/common/main.c:398
#13 0x000055555571572a in main (argc=7, argv=0x7fffffffe1e8) at /git/osmo-bts/src/osmo-bts-trx/main.c:185

#3 Updated by pespin 1 day ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 90

Related fixes:
remote: https://gerrit.osmocom.org/c/libosmo-abis/+/21281 ipaccess: Fix log error printed on wrong conditional branch [NEW]
remote: https://gerrit.osmocom.org/c/libosmo-abis/+/21282 ipaccess: Fix wrong assertion in ipaccess_drop() when used by BTS code [NEW]

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)