Actions
Bug #4864
closedosmo-bts-trx: Assert failed bfd->data == line upon IPA reset
Start date:
11/19/2020
Due date:
% Done:
100%
Spec Reference:
Description
While setting up networking for osmo-bts + osmo-bsc inside a docker container and osmo-trx outside. Due to connection failure, osmo-bts is crashing:
Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: ((*)) Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: | Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: / \ OsmoBTS Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0017> control_if.c:911 CTRL at 127.0.0.1 4238 Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0010> telnet_interface.c:104 Available via telnet 0.0.0.0 4241 Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipaccess.c:1060 enabling ipaccess BTS mode, OML connecting to 127.0.0.1:3002 Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <000b> trx_if.c:1216 phy0.0: Open transceiver Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipa.c:128 127.0.0.1:3002 connection done Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipaccess.c:882 received ID_GET for unit ID 6969/0/0 Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipaccess.c:505 failed to send A-bis IPA signalling message. Reason: Connection reset by peer Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: <0012> input/ipaccess.c:92 Forcing socket shutdown with no signal link set Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: Assert failed bfd->data == line input/ipaccess.c:99 Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: backtrace() returned 10 addresses Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmocore.so.16(osmo_panic+0xcb) [0x7fad876641eb] Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmoabis.so.9(+0xc47f) [0x7fad87f2647f] Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmoabis.so.9(+0xc5b4) [0x7fad87f265b4] Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmoabis.so.9(+0xa7c4) [0x7fad87f247c4] Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmocore.so.16(+0xd258) [0x7fad87656258] Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/lib/x86_64-linux-gnu/libosmocore.so.16(osmo_select_main+0x6) [0x7fad876562d6] Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/bin/osmo-bts-trx(+0x3738c) [0x55c16aee038c] Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7fad86ea92e1] Nov 19 13:42:32 3debd210bf35 osmo-bts-trx[34]: /usr/bin/osmo-bts-trx(+0xb19a) [0x55c16aeb419a] Nov 19 13:42:32 3debd210bf35 systemd[1]: osmo-bts-trx.service: Main process exited, code=dumped, status=6/ABRT Nov 19 13:42:32 3debd210bf35 systemd[1]: osmo-bts-trx.service: Unit entered failed state. Nov 19 13:42:32 3debd210bf35 systemd[1]: osmo-bts-trx.service: Failed with result 'core-dump'
# osmo-bts-trx --version OsmoBTS version 1.2.0.331-70d7
Updated by pespin over 3 years ago
way to reproduce: it seems it's resetting due to wrong unit-id being passed (incorrect config).
Updated by pespin over 3 years ago
It can be reproduced with following patch to emulate the conn lost (in the real case ipacces_drop() is called during send() failure in __handle_ts1_write()):
diff --git a/src/input/ipaccess.c b/src/input/ipaccess.c index 593f9f0..c22dd09 100644 --- a/src/input/ipaccess.c +++ b/src/input/ipaccess.c @@ -899,6 +899,7 @@ int ipaccess_bts_handle_ccm(struct ipa_client_conn *link, goto err_rmsg; } msgb_free(rmsg); + ipaccess_drop(link->ofd, line); } return 1; }
((*)) | / \ OsmoBTS 20201123134944765 DL1C <0006> fsm.c:461 TRX_PROV[0x6120000042a0]{CLOSED}: Allocated 20201123134944765 DL1C <0006> trx_provision_fsm.c:102 phy0.0: Delaying provision, TRX not yet assigned to phy instance 20201123134944765 DLGLOBAL <0010> cpu_sched_vty.c:470 Setting SCHED_RR priority 1 20201123134944766 DLCTRL <0017> control_if.c:911 CTRL at 127.0.0.1 4238 20201123134944766 DLGLOBAL <0010> telnet_interface.c:104 Available via telnet 127.0.0.1 4241 20201123134944766 DPCU <0009> pcu_sock.c:1131 Started listening on PCU socket: /tmp/pcu_bts 20201123134944766 DLINP <0012> e1_input.c:466 E1L(0) Line (0x62f000000460) reference count ctor changed 0 -> 1 [1 (ctor)] 20201123134944766 DLINP <0012> ipaccess.c:1060 enabling ipaccess BTS mode, OML connecting to 127.0.0.1:3002 20201123134944766 DL1C <0006> phy_link.c:58 phy0: PHY link state change shutdown -> connecting 20201123134944766 DOML <0001> phy_link.c:70 NM_RCARRIER_OP(bts0-trx0)[0x6120000033a0]{DISABLED_NOTINSTALLED}: Received Event PHYLINK_DOWN 20201123134944766 DOML <0001> phy_link.c:74 NM_BBTRANSC_OP(bts0-trx0)[0x612000003520]{DISABLED_NOTINSTALLED}: Received Event PHYLINK_DOWN 20201123134944766 DL1C <0006> scheduler.c:589 Init scheduler for trx=0 20201123134944766 DTRX <000b> trx_if.c:1216 phy0.0: Open transceiver 20201123134944766 DL1C <0006> trx_if.c:1304 TRX_PROV(phy0-0)[0x6120000042a0]{CLOSED}: Received Event TRX_PROV_EV_OPEN 20201123134944766 DTRX <000b> trx_if.c:248 phy0.0: Enqueuing TRX control command 'CMD POWEROFF' 20201123134944766 DL1C <0006> trx_provision_fsm.c:267 TRX_PROV(phy0-0)[0x6120000042a0]{CLOSED}: state_chg to OPEN_POWEROFF 20201123134944766 DTRX <000b> trx_if.c:248 phy0.0: Enqueuing TRX control command 'CMD RFMUTE 1' 20201123134944766 DLINP <0012> ipa.c:128 127.0.0.1:3002 connection done 20201123134944766 DLINP <0012> ipaccess.c:882 received ID_GET for unit ID 6969/0/0 20201123134944766 DLINP <0012> ipaccess.c:802 tag 8: 6969/0/0 20201123134944766 DLINP <0012> ipaccess.c:802 tag 7: 00:00:00:00:00:00 20201123134944766 DLINP <0012> ipaccess.c:802 tag 2: 20201123134944766 DLINP <0012> ipaccess.c:802 tag 3: sysmoBTS 20201123134944766 DLINP <0012> ipaccess.c:802 tag 4: 20201123134944766 DLINP <0012> ipaccess.c:802 tag 5: 1.2.0.332-2edf 20201123134944766 DLINP <0012> ipaccess.c:802 tag 1: sysmoBTS-00-00-00-00-00-00 20201123134944766 DLINP <0012> ipaccess.c:802 tag 0: 20201123134944766 DLINP <0012> ipaccess.c:85 E1L(0) Line (0x62f000000460) reference count ipaccess_drop changed 0 -> 1 [2 (ctor,ipaccess_drop)] 20201123134944766 DLINP <0012> ipaccess.c:91 Forcing socket shutdown with no signal link set Assert failed bfd->data == line /git/libosmo-abis/src/input/ipaccess.c:99 backtrace() returned 17 addresses /build/new/out/lib/libosmocore.so.16(+0x1508f5) [0x7ffff66878f5] /build/new/out/lib/libosmocore.so.16(osmo_generate_backtrace+0x18) [0x7ffff6687cc4] /build/new/out/lib/libosmocore.so.16(+0x15067f) [0x7ffff668767f] /build/new/out/lib/libosmocore.so.16(osmo_panic+0x13c) [0x7ffff66877c5] /build/new/out/lib/libosmoabis.so.9(+0xbb4fe) [0x7ffff72fe4fe] /build/new/out/lib/libosmoabis.so.9(ipaccess_bts_handle_ccm+0xf0b) [0x7ffff7307d9e] /build/new/out/lib/libosmoabis.so.9(+0xc5272) [0x7ffff7308272] /build/new/out/lib/libosmoabis.so.9(+0xb10cc) [0x7ffff72f40cc] /build/new/out/lib/libosmoabis.so.9(+0xb24cf) [0x7ffff72f54cf] /build/new/out/lib/libosmocore.so.16(+0x105366) [0x7ffff663c366] /build/new/out/lib/libosmocore.so.16(+0x1054ac) [0x7ffff663c4ac] /build/new/out/lib/libosmocore.so.16(osmo_select_main+0x16) [0x7ffff663c4cb] /build/new/out/bin/osmo-bts-trx(+0x2ba1cd) [0x55555580e1cd] /build/new/out/bin/osmo-bts-trx(+0x1c172a) [0x55555571572a] /usr/lib/libc.so.6(__libc_start_main+0xf2) [0x7ffff59ea152] /build/new/out/bin/osmo-bts-trx(+0x1c031e) [0x55555571431e] Program received signal SIGABRT, Aborted. 0x00007ffff59ff615 in raise () from /usr/lib/libc.so.6 (gdb) bt #0 0x00007ffff59ff615 in raise () from /usr/lib/libc.so.6 #1 0x00007ffff59e8862 in abort () from /usr/lib/libc.so.6 #2 0x00007ffff6687689 in osmo_panic_default (fmt=0x7ffff733f5e0 "Assert failed %s %s:%d\n", args=0x7fffffffdb50) at /git/libosmocore/src/panic.c:49 #3 0x00007ffff66877c5 in osmo_panic (fmt=0x7ffff733f5e0 "Assert failed %s %s:%d\n") at /git/libosmocore/src/panic.c:84 #4 0x00007ffff72fe4fe in ipaccess_drop (bfd=0x62f000000a50, line=0x62f000000460) at /git/libosmo-abis/src/input/ipaccess.c:99 #5 0x00007ffff7307d9e in ipaccess_bts_handle_ccm (link=0x612000004420, dev=0x555555a85580 <bts_dev_info>, msg=0x6330000008e0) at /git/libosmo-abis/src/input/ipaccess.c:902 #6 0x00007ffff7308272 in ipaccess_bts_read_cb (link=0x612000004420, msg=0x6330000008e0) at /git/libosmo-abis/src/input/ipaccess.c:938 #7 0x00007ffff72f40cc in ipa_client_read (link=0x612000004420) at /git/libosmo-abis/src/input/ipa.c:72 #8 0x00007ffff72f54cf in ipa_client_fd_cb (ofd=0x62f000000a50, what=1) at /git/libosmo-abis/src/input/ipa.c:136 #9 0x00007ffff663c366 in poll_disp_fds (n_fd=9) at /git/libosmocore/src/select.c:350 #10 0x00007ffff663c4ac in _osmo_select_main (polling=0) at /git/libosmocore/src/select.c:378 #11 0x00007ffff663c4cb in osmo_select_main (polling=0) at /git/libosmocore/src/select.c:417 #12 0x000055555580e1cd in bts_main (argc=7, argv=0x7fffffffe1e8) at /git/osmo-bts/src/common/main.c:398 #13 0x000055555571572a in main (argc=7, argv=0x7fffffffe1e8) at /git/osmo-bts/src/osmo-bts-trx/main.c:185
Updated by pespin over 3 years ago
- Status changed from New to Feedback
- % Done changed from 0 to 90
Related fixes:
remote: https://gerrit.osmocom.org/c/libosmo-abis/+/21281 ipaccess: Fix log error printed on wrong conditional branch [NEW]
remote: https://gerrit.osmocom.org/c/libosmo-abis/+/21282 ipaccess: Fix wrong assertion in ipaccess_drop() when used by BTS code [NEW]
Updated by pespin over 3 years ago
- Status changed from Feedback to Resolved
- % Done changed from 90 to 100
Merged, closing.
Actions