Bug #5645
closedosmo-bts-trx crashes when using gsmtap option -i
100%
Description
When osmo-bts-trx is started with option -i (osmo-bts-trx -c ./osmo-bts.cfg -i 127.0.0.1), then it fails with SIGABRT (or Segmentation Fault without GDB). The problem is most likely related to an msgb that is too small.
Mon Aug 15 14:43:21 2022 <0001> nm_channel_fsm.c:152 NM_CHAN_OP(bts0-trx0-ts5)[0x55555576f660]{DISABLED_OFFLINE}: state_chg to ENABLED Mon Aug 15 14:43:21 2022 <0001> oml.c:354 OC=CHANNEL INST=(00,00,05) AVAIL STATE Off line -> OK Mon Aug 15 14:43:21 2022 <0001> oml.c:362 OC=CHANNEL INST=(00,00,05) OPER STATE Disabled -> Enabled Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:242 Sending info Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:257 BTS is up Mon Aug 15 14:43:21 2022 <0001> oml.c:144 OC=CHANNEL(03) INST=(00,00,05): Tx State Changed Event Report Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:139 127.0.0.1:3002 connected write Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:139 127.0.0.1:3002 connected write Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:135 127.0.0.1:3002 connected read Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:56 127.0.0.1:3002 message received Mon Aug 15 14:43:21 2022 <0001> oml.c:1017 OC=CHANNEL(03) INST=(00,00,06): Rx OPSTART Mon Aug 15 14:43:21 2022 <0001> l1_if.c:588 NM_CHAN_OP(bts0-trx0-ts6)[0x55555576fcb0]{DISABLED_OFFLINE}: Received Event OPSTART_ACK Mon Aug 15 14:43:21 2022 <0001> oml.c:144 OC=CHANNEL(03) INST=(00,00,06): Tx Opstart Ack Mon Aug 15 14:43:21 2022 <0001> nm_channel_fsm.c:152 NM_CHAN_OP(bts0-trx0-ts6)[0x55555576fcb0]{DISABLED_OFFLINE}: state_chg to ENABLED Mon Aug 15 14:43:21 2022 <0001> oml.c:354 OC=CHANNEL INST=(00,00,06) AVAIL STATE Off line -> OK Mon Aug 15 14:43:21 2022 <0001> oml.c:362 OC=CHANNEL INST=(00,00,06) OPER STATE Disabled -> Enabled Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:242 Sending info Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:257 BTS is up Mon Aug 15 14:43:21 2022 <0001> oml.c:144 OC=CHANNEL(03) INST=(00,00,06): Tx State Changed Event Report Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:139 127.0.0.1:3002 connected write Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:139 127.0.0.1:3002 connected write Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:139 127.0.0.1:3002 connected write Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:135 127.0.0.1:3002 connected read Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:56 127.0.0.1:3002 message received Mon Aug 15 14:43:21 2022 <0001> oml.c:1017 OC=CHANNEL(03) INST=(00,00,07): Rx OPSTART Mon Aug 15 14:43:21 2022 <0001> l1_if.c:588 NM_CHAN_OP(bts0-trx0-ts7)[0x555555770300]{DISABLED_OFFLINE}: Received Event OPSTART_ACK Mon Aug 15 14:43:21 2022 <0001> oml.c:144 OC=CHANNEL(03) INST=(00,00,07): Tx Opstart Ack Mon Aug 15 14:43:21 2022 <0001> nm_channel_fsm.c:152 NM_CHAN_OP(bts0-trx0-ts7)[0x555555770300]{DISABLED_OFFLINE}: state_chg to ENABLED Mon Aug 15 14:43:21 2022 <0001> oml.c:354 OC=CHANNEL INST=(00,00,07) AVAIL STATE Off line -> OK Mon Aug 15 14:43:21 2022 <0001> oml.c:362 OC=CHANNEL INST=(00,00,07) OPER STATE Disabled -> Enabled Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:242 Sending info Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:257 BTS is up Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:221 (bts=0,trx=0) PDCH on ts=7 is available (tsc=6 hopping=no arfcn=868) Mon Aug 15 14:43:21 2022 <0001> oml.c:144 OC=CHANNEL(03) INST=(00,00,07): Tx State Changed Event Report Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:139 127.0.0.1:3002 connected write Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:863 Activate request received: TRX=0 TS=7 Mon Aug 15 14:43:21 2022 <0006> l1sap.c:1942 (bts=0,trx=0,ts=7,ss=0) Activating channel PDCH on TS7 Mon Aug 15 14:43:21 2022 <0006> scheduler.c:1099 (bts=0,trx=0,ts=7,ss=0) Activating PDTCH Mon Aug 15 14:43:21 2022 <0006> scheduler.c:1099 (bts=0,trx=0,ts=7,ss=0) Activating PTCCH Mon Aug 15 14:43:21 2022 <0006> lchan.c:271 (bts=0,trx=0,ts=7,ss=0) state NONE -> ACTIVE Mon Aug 15 14:43:21 2022 <0006> l1sap.c:837 (bts=0,trx=0,ts=7,ss=0) activate confirm chan_nr=PDCH on TS7 trx=0 Mon Aug 15 14:43:21 2022 <0000> rsl.c:1389 (bts=0,trx=0,ts=7,ss=0) not sending CHAN ACT ACK Mon Aug 15 14:43:21 2022 <0011> input/ipa.c:139 127.0.0.1:3002 connected write Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:448 Sending rts request: is_ptcch=0 arfcn=868 block=11 Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:680 Data request received: sapi=PDTCH arfcn=868 block=11 data= msgb(0x5555558a3a30): Not enough tailroom msgb_put (allocated 52904, head at 0, len 16, tailroom 53024 < want tailroom 1432604448) backtrace() returned 22 addresses /usr/local/lib/libosmocore.so.19(osmo_generate_backtrace+0x18) [0x7ffff7e15e23] /usr/local/lib/libosmocore.so.19(+0x29b01) [0x7ffff7e15b01] /usr/local/lib/libosmocore.so.19(osmo_panic+0xca) [0x7ffff7e15bd0] /usr/local/lib/libosmocore.so.19(+0x28fd0) [0x7ffff7e14fd0] /usr/local/lib/libosmocore.so.19(gsmtap_makemsg_ex+0x110) [0x7ffff7e1533f] /usr/local/lib/libosmocore.so.19(gsmtap_send_ex+0x88) [0x7ffff7e1563e] /usr/local/lib/libosmocore.so.19(gsmtap_send+0x79) [0x7ffff7e156fa] /usr/local/bin/osmo-bts-trx(+0x51806) [0x5555555a5806] /usr/local/bin/osmo-bts-trx(+0x56493) [0x5555555aa493] /usr/local/bin/osmo-bts-trx(+0x566c4) [0x5555555aa6c4] /usr/local/bin/osmo-bts-trx(+0x4998e) [0x55555559d98e] /usr/local/bin/osmo-bts-trx(+0x4a740) [0x55555559e740] /usr/local/bin/osmo-bts-trx(+0x4b174) [0x55555559f174] /usr/local/bin/osmo-bts-trx(+0x4b3b9) [0x55555559f3b9] /usr/local/lib/libosmocore.so.19(+0x1082c) [0x7ffff7dfc82c] /usr/local/lib/libosmocore.so.19(+0x10939) [0x7ffff7dfc939] /usr/local/lib/libosmocore.so.19(osmo_select_main+0x15) [0x7ffff7dfc955] /usr/local/bin/osmo-bts-trx(+0x5a5f1) [0x5555555ae5f1] /usr/local/bin/osmo-bts-trx(+0xd5ed) [0x5555555615ed] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7ffff7c1309b] /usr/local/bin/osmo-bts-trx(+0xd14a) [0x55555556114a] Program received signal SIGABRT, Aborted. __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb)
Related issues
Updated by pespin over 1 year ago
Mon Aug 15 14:43:21 2022 <0009> pcu_sock.c:680 Data request received: sapi=PDTCH arfcn=868 block=11 data= msgb(0x5555558a3a30): Not enough tailroom msgb_put (allocated 52904, head at 0, len 16, tailroom 53024 < want tailroom 1432604448)
Looks like something is wrong with that size 1432604448. This size should be rejected even before trying to allocate a msgb for it.
So the problem is not the msgb being too small, it's a size totally unrealistic being passed. Maybe a negative number converted to unsigned?
Updated by pespin over 1 year ago
dexter can you provide more information on the versions, setup, etc. you were using when having this crash?
Updated by dexter over 1 year ago
- Status changed from New to In Progress
- % Done changed from 0 to 60
I have traced the problem down to a problem with msgb_l2hlen(). This function returns a value, even when l2h is not populated. This is the case sometimes. When I fix the problem it no longer segfaults. As far as I know, the reason why l2h is not populated in some cases is simply because of idle blocks that (see also l1sap_pdch_req()) and those won't have l2h populated.
Updated by dexter over 1 year ago
- % Done changed from 60 to 80
We have decied to put asserts in the msgb_lXhlen() functions, so if we call it with an msgb that does not have the related lXh pointer populated the program will stop. I have now fixed the location that caused the problems with gsmtap. I also found one more.
https://gerrit.osmocom.org/c/osmo-bts/+/29650 l1sap: do not call msgb_l2hlen without checking
https://gerrit.osmocom.org/c/osmo-bts/+/29651 measurement: do not call msgb_l3len without checking
Updated by fixeria over 1 year ago
- Priority changed from Normal to High
The libosmocore patch has been merged earlier than the respective osmo-bts patches. I assume this is why we have +35 failures in https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bts-test/1799/. Bumping priority to high, let's get the remaining patches merged ASAP.
Updated by dexter over 1 year ago
I am on this now. I think this is in particular related to the problem in measurement.c, which I unfortunately noticed late. Patch https://gerrit.osmocom.org/c/osmo-bts/+/29651 should fix it. So far I haven't seen any problems with other osmocom programs.
Updated by dexter over 1 year ago
- Status changed from In Progress to Resolved
- % Done changed from 80 to 100
This is now fixed and all patches are merged.
Updated by fixeria over 1 year ago
- Related to Bug #5765: osmo-bts-trx crashes when using gsmtap-sapi pdtch added