Bug #4463
closedosmo-pcu crash after re-enabling MS RA capabilities parsing from SGSN messages
100%
Description
Today I was running a network setup with osmo-pcu on my laptop with 2 mobiles phones registering, and osmo-pcu crashed.
It seems related to the RA Cap messages we enabled recently comin from osmo-sgsn in osmo-pcu.
<000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:321 NSVCI=65534 Creating NS-VC with Signal weight 1, Data weight 1 20200320204116517 DLGLOBAL <000e> /home/pespin/dev/sysmocom/git/libosmocore/src/vty/telnet_interface.c:104 Available via telnet 127.0.0.1 4240 20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/osmobts_sock.cpp:211 Opening OsmoPCU L1 interface to OsmoBTS 20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/osmobts_sock.cpp:229 osmo-bts PCU socket /tmp/pcu_bts has been connected 20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:136 Sending 0.8.0.81-570f TXT as PCU_VERSION to BTS 20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:501 BTS available 20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:2070 Listening for nsip packets from 192.168.30.1:23000 on 0.0.0.0:23020 20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:2094 NS UDP socket at 0.0.0.0:23020 20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:321 NSVCI=1800 Creating NS-VC with Signal weight 1, Data weight 1 20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:2113 NSEI=1800 RESET procedure based on API request 20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:559 NSEI=1800 Tx NS RESET (NSVCI=1800, cause=O&M intervention) 20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:148 Sending activate request: trx=0 ts=6 20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:627 PDCH: trx=0 ts=6 20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:148 Sending activate request: trx=0 ts=7 20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:627 PDCH: trx=0 ts=7 20200320204116518 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:1354 NSVCI=1800 Rx NS RESET ACK (NSEI=1800, NSVCI=1800) 20200320204116518 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:704 NSEI=1800 Tx NS UNBLOCK (NSVCI=1800) 20200320204116518 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:1806 NSEI=1800 Rx NS UNBLOCK ACK 20200320204116518 DPCU <000d> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:576 NS-VC 1800 is unblocked. 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:857 Sending reset on BVCI 0 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_bssgp_bss.c:300 BSSGP (BVCI=0) Tx BVC-RESET CAUSE=O&M intervention 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:323 Rx BSSGP BVCI=0 (SIGN) BVC_RESET_ACK 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:865 Sending reset on BVCI 1800 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_bssgp_bss.c:300 BSSGP (BVCI=1800) Tx BVC-RESET CAUSE=O&M intervention 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:323 Rx BSSGP BVCI=0 (SIGN) BVC_RESET_ACK 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:874 Sending unblock on BVCI 1800 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_bssgp_bss.c:281 BSSGP (BVCI=1800) Tx BVC-UNBLOCK 20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:337 Rx BSSGP BVCI=0 (SIGN) BVC_UNBLOCK_ACK 20200320204531628 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:442 RACH request received: sapi=1 qta=-1, ra=118, fn=1307419, cur_fn=1307423, is_11bit=0 20200320204532025 DCSN1 <0000> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5026 csnStreamDecoder (type=5): 20200320204532025 DRLCMAC <0002> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pdch.cpp:609 MS supports EGPRS multislot class 12. 20200320204532025 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:992 Allocating UL TBF: MS_CLASS=12/12 20200320204532026 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:541 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL) Setting Control TS 6 20200320204532026 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:948 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL) Allocated: trx = 0, ul_slots = 40, dl_slots = 00 20200320204532048 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:1374 TBF(TFI=0 TLLI=0x8faaadbd DIR=UL STATE=ASSIGN) start Packet Uplink Assignment (PACCH) 20200320204532048 DCSN1 <0000> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5185 csnStreamDecoder (type=10): 20200320204532048 DTBFDL <0009> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:782 TBF(TFI=0 TLLI=0x8faaadbd DIR=UL STATE=ASSIGN) Scheduled UL Assignment polling on PACCH (FN=1307553, TS=7) 20200320204532264 DCSN1 <0000> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5026 csnStreamDecoder (type=1): 20200320204532264 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:544 TBF(TFI=0 TLLI=0x8faaadbd DIR=UL STATE=FLOW) Changing Control TS 6 20200320204532481 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf_ul.cpp:404 LLC [PCU -> SGSN] TBF(TFI=0 TLLI=0x8faaadbd DIR=UL STATE=FLOW) len=52 20200320204532482 DCSN1 <0000> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5792 csnStreamDecoder (RAcap): 20200320204532482 DRLCMACDATA <0003> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5800 Got 7 remaining bits unhandled by decoder at the end of bitvec 20200320204532482 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:163 LLC [SGSN -> PCU] = TLLI: 0x8faaadbd IMSI: 000 len: 9 20200320204532482 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:1071 Allocating DL TBF: MS_CLASS=12/12 20200320204532482 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:541 TBF(TFI=0 TLLI=0x00000000 DIR=DL STATE=NULL) Setting Control TS 6 20200320204532482 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:948 TBF(TFI=0 TLLI=0x8faaadbd DIR=DL STATE=NULL) Allocated: trx = 0, ul_slots = 40, dl_slots = 40 20200320204532482 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/bts.cpp:898 TBF(TFI=0 TLLI=0x8faaadbd DIR=DL STATE=ASSIGN) TX: START Immediate Assignment Downlink (PCH) *** stack smashing detected ***: terminated Program received signal SIGABRT, Aborted. 0x00007ffff77b7ce5 in raise () from /usr/lib/libc.so.6
(gdb) bt #0 0x00007ffff77b7ce5 in raise () from /usr/lib/libc.so.6 #1 0x00007ffff77a1857 in abort () from /usr/lib/libc.so.6 #2 0x00007ffff77fb2b0 in __libc_message () from /usr/lib/libc.so.6 #3 0x00007ffff788b06a in __fortify_fail () from /usr/lib/libc.so.6 #4 0x00007ffff788b034 in __stack_chk_fail () from /usr/lib/libc.so.6 #5 0x0000555555581e4f in gprs_bssgp_pcu_rx_dl_ud (msg=0x55555572fce0, tp=0x7fffffffbc80) at /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:167 #6 0x0000555500000000 in ?? () #7 0x00007ffff7f6cf40 in ?? () from /home/pespin/dev/sysmocom/build/new/out/lib/libosmogsm.so.13 #8 0x000055555572e6d0 in ?? () #9 0x00007fffffffbc80 in ?? () #10 0x000055555572fce0 in ?? () #11 0x00000000ffffbc30 in ?? () #12 0x0000070800000000 in ?? () #13 0x000055555572fd80 in ?? () #14 0x460dab82121f6200 in ?? () #15 0x000055555565d380 in ?? () #16 0x00005555556aced0 in ?? () #17 0x00007fffffffcca0 in ?? () #18 0x000055555558303c in gprs_bssgp_pcu_rcvmsg ( msg=<error reading variable: Cannot access memory at address 0xabd8>) --Type <RET> for more, q to quit, c to continue without paging-- at /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:465 Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) l 173 quit = 1; 174 break; 175 case SIGABRT: 176 /* in case of abort, we want to obtain a talloc report 177 * and then return to the caller, who will abort the process 178 */ 179 case SIGUSR1: 180 case SIGUSR2: 181 talloc_report_full(tall_pcu_ctx, stderr); 182 break; (gdb) frame 5 #5 0x0000555555581e4f in gprs_bssgp_pcu_rx_dl_ud (msg=0x55555572fce0, tp=0x7fffffffbc80) at /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:167 167 } (gdb) l 162 163 LOGP(DBSSGP, LOGL_INFO, "LLC [SGSN -> PCU] = TLLI: 0x%08x IMSI: %s len: %d\n", tlli, imsi, len); 164 165 return gprs_rlcmac_dl_tbf::handle(the_pcu.bts, tlli, tlli_old, imsi, 166 ms_class, egprs_ms_class, delay_csec, data, len); 167 } 168 169 static int gprs_bssgp_pcu_rx_paging_cs(struct msgb *msg, struct tlv_parsed *tp) 170 { 171 const uint8_t *mi;
Files
Updated by pespin about 4 years ago
- Subject changed from osmo-pcu to osmo-pcu crash after re-enabling MS RA capabilities parsing from SGSN messages
Updated by pespin about 4 years ago
Copied the content of the RA Cap field to a unit test and I can reproduce in there the same stack smashing seen in osmo-pcu:
https://gerrit.osmocom.org/c/osmo-pcu/+/17548 RLCMACTest: Reproduce stack smashing bug
Updated by fixeria about 4 years ago
==908769==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdccbd3654 at pc 0x55acc4386ee3 bp 0x7ffdccbd2e60 sp 0x7ffdccbd2e50 WRITE of size 1 at 0x7ffdccbd3654 thread T0 #0 0x55acc4386ee2 in csnStreamDecoder /home/wmn/wmn/osmocom/osmo-pcu/src/csn1.c:511 #1 0x55acc4390264 in csnStreamDecoder /home/wmn/wmn/osmocom/osmo-pcu/src/csn1.c:1361 #2 0x55acc43679f5 in decode_gsm_ra_cap(bitvec*, MS_Radio_Access_capability_t*) /home/wmn/wmn/osmocom/osmo-pcu/src/gsm_rlcmac.cpp:5793 #3 0x55acc435da46 in testRAcap2(void*) rlcmac/RLCMACTest.cpp:409 #4 0x55acc435dd8b in main rlcmac/RLCMACTest.cpp:439 #5 0x7f80c0a88022 in __libc_start_main (/usr/lib/libc.so.6+0x27022) #6 0x55acc43535ed in _start (/home/wmn/wmn/osmocom/osmo-pcu/tests/rlcmac/RLCMACTest+0xa45ed) Address 0x7ffdccbd3654 is located in stack of thread T0 at offset 180 in frame #0 0x55acc435d928 in testRAcap2(void*) rlcmac/RLCMACTest.cpp:284 This frame has 1 object(s): [32, 180) 'data' (line 286) <== Memory access at offset 180 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow /home/wmn/wmn/osmocom/osmo-pcu/src/csn1.c:511 in csnStreamDecoder Shadow bytes around the buggy address: 0x100039972670: f2 f2 00 04 f2 f2 00 04 f2 f2 00 00 00 00 00 00 0x100039972680: 00 00 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 0x100039972690: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 0x1000399726a0: 04 f2 00 04 f3 f3 00 00 00 00 00 00 00 00 00 00 0x1000399726b0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 =>0x1000399726c0: 00 00 00 00 00 00 00 00 00 00[04]f3 f3 f3 f3 f3 0x1000399726d0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 0x1000399726e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000399726f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100039972700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100039972710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==908769==ABORTING
Updated by pespin about 4 years ago
I updated the patch with the fix for it.
TODO:- test it with osmo-pcu and the same real phone
- send similar patch to wireshark rclmac part.
Updated by pespin about 4 years ago
fixeria it's now fixed, but I'm wondering why do you get clear output from Asan while I don't. Perhaps because you use clang?
Updated by fixeria about 4 years ago
fixeria it's now fixed
I came up with a similar fix, but you were faster :D
I'm wondering why do you get clear output from Asan while I don't. Perhaps because you use clang?
Nope, I am using GCC. Here is my build configuration:
$ gcc -v gcc version 9.3.0 (Arch Linux 9.3.0-1) $ ./configure --enable-sanitize CFLAGS="-O0 -g" CXXFLAGS="-O0 -g"
Updated by pespin about 4 years ago
- Status changed from New to Resolved
- % Done changed from 0 to 100
Fixed by commits 81b40cbaf3070f70954663f68375100128bdc77e..e50ce6e45c4509805807d599cadf1a1b23d37f63.
Updated by pespin about 4 years ago
- Status changed from Resolved to In Progress
- % Done changed from 100 to 90
Actually, keeping it open since I need to port those patches to wireshark.
Updated by pespin about 4 years ago
Ports to wireshark.git submitted here:
remote: https://code.wireshark.org/review/36571 rlcmac: Don't pass array element to CSN1 descriptors
remote: https://code.wireshark.org/review/36572 csn1: Validate recursive array max size during decoding
remote: https://code.wireshark.org/review/36573 rlcmac: Fix bug receiving RA cap
remote: https://code.wireshark.org/review/36574 rlcmac: Introduce MS Radio Access Capabilities 2 to fix related spare bits
Updated by pespin about 4 years ago
- Status changed from In Progress to Resolved
- % Done changed from 90 to 100
Wireshark commits merged, closing the ticket.