Project

General

Profile

Actions

Bug #4838

closed

osmo-pcu: ASan stack-buffer-underflow in csn1 decoder (PktPagingRequest)

Added by pespin over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
10/30/2020
Due date:
% Done:

100%

Spec Reference:

Description

While doing some changes (non-related changed to CSN1) and running TTCN3 PCU tests I got into this (I had --enable-sanitize on libosmocore and osmo-pcu):

20201030181602024 DBSSGP DEBUG Sending FLOW CONTROL BVC, Bmax = 200000, R = 20000, Bmax_MS = 100000, R_MS = 10000, avg_dly = 0 (gprs_bssgp_pcu.cpp:895)
20201030181602027 DLNS DEBUG GPRS-NS2-VC[0x6120000078a0]{UNBLOCKED}: Received Event UNITDATA (gprs_ns2_vc_fsm.c:655)
20201030181602027 DBSSGP DEBUG rx BVCI_PTP=1234 gprs_bssgp_rx_ptp (gprs_bssgp_pcu.cpp:483)
20201030181602027 DRLCMAC INFO Add RR paging: chan-needed=0 MI=IMSI-262420000000042 (bts.cpp:375)
20201030181602027 DTBF DEBUG TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=FLOW) uses TRX=0 TS=7, so we mark (bts.cpp:401)
20201030181602027 DRLCMAC INFO Paging on PACCH of TRX=0 TS=7 (bts.cpp:423)
20201030181602027 DLNS DEBUG GPRS-NS2-VC[0x6120000078a0]{UNBLOCKED}: Received Event UNITDATA (gprs_ns2_vc_fsm.c:655)
20201030181602028 DBSSGP DEBUG rx BVCI_PTP=1234 gprs_bssgp_rx_ptp (gprs_bssgp_pcu.cpp:483)
20201030181602028 DBSSGP DEBUG Rx BSSGP BVCI=1234 (PTP) FLOW-CONTROL-BVC-ACK (gprs_bssgp_pcu.cpp:280)
20201030181602044 DL1IF DEBUG RTS request received: trx=0 ts=7 sapi=5 arfcn=871 fn=13 cur_fn=8 block=0 (pcu_l1_if.cpp:420)
20201030181602044 DRLCMACSCHED DEBUG Received RTS for PDCH: TRX=0 TS=7 FN=13 block_nr=0 scheduling USF=0 for required uplink resource of UL TFI=0 (gprs_rlcmac_
sched.cpp:118)
20201030181602044 DRLCMAC DEBUG Scheduling paging (pdch.cpp:190)
20201030181602044 DRLCMAC DEBUG Paging MI - IMSI-262420000000042 (pdch.cpp:213)
20201030181602044 DRLCMAC DEBUG +++++++++++++++++++++++++ TX : Packet Paging Request +++++++++++++++++++++++++ (pdch.cpp:251)
20201030181602044 DCSN1 INFO csnStreamDecoder (type: Pkt Paging Request (34): MESSAGE_TYPE = 34 | PAGE_MODE = 0 | Exist_PERSISTENCE_LEVEL = 0 | Exist_NLN = 0 | Repeated_Page_info = Exist | u.Page_req_RR = 1 |  : u.Page_req_RR | u.Mobile_Identity = 1 |  : u.Mobile_Identity | Length_of_Mobile_Identity_contents = 8 | =================================================================
==81==ERROR: AddressSanitizer: stack-buffer-underflow on address 0x7ffc51180350 at pc 0x7fa695fe6305 bp 0x7ffc5117f9c0 sp 0x7ffc5117f9b8
READ of size 4 at 0x7ffc51180350 thread T0
    #0 0x7fa695fe6304 in bitvec_read_field /tmp/libosmocore/src/bitvec.c:481
    #1 0x55ba180dab86 in csnStreamDecoder (/usr/local/bin/osmo-pcu+0x29db86)
    #2 0x55ba180d8aa4 in csnStreamDecoder (/usr/local/bin/osmo-pcu+0x29baa4)
    #3 0x55ba180d8aa4 in csnStreamDecoder (/usr/local/bin/osmo-pcu+0x29baa4)
    #4 0x55ba180db89d in csnStreamDecoder (/usr/local/bin/osmo-pcu+0x29e89d)
    #5 0x55ba180c6fb0 in decode_gsm_rlcmac_downlink (/usr/local/bin/osmo-pcu+0x289fb0)
    #6 0x55ba180875ac in gprs_rlcmac_pdch::packet_paging_request() (/usr/local/bin/osmo-pcu+0x24a5ac)
    #7 0x55ba180c1d87 in sched_select_ctrl_msg(unsigned char, unsigned char, unsigned int, unsigned char, gprs_rlcmac_pdch*, gprs_rlcmac_tbf*, gprs_rlcmac_tbf*, gprs_rlcmac_ul_tbf*) (/usr/local/bin/osmo-pcu+0x284d87)
    #8 0x55ba180c4054 in gprs_rlcmac_rcv_rts_block(gprs_rlcmac_bts*, unsigned char, unsigned char, unsigned int, unsigned char) (/usr/local/bin/osmo-pcu+0x287054)
    #9 0x55ba1801607f in pcu_rx_rts_req_pdtch (/usr/local/bin/osmo-pcu+0x1d907f)
    #10 0x55ba1801679b in pcu_rx_rts_req(gsm_pcu_if_rts_req*) (/usr/local/bin/osmo-pcu+0x1d979b)
    #11 0x55ba1801e66b in pcu_rx(unsigned char, gsm_pcu_if*) (/usr/local/bin/osmo-pcu+0x1e166b)
    #12 0x55ba180b5da3 in pcu_sock_read(osmo_fd*) (/usr/local/bin/osmo-pcu+0x278da3)
    #13 0x55ba180b6229 in pcu_sock_cb(osmo_fd*, unsigned int) (/usr/local/bin/osmo-pcu+0x279229)
    #14 0x7fa695fd8816 in osmo_fd_disp_fds /tmp/libosmocore/src/select.c:265
    #15 0x7fa695fd8816 in _osmo_select_main /tmp/libosmocore/src/select.c:407
    #16 0x7fa695fdbbda in osmo_select_main /tmp/libosmocore/src/select.c:417
    #17 0x55ba17fee167 in main (/usr/local/bin/osmo-pcu+0x1b1167)
    #18 0x7fa6943da2e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #19 0x55ba17feb579 in _start (/usr/local/bin/osmo-pcu+0x1ae579)

Address 0x7ffc51180350 is located in stack of thread T0 at offset 0 in frame
    #0 0x55ba180c5e16 in decode_gsm_rlcmac_downlink (/usr/local/bin/osmo-pcu+0x288e16)

  This frame has 2 object(s):
    [32, 36) 'readIndex'
    [96, 108) 'ar'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-underflow /tmp/libosmocore/src/bitvec.c:481 in bitvec_read_field
Shadow bytes around the buggy address:
  0x10000a228010: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000a228020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000a228030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10000a228040: f1 f1 f1 f1 00 04 f4 f4 f2 f2 f2 f2 00 00 00 00
  0x10000a228050: 00 00 00 00 00 00 00 00 f3 f3 f3 f3 00 00 00 00
=>0x10000a228060: 00 00 00 00 00 00 00 00 00 00[f1]f1 f1 f1 04 f4
  0x10000a228070: f4 f4 f2 f2 f2 f2 00 04 f4 f4 f3 f3 f3 f3 00 00
  0x10000a228080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
  0x10000a228090: f1 f1 04 f4 f4 f4 f2 f2 f2 f2 00 00 00 f4 f2 f2
  0x10000a2280a0: f2 f2 00 00 00 00 00 00 00 00 f3 f3 f3 f3 00 00
  0x10000a2280b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==81==ABORTING
25: stopped pid 81 with status 1

Probably something is broken in our TTCN3 (not yet finished) and I also saw lots of TTCN3 RLCMAC encoder output around the same time, but still, that packet should crash osmo-pcu.


Files

Actions #1

Updated by fixeria over 3 years ago

Can you share the MAC block provoking osmo-pcu crash here as hex? I would be interested to have a look.

Actions #2

Updated by fixeria over 3 years ago

As far as I can see, in gprs_rlcmac_pdch::packet_paging_request() we first encode the paging message using the low-level bitvec API, and then decode the encoding result in the same function, I guess to make sure that the message is correct. Using the bitvec API for encoding kind of makes sense, because we cannot know in advance how many unique identities would fit.

Actions #3

Updated by pespin over 3 years ago

I was quite busy at the time also fixing several other stuff.
I think it should be easy to reproduce by running the entire PCU_Tests.ttcn.control with libosmocore+osmo-pcu with "--enable-sanitize". I can do it whenever I have more time and check which test exactly throws that issue, and provide pcap file.

But I saw this similar ASan report around 2-3 times in osmo-pcu.log for one PCU_Tests.control run, that's why I'm saying it should be easy to reproduce.

Actions #4

Updated by fixeria over 3 years ago

  • Status changed from New to Feedback

Hi Pau,

I think it should be easy to reproduce by running the entire PCU_Tests.ttcn.control with libosmocore+osmo-pcu with "--enable-sanitize".

I could not reproduce this segfault. I even hacked a new TTCN-3 test case verifying handling of multiple MIs:

https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/20997 library: fix Repeated Page Info IE in PacketPagingReq
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/20998 PCU_Tests: WIP: add f_TC_paging_cs_multi_tuwat

and everything works like a charm:

04:09 < fixeria_> pespin: re OS#4838: I came up with a TTCN-3 test case that sends several PS PAGING requests
04:09 < fixeria_> pespin: I am running sanitized osmo-pcu, and so far sending 4 or even 6 requests works fine
04:16 < fixeria_> pespin_: 56 (7 unique TFIs * 8 TRXes) concurrent TBFs, and 56 CS PAGING requests also work fine ;)

Actions #5

Updated by pespin over 3 years ago

I can reproduce this by running TC_paging_cs_from_bts or TC_paging_cs_from_sgsn_sign (but I think other tests running TC_paging_* also trigger it).

So for TC_paging_cs_from_sgsn_sign I attach pcap file (but ofc the rlcmac resulting message cannot be seen because it crashes while generating it. Still the incoming BSSGP one can be seen).

Interestingly, the TTCN3 test also seems to be crashing somehow, probably when receiving the immediateAssignment and trying to decode it in RLCMAC (form backtrace it seems it crashes in the Unix Domain socket handler?):

MTC@f50471d3be1b: BSSGP successfully initialized
MTC@f50471d3be1b: Sending RACH.ind on fn=1337 with RA=120, TA=0
MTC@f50471d3be1b: Start timer T: 2 s
MTC@f50471d3be1b: Rx Immediate Assignment: { header := { l2_plen := { l2_plen := 11, zero_one := '01'B }, skip_indicator := 0, rr_protocol_discriminator := 6, message_type := IMMEDIATE_ASSIGNMENT (63) }, payload := { imm_ass := { ded_or_tbf := { spare := '0'B, tma := false, downlink := false, tbf := true }, page_mode := PAGE_MODE_NORMAL (0), chan_desc := omit, pkt_chan_desc := { channel_Type_spare := 1, tn := 7, tsc := 7, presence := '0'B, zero := { hopping := '0'B, spare := '0'B, arfcn := 871, indirect := omit }, one := omit }, req_ref := { ra := '01111000'B, t1p := 1, t3 := 11, t2 := 11 }, timing_advance := 0, mobile_allocation := { len := 0, ma := ''B }, rest_octets := { presence := '11'B, ll := omit, lh := omit, hl := omit, hh := { pa_disc := '0'B, pa := { uldl := { ass_disc := '0'B, ass := { ul := { presence := '1'B, dynamic := { tfi_assignment := 0, polling := '0'B, spare := '0'B, usf := 0, usf_granularity := '0'B, p0_present := '0'B, p0 := omit, pr_mode := omit, ch_coding_cmd := CH_CODING_CS2 (1), tlli_block_chan_coding := '1'B, alpha_present := '0'B, alpha := omit, gamma := 0, ta_index_present := '0'B, ta_index := omit, tbf_starting_time_present := '0'B, tbf_starting_time := omit }, single := omit } } } } } } } } }
MTC@f50471d3be1b: Rx Uplink TBF GPRS assignment: { presence := '1'B, dynamic := { tfi_assignment := 0, polling := '0'B, spare := '0'B, usf := 0, usf_granularity := '0'B, p0_present := '0'B, p0 := omit, pr_mode := omit, ch_coding_cmd := CH_CODING_CS2 (1), tlli_block_chan_coding := '1'B, alpha_present := '0'B, alpha := omit, gamma := 0, ta_index_present := '0'B, ta_index := omit, tbf_starting_time_present := '0'B, tbf_starting_time := omit }, single := omit }
close_fd10
 /osmo-ttcn3-hacks/pcu/PCU_Tests: Segmentation fault occurred
/usr/lib/titan/libttcn3-parallel-dynamic.so(_Z14signal_handleri+0xa3)[0x7f9582b39c03]
/lib/x86_64-linux-gnu/libc.so.6(+0x33060)[0x7f9580f8f060]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x14)[0x7f9580fd7524]
/usr/lib/titan/libttcn3-parallel-dynamic.so(Free+0xe)[0x7f9582b297fe]
/osmo-ttcn3-hacks/pcu/UD_PT.so(_ZN12UD__PortType15UD__PT_PROVIDER24Handle_Fd_Event_ReadableEi+0x1f8)[0x7f95839df7d8]
/usr/lib/titan/libttcn3-parallel-dynamic.so(_ZN19Fd_And_Timeout_User13call_handlersEi+0x56d)[0x7f9582afdebd]
/usr/lib/titan/libttcn3-parallel-dynamic.so(_ZN13TTCN_Snapshot8take_newEb+0x387)[0x7f9582afe5f7]
/osmo-ttcn3-hacks/pcu/PCUIF_Components.so(_ZN17PCUIF__Components21f__PCUIF__CT__handlerERK10CHARSTRINGRK7BOOLEAN+0xd89)[0x7f95a6bdb26b]
/osmo-ttcn3-hacks/pcu/PCUIF_Components.so(_ZN17PCUIF__Components18start_ptc_functionEPKcR8Text_Buf+0x254)[0x7f95a6bdbf32]
/usr/lib/titan/libttcn3-parallel-dynamic.so(_ZN11Module_List14start_functionEPKcS1_R8Text_Buf+0x2b)[0x7f9582ae477b]
/usr/lib/titan/libttcn3-parallel-dynamic.so(_ZN12TTCN_Runtime14start_functionEPKcS1_R8Text_Buf+0x25)[0x7f9582af68e5]
/usr/lib/titan/libttcn3-parallel-dynamic.so(_ZN18TTCN_Communication13process_startEv+0x42)[0x7f9582abac32]
/usr/lib/titan/libttcn3-parallel-dynamic.so(_ZN18TTCN_Communication23process_all_messages_tcEv+0x2f5)[0x7f9582abb485]
/usr/lib/titan/libttcn3-parallel-dynamic.so(_ZN12TTCN_Runtime8ptc_mainEv+0xdf)[0x7f9582afa4ff]
/usr/lib/titan/libttcn3-parallel-dynamic.so(main+0x365)[0x7f95827f3665]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f9580f7c2e1]
/osmo-ttcn3-hacks/pcu/PCU_Tests(+0x229a)[0x559c36bba29a]
MC@f50471d3be1b: Unexpected end of PTC connection (8) from f50471d3be1b [172.18.13.10].
BTS(9)@f50471d3be1b: Warning: The last outgoing messages on port PCUIF may be lost.
BTS(9)@f50471d3be1b: Dynamic test case error: Port PCUIF has neither connections nor mappings. Message cannot be sent on it.
ClckGen-0(11)@f50471d3be1b: Dynamic test case error: Port CLCK has neither connections nor mappings. Message cannot be sent on it.

Actions #6

Updated by pespin over 3 years ago

Offending message crashing decoder:

40 88 3c 14 93 12 00 00 00 00 12 00 00 00 00 00 00 00 00 00 00 00 00

Actions #7

Updated by pespin over 3 years ago

I can reproduce the ASan issue in a unit test with this patch:

diff --git a/tests/rlcmac/RLCMACTest.cpp b/tests/rlcmac/RLCMACTest.cpp
index f1b6508..2c7d1da 100644
--- a/tests/rlcmac/RLCMACTest.cpp
+++ b/tests/rlcmac/RLCMACTest.cpp
@@ -110,6 +110,7 @@ void testRlcMacDownlink(void *test_ctx)
        "4913e00850884013a8048b2b2b2b2b2b2b2b2b2b2b2b2b", // Polling Request (malformed)
        "412430007fffffffffffffffefd19c7ba12b2b2b2b2b2b", // Packet Uplink Ack Nack?
        "41942b2b2b2b2b2b2b2b2b2b2b2b2b2b2b2b2b2b2b2b2b", // System Info 13?
+       "40883c1493120000000012000000000000000000000000", // Paging request (OS#4838)
        };

        int testDataSize = sizeof(testData)/sizeof(testData[0]);
Actions #8

Updated by pespin over 3 years ago

This change seems to fix the bug, but I still need to see if it's the proper fix:

diff --git a/src/csn1.c b/src/csn1.c
index fa29e27..881a32a 100644
--- a/src/csn1.c
+++ b/src/csn1.c
@@ -1216,7 +1216,7 @@ csnStreamDecoder(csnStream_t* ar, const CSN_DESCR* pDescr, struct bitvec *vector

           while (count > 0)
           {
-            readIndex -= 8;
+            *readIndex -= 8;
            *pui8 = bitvec_read_field(vector, readIndex, 8);
             LOGPC(DCSN1, LOGL_DEBUG, "%s = %u | ", pDescr->sz , (unsigned)*pui8);
             pui8++;

The problem was readIndex pointer being changed instead of its content, so when bitvec_read_field tried reading from it, it was reading from somewherse else, because address changed from 0x7ffffffdfb80 to 0x7ffffffdfb60

Actions #9

Updated by pespin over 3 years ago

  • % Done changed from 0 to 90

Proper fix should be here, I still need to adjust the unit test expectancies:
https://gerrit.osmocom.org/c/osmo-pcu/+/21305 csn1: Fix readIndex pointer change in CSN_VARIABLE_ARRAY

Actions #10

Updated by pespin over 3 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 90 to 100

Merged, closing.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)