Project

General

Profile

Bug #3612

osmo-bts-trx: heap-use-after-free in e1inp_sign_link_destroy

Added by pespin about 1 year ago. Updated about 1 year ago.

Status:
Stalled
Priority:
Normal
Assignee:
Category:
osmo-bts-trx
Target version:
-
Start date:
10/02/2018
Due date:
% Done:

0%

Spec Reference:

Description

20181002160543055 DTRX <000b> trx_if.c:124 Clock indication: fn=387162
20181002160543055 DL1C <0006> scheduler_trx.c:1640 TRX Clock Ind: elapsed_us= 986522, elapsed_fn=217, error_us=-14933
20181002160543055 DL1C <0006> scheduler_trx.c:1658 GSM clock jitter: 14247us (elapsed_fn=4)
20181002160543056 DL1C <0006> scheduler_trx.c:1681 We were 4 FN slower than TRX, compensated
20181002160543133 DLINP <0012> input/ipa.c:63 10.42.42.7:3003 lost connection with server
20181002160543133 DABIS <000d> abis.c:144 Signalling link down
20181002160543133 DABIS <000d> abis.c:229 Input Signal 2 received
20181002160543134 DABIS <000d> abis.c:229 Input Signal 2 received
=================================================================
==29326==ERROR: AddressSanitizer: heap-use-after-free on address 0x62e000001078 at pc 0x7f82e8521b02 bp 0x7ffcf0cd7620 sp 0x7ffcf0cd7618
WRITE of size 8 at 0x62e000001078 thread T0
    #0 0x7f82e8521b01 in __llist_del /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:114
    #1 0x7f82e8521b01 in llist_del /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:126
    #2 0x7f82e8521b01 in e1inp_sign_link_destroy /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmo-abis/src/e1_input.c:551
    #3 0x557e14465b1c in sign_link_down /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/osmo-bts/src/common/abis.c:167
    #4 0x7f82e8531abf in ipa_client_read input/ipa.c:68
    #5 0x7f82e8531abf in ipa_client_fd_cb input/ipa.c:136
    #6 0x7f82e77e4878 in osmo_fd_disp_fds /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmocore/src/select.c:217
    #7 0x7f82e77e4878 in osmo_select_main /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmocore/src/select.c:257
    #8 0x557e144617eb in bts_main /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/osmo-bts/src/common/main.c:364
    #9 0x7f82e64b32e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #10 0x557e143c6fd9 in _start (/home/jenkins/workspace/osmo-gsm-tester_manual-run/trial-155/inst/osmo-bts/bin/osmo-bts-trx+0x103fd9)

0x62e000001078 is located 3192 bytes inside of 48072-byte region [0x62e000000400,0x62e00000bfc8)
freed by thread T0 here:
    #0 0x7f82e9221a10 in free (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1a10)
    #1 0x7f82e82be86a in _talloc_free (/usr/lib/x86_64-linux-gnu/libtalloc.so.2+0x486a)

previously allocated by thread T0 here:
    #0 0x7f82e9221d28 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1d28)
    #1 0x7f82e82c0acd in _talloc_zero (/usr/lib/x86_64-linux-gnu/libtalloc.so.2+0x6acd)

SUMMARY: AddressSanitizer: heap-use-after-free /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:114 in __llist_del
Shadow bytes around the buggy address:
  0x0c5c7fff81b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff81c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff81d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff81e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff81f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c5c7fff8200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd[fd]
  0x0c5c7fff8210: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff8220: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff8230: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff8240: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff8250: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==29326==ABORTING

trial-155-run.tgz trial-155-run.tgz 207 KB pespin, 10/02/2018 03:42 PM

Related issues

Related to OpenBSC - Bug #4094: multiple crashes due to connection failures / dropsNew07/09/2019

History

#1 Updated by pespin about 1 year ago

According to test case, it happens when osmo-bsc is killed and osmo-bts-trx detects the connection is broken (timestamp 20181002160543133 in osmo-bts log):

16:05:40.004725 run   osmo-bsc_10.42.42.7(pid=29315): Terminating (SIGINT)  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)]  [process.py:111]
16:05:43.411228 run   osmo-bsc_10.42.42.7(pid=29315): DBG: Cleanup  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)]  [process.py:134]
16:05:43.412340 run   osmo-bsc_10.42.42.7(pid=29315): Terminated: ok {rc=0}  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)]  [process.py:137]
16:05:43.413307 run          osmo-trx-uhd(pid=29325): Terminating (SIGINT)  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)]  [process.py:111]
16:05:43.582363 run          osmo-trx-uhd(pid=29325): DBG: Cleanup  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)]  [process.py:134]
16:05:43.583486 run          osmo-trx-uhd(pid=29325): Terminated {rc=130}  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)]  [process.py:139]
16:05:43.584962 run          osmo-bts-trx(pid=29326): Terminating (SIGINT)  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)]  [process.py:111]
16:05:43.586014 run          osmo-bts-trx(pid=29326): DBG: Cleanup  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)]  [process.py:134]
16:05:43.586974 run          osmo-bts-trx(pid=29326): Terminated {rc=1}  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)]  [process.py:139]

#2 Updated by pespin about 1 year ago

This is the code path map:

ipa_client_read                                                         "lost connection with server" 
    ipa_client_conn_close                                               FD unregister && close, free pending_msg queue.
    link->updown_cb(link)
        ipaccess_bts_updown_cb(link)
            sign_link_down(link->line) [osmo-bts]                        "abis.c:144 Signalling link down".   Freees g_bts->oml_link and each trx->rsl_link: e1inp_sign_link_destroy
                e1inp_sign_link_destroy x3
                    link->ts->line->driver->close(link)
                        ipaccess_close
                            e1inp_close_socket                            unregisters and closes bfd
                                e1inp_int_snd_event(S_L_INP_TEI_DN);
                                    &inp_s_cbfn [osmo-bts]                "Input Signal %u received\n"  prints and does nothing
                    e1inp_line_put(link->ts->line)
                    talloc_free(link)
                bts_model_abis_close                                     "Abis close" 

The issue seems to appear when trying to call e1inp_sign_link_destroy(trx->rsl_link); on the 2nd TRX to close its RSL connection.

As we get the message "Input Signal 2 received" twice and we get the crash before seeing "Abis close", it must necessarily fail that way.

So it seems the 2nd trx was at some point freed but was not removed from g_bts->trx_list.

#3 Updated by pespin about 1 year ago

  • Status changed from New to Stalled

I was unable so far to find the cause of the issue.

I mark this issue as stalled until I have an ettus B200 I can easily try to reproduce it with.

I submitted several patches in libosmo-abis and osmo-bts to improve logging and I enabled the test to be run by osmo-gsm-tester prod setup. Hopefully we'll get more input on this meanwhile.

#4 Updated by Hoernchen 3 months ago

  • Related to Bug #4094: multiple crashes due to connection failures / drops added

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)