Project

General

Profile

Actions

Bug #3612

closed

osmo-bts-trx: heap-use-after-free in e1inp_sign_link_destroy

Added by pespin over 5 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
osmo-bts-trx
Target version:
-
Start date:
10/02/2018
Due date:
% Done:

100%

Spec Reference:

Description

20181002160543055 DTRX <000b> trx_if.c:124 Clock indication: fn=387162
20181002160543055 DL1C <0006> scheduler_trx.c:1640 TRX Clock Ind: elapsed_us= 986522, elapsed_fn=217, error_us=-14933
20181002160543055 DL1C <0006> scheduler_trx.c:1658 GSM clock jitter: 14247us (elapsed_fn=4)
20181002160543056 DL1C <0006> scheduler_trx.c:1681 We were 4 FN slower than TRX, compensated
20181002160543133 DLINP <0012> input/ipa.c:63 10.42.42.7:3003 lost connection with server
20181002160543133 DABIS <000d> abis.c:144 Signalling link down
20181002160543133 DABIS <000d> abis.c:229 Input Signal 2 received
20181002160543134 DABIS <000d> abis.c:229 Input Signal 2 received
=================================================================
==29326==ERROR: AddressSanitizer: heap-use-after-free on address 0x62e000001078 at pc 0x7f82e8521b02 bp 0x7ffcf0cd7620 sp 0x7ffcf0cd7618
WRITE of size 8 at 0x62e000001078 thread T0
    #0 0x7f82e8521b01 in __llist_del /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:114
    #1 0x7f82e8521b01 in llist_del /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:126
    #2 0x7f82e8521b01 in e1inp_sign_link_destroy /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmo-abis/src/e1_input.c:551
    #3 0x557e14465b1c in sign_link_down /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/osmo-bts/src/common/abis.c:167
    #4 0x7f82e8531abf in ipa_client_read input/ipa.c:68
    #5 0x7f82e8531abf in ipa_client_fd_cb input/ipa.c:136
    #6 0x7f82e77e4878 in osmo_fd_disp_fds /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmocore/src/select.c:217
    #7 0x7f82e77e4878 in osmo_select_main /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmocore/src/select.c:257
    #8 0x557e144617eb in bts_main /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/osmo-bts/src/common/main.c:364
    #9 0x7f82e64b32e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #10 0x557e143c6fd9 in _start (/home/jenkins/workspace/osmo-gsm-tester_manual-run/trial-155/inst/osmo-bts/bin/osmo-bts-trx+0x103fd9)

0x62e000001078 is located 3192 bytes inside of 48072-byte region [0x62e000000400,0x62e00000bfc8)
freed by thread T0 here:
    #0 0x7f82e9221a10 in free (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1a10)
    #1 0x7f82e82be86a in _talloc_free (/usr/lib/x86_64-linux-gnu/libtalloc.so.2+0x486a)

previously allocated by thread T0 here:
    #0 0x7f82e9221d28 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1d28)
    #1 0x7f82e82c0acd in _talloc_zero (/usr/lib/x86_64-linux-gnu/libtalloc.so.2+0x6acd)

SUMMARY: AddressSanitizer: heap-use-after-free /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:114 in __llist_del
Shadow bytes around the buggy address:
  0x0c5c7fff81b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff81c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff81d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff81e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff81f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c5c7fff8200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd[fd]
  0x0c5c7fff8210: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff8220: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff8230: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff8240: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5c7fff8250: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==29326==ABORTING


Files

trial-155-run.tgz trial-155-run.tgz 207 KB pespin, 10/02/2018 03:42 PM

Related issues

Related to OpenBSC - Bug #4094: multiple crashes due to connection failures / dropsClosed07/09/2019

Actions
Related to OsmoBSC - Bug #4624: osmo-bsc leaks memoryResolvedpespin06/20/2020

Actions
Related to OsmoBTS - Bug #4709: osmo-bts-trx (latest version 1.2.1) crashes in ttcn3-bts-test-latestResolved08/13/2020

Actions
Actions #1

Updated by pespin over 5 years ago

According to test case, it happens when osmo-bsc is killed and osmo-bts-trx detects the connection is broken (timestamp 20181002160543133 in osmo-bts log):

16:05:40.004725 run   osmo-bsc_10.42.42.7(pid=29315): Terminating (SIGINT)  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)]  [process.py:111]
16:05:43.411228 run   osmo-bsc_10.42.42.7(pid=29315): DBG: Cleanup  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)]  [process.py:134]
16:05:43.412340 run   osmo-bsc_10.42.42.7(pid=29315): Terminated: ok {rc=0}  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)]  [process.py:137]
16:05:43.413307 run          osmo-trx-uhd(pid=29325): Terminating (SIGINT)  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)]  [process.py:111]
16:05:43.582363 run          osmo-trx-uhd(pid=29325): DBG: Cleanup  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)]  [process.py:134]
16:05:43.583486 run          osmo-trx-uhd(pid=29325): Terminated {rc=130}  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)]  [process.py:139]
16:05:43.584962 run          osmo-bts-trx(pid=29326): Terminating (SIGINT)  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)]  [process.py:111]
16:05:43.586014 run          osmo-bts-trx(pid=29326): DBG: Cleanup  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)]  [process.py:134]
16:05:43.586974 run          osmo-bts-trx(pid=29326): Terminated {rc=1}  [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)]  [process.py:139]
Actions #2

Updated by pespin over 5 years ago

This is the code path map:

ipa_client_read                                                         "lost connection with server" 
    ipa_client_conn_close                                               FD unregister && close, free pending_msg queue.
    link->updown_cb(link)
        ipaccess_bts_updown_cb(link)
            sign_link_down(link->line) [osmo-bts]                        "abis.c:144 Signalling link down".   Freees g_bts->oml_link and each trx->rsl_link: e1inp_sign_link_destroy
                e1inp_sign_link_destroy x3
                    link->ts->line->driver->close(link)
                        ipaccess_close
                            e1inp_close_socket                            unregisters and closes bfd
                                e1inp_int_snd_event(S_L_INP_TEI_DN);
                                    &inp_s_cbfn [osmo-bts]                "Input Signal %u received\n"  prints and does nothing
                    e1inp_line_put(link->ts->line)
                    talloc_free(link)
                bts_model_abis_close                                     "Abis close" 

The issue seems to appear when trying to call e1inp_sign_link_destroy(trx->rsl_link); on the 2nd TRX to close its RSL connection.

As we get the message "Input Signal 2 received" twice and we get the crash before seeing "Abis close", it must necessarily fail that way.

So it seems the 2nd trx was at some point freed but was not removed from g_bts->trx_list.

Actions #3

Updated by pespin over 5 years ago

  • Status changed from New to Stalled

I was unable so far to find the cause of the issue.

I mark this issue as stalled until I have an ettus B200 I can easily try to reproduce it with.

I submitted several patches in libosmo-abis and osmo-bts to improve logging and I enabled the test to be run by osmo-gsm-tester prod setup. Hopefully we'll get more input on this meanwhile.

Actions #4

Updated by Hoernchen over 4 years ago

  • Related to Bug #4094: multiple crashes due to connection failures / drops added
Actions #5

Updated by pespin almost 4 years ago

I got this again today, by pressing restarting osmo-bsc quicly while osmo-bts-trx was connected to it. using current master + some changes non-related to this in l1_if.c (tx power ramping):

So that's basically current osmo-bts.git master de2ef31f4c85c307455f6df37bfcc88534d35cc7.

20200528210313338 DTRX <000b> trx_if.c:122 phy0.0: Clock indication: fn=1753825
20200528210313338 DL1C <0006> scheduler_trx.c:1871 TRX Clock Ind: elapsed_us= 997947, elapsed_fn=216, error_us=+1107
20200528210313338 DL1C <0006> scheduler_trx.c:1890 GSM clock jitter: -1915us (elapsed_fn=0)
20200528210314336 DTRX <000b> trx_if.c:122 phy0.0: Clock indication: fn=1754041
20200528210314336 DL1C <0006> scheduler_trx.c:1871 TRX Clock Ind: elapsed_us= 997604, elapsed_fn=216, error_us= +764
20200528210314336 DL1C <0006> scheduler_trx.c:1890 GSM clock jitter: -2697us (elapsed_fn=0)
20200528210315106 DLINP <0012> ipa.c:65 192.168.30.1:3003 connection closed with server
20200528210315106 DABIS <000d> abis.c:143 Signalling link down
=================================================================
==441979==ERROR: AddressSanitizer: heap-use-after-free on address 0x62f000039090 at pc 0x7ffff73a809d bp 0x7fffffffd980 sp 0x7fffffffd970
WRITE of size 8 at 0x62f000039090 thread T0
    #0 0x7ffff73a809c in __llist_del /build/new/out/include/osmocom/core/linuxlist.h:117
    #1 0x7ffff73a81d4 in llist_del /build/new/out/include/osmocom/core/linuxlist.h:129
    #2 0x7ffff73ad785 in e1inp_sign_link_destroy /git/libosmo-abis/src/e1_input.c:590
    #3 0x5555557ceda6 in sign_link_down /git/osmo-bts/src/common/abis.c:166
    #4 0x7ffff73d84d2 in ipaccess_bts_updown_cb /git/libosmo-abis/src/input/ipaccess.c:811
    #5 0x7ffff73c64ef in ipa_client_read /git/libosmo-abis/src/input/ipa.c:68
    #6 0x7ffff73c7a22 in ipa_client_fd_cb /git/libosmo-abis/src/input/ipa.c:136
    #7 0x7ffff685cf76 in osmo_fd_disp_fds /git/libosmocore/src/select.c:227
    #8 0x7ffff685d35b in _osmo_select_main /git/libosmocore/src/select.c:265
    #9 0x7ffff685d43a in osmo_select_main /git/libosmocore/src/select.c:274
    #10 0x5555557c6389 in bts_main /git/osmo-bts/src/common/main.c:354
    #11 0x5555556e40bf in main /git/osmo-bts/src/osmo-bts-trx/main.c:155
    #12 0x7ffff5c16001 in __libc_start_main (/usr/lib/libc.so.6+0x27001)
    #13 0x5555556e31fd in _start (/build/new/out/bin/osmo-bts-trx+0x18f1fd)

0x62f000039090 is located 3216 bytes inside of 48336-byte region [0x62f000038400,0x62f0000440d0)
freed by thread T0 here:
    #0 0x7ffff76a40e9 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:123
    #1 0x7ffff675741b  (/usr/lib/libtalloc.so.2+0x441b)

previously allocated by thread T0 here:
    #0 0x7ffff76a4459 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x7ffff6759b8c  (/usr/lib/libtalloc.so.2+0x6b8c)

SUMMARY: AddressSanitizer: heap-use-after-free /build/new/out/include/osmocom/core/linuxlist.h:117 in __llist_del
Shadow bytes around the buggy address:
  0x0c5e7ffff1c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff1d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff1e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff1f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c5e7ffff210: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff220: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff230: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff240: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff250: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c5e7ffff260: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==441979==ABORTING

Program received signal SIGABRT, Aborted.
--Type <RET> for more, q to quit, c to continue without paging--
0x00007ffff5c2b355 in raise () from /usr/lib/libc.so.6

Actions #6

Updated by pespin almost 4 years ago

  • Status changed from Stalled to Feedback
  • % Done changed from 0 to 90

Fixed by:
https://gerrit.osmocom.org/c/libosmo-abis/+/18730 e1_input: refcount inc line during e1_sign_link_create, not during line update

Actions #7

Updated by pespin over 3 years ago

  • Related to Bug #4624: osmo-bsc leaks memory added
Actions #8

Updated by pespin over 3 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 90 to 100

Possible memory leak as result being tracked in #3612, closing this ticket since the crash is fixed.

Actions #9

Updated by laforge over 3 years ago

  • Related to Bug #4709: osmo-bts-trx (latest version 1.2.1) crashes in ttcn3-bts-test-latest added
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)