Bug #3612
closedosmo-bts-trx: heap-use-after-free in e1inp_sign_link_destroy
100%
Description
20181002160543055 DTRX <000b> trx_if.c:124 Clock indication: fn=387162 20181002160543055 DL1C <0006> scheduler_trx.c:1640 TRX Clock Ind: elapsed_us= 986522, elapsed_fn=217, error_us=-14933 20181002160543055 DL1C <0006> scheduler_trx.c:1658 GSM clock jitter: 14247us (elapsed_fn=4) 20181002160543056 DL1C <0006> scheduler_trx.c:1681 We were 4 FN slower than TRX, compensated 20181002160543133 DLINP <0012> input/ipa.c:63 10.42.42.7:3003 lost connection with server 20181002160543133 DABIS <000d> abis.c:144 Signalling link down 20181002160543133 DABIS <000d> abis.c:229 Input Signal 2 received 20181002160543134 DABIS <000d> abis.c:229 Input Signal 2 received ================================================================= ==29326==ERROR: AddressSanitizer: heap-use-after-free on address 0x62e000001078 at pc 0x7f82e8521b02 bp 0x7ffcf0cd7620 sp 0x7ffcf0cd7618 WRITE of size 8 at 0x62e000001078 thread T0 #0 0x7f82e8521b01 in __llist_del /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:114 #1 0x7f82e8521b01 in llist_del /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:126 #2 0x7f82e8521b01 in e1inp_sign_link_destroy /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmo-abis/src/e1_input.c:551 #3 0x557e14465b1c in sign_link_down /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/osmo-bts/src/common/abis.c:167 #4 0x7f82e8531abf in ipa_client_read input/ipa.c:68 #5 0x7f82e8531abf in ipa_client_fd_cb input/ipa.c:136 #6 0x7f82e77e4878 in osmo_fd_disp_fds /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmocore/src/select.c:217 #7 0x7f82e77e4878 in osmo_select_main /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/libosmocore/src/select.c:257 #8 0x557e144617eb in bts_main /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/osmo-bts/src/common/main.c:364 #9 0x7f82e64b32e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #10 0x557e143c6fd9 in _start (/home/jenkins/workspace/osmo-gsm-tester_manual-run/trial-155/inst/osmo-bts/bin/osmo-bts-trx+0x103fd9) 0x62e000001078 is located 3192 bytes inside of 48072-byte region [0x62e000000400,0x62e00000bfc8) freed by thread T0 here: #0 0x7f82e9221a10 in free (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1a10) #1 0x7f82e82be86a in _talloc_free (/usr/lib/x86_64-linux-gnu/libtalloc.so.2+0x486a) previously allocated by thread T0 here: #0 0x7f82e9221d28 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1d28) #1 0x7f82e82c0acd in _talloc_zero (/usr/lib/x86_64-linux-gnu/libtalloc.so.2+0x6acd) SUMMARY: AddressSanitizer: heap-use-after-free /home/osmocom-build/jenkins/workspace/osmo-gsm-tester_build-osmo-bts/inst-osmo-bts/include/osmocom/core/linuxlist.h:114 in __llist_del Shadow bytes around the buggy address: 0x0c5c7fff81b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5c7fff81c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5c7fff81d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5c7fff81e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5c7fff81f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c5c7fff8200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd[fd] 0x0c5c7fff8210: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5c7fff8220: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5c7fff8230: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5c7fff8240: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5c7fff8250: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==29326==ABORTING
Files
Related issues
Updated by pespin almost 6 years ago
According to test case, it happens when osmo-bsc is killed and osmo-bts-trx detects the connection is broken (timestamp 20181002160543133 in osmo-bts log):
16:05:40.004725 run osmo-bsc_10.42.42.7(pid=29315): Terminating (SIGINT) [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)] [process.py:111] 16:05:43.411228 run osmo-bsc_10.42.42.7(pid=29315): DBG: Cleanup [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)] [process.py:134] 16:05:43.412340 run osmo-bsc_10.42.42.7(pid=29315): Terminated: ok {rc=0} [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bsc_10.42.42.7↪osmo-bsc_10.42.42.7(pid=29315)] [process.py:137] 16:05:43.413307 run osmo-trx-uhd(pid=29325): Terminating (SIGINT) [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)] [process.py:111] 16:05:43.582363 run osmo-trx-uhd(pid=29325): DBG: Cleanup [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)] [process.py:134] 16:05:43.583486 run osmo-trx-uhd(pid=29325): Terminated {rc=130} [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-trx-uhd↪osmo-trx-uhd(pid=29325)] [process.py:139] 16:05:43.584962 run osmo-bts-trx(pid=29326): Terminating (SIGINT) [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)] [process.py:111] 16:05:43.586014 run osmo-bts-trx(pid=29326): DBG: Cleanup [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)] [process.py:134] 16:05:43.586974 run osmo-bts-trx(pid=29326): Terminated {rc=1} [trial-155↪ussd:trx-b200+mod-bts0-numtrx2+mod-bts0-chanallocdescend↪osmo-bts-trx↪osmo-bts-trx(pid=29326)] [process.py:139]
Updated by pespin almost 6 years ago
This is the code path map:
ipa_client_read "lost connection with server" ipa_client_conn_close FD unregister && close, free pending_msg queue. link->updown_cb(link) ipaccess_bts_updown_cb(link) sign_link_down(link->line) [osmo-bts] "abis.c:144 Signalling link down". Freees g_bts->oml_link and each trx->rsl_link: e1inp_sign_link_destroy e1inp_sign_link_destroy x3 link->ts->line->driver->close(link) ipaccess_close e1inp_close_socket unregisters and closes bfd e1inp_int_snd_event(S_L_INP_TEI_DN); &inp_s_cbfn [osmo-bts] "Input Signal %u received\n" prints and does nothing e1inp_line_put(link->ts->line) talloc_free(link) bts_model_abis_close "Abis close"
The issue seems to appear when trying to call e1inp_sign_link_destroy(trx->rsl_link);
on the 2nd TRX to close its RSL connection.
As we get the message "Input Signal 2 received" twice and we get the crash before seeing "Abis close", it must necessarily fail that way.
So it seems the 2nd trx was at some point freed but was not removed from g_bts->trx_list.
Updated by pespin almost 6 years ago
- Status changed from New to Stalled
I was unable so far to find the cause of the issue.
I mark this issue as stalled until I have an ettus B200 I can easily try to reproduce it with.
I submitted several patches in libosmo-abis and osmo-bts to improve logging and I enabled the test to be run by osmo-gsm-tester prod setup. Hopefully we'll get more input on this meanwhile.
Updated by Hoernchen about 5 years ago
- Related to Bug #4094: multiple crashes due to connection failures / drops added
Updated by pespin about 4 years ago
I got this again today, by pressing restarting osmo-bsc quicly while osmo-bts-trx was connected to it. using current master + some changes non-related to this in l1_if.c (tx power ramping):
So that's basically current osmo-bts.git master de2ef31f4c85c307455f6df37bfcc88534d35cc7.
20200528210313338 DTRX <000b> trx_if.c:122 phy0.0: Clock indication: fn=1753825 20200528210313338 DL1C <0006> scheduler_trx.c:1871 TRX Clock Ind: elapsed_us= 997947, elapsed_fn=216, error_us=+1107 20200528210313338 DL1C <0006> scheduler_trx.c:1890 GSM clock jitter: -1915us (elapsed_fn=0) 20200528210314336 DTRX <000b> trx_if.c:122 phy0.0: Clock indication: fn=1754041 20200528210314336 DL1C <0006> scheduler_trx.c:1871 TRX Clock Ind: elapsed_us= 997604, elapsed_fn=216, error_us= +764 20200528210314336 DL1C <0006> scheduler_trx.c:1890 GSM clock jitter: -2697us (elapsed_fn=0) 20200528210315106 DLINP <0012> ipa.c:65 192.168.30.1:3003 connection closed with server 20200528210315106 DABIS <000d> abis.c:143 Signalling link down ================================================================= ==441979==ERROR: AddressSanitizer: heap-use-after-free on address 0x62f000039090 at pc 0x7ffff73a809d bp 0x7fffffffd980 sp 0x7fffffffd970 WRITE of size 8 at 0x62f000039090 thread T0 #0 0x7ffff73a809c in __llist_del /build/new/out/include/osmocom/core/linuxlist.h:117 #1 0x7ffff73a81d4 in llist_del /build/new/out/include/osmocom/core/linuxlist.h:129 #2 0x7ffff73ad785 in e1inp_sign_link_destroy /git/libosmo-abis/src/e1_input.c:590 #3 0x5555557ceda6 in sign_link_down /git/osmo-bts/src/common/abis.c:166 #4 0x7ffff73d84d2 in ipaccess_bts_updown_cb /git/libosmo-abis/src/input/ipaccess.c:811 #5 0x7ffff73c64ef in ipa_client_read /git/libosmo-abis/src/input/ipa.c:68 #6 0x7ffff73c7a22 in ipa_client_fd_cb /git/libosmo-abis/src/input/ipa.c:136 #7 0x7ffff685cf76 in osmo_fd_disp_fds /git/libosmocore/src/select.c:227 #8 0x7ffff685d35b in _osmo_select_main /git/libosmocore/src/select.c:265 #9 0x7ffff685d43a in osmo_select_main /git/libosmocore/src/select.c:274 #10 0x5555557c6389 in bts_main /git/osmo-bts/src/common/main.c:354 #11 0x5555556e40bf in main /git/osmo-bts/src/osmo-bts-trx/main.c:155 #12 0x7ffff5c16001 in __libc_start_main (/usr/lib/libc.so.6+0x27001) #13 0x5555556e31fd in _start (/build/new/out/bin/osmo-bts-trx+0x18f1fd) 0x62f000039090 is located 3216 bytes inside of 48336-byte region [0x62f000038400,0x62f0000440d0) freed by thread T0 here: #0 0x7ffff76a40e9 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:123 #1 0x7ffff675741b (/usr/lib/libtalloc.so.2+0x441b) previously allocated by thread T0 here: #0 0x7ffff76a4459 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145 #1 0x7ffff6759b8c (/usr/lib/libtalloc.so.2+0x6b8c) SUMMARY: AddressSanitizer: heap-use-after-free /build/new/out/include/osmocom/core/linuxlist.h:117 in __llist_del Shadow bytes around the buggy address: 0x0c5e7ffff1c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff1d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff1e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff1f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c5e7ffff210: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff220: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff230: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff240: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff250: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c5e7ffff260: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==441979==ABORTING Program received signal SIGABRT, Aborted. --Type <RET> for more, q to quit, c to continue without paging-- 0x00007ffff5c2b355 in raise () from /usr/lib/libc.so.6
Updated by pespin about 4 years ago
- Status changed from Stalled to Feedback
- % Done changed from 0 to 90
Fixed by:
https://gerrit.osmocom.org/c/libosmo-abis/+/18730 e1_input: refcount inc line during e1_sign_link_create, not during line update
Updated by pespin about 4 years ago
- Related to Bug #4624: osmo-bsc leaks memory added
Updated by pespin about 4 years ago
- Status changed from Feedback to Resolved
- % Done changed from 90 to 100
Possible memory leak as result being tracked in #3612, closing this ticket since the crash is fixed.
Updated by laforge almost 4 years ago
- Related to Bug #4709: osmo-bts-trx (latest version 1.2.1) crashes in ttcn3-bts-test-latest added