Bug #4625
closedosmo-stp crashes in ttcn3-bsc-tests on first M3UA message
90%
Description
I noticed some unexplained test fallout in ttcn3-bsc-tests,
simply re-ran the same test without anything additionally merged (AFAICT),
and suddenly all ttcn3-bsc-tests fail across the board.
Logs show that osmo-stp segfaults.
Unfortunately there is no segfault backtrace in the logs.
Trying to reproduce the same issue does not work.
Running 1:1 the same suite (i.e. by launching ./jenkins.sh in docker-playground/ttcn3-bsc-test) does not crash osmo-stp.
Files
Related issues
Updated by neels about 4 years ago
20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:368 0: Creating SS7 Instance [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:668 0: Creating Route Table system [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1384 0: asp-virt-msc0-0: Restarting ASP virt-msc0-0, r=172.18.2.203:23905<->l=172.18.2.200:2905 [0;m20200620150121545 [1;33mDLSS7[0;m <000c> osmo_ss7.c:1432 0: asp-virt-msc0-0: ASP Restart for server not implemented yet! [0;m20200620150121545 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_ASP(virt-msc0-0){ASP_DOWN}: Allocated [0;m20200620150121545 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_AS(mahlzeit){AS_DOWN}: Allocated [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:970 0: as-mahlzeit: Created AS [0;m20200620150121545 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_AS(virt-msc0){AS_DOWN}: Allocated [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:970 0: as-virt-msc0: Created AS [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:990 0: as-virt-msc0: Adding ASP virt-msc0-0 to AS [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1384 0: asp-virt-msc1-0: Restarting ASP virt-msc1-0, r=172.18.2.203:23906<->l=172.18.2.200:2905 [0;m20200620150121545 [1;33mDLSS7[0;m <000c> osmo_ss7.c:1432 0: asp-virt-msc1-0: ASP Restart for server not implemented yet! [0;m20200620150121545 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_ASP(virt-msc1-0){ASP_DOWN}: Allocated [0;m20200620150121545 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_AS(virt-msc1){AS_DOWN}: Allocated [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:970 0: as-virt-msc1: Created AS [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:990 0: as-virt-msc1: Adding ASP virt-msc1-0 to AS [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1384 0: asp-virt-msc2-0: Restarting ASP virt-msc2-0, r=172.18.2.203:23907<->l=172.18.2.200:2905 [0;m20200620150121545 [1;33mDLSS7[0;m <000c> osmo_ss7.c:1432 0: asp-virt-msc2-0: ASP Restart for server not implemented yet! [0;m20200620150121545 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_ASP(virt-msc2-0){ASP_DOWN}: Allocated [0;m20200620150121545 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_AS(virt-msc2){AS_DOWN}: Allocated [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:970 0: as-virt-msc2: Created AS [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:990 0: as-virt-msc2: Adding ASP virt-msc2-0 to AS [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:792 0: Creating route: pc=185=0.23.1 mask=0x3fff via AS 'virt-msc0' [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:792 0: Creating route: pc=2=0.0.2 mask=0x3fff via AS 'virt-msc1' [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:792 0: Creating route: pc=3=0.0.3 mask=0x3fff via AS 'virt-msc2' [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1988 Creating m3ua Server (null):2905 [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:2006 Created m3ua server on (null):2905 [0;m20200620150121545 [1;32mDLSS7[0;m <000c> osmo_ss7.c:2034 (Re)binding m3ua Server to 0.0.0.0:2905 [0;m20200620150121546 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1988 Creating ipa Server (null):5000 [0;m20200620150121546 [1;32mDLSS7[0;m <000c> osmo_ss7.c:2006 Created ipa server on (null):5000 [0;m20200620150121546 [1;32mDLSS7[0;m <000c> osmo_ss7.c:2034 (Re)binding ipa Server to 0.0.0.0:5000 [0;m20200620150121546 [1;33mDLGLOBAL[0;m <0000> telnet_interface.c:104 Available via telnet 127.0.0.1 4239 [0;m20200620150122909 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted [0;m20200620150122909 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1853 (r=172.18.2.20:2905<->l=172.18.2.200:2905): created dynamic ASP asp-dyn-0 [0;m20200620150122909 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1384 0: asp-asp-dyn-0: Restarting ASP asp-dyn-0, r=172.18.2.20:2905<->l=0.0.0.0:2905 [0;m20200620150122909 [1;33mDLSS7[0;m <000c> osmo_ss7.c:1432 0: asp-asp-dyn-0: ASP Restart for server not implemented yet! [0;m20200620150122909 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_ASP(asp-dyn-0){ASP_DOWN}: Allocated [0;m20200620150122909 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-EST.ind [0;m20200620150122909 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_ESTABLISH.indication [0;m20200620150122909 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted [0;m20200620150122909 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1834 (r=172.18.2.20:2905<->l=172.18.2.200:2905): matched connection to ASP asp-dyn-0 [0;m20200620150122909 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-EST.ind [0;m20200620150122909 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_ESTABLISH.indication [0;m20200620150122909 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1579 0: asp-asp-dyn-0: xua_srv_conn_cb(): sctp_recvmsg() returned 12 (flags=0x8080) [0;m20200620150122909 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1512 0: asp-asp-dyn-0: xUA SRV SCTP NOTIFICATION 32773 flags=0x0 [0;m20200620150122909 [1;33mDLSS7[0;m <000c> osmo_ss7.c:1525 0: asp-asp-dyn-0: xUA SRV SHUTDOWN_EVENT [0;m20200620150122910 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1775 asp-dyn-0: SCTP connection closed [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1781 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-COMM_DOWN.ind [0;m20200620150122910 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:668 XUA_ASP(asp-dyn-0){ASP_DOWN}: state_chg to ASP_DOWN [0;m20200620150122910 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-ASP_DOWN.indication [0;m20200620150122910 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_RELEASE.indication [0;m20200620150122910 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1344 0: asp-asp-dyn-0: Destroying ASP [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-0){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST) [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-0){ASP_DOWN}: Freeing instance [0;m20200620150122910 [1;34mDLSS7[0;m <000c> fsm.c:573 XUA_ASP(asp-dyn-0){ASP_DOWN}: Deallocated [0;m20200620150122910 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted [0;m20200620150122910 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1853 (r=172.18.2.20:2905<->l=172.18.2.200:2905): created dynamic ASP asp-dyn-1 [0;m20200620150122910 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1384 0: asp-asp-dyn-1: Restarting ASP asp-dyn-1, r=172.18.2.20:2905<->l=0.0.0.0:2905 [0;m20200620150122910 [1;33mDLSS7[0;m <000c> osmo_ss7.c:1432 0: asp-asp-dyn-1: ASP Restart for server not implemented yet! [0;m20200620150122910 [1;34mDLSS7[0;m <000c> fsm.c:461 XUA_ASP(asp-dyn-1){ASP_DOWN}: Allocated [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-1){ASP_DOWN}: Received Event SCTP-EST.ind [0;m20200620150122910 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-SCTP_ESTABLISH.indication [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 12 (flags=0x8080) [0;m20200620150122910 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1512 0: asp-asp-dyn-1: xUA SRV SCTP NOTIFICATION 32773 flags=0x0 [0;m20200620150122910 [1;33mDLSS7[0;m <000c> osmo_ss7.c:1525 0: asp-asp-dyn-1: xUA SRV SHUTDOWN_EVENT [0;m20200620150122910 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1775 asp-dyn-1: SCTP connection closed [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1781 XUA_ASP(asp-dyn-1){ASP_DOWN}: Received Event SCTP-COMM_DOWN.ind [0;m20200620150122910 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:668 XUA_ASP(asp-dyn-1){ASP_DOWN}: state_chg to ASP_DOWN [0;m20200620150122910 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-ASP_DOWN.indication [0;m20200620150122910 [1;34mDLSS7[0;m <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-SCTP_RELEASE.indication [0;m20200620150122910 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1344 0: asp-asp-dyn-1: Destroying ASP [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST) [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Freeing instance [0;m20200620150122910 [1;34mDLSS7[0;m <000c> fsm.c:573 XUA_ASP(asp-dyn-1){ASP_DOWN}: Deallocated [0;m20200620150122910 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80) [0;m20200620150122910 [1;34mDLM3UA[0;m <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP) [0;m20200620150122910 [1;34mDLSS7[0;m <000c> m3ua.c:686 XUA_ASP(asp-dyn-1){ASP_DOWN}: FSM instance already terminating, not dispatching event ASPSM-ASP_UP [0;m20200620150122910 [1;31mDLSS7[0;m <000c> osmo_ss7.c:1918 0: asp-asp-dyn-1: Cannot transmit, no asp->server [0;m20200620150125953 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80) [0;m20200620150125953 [1;34mDLM3UA[0;m <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP) [0;mSegmentation fault (core dumped)
Updated by neels about 4 years ago
- File artifacts_ttcn3-bsc-test_1012.zip artifacts_ttcn3-bsc-test_1012.zip added
- File artifacts_ttcn3-bsc-test_1013.zip artifacts_ttcn3-bsc-test_1013.zip added
- Status changed from New to Stalled
the spooky test failure happened twice in a row, and now isn't happening any longer.
I applied a change to try and collect the osmo-stp core dump, but the error vanished before a core could be generated.
(Could it have to do with another test or docker container running simultaneously?)
attached are the build artifacts for the two failing runs.
Updated by fixeria about 4 years ago
Hi Neels,
please see:
https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test/1016/artifact/logs/stp/
(gdb) bt #0 0x00007ff0cc0713ab in _osmo_fsm_inst_dispatch (fi=0x55d0dcc7a050, event=7, data=0x55d0dcc76ce0, file=0x7ff0cc6eb807 <error: Cannot access memory at address 0x7ff0cc6eb807>, line=686) at fsm.c:852 #1 0x00007ff0cc6d34bb in ?? () #2 0x12a2026400000000 in ?? () #3 0x000055d0dcc75540 in ?? () #4 0x0000000000000008 in ?? () #5 0x00007ff0cc6e3075 in ?? () #6 0x000055d0dcc75540 in ?? () #7 0x0000000000000008 in ?? () #8 0x000055d0dcc75470 in ?? () #9 0x000055d0dcc75010 in ?? () #10 0x000055d0dcc75478 in ?? () #11 0x00007ff0cc6e319b in ?? () #12 0x0000000000000000 in ?? () (gdb) p *fi $3 = { list = { next = 0x55d0dcc7a0e0, prev = 0x55d0dcc792c8 }, fsm = 0x4000c, id = 0x55d0dcc76f20 "", name = 0x55d0dcc75340 "XUA_ASP(asp-dyn-1)", priv = 0x91, log_level = 1784082000, state = 32752, T = -590897168, timer = { node = { rb_parent_color = 0, rb_right = 0x55d0dcc79250, rb_left = 0x55d0dcc78470 }, list = { next = 0x0, prev = 0x0 }, timeout = { tv_sec = 140672198686739, tv_usec = 32 }, active = 0, cb = 0x0, data = 0x55d0dcc7a050 }, proc = { parent = 0x55d0dcc792c8, parent_term_event = 3704070224, children = { next = 0x80007, prev = 0x55d0dcc784d0 }, child = { next = 0x55d0dcc7a100, prev = 0x101 }, terminating = 72 } }
Updated by fixeria about 4 years ago
Looks like use-after-free:
[0;m20200621060103377 [1;32mDLSS7[0;m <000c> osmo_ss7.c:1344 0: asp-asp-dyn-1: Destroying ASP [0;m20200621060103377 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST) [0;m20200621060103377 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Freeing instance [0;m20200621060103377 [1;34mDLSS7[0;m <000c> fsm.c:573 XUA_ASP(asp-dyn-1){ASP_DOWN}: Deallocated // <--- !!! ... [0;m20200621060103377 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80) [0;m20200621060103377 [1;34mDLM3UA[0;m <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP) [0;m20200621060103377 [1;34mDLSS7[0;m <000c> m3ua.c:686 XUA_ASP(asp-dyn-1){ASP_DOWN}: FSM instance already terminating, not dispatching event ASPSM-ASP_UP [0;m20200621060103377 [1;31mDLSS7[0;m <000c> osmo_ss7.c:1918 0: asp-asp-dyn-1: Cannot transmit, no asp->server [0;m20200621060105414 [1;34mDLSS7[0;m <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80) [0;m20200621060105414 [1;34mDLM3UA[0;m <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP)
Updated by laforge about 4 years ago
- Priority changed from Normal to Immediate
I saw this again in ttcn3-bsc-test build #1016
20200621060103376 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted 20200621060103377 DLSS7 <000c> osmo_ss7.c:1853 (r=172.18.2.20:2905<->l=172.18.2.200:2905): created dynamic ASP asp-dyn-0 20200621060103377 DLSS7 <000c> osmo_ss7.c:1384 0: asp-asp-dyn-0: Restarting ASP asp-dyn-0, r=172.18.2.20:2905<->l=0.0.0.0:2905 20200621060103377 DLSS7 <000c> osmo_ss7.c:1432 0: asp-asp-dyn-0: ASP Restart for server not implemented yet! 20200621060103377 DLSS7 <000c> fsm.c:461 XUA_ASP(asp-dyn-0){ASP_DOWN}: Allocated 20200621060103377 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-EST.ind 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_ESTABLISH.indication 20200621060103377 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted 20200621060103377 DLSS7 <000c> osmo_ss7.c:1834 (r=172.18.2.20:2905<->l=172.18.2.200:2905): matched connection to ASP asp-dyn-0 20200621060103377 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-EST.ind 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_ESTABLISH.indication 20200621060103377 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-0: xua_srv_conn_cb(): sctp_recvmsg() returned 12 (flags=0x8080) 20200621060103377 DLSS7 <000c> osmo_ss7.c:1512 0: asp-asp-dyn-0: xUA SRV SCTP NOTIFICATION 32773 flags=0x0 20200621060103377 DLSS7 <000c> osmo_ss7.c:1525 0: asp-asp-dyn-0: xUA SRV SHUTDOWN_EVENT 20200621060103377 DLSS7 <000c> osmo_ss7.c:1775 asp-dyn-0: SCTP connection closed 20200621060103377 DLSS7 <000c> osmo_ss7.c:1781 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-COMM_DOWN.ind 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:668 XUA_ASP(asp-dyn-0){ASP_DOWN}: state_chg to ASP_DOWN 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-ASP_DOWN.indication 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_RELEASE.indication 20200621060103377 DLSS7 <000c> osmo_ss7.c:1344 0: asp-asp-dyn-0: Destroying ASP 20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-0){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST) 20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-0){ASP_DOWN}: Freeing instance 20200621060103377 DLSS7 <000c> fsm.c:573 XUA_ASP(asp-dyn-0){ASP_DOWN}: Deallocated 20200621060103377 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted 20200621060103377 DLSS7 <000c> osmo_ss7.c:1853 (r=172.18.2.20:2905<->l=172.18.2.200:2905): created dynamic ASP asp-dyn-1 20200621060103377 DLSS7 <000c> osmo_ss7.c:1384 0: asp-asp-dyn-1: Restarting ASP asp-dyn-1, r=172.18.2.20:2905<->l=0.0.0.0:2905 20200621060103377 DLSS7 <000c> osmo_ss7.c:1432 0: asp-asp-dyn-1: ASP Restart for server not implemented yet! 20200621060103377 DLSS7 <000c> fsm.c:461 XUA_ASP(asp-dyn-1){ASP_DOWN}: Allocated 20200621060103377 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-1){ASP_DOWN}: Received Event SCTP-EST.ind 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-SCTP_ESTABLISH.indication 20200621060103377 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 12 (flags=0x8080) 20200621060103377 DLSS7 <000c> osmo_ss7.c:1512 0: asp-asp-dyn-1: xUA SRV SCTP NOTIFICATION 32773 flags=0x0 20200621060103377 DLSS7 <000c> osmo_ss7.c:1525 0: asp-asp-dyn-1: xUA SRV SHUTDOWN_EVENT 20200621060103377 DLSS7 <000c> osmo_ss7.c:1775 asp-dyn-1: SCTP connection closed 20200621060103377 DLSS7 <000c> osmo_ss7.c:1781 XUA_ASP(asp-dyn-1){ASP_DOWN}: Received Event SCTP-COMM_DOWN.ind 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:668 XUA_ASP(asp-dyn-1){ASP_DOWN}: state_chg to ASP_DOWN 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-ASP_DOWN.indication 20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-SCTP_RELEASE.indication 20200621060103377 DLSS7 <000c> osmo_ss7.c:1344 0: asp-asp-dyn-1: Destroying ASP 20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST) 20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Freeing instance 20200621060103377 DLSS7 <000c> fsm.c:573 XUA_ASP(asp-dyn-1){ASP_DOWN}: Deallocated 20200621060103377 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80) 20200621060103377 DLM3UA <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP) 20200621060103377 DLSS7 <000c> m3ua.c:686 XUA_ASP(asp-dyn-1){ASP_DOWN}: FSM instance already terminating, not dispatching event ASPSM-ASP_UP 20200621060103377 DLSS7 <000c> osmo_ss7.c:1918 0: asp-asp-dyn-1: Cannot transmit, no asp->server 20200621060105414 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80) 20200621060105414 DLM3UA <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP)
The key aspect is that we receive an ASPSM:UP message while the ASP is terminating, and then things go downhill from there.
Updated by laforge about 4 years ago
I've locally reproduced it (in docker with our "./jenkins.sh". The problem is that since it relates to M3UA traffic between BSC and STP, it happens before we launch TTCN3 and hence before we capture anything in tcpdump by our infrastructure.
Updated by laforge about 4 years ago
- Status changed from Stalled to In Progress
- Assignee set to laforge
Updated by laforge about 4 years ago
- % Done changed from 0 to 20
So the problem is two-fold
- osmo-bsc is sending a rapid sequence of SCTP INIT-SHUTDOWN-INIT-SHUTDOWN-INIT (packets 11 to 28 in the attached stp.pcap) before ever getting to the stage where it sends the first M3UA message (ASPUP). This is clearly not expected, and osmo-bsc shouldn't do that (neels is working on related changes avoiding an ASP restart every time a MSC config item is parsed)
- osmo-stp should of course not crash, no matter what kind of SCTP packet sequences arrive from any valid or invalid peer. I will try to creat a TTCN3 test case for this behavior to reliably reproduce and then look into a fix.
Updated by neels about 4 years ago
- osmo-bsc is sending a rapid sequence of SCTP INIT-SHUTDOWN-INIT-SHUTDOWN-INIT (packets 11 to 28 in the attached stp.pcap) before ever getting to the stage where it sends the first M3UA message (ASPUP). This is clearly not expected, and osmo-bsc shouldn't do that (neels is working on related changes avoiding an ASP restart every time a MSC config item is parsed)
patch for this part is https://gerrit.osmocom.org/c/libosmo-sccp/+/18990
Updated by laforge about 4 years ago
Updated by neels about 4 years ago
- Related to Feature #4635: quirks when initializing SS7 ASP added
Updated by laforge about 4 years ago
- % Done changed from 20 to 90
patch merged. I was unable to reproduce the required tight timing from within a TTCN3 test case. I could only get it implented in native C. Will try how it goes from a TTCN3 test case that uses a native C++ function for the rapid successive connect()/close() calls.
Updated by laforge about 4 years ago
- Status changed from In Progress to Resolved
unable to reliably test for it in a test case :(