Project

General

Profile

Bug #4625

osmo-stp crashes in ttcn3-bsc-tests on first M3UA message

Added by neels 22 days ago. Updated 17 days ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Target version:
-
Start date:
06/20/2020
Due date:
% Done:

90%


Description

I noticed some unexplained test fallout in ttcn3-bsc-tests,
simply re-ran the same test without anything additionally merged (AFAICT),
and suddenly all ttcn3-bsc-tests fail across the board.

Logs show that osmo-stp segfaults.
Unfortunately there is no segfault backtrace in the logs.

Trying to reproduce the same issue does not work.
Running 1:1 the same suite (i.e. by launching ./jenkins.sh in docker-playground/ttcn3-bsc-test) does not crash osmo-stp.

artifacts_ttcn3-bsc-test_1012.zip artifacts_ttcn3-bsc-test_1012.zip 13.7 MB neels, 06/20/2020 06:16 PM
artifacts_ttcn3-bsc-test_1013.zip artifacts_ttcn3-bsc-test_1013.zip 13.7 MB neels, 06/20/2020 06:16 PM
stp.pcap stp.pcap 16 KB laforge, 06/23/2020 03:50 PM

Related issues

Related to libosmo-sccp + libosmo-sigtran - Feature #4635: quirks when initializing SS7 ASPNew06/24/2020

History

#1 Updated by neels 22 days ago

20200620150121545 DLSS7 <000c> osmo_ss7.c:368 0: Creating SS7 Instance
20200620150121545 DLSS7 <000c> osmo_ss7.c:668 0: Creating Route Table system
20200620150121545 DLSS7 <000c> osmo_ss7.c:1384 0: asp-virt-msc0-0: Restarting ASP virt-msc0-0, r=172.18.2.203:23905<->l=172.18.2.200:2905
20200620150121545 DLSS7 <000c> osmo_ss7.c:1432 0: asp-virt-msc0-0: ASP Restart for server not implemented yet!
20200620150121545 DLSS7 <000c> fsm.c:461 XUA_ASP(virt-msc0-0){ASP_DOWN}: Allocated
20200620150121545 DLSS7 <000c> fsm.c:461 XUA_AS(mahlzeit){AS_DOWN}: Allocated
20200620150121545 DLSS7 <000c> osmo_ss7.c:970 0: as-mahlzeit: Created AS
20200620150121545 DLSS7 <000c> fsm.c:461 XUA_AS(virt-msc0){AS_DOWN}: Allocated
20200620150121545 DLSS7 <000c> osmo_ss7.c:970 0: as-virt-msc0: Created AS
20200620150121545 DLSS7 <000c> osmo_ss7.c:990 0: as-virt-msc0: Adding ASP virt-msc0-0 to AS
20200620150121545 DLSS7 <000c> osmo_ss7.c:1384 0: asp-virt-msc1-0: Restarting ASP virt-msc1-0, r=172.18.2.203:23906<->l=172.18.2.200:2905
20200620150121545 DLSS7 <000c> osmo_ss7.c:1432 0: asp-virt-msc1-0: ASP Restart for server not implemented yet!
20200620150121545 DLSS7 <000c> fsm.c:461 XUA_ASP(virt-msc1-0){ASP_DOWN}: Allocated
20200620150121545 DLSS7 <000c> fsm.c:461 XUA_AS(virt-msc1){AS_DOWN}: Allocated
20200620150121545 DLSS7 <000c> osmo_ss7.c:970 0: as-virt-msc1: Created AS
20200620150121545 DLSS7 <000c> osmo_ss7.c:990 0: as-virt-msc1: Adding ASP virt-msc1-0 to AS
20200620150121545 DLSS7 <000c> osmo_ss7.c:1384 0: asp-virt-msc2-0: Restarting ASP virt-msc2-0, r=172.18.2.203:23907<->l=172.18.2.200:2905
20200620150121545 DLSS7 <000c> osmo_ss7.c:1432 0: asp-virt-msc2-0: ASP Restart for server not implemented yet!
20200620150121545 DLSS7 <000c> fsm.c:461 XUA_ASP(virt-msc2-0){ASP_DOWN}: Allocated
20200620150121545 DLSS7 <000c> fsm.c:461 XUA_AS(virt-msc2){AS_DOWN}: Allocated
20200620150121545 DLSS7 <000c> osmo_ss7.c:970 0: as-virt-msc2: Created AS
20200620150121545 DLSS7 <000c> osmo_ss7.c:990 0: as-virt-msc2: Adding ASP virt-msc2-0 to AS
20200620150121545 DLSS7 <000c> osmo_ss7.c:792 0: Creating route: pc=185=0.23.1 mask=0x3fff via AS 'virt-msc0'
20200620150121545 DLSS7 <000c> osmo_ss7.c:792 0: Creating route: pc=2=0.0.2 mask=0x3fff via AS 'virt-msc1'
20200620150121545 DLSS7 <000c> osmo_ss7.c:792 0: Creating route: pc=3=0.0.3 mask=0x3fff via AS 'virt-msc2'
20200620150121545 DLSS7 <000c> osmo_ss7.c:1988 Creating m3ua Server (null):2905
20200620150121545 DLSS7 <000c> osmo_ss7.c:2006 Created m3ua server on (null):2905
20200620150121545 DLSS7 <000c> osmo_ss7.c:2034 (Re)binding m3ua Server to 0.0.0.0:2905
20200620150121546 DLSS7 <000c> osmo_ss7.c:1988 Creating ipa Server (null):5000
20200620150121546 DLSS7 <000c> osmo_ss7.c:2006 Created ipa server on (null):5000
20200620150121546 DLSS7 <000c> osmo_ss7.c:2034 (Re)binding ipa Server to 0.0.0.0:5000
20200620150121546 DLGLOBAL <0000> telnet_interface.c:104 Available via telnet 127.0.0.1 4239
20200620150122909 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted
20200620150122909 DLSS7 <000c> osmo_ss7.c:1853 (r=172.18.2.20:2905<->l=172.18.2.200:2905): created dynamic ASP asp-dyn-0
20200620150122909 DLSS7 <000c> osmo_ss7.c:1384 0: asp-asp-dyn-0: Restarting ASP asp-dyn-0, r=172.18.2.20:2905<->l=0.0.0.0:2905
20200620150122909 DLSS7 <000c> osmo_ss7.c:1432 0: asp-asp-dyn-0: ASP Restart for server not implemented yet!
20200620150122909 DLSS7 <000c> fsm.c:461 XUA_ASP(asp-dyn-0){ASP_DOWN}: Allocated
20200620150122909 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-EST.ind
20200620150122909 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_ESTABLISH.indication
20200620150122909 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted
20200620150122909 DLSS7 <000c> osmo_ss7.c:1834 (r=172.18.2.20:2905<->l=172.18.2.200:2905): matched connection to ASP asp-dyn-0
20200620150122909 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-EST.ind
20200620150122909 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_ESTABLISH.indication
20200620150122909 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-0: xua_srv_conn_cb(): sctp_recvmsg() returned 12 (flags=0x8080)
20200620150122909 DLSS7 <000c> osmo_ss7.c:1512 0: asp-asp-dyn-0: xUA SRV SCTP NOTIFICATION 32773 flags=0x0
20200620150122909 DLSS7 <000c> osmo_ss7.c:1525 0: asp-asp-dyn-0: xUA SRV SHUTDOWN_EVENT
20200620150122910 DLSS7 <000c> osmo_ss7.c:1775 asp-dyn-0: SCTP connection closed
20200620150122910 DLSS7 <000c> osmo_ss7.c:1781 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-COMM_DOWN.ind
20200620150122910 DLSS7 <000c> xua_asp_fsm.c:668 XUA_ASP(asp-dyn-0){ASP_DOWN}: state_chg to ASP_DOWN
20200620150122910 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-ASP_DOWN.indication
20200620150122910 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_RELEASE.indication
20200620150122910 DLSS7 <000c> osmo_ss7.c:1344 0: asp-asp-dyn-0: Destroying ASP
20200620150122910 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-0){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST)
20200620150122910 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-0){ASP_DOWN}: Freeing instance
20200620150122910 DLSS7 <000c> fsm.c:573 XUA_ASP(asp-dyn-0){ASP_DOWN}: Deallocated
20200620150122910 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted
20200620150122910 DLSS7 <000c> osmo_ss7.c:1853 (r=172.18.2.20:2905<->l=172.18.2.200:2905): created dynamic ASP asp-dyn-1
20200620150122910 DLSS7 <000c> osmo_ss7.c:1384 0: asp-asp-dyn-1: Restarting ASP asp-dyn-1, r=172.18.2.20:2905<->l=0.0.0.0:2905
20200620150122910 DLSS7 <000c> osmo_ss7.c:1432 0: asp-asp-dyn-1: ASP Restart for server not implemented yet!
20200620150122910 DLSS7 <000c> fsm.c:461 XUA_ASP(asp-dyn-1){ASP_DOWN}: Allocated
20200620150122910 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-1){ASP_DOWN}: Received Event SCTP-EST.ind
20200620150122910 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-SCTP_ESTABLISH.indication
20200620150122910 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 12 (flags=0x8080)
20200620150122910 DLSS7 <000c> osmo_ss7.c:1512 0: asp-asp-dyn-1: xUA SRV SCTP NOTIFICATION 32773 flags=0x0
20200620150122910 DLSS7 <000c> osmo_ss7.c:1525 0: asp-asp-dyn-1: xUA SRV SHUTDOWN_EVENT
20200620150122910 DLSS7 <000c> osmo_ss7.c:1775 asp-dyn-1: SCTP connection closed
20200620150122910 DLSS7 <000c> osmo_ss7.c:1781 XUA_ASP(asp-dyn-1){ASP_DOWN}: Received Event SCTP-COMM_DOWN.ind
20200620150122910 DLSS7 <000c> xua_asp_fsm.c:668 XUA_ASP(asp-dyn-1){ASP_DOWN}: state_chg to ASP_DOWN
20200620150122910 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-ASP_DOWN.indication
20200620150122910 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-SCTP_RELEASE.indication
20200620150122910 DLSS7 <000c> osmo_ss7.c:1344 0: asp-asp-dyn-1: Destroying ASP
20200620150122910 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST)
20200620150122910 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Freeing instance
20200620150122910 DLSS7 <000c> fsm.c:573 XUA_ASP(asp-dyn-1){ASP_DOWN}: Deallocated
20200620150122910 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80)
20200620150122910 DLM3UA <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP)
20200620150122910 DLSS7 <000c> m3ua.c:686 XUA_ASP(asp-dyn-1){ASP_DOWN}: FSM instance already terminating, not dispatching event ASPSM-ASP_UP
20200620150122910 DLSS7 <000c> osmo_ss7.c:1918 0: asp-asp-dyn-1: Cannot transmit, no asp->server
20200620150125953 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80)
20200620150125953 DLM3UA <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP)
Segmentation fault (core dumped)

#2 Updated by neels 21 days ago

the spooky test failure happened twice in a row, and now isn't happening any longer.
I applied a change to try and collect the osmo-stp core dump, but the error vanished before a core could be generated.

(Could it have to do with another test or docker container running simultaneously?)

attached are the build artifacts for the two failing runs.

#4 Updated by neels 21 days ago

  • Priority changed from Urgent to Normal

#5 Updated by fixeria 21 days ago

Hi Neels,

please see:

https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test/1016/artifact/logs/stp/

(gdb) bt
#0  0x00007ff0cc0713ab in _osmo_fsm_inst_dispatch (fi=0x55d0dcc7a050, event=7, data=0x55d0dcc76ce0, 
    file=0x7ff0cc6eb807 <error: Cannot access memory at address 0x7ff0cc6eb807>, line=686) at fsm.c:852
#1  0x00007ff0cc6d34bb in ?? ()
#2  0x12a2026400000000 in ?? ()
#3  0x000055d0dcc75540 in ?? ()
#4  0x0000000000000008 in ?? ()
#5  0x00007ff0cc6e3075 in ?? ()
#6  0x000055d0dcc75540 in ?? ()
#7  0x0000000000000008 in ?? ()
#8  0x000055d0dcc75470 in ?? ()
#9  0x000055d0dcc75010 in ?? ()
#10 0x000055d0dcc75478 in ?? ()
#11 0x00007ff0cc6e319b in ?? ()
#12 0x0000000000000000 in ?? ()

(gdb) p *fi
$3 = {
  list = {
    next = 0x55d0dcc7a0e0, 
    prev = 0x55d0dcc792c8
  }, 
  fsm = 0x4000c, 
  id = 0x55d0dcc76f20 "", 
  name = 0x55d0dcc75340 "XUA_ASP(asp-dyn-1)", 
  priv = 0x91, 
  log_level = 1784082000, 
  state = 32752, 
  T = -590897168, 
  timer = {
    node = {
      rb_parent_color = 0, 
      rb_right = 0x55d0dcc79250, 
      rb_left = 0x55d0dcc78470
    }, 
    list = {
      next = 0x0, 
      prev = 0x0
    }, 
    timeout = {
      tv_sec = 140672198686739, 
      tv_usec = 32
    }, 
    active = 0, 
    cb = 0x0, 
    data = 0x55d0dcc7a050
  },
  proc = {
    parent = 0x55d0dcc792c8, 
    parent_term_event = 3704070224, 
    children = {
      next = 0x80007, 
      prev = 0x55d0dcc784d0
    }, 
    child = {
      next = 0x55d0dcc7a100, 
      prev = 0x101
    }, 
    terminating = 72
  }
}

#6 Updated by fixeria 21 days ago

Looks like use-after-free:

20200621060103377 DLSS7 <000c> osmo_ss7.c:1344 0: asp-asp-dyn-1: Destroying ASP
20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST)
20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Freeing instance
20200621060103377 DLSS7 <000c> fsm.c:573 XUA_ASP(asp-dyn-1){ASP_DOWN}: Deallocated  // <--- !!!
...
20200621060103377 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80)
20200621060103377 DLM3UA <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP)
20200621060103377 DLSS7 <000c> m3ua.c:686 XUA_ASP(asp-dyn-1){ASP_DOWN}: FSM instance already terminating, not dispatching event ASPSM-ASP_UP
20200621060103377 DLSS7 <000c> osmo_ss7.c:1918 0: asp-asp-dyn-1: Cannot transmit, no asp->server
20200621060105414 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80)
20200621060105414 DLM3UA <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP)

#7 Updated by laforge 20 days ago

  • Priority changed from Normal to Immediate

I saw this again in ttcn3-bsc-test build #1016

20200621060103376 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted
20200621060103377 DLSS7 <000c> osmo_ss7.c:1853 (r=172.18.2.20:2905<->l=172.18.2.200:2905): created dynamic ASP asp-dyn-0
20200621060103377 DLSS7 <000c> osmo_ss7.c:1384 0: asp-asp-dyn-0: Restarting ASP asp-dyn-0, r=172.18.2.20:2905<->l=0.0.0.0:2905
20200621060103377 DLSS7 <000c> osmo_ss7.c:1432 0: asp-asp-dyn-0: ASP Restart for server not implemented yet!
20200621060103377 DLSS7 <000c> fsm.c:461 XUA_ASP(asp-dyn-0){ASP_DOWN}: Allocated
20200621060103377 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-EST.ind
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_ESTABLISH.indication
20200621060103377 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted
20200621060103377 DLSS7 <000c> osmo_ss7.c:1834 (r=172.18.2.20:2905<->l=172.18.2.200:2905): matched connection to ASP asp-dyn-0
20200621060103377 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-EST.ind
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_ESTABLISH.indication
20200621060103377 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-0: xua_srv_conn_cb(): sctp_recvmsg() returned 12 (flags=0x8080)
20200621060103377 DLSS7 <000c> osmo_ss7.c:1512 0: asp-asp-dyn-0: xUA SRV SCTP NOTIFICATION 32773 flags=0x0
20200621060103377 DLSS7 <000c> osmo_ss7.c:1525 0: asp-asp-dyn-0: xUA SRV SHUTDOWN_EVENT
20200621060103377 DLSS7 <000c> osmo_ss7.c:1775 asp-dyn-0: SCTP connection closed
20200621060103377 DLSS7 <000c> osmo_ss7.c:1781 XUA_ASP(asp-dyn-0){ASP_DOWN}: Received Event SCTP-COMM_DOWN.ind
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:668 XUA_ASP(asp-dyn-0){ASP_DOWN}: state_chg to ASP_DOWN
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-ASP_DOWN.indication
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-0: No Layer Manager, dropping M-SCTP_RELEASE.indication
20200621060103377 DLSS7 <000c> osmo_ss7.c:1344 0: asp-asp-dyn-0: Destroying ASP
20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-0){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST)
20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-0){ASP_DOWN}: Freeing instance
20200621060103377 DLSS7 <000c> fsm.c:573 XUA_ASP(asp-dyn-0){ASP_DOWN}: Deallocated
20200621060103377 DLSS7 <000c> osmo_ss7.c:1812 (r=172.18.2.20:2905<->l=172.18.2.200:2905): New m3ua connection accepted
20200621060103377 DLSS7 <000c> osmo_ss7.c:1853 (r=172.18.2.20:2905<->l=172.18.2.200:2905): created dynamic ASP asp-dyn-1
20200621060103377 DLSS7 <000c> osmo_ss7.c:1384 0: asp-asp-dyn-1: Restarting ASP asp-dyn-1, r=172.18.2.20:2905<->l=0.0.0.0:2905
20200621060103377 DLSS7 <000c> osmo_ss7.c:1432 0: asp-asp-dyn-1: ASP Restart for server not implemented yet!
20200621060103377 DLSS7 <000c> fsm.c:461 XUA_ASP(asp-dyn-1){ASP_DOWN}: Allocated
20200621060103377 DLSS7 <000c> osmo_ss7.c:1889 XUA_ASP(asp-dyn-1){ASP_DOWN}: Received Event SCTP-EST.ind
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-SCTP_ESTABLISH.indication
20200621060103377 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 12 (flags=0x8080)
20200621060103377 DLSS7 <000c> osmo_ss7.c:1512 0: asp-asp-dyn-1: xUA SRV SCTP NOTIFICATION 32773 flags=0x0
20200621060103377 DLSS7 <000c> osmo_ss7.c:1525 0: asp-asp-dyn-1: xUA SRV SHUTDOWN_EVENT
20200621060103377 DLSS7 <000c> osmo_ss7.c:1775 asp-dyn-1: SCTP connection closed
20200621060103377 DLSS7 <000c> osmo_ss7.c:1781 XUA_ASP(asp-dyn-1){ASP_DOWN}: Received Event SCTP-COMM_DOWN.ind
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:668 XUA_ASP(asp-dyn-1){ASP_DOWN}: state_chg to ASP_DOWN
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-ASP_DOWN.indication
20200621060103377 DLSS7 <000c> xua_asp_fsm.c:113 0: asp-asp-dyn-1: No Layer Manager, dropping M-SCTP_RELEASE.indication
20200621060103377 DLSS7 <000c> osmo_ss7.c:1344 0: asp-asp-dyn-1: Destroying ASP
20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Terminating (cause = OSMO_FSM_TERM_REQUEST)
20200621060103377 DLSS7 <000c> osmo_ss7.c:1351 XUA_ASP(asp-dyn-1){ASP_DOWN}: Freeing instance
20200621060103377 DLSS7 <000c> fsm.c:573 XUA_ASP(asp-dyn-1){ASP_DOWN}: Deallocated
20200621060103377 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80)
20200621060103377 DLM3UA <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP)
20200621060103377 DLSS7 <000c> m3ua.c:686 XUA_ASP(asp-dyn-1){ASP_DOWN}: FSM instance already terminating, not dispatching event ASPSM-ASP_UP
20200621060103377 DLSS7 <000c> osmo_ss7.c:1918 0: asp-asp-dyn-1: Cannot transmit, no asp->server
20200621060105414 DLSS7 <000c> osmo_ss7.c:1579 0: asp-asp-dyn-1: xua_srv_conn_cb(): sctp_recvmsg() returned 8 (flags=0x80)
20200621060105414 DLM3UA <000f> m3ua.c:722 0: asp-asp-dyn-1: Received M3UA Message (ASPSM:UP)

The key aspect is that we receive an ASPSM:UP message while the ASP is terminating, and then things go downhill from there.

#8 Updated by laforge 19 days ago

I've locally reproduced it (in docker with our "./jenkins.sh". The problem is that since it relates to M3UA traffic between BSC and STP, it happens before we launch TTCN3 and hence before we capture anything in tcpdump by our infrastructure.

#9 Updated by laforge 19 days ago

  • Status changed from Stalled to In Progress
  • Assignee set to laforge

#10 Updated by laforge 18 days ago

  • % Done changed from 0 to 20

So the problem is two-fold

  1. osmo-bsc is sending a rapid sequence of SCTP INIT-SHUTDOWN-INIT-SHUTDOWN-INIT (packets 11 to 28 in the attached stp.pcap) before ever getting to the stage where it sends the first M3UA message (ASPUP). This is clearly not expected, and osmo-bsc shouldn't do that (neels is working on related changes avoiding an ASP restart every time a MSC config item is parsed)
  2. osmo-stp should of course not crash, no matter what kind of SCTP packet sequences arrive from any valid or invalid peer. I will try to creat a TTCN3 test case for this behavior to reliably reproduce and then look into a fix.

#11 Updated by neels 18 days ago

  1. osmo-bsc is sending a rapid sequence of SCTP INIT-SHUTDOWN-INIT-SHUTDOWN-INIT (packets 11 to 28 in the attached stp.pcap) before ever getting to the stage where it sends the first M3UA message (ASPUP). This is clearly not expected, and osmo-bsc shouldn't do that (neels is working on related changes avoiding an ASP restart every time a MSC config item is parsed)

patch for this part is https://gerrit.osmocom.org/c/libosmo-sccp/+/18990

#13 Updated by neels 17 days ago

  • Related to Feature #4635: quirks when initializing SS7 ASP added

#14 Updated by laforge 17 days ago

  • % Done changed from 20 to 90

patch merged. I was unable to reproduce the required tight timing from within a TTCN3 test case. I could only get it implented in native C. Will try how it goes from a TTCN3 test case that uses a native C++ function for the rapid successive connect()/close() calls.

#15 Updated by laforge 17 days ago

  • Status changed from In Progress to Resolved

unable to reliably test for it in a test case :(

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)