Actions
Bug #3050
closedvty: it is possible to crash the MSC by issuing a 'show subscriber' command
Start date:
03/09/2018
Due date:
% Done:
100%
Resolution:
Spec Reference:
Description
MSC sometimes hits a segfault, trying to list a deallocated subscriber connection, by iterating gsmnet->subscr_conns.
(Details follow)
Updated by neels about 6 years ago
backtrace:
20180309154726816 DPAG DEBUG gsm_subscriber.c:73 Paging success for MSISDN:101 (event=0) 20180309154726816 DPAG DEBUG osmo_msc.c:411 Paging can stop for MSISDN:101 20180309154726816 DPAG DEBUG gsm_subscriber.c:103 Paging without action. 20180309154726816 DMM DEBUG subscr_conn.c:133 Subscr_Conn(626598408)[0x5555558e32e0]{SUBSCR_CONN_S_ACCEPTED}: Received Event SUBSCR_CONN_E_BUMP 20180309154726816 DMM DEBUG subscr_conn.c:169 Subscr_Conn(626598408)[0x5555558e32e0]{SUBSCR_CONN_S_ACCEPTED}: bump: releasing conn 20180309154726816 DMM DEBUG subscr_conn.c:170 Subscr_Conn(626598408)[0x5555558e32e0]{SUBSCR_CONN_S_ACCEPTED}: state_chg to SUBSCR_CONN_S_RELEASED 20180309154726816 DMM DEBUG subscr_conn.c:247 Subscr_Conn(626598408)[0x5555558e32e0]{SUBSCR_CONN_S_RELEASED}: Terminating (cause = OSMO_FSM_TERM_REGULAR) 20180309154726816 DVLR DEBUG subscr_conn.c:247 Process_Access_Request_VLR(626598408)[0x5555558e8590]{PR_ARQ_S_DONE}: Terminating (cause = OSMO_FSM_TERM_PARENT) 20180309154726816 DVLR DEBUG subscr_conn.c:247 Process_Access_Request_VLR(626598408)[0x5555558e8590]{PR_ARQ_S_DONE}: Removing from parent Subscr_Conn(626598408)[0x5555558e32e0] 20180309154726816 DVLR DEBUG subscr_conn.c:247 Process_Access_Request_VLR(626598408)[0x5555558e8590]{PR_ARQ_S_DONE}: Freeing instance 20180309154726816 DVLR DEBUG fsm.c:346 Process_Access_Request_VLR(626598408)[0x5555558e8590]{PR_ARQ_S_DONE}: Deallocated 20180309154726816 DMM DEBUG subscr_conn.c:226 Subscr_Conn(626598408)[0x5555558e32e0]{SUBSCR_CONN_S_RELEASED}: conn_fsm_cleanup conn MSISDN:101 20180309154726816 DMM DEBUG osmo_msc.c:311 msc_subscr_conn_close(vsub=MSISDN:101, cause=2): no conn fsm, releasing directly without release event. 20180309154726816 DBSSAP INFO a_iface.c:420 (subscr MSISDN:101, conn_id 2) Tx BSSMAP CLEAR COMMAND to BSC 20180309154726816 DLSCCP DEBUG sccp_scoc.c:1615 Received SCCP User Primitive N-DATA.request) 20180309154726816 DLSCCP DEBUG sccp_scoc.c:1657 SCCP-SCOC(2)[0x5555558e2f00]{ACTIVE}: Received Event N-DATA.req 20180309154726816 DLSS7 DEBUG sccp_scrc.c:391 sccp_scrc_rx_scoc_conn_msg: HDR=(CO:CODT,V=0,LEN=0), PART(T=Routing Context,L=4,D=00000000), PART(T=Destination Reference,L=4,D=00000003), PART(T=Data,L=6,D=000420040109) 20180309154726816 DLSS7 DEBUG osmo_ss7_hmrt.c:278 m3ua_hmdc_rx_from_l2(): dpc=187=0.23.3 not local, message is for routing 20180309154726816 DLSS7 DEBUG osmo_ss7_hmrt.c:227 Found route for dpc=187=0.23.3: pc=0=0.0.0 mask=0x0=0.0.0 via AS as-clnt-OsmoMSC-A-Iu proto=m3ua 20180309154726816 DLSS7 DEBUG osmo_ss7_hmrt.c:233 rt->dest.as proto is M3UA for dpc=187=0.23.3 20180309154726816 DLSS7 DEBUG m3ua.c:507 XUA_AS(as-clnt-OsmoMSC-A-Iu)[0x5555558c5820]{AS_ACTIVE}: Received Event AS-TRANSFER.req 20180309154726816 DRLL DEBUG osmo_msc.c:229 subscr MSISDN:101: Freeing subscriber connection 0x5555558e3130 here the GSMTAP log shows llist_del(conn = 0x5555558e3130) from a temporary hack: DEBUGP(DREF, "llist_del(conn = %p)\n", conn); llist_del(&conn->entry); 20180309154726816 DMM DEBUG subscr_conn.c:247 Subscr_Conn(626598408)[0x5555558e32e0]{SUBSCR_CONN_S_RELEASED}: Freeing instance 20180309154726816 DMM DEBUG fsm.c:346 Subscr_Conn(626598408)[0x5555558e32e0]{SUBSCR_CONN_S_RELEASED}: Deallocated 20180309154726817 DLINP DEBUG stream.c:279 connected write 20180309154726817 DLINP DEBUG stream.c:204 sending data [...] 20180309154726823 DLSCCP DEBUG sccp_scoc.c:1548 Received CO:RELCO for local reference 2 20180309154726823 DLSCCP DEBUG sccp_scoc.c:1581 SCCP-SCOC(2)[0x5555558e2f00]{DISCONN_PEND}: Received Event RCOC-RELEASE_COMPLETE.ind 20180309154726823 DLSCCP DEBUG sccp_scoc.c:1060 SCCP-SCOC(2)[0x5555558e2f00]{DISCONN_PEND}: state_chg to IDLE 20180309154726823 DLSCCP DEBUG sccp_scoc.c:420 SCCP-SCOC(2)[0x5555558e2f00]{IDLE}: Terminating (cause = OSMO_FSM_TERM_REQUEST) 20180309154726823 DLSCCP DEBUG sccp_scoc.c:420 SCCP-SCOC(2)[0x5555558e2f00]{IDLE}: Freeing instance 20180309154726823 DLSCCP DEBUG fsm.c:346 SCCP-SCOC(2)[0x5555558e2f00]{IDLE}: Deallocated Program received signal SIGSEGV, Segmentation fault. vty_dump_one_conn (vty=vty@entry=0x5555558e3a30, conn=conn@entry=0x5555558e3130) at ../../../../src/osmo-msc/src/libmsc/vty_interface_layer3.c:67 67 vty_out(vty, "%08x %22s %3s %5u %3u %08x %c /%1u %27s %s", (gdb) bt #0 vty_dump_one_conn (vty=vty@entry=0x5555558e3a30, conn=conn@entry=0x5555558e3130) at ../../../../src/osmo-msc/src/libmsc/vty_interface_layer3.c:67 #1 0x00005555555771d8 in subscr_dump_full_vty (vty=vty@entry=0x5555558e3a30, vsub=vsub@entry=0x5555558e4b90) at ../../../../src/osmo-msc/src/libmsc/vty_interface_layer3.c:211 #2 0x00005555555780b8 in show_subscr (self=<optimized out>, vty=0x5555558e3a30, argc=<optimized out>, argv=0x7fffffffd8a0) at ../../../../src/osmo-msc/src/libmsc/vty_interface_layer3.c:370 #3 0x00007ffff7bc3b44 in cmd_execute_command_real (vline=vline@entry=0x5555558e7180, vty=vty@entry=0x5555558e3a30, cmd=0x0) at ../../../../src/libosmocore/src/vty/command.c:2275 #4 0x00007ffff7bc57e1 in cmd_execute_command (vline=vline@entry=0x5555558e7180, vty=vty@entry=0x5555558e3a30, cmd=cmd@entry=0x0, vtysh=vtysh@entry=0) at ../../../../src/libosmocore/src/vty/command.c:2308 #5 0x00007ffff7bc7cb1 in vty_command (buf=<optimized out>, vty=0x5555558e3a30) at ../../../../src/libosmocore/src/vty/vty.c:420 #6 vty_execute (vty=0x5555558e3a30) at ../../../../src/libosmocore/src/vty/vty.c:684 #7 vty_read (vty=<optimized out>) at ../../../../src/libosmocore/src/vty/vty.c:1426 #8 0x00007ffff7bc9769 in client_data (fd=0x5555558e1878, what=1) at ../../../../src/libosmocore/src/vty/telnet_interface.c:135 #9 0x00007ffff732fda1 in osmo_fd_disp_fds (_eset=0x7fffffffe4d0, _wset=0x7fffffffe450, _rset=0x7fffffffe3d0) at ../../../src/libosmocore/src/select.c:216 #10 osmo_select_main (polling=<optimized out>) at ../../../src/libosmocore/src/select.c:256 #11 0x000055555555f84c in main (argc=1, argv=<optimized out>) at ../../../../src/osmo-msc/src/osmo-msc/msc_main.c:531 (gdb) p conn $1 = (const struct gsm_subscriber_connection *) 0x5555558e3130 (gdb) p *conn $2 = {entry = {next = 0x5555558e3230, prev = 0x5555558e7218}, use_count = 262405, use_tokens = 0, received_cm_service_request = 192, vsub = 0x5555558e3130, expire_timer_stopped = 113 'q', next_rp_ref = 0 '\000', conn_fsm = 0x7fff028508f0, silent_call = 0, mncc_rtp_bridge = 0, network = 0x0, in_release = 208, via_ran = (RAN_GERAN_A | unknown: 21844), classmark = {classmark1_set = false, classmark1 = {pwr_lev = 0 '\000', a5_1 = 0 '\000', es_ind = 0 '\000', rev_lev = 0 '\000', spare = 0 '\000'}, classmark2_len = 0 '\000', classmark2 = "\000\000", classmark3_len = 0 '\000', classmark3 = '\000' <repeats 13 times>}, lac = 0, encr = {alg_id = 112 'p', key_len = 248 '\370', key = "\312\366\377\177\000\000\004\000\000\000\000\000\000\000\000"}, n_sd_next = "\000\000\000", rtp = { mgcp_ctx = 0x0, mgcp_rtp_endpoint = 0, local_port_ran = 0, local_addr_ran = "\000\000\000\000\000\002\000\000\000\000\221\000\000\000\000", remote_port_ran = 0, remote_addr_ran = "\360\b\205\002\377\177\000\000\320\060\216UUU\000", local_port_cn = 13008, local_addr_cn = "\216UUU\000\000\000\000\000\000\000\000\000\000`2", remote_port_cn = 21902, remote_addr_cn = "UU", '\000' <repeats 13 times>}, iu = {ue_ctx = 0x7ffff6caefec, rab_id = 32 ' '}, a = {scu = 0x0, bsc_addr = {presence = 0, ri = OSMO_SCCP_RI_NONE, gt = {gti = 0 '\000', tt = 0 '\000', npi = 0, nai = 1435382576, digits = "UU\000\000\060\061\216UUU\000\000\027\001\004\000\000\000\000\000\300\062\216UUU\000\000\000\000\000"}, pc = 187, ssn = 113, ip = {v4 = {s_addr = 0}, v6 = { __in6_u = {__u6_addr8 = "\000\000\000\000\360\b\205\002\377\177\000\000\000\000\000", __u6_addr16 = {0, 0, 2288, 645, 32767, 0, 0, 0}, __u6_addr32 = {0, 42273008, 32767, 0}}}}}, conn_id = 0}} (gdb)Code state:
- I added some pointer information to the logs.
- I added a VTY command to page the subscriber (paging without an action cb).
unfortunately the DREF is missing from above log, I only see it in the gsmtap_log. There I clearly see a log of
"llist_del(conn = 0x5555558e3130)"
and later on in the gdb backtrace, that same pointer shows up from iterating gsmnet->subscr_conns. How is that possible?
(gdb) p conn $1 = (const struct gsm_subscriber_connection *) 0x5555558e3130
(No other conns have been established, so no other one has been put in its place)
Updated by neels about 6 years ago
- Status changed from New to In Progress
- Assignee set to neels
- % Done changed from 0 to 90
Ah, it's not from iterating the network->subscr_conns, it's from vsub->msc_conn_ref, which is still left stale!
https://gerrit.osmocom.org/7176
Updated by neels about 6 years ago
- Status changed from In Progress to Resolved
- % Done changed from 90 to 100
merged
Actions