Project

General

Profile

Actions

Bug #3062

closed

osmo-msc crash while running osmo-gsm-tester voice:nanobts

Added by pespin about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
03/13/2018
Due date:
% Done:

100%

Resolution:
Spec Reference:

Description

dmesg:

[3019686.470628] osmo-msc[5749]: segfault at 10 ip 00007fb7d25be03f sp 00007ffcf5f136e8 error 4 in libosmocore.so.9.0.0[7fb7d25aa000+22000]

Attached crash report with core file and binary used.

Content of osmo-msc_git_hashes.txt, containing git revision for each component in the rootfs:

b10ec0be5ffcd8759bb8b1461549a7eaf570bd9b libosmocore
d3b8b69977207235056f6033e5828796e5620049 libosmo-abis
9d726d0e56665b4439bb4efe782be06be2e83ffb libosmo-netif
0cad3b4538710a91881ddbb80424c2d7c62de962 libsmpp34
58fcc5344ea0e914c7f9be69c309998d243142bd libosmo-sccp
4f7613eeb1f5cbb107af11920f9270721a4720ae osmo-mgw
fa10eda6d70a4ecb37b6dfe629e665a16f8d3f59 osmo-msc


Files

crash.tar.gz crash.tar.gz 5.99 MB pespin, 03/13/2018 02:56 PM
run-success.tar.gz run-success.tar.gz 253 KB pespin, 03/13/2018 03:23 PM

Related issues

Related to OsmoMSC - Bug #3122: fix subscr_conn fsm: safely catch all compl-l3 failures, properly handle all release situationsResolvedneels03/28/2018

Actions
Actions #1

Updated by pespin about 6 years ago

Run same test scenario next time, everything runs successfully.

There can be seen a few difference from ofono point of view which may give hints on the difference of steps:
Crashing one:

15:02:39.919850 tst                        /sierra_1: DBG: Dialing: 2349
15:02:40.030812 tst                        /sierra_1: DBG: Adding /sierra_1/voicecall01 to call list
15:02:40.031886 tst                        /sierra_2: DBG: Waiting for incoming call from: 2348
15:02:40.035831 tst                        /sierra_1: DBG: 'org.ofono.VoiceCallManager'.CallAdded() -> /sierra_1/voicecall01="{'Name': '', 'RemoteHeld': False, 'RemoteMultiparty': False, 'Emergency': False, 'State': 'dialing', 'Multiparty': False, 'LineIdentification': '2349'}" 
15:02:40.036713 tst                        /sierra_1: DBG: Call already exists '/sierra_1/voicecall01'
15:02:44.063539 tst                        /sierra_2: DBG: 'org.ofono.VoiceCallManager'.CallAdded() -> /sierra_2/voicecall01="{'Name': '', 'RemoteHeld': False, 'RemoteMultiparty': False, 'Emergency': False, 'State': 'dialing', 'Multiparty': False, 'LineIdentification': '2348'}" 
15:02:45.085959 tst                 mo_mt_call.py:43: dial success
15:02:45.208064 tst                        /sierra_1: DBG: call state: alerting
15:02:45.219170 tst                        /sierra_2: DBG: call state: incoming
15:02:45.219887 tst                        /sierra_2: DBG: Answer call /sierra_2/voicecall01
15:02:45.230227 tst                        /sierra_2: DBG: call state: incoming
15:02:45.293917 tst                        /sierra_1: DBG: call state: alerting
15:02:46.307953 tst                        /sierra_1: DBG: call state: alerting
15:02:47.322194 tst                        /sierra_1: DBG: call state: alerting
15:02:48.336572 tst                        /sierra_1: DBG: call state: alerting
15:02:49.350422 tst                        /sierra_1: DBG: call state: alerting
15:02:50.364705 tst                        /sierra_1: DBG: call state: alerting
15:02:51.378847 tst                        /sierra_1: DBG: call state: alerting
15:02:52.382323 tst                        /sierra_2: DBG: 'org.ofono.VoiceCallManager'.CallRemoved() -> /sierra_2/voicecall01
15:02:52.394862 tst                        /sierra_1: DBG: call state: alerting
15:02:53.410070 tst                        /sierra_1: DBG: call state: alerting
15:02:54.424374 tst                        /sierra_1: DBG: call state: alerting
15:02:55.438432 tst                        /sierra_1: DBG: call state: alerting
15:02:56.452517 tst                        /sierra_1: DBG: call state: alerting
15:02:57.466964 tst                        /sierra_1: DBG: call state: alerting
15:02:58.481001 tst                        /sierra_1: DBG: call state: alerting
15:02:59.495322 tst                        /sierra_1: DBG: call state: alerting
15:03:00.497691 run    osmo-msc_10.42.42.6(pid=5749): DBG: Cleanup
15:03:00.498565 run    osmo-msc_10.42.42.6(pid=5749): ERR: Terminated: ERROR {rc=-11}  [trial↪voice:nanobts↪osmo-msc_10.42.42.6↪osmo-msc_10.42.42.6(pid=5749)]

Working one:

16:15:34.867474 tst                        /sierra_1: DBG: Dialing: 2351
16:15:34.982822 tst                        /sierra_1: DBG: Adding /sierra_1/voicecall01 to call list
16:15:34.983774 tst                        /sierra_2: DBG: Waiting for incoming call from: 2350
16:15:34.987333 tst                        /sierra_1: DBG: 'org.ofono.VoiceCallManager'.CallAdded() -> /sierra_1/voicecall01="{'State': 'dialing', 'RemoteHeld': False, 'LineIdentification': '2351', 'Emergency': False, 'Multiparty': False, 'RemoteMultiparty': False, 'Name': ''}" 
16:15:34.988151 tst                        /sierra_1: DBG: Call already exists '/sierra_1/voicecall01'
16:15:35.994692 tst                        /sierra_2: DBG: 'org.ofono.NetworkRegistration'.PropertyChanged() -> Strength=60
16:15:39.016523 tst                        /sierra_2: DBG: 'org.ofono.VoiceCallManager'.CallAdded() -> /sierra_2/voicecall01="{'State': 'dialing', 'RemoteHeld': False, 'LineIdentification': '2350', 'Emergency': False, 'Multiparty': False, 'RemoteMultiparty': False, 'Name': ''}" 
16:15:40.038712 tst                 mo_mt_call.py:43: dial success
16:15:40.170234 tst                        /sierra_1: DBG: call state: alerting
16:15:40.181010 tst                        /sierra_2: DBG: call state: incoming
16:15:40.181749 tst                        /sierra_2: DBG: Answer call /sierra_2/voicecall01
16:15:40.192153 tst                        /sierra_2: DBG: call state: incoming
16:15:40.243483 tst                        /sierra_1: DBG: call state: alerting
16:15:41.257555 tst                        /sierra_1: DBG: call state: active
16:15:41.269610 tst                        /sierra_2: DBG: call state: active
16:15:41.270376 tst                 mo_mt_call.py:49: answer success, call established and ongoing
16:15:46.404230 tst                        /sierra_1: DBG: call state: active
16:15:46.416582 tst                        /sierra_2: DBG: call state: active
16:15:46.417448 tst                        /sierra_2: DBG: Hang up call /sierra_2/voicecall01
16:15:46.471815 tst                        /sierra_1: DBG: call_id_list: ['/sierra_1/voicecall01']
16:15:47.475741 tst                        /sierra_2: DBG: 'org.ofono.VoiceCallManager'.CallRemoved() -> /sierra_2/voicecall01
16:15:47.476564 tst                        /sierra_1: DBG: 'org.ofono.VoiceCallManager'.CallRemoved() -> /sierra_1/voicecall01
16:15:47.477449 tst                        /sierra_1: DBG: call_id_list: []
16:15:47.477985 tst                        /sierra_2: DBG: call_id_list: []
16:15:47.478525 tst                 mo_mt_call.py:56: hangup success
16:15:47.593366 tst                    mo_mt_call.py: Test passed (212.5 sec)

Actions #2

Updated by neels about 6 years ago

  • Status changed from New to In Progress
  • Assignee set to neels
Actions #3

Updated by neels about 6 years ago

  • % Done changed from 0 to 30

Got the failure reproduced locally by a test added on branch neels/os3062.

The cause is that when receiving a BSSMAP Clear Request from the BSC, the MSC shoots the conn to the moon right away, while the T308 would still like to complete the CC Release -> CC Release Complete dance. We thus have a T308 timer running on an already freed connection, and once it fires 10 seconds later, it ends in a segfault.

Still need to figure out how to fix the situation among the intricacies of the conn FSM and the externally fired trans.cc.timer. A proper way might be to incorporate the timer we have always had into the relatively new subscriber conn FSM... but for now it would suffice to stay around until the timers are done. I believe the main fix is to trigger a less drastic event when we receive a BSSMAP Clear Request from the BSC.

Also a safeguard should make sure that no timers are running when the transaction is freed, and loudly error-log about it.

Actions #4

Updated by neels about 6 years ago

  • % Done changed from 30 to 40

got an effective safeguard in https://gerrit.osmocom.org/7273 which fixes the segfault, but the underlying failure to gracefully tear down is still pending.

Actions #5

Updated by neels about 6 years ago

  • Related to Bug #3122: fix subscr_conn fsm: safely catch all compl-l3 failures, properly handle all release situations added
Actions #6

Updated by neels almost 6 years ago

  • % Done changed from 40 to 90
Actions #7

Updated by neels almost 6 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)