Project

General

Profile

Bug #3396

TC_ho_int is indicating port problem during handover

Added by dexter 5 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
07/16/2018
Due date:
% Done:

100%

Spec Reference:

Description

Since the coverage has increased TC_ho_int permanently failes. Surprisingly TC_ho_int stops failing when the testcase is executed a second time without restarting osmo-bsc. Only a freshly restarted osmo-bsc seems to trigger the problem.

<0008> bsc_vty.c:1577 (bts=0,trx=0,ts=1,ss=0) (ARFCN 871) --> BTS 1 Manually triggering Handover from VTY
<0007> handover_logic.c:135 SUBSCR_CONN[0x55afd26d4eb0]{ACTIVE}: Received Event HO_START
<0007> bsc_subscr_conn_fsm.c:504 SUBSCR_CONN[0x55afd26d4eb0]{ACTIVE}: state_chg to WAIT_HO_COMPL
<0007> handover_logic.c:333 SUBSCR_CONN[0x55afd26d4eb0]{WAIT_HO_COMPL}: Received Event HO_COMPL
<0007> bsc_subscr_conn_fsm.c:801 SUBSCR_CONN[0x55afd26d4eb0]{WAIT_HO_COMPL}: state_chg to WAIT_MDCX_BTS_HO
<0000> mgcp_client_fsm.c:644 MGCP_CONN[0x55afd26f0e80]{ST_READY}: Cannot MDCX, port == 0
<0007> osmo_bsc_sigtran.c:325 Tx MSC CLEAR REQUEST
<0007> osmo_bsc_sigtran.c:346 Sending connection (id=1) oriented data to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP (00 04 22 04 01 20 )
<0007> bsc_subscr_conn_fsm.c:806 SUBSCR_CONN[0x55afd26d4eb0]{WAIT_MDCX_BTS_HO}: transition to state CLEARING not permitted!
<0007> osmo_bsc_audio.c:56 Connecting BTS to port: 10002 conn: 1
<0012> input/ipaccess.c:243 Sign link vanished, dead socket
<0012> input/ipaccess.c:71 Forcing socket shutdown with no signal link set
<0012> bts_ipaccess_nanobts.c:407 (bts=0) Dropping OML link.
<0014> osmo_bsc_main.c:352 Lost some E1 TEI link: 1 0x7f494abd8070
<0012> bts_ipaccess_nanobts.c:391 (bts=0,trx=0) Dropping RSL link.
<0014> osmo_bsc_main.c:352 Lost some E1 TEI link: 2 0x7f494abd8070
<0012> input/ipaccess.c:243 Sign link vanished, dead socket
<0012> input/ipaccess.c:71 Forcing socket shutdown with no signal link set
<0012> bts_ipaccess_nanobts.c:407 (bts=1) Dropping OML link.
<0014> osmo_bsc_main.c:352 Lost some E1 TEI link: 1 0x7f494aba2070
<0007> bsc_api.c:689 (bts=1,trx=0,ts=1,pchan=TCH/F) (ss=0,NONE) (IMSI:001019876543210) S_LCHAN_UNEXPECTED_RELEASE
<0007> osmo_bsc_api.c:457 Tx MSC CLEAR REQUEST
<0007> osmo_bsc_api.c:465 SUBSCR_CONN[0x55afd26d4eb0]{WAIT_MDCX_BTS_HO}: Received Event TX_SCCP
<0007> osmo_bsc_sigtran.c:325 Tx MSC CLEAR REQUEST
<0007> osmo_bsc_sigtran.c:346 Sending connection (id=1) oriented data to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP (00 04 22 04 01 01 )
<0012> bts_ipaccess_nanobts.c:391 (bts=1,trx=0) Dropping RSL link.
<0014> osmo_bsc_main.c:352 Lost some E1 TEI link: 2 0x7f494aba2070

History

#1 Updated by dexter 5 months ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 80

I managed to find out the reason for the handover problem. First of all I did not notice the problem at first because I kept osmo-bsc running while debugging the TTCN3 tests. When osmo-bsc restarts the test failes, but when I kept osmo-bsc running the test passed. The reason for this was that the same lchan as with the test before got re-used and the struct memebers that hold the ip/port were still populated from the prevous run (it did the IPACC negotiation, but a fraction of a time too late).

So first of all I made it failing reliably by deleting those struct members with the lchan_free()
https://gerrit.osmocom.org/#/c/osmo-bsc/+/10038 chan_alloc: delete rtp voice related in lchan_free()

Since the IPACC negotiation is done a tiny bit too late I added a signal handler to handover_logic that listens to SS_ABISIP for an IPACC CRCX. Before I signal to the GSCON FSM that the handover is done I check if we have a valid port in the lchan struct. If yes, everything is ok and I proceed. If no I set a flag. The next IPACC CRCX that comes along will then trigger the signal that tells the GSCON FSM that the handover is (finally) done.

https://gerrit.osmocom.org/#/c/osmo-bsc/+/10039 handover_logic: make sure IPACC is done before MGCP

The reason why this problem has been undetected for presumably longer time was that the as_media() was not verifying if there is an MDCX or not. Now it does and it detects the problem. Unfortunately the as_handover also needed some fixing in order to be a bit more robust against the racing problem with the IPACC, MGCP and RSL.

https://gerrit.osmocom.org/#/c/osmo-ttcn3-hacks/+/10040 MSC_ConnectionHandler: make as_handover more race robust

#2 Updated by dexter 4 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 80 to 100

https://gerrit.osmocom.org/#/c/osmo-bsc/+/10039/ is not merged yet but TC_ho_int is passing again, so lets abandon gerrit change 10039.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)