Bug #2898
closedOsmoBSC can generate BSSMAP ASSIGNMENT FAIL after BSSMAP ASSIGNMENT COMPLETE
0%
Description
I've just observed that there is at least one scenario in which we are sending a BSSMAP ASSIGNMENT FAIL after we've already confirmed it's successful
(this is with laforge/fsm branch):
Mon Jan 29 22:22:17 2018 DLINP <0013> input/ipa.c:263 accept()ed new link from 127.0.0.1 to port 3003 Mon Jan 29 22:22:17 2018 DRSL <0004> bsc_init.c:332 bootstrapping RSL for BTS/TRX (0/0) on ARFCN 871 using MCC=1 MNC=1 LAC=1 CID=0 BSIC=63 Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:1891 BTS 0 CHAN RQD: reason: answer to paging (ra=0x23, neci=0x01, chreq_reason=0x01) Mon Jan 29 22:22:19 2018 DRLL <0000> chan_alloc.c:412 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) Allocating lchan=0 as SDCCH Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:1973 (bts=0,trx=0,ts=0,ss=0) Activating ARFCN(871) SS(0) lctype SDCCH r=PAGING ra=0x23 ta=0 Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:597 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) Tx RSL Channel Activate with act_type=INITIAL Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:626 (bts=0,trx=0,ts=0,ss=0) state NONE -> ACTIVATION REQUESTED Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:1637 (bts=0,trx=0,ts=0,ss=0) CHANNEL ACTIVATE ACK Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:1289 (bts=0,trx=0,ts=0,ss=0) state ACTIVATION REQUESTED -> ACTIVE Mon Jan 29 22:22:19 2018 DRLL <0000> abis_rsl.c:2177 (bts=0,trx=0,ts=0,ss=0) SAPI=0 ESTABLISH INDICATION Mon Jan 29 22:22:19 2018 DMSC <0008> fsm.c:243 SUBSCR_CONN[0x5631ee52f2f0]{INIT}: Allocated Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_api.c:225 Tx MSC COMPL L3 Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_sigtran.c:251 Initializing resources for new SIGTRAN connection to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP... Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_api.c:303 SUBSCR_CONN[0x5631ee52f2f0]{INIT}: Received Event MO-CONNECT.req Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_sigtran.c:293 Allocated new connection id: 39 Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_sigtran.c:297 Opening new SIGTRAN connection (id=39) to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP Mon Jan 29 22:22:19 2018 DMSC <0008> bsc_subscr_conn_fsm.c:106 SUBSCR_CONN[0x5631ee52f2f0]{INIT}: state_chg to WAIT_CC Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_sigtran.c:187 N-CONNECT.cnf(39, ) Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_sigtran.c:190 SUBSCR_CONN[0x5631ee52f2f0]{WAIT_CC}: Received Event MO-CONNET.cfm Mon Jan 29 22:22:19 2018 DMSC <0008> bsc_subscr_conn_fsm.c:138 SUBSCR_CONN[0x5631ee52f2f0]{WAIT_CC}: state_chg to ACTIVE Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_sigtran.c:204 N-DATA.ind(39, 00 11 01 0b 03 01 08 01 7c 06 01 02 03 04 09 26 7d 01 80 ) Mon Jan 29 22:22:19 2018 DMSC <0008> a_reset.c:187 A-RESET(msc-0)[0x5631ee52d8f0]{CONN}: Received Event 2 Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_bssap.c:917 Rx MSC DT1 BSSMAP ASSIGNMENT REQ Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_bssap.c:817 Found matching audio type: full rate SPEECH_V1 for channel_type = { ch_indctr=0x1, ch_rate_type=0x8, perm_spch=[ 01 ] } Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_mgcp.c:283 SUBSCR_CONN[0x5631ee52f2f0]{ACTIVE}: Received Event ASSIGNMENT_CMD Mon Jan 29 22:22:19 2018 DMSC <0008> bsc_subscr_conn_fsm.c:164 SUBSCR_CONN[0x5631ee52f2f0]{ACTIVE}: ASSIGNMENT_CMD(full_rate=1, chan_mode=1) Mon Jan 29 22:22:19 2018 DRLL <0000> chan_alloc.c:412 (bts=0,trx=0,ts=1,pchan=TCH/F) Allocating lchan=0 as TCH_F Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:597 (bts=0,trx=0,ts=1,pchan=TCH/F) Tx RSL Channel Activate with act_type=INTRA_NORM_ASS Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:626 (bts=0,trx=0,ts=1,ss=0) state NONE -> ACTIVATION REQUESTED Mon Jan 29 22:22:19 2018 DMSC <0008> bsc_subscr_conn_fsm.c:167 SUBSCR_CONN[0x5631ee52f2f0]{ACTIVE}: state_chg to WAIT_ASS_CMPL Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:1637 (bts=0,trx=0,ts=1,ss=0) CHANNEL ACTIVATE ACK Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:1289 (bts=0,trx=0,ts=1,ss=0) state ACTIVATION REQUESTED -> ACTIVE Mon Jan 29 22:22:19 2018 DMSC <0008> bsc_api.c:916 Sending assignment on chan: 0x7f882be7d328 Mon Jan 29 22:22:19 2018 DRLL <0000> abis_rsl.c:2177 (bts=0,trx=0,ts=1,ss=0) SAPI=0 DATA INDICATION Mon Jan 29 22:22:19 2018 DRLL <0000> chan_alloc.c:538 (bts=0,trx=0,ts=0,ss=0) starting release sequence Mon Jan 29 22:22:19 2018 DRSL <0004> chan_alloc.c:539 (bts=0,trx=0,ts=0,ss=0) state ACTIVE -> RELEASE REQUESTED Mon Jan 29 22:22:19 2018 DRLL <0000> abis_rsl.c:1205 (bts=0,trx=0,ts=0,ss=0) RSL RLL RELEASE REQ (link_id=0x00, reason=1) Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:2400 (bts=0,trx=0,ts=1,ss=0) IPAC_BIND speech_mode=0x10 RTP_PAYLOAD=3 Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_api.c:443 POSTPONE MSC ASSIGN COMPL Mon Jan 29 22:22:19 2018 DRLL <0000> abis_rsl.c:2177 (bts=0,trx=0,ts=0,ss=0) SAPI=0 RELEASE CONFIRMATION Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:2567 (bts=0,trx=0,ts=1,ss=0) IPAC_CRCX_ACK LOCAL_IP=127.0.0.2 LOCAL_PORT=9002 CON_ID=1 RTP_PAYLOAD2=0x03 Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_audio.c:55 Connecting BTS to port: 10002 conn: 1 Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:2439 (bts=0,trx=0,ts=1,ss=0) IPAC_MDCX IP=127.0.0.3 PORT=10002 RTP_PAYLOAD=3 RTP_PAYLOAD2=3 CONN_ID=1 speech_mode=0x00 Mon Jan 29 22:22:19 2018 DRSL <0004> abis_rsl.c:2578 (bts=0,trx=0,ts=1,ss=0) IPAC_MDCX_ACK CON_ID=1 RTP_PAYLOAD2=0x03 REMOTE_IP=127.0.0.3 REMOTE_PORT=10002 Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_audio.c:94 Tx MSC ASSIGN COMPL (POSTPONED) Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_bssap.c:1055 Sending assignment complete message... (id=39) Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_sigtran.c:326 Tx MSC ASSIGNMENT COMPL Mon Jan 29 22:22:19 2018 DMSC <0008> osmo_bsc_sigtran.c:347 Sending connection (id=39) oriented data to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP (00 14 02 15 00 21 98 2c 01 40 01 7c 06 7f 00 00 03 2a fa 7e 01 80 ) Mon Jan 29 22:22:21 2018 DRSL <0004> abis_rsl.c:1771 (bts=0,trx=0,ts=0,ss=0) T3111 expired: releasing RF Channel Mon Jan 29 22:22:21 2018 DRSL <0004> abis_rsl.c:873 (bts=0,trx=0,ts=0,ss=0) RF Channel Release Mon Jan 29 22:22:21 2018 DRSL <0004> abis_rsl.c:943 (bts=0,trx=0,ts=0,ss=0) RF CHANNEL RELEASE ACK Mon Jan 29 22:22:21 2018 DRSL <0004> abis_rsl.c:86 (bts=0,trx=0,ts=0,ss=0) state RELEASE REQUESTED -> NONE Mon Jan 29 22:22:25 2018 DMSC <0008> fsm.c:182 SUBSCR_CONN[0x5631ee52f2f0]{WAIT_ASS_CMPL}: Timeout of T10 Mon Jan 29 22:22:25 2018 DMSC <0008> osmo_bsc_sigtran.c:326 Tx MSC ASSIGNMENT FAIL Mon Jan 29 22:22:25 2018 DMSC <0008> osmo_bsc_sigtran.c:347 Sending connection (id=39) oriented data to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP (00 04 03 04 01 01 ) Mon Jan 29 22:22:25 2018 DMSC <0008> bsc_subscr_conn_fsm.c:447 SUBSCR_CONN[0x5631ee52f2f0]{WAIT_ASS_CMPL}: state_chg to ACTIVE Mon Jan 29 22:22:31 2018 DMSC <0008> osmo_bsc_sigtran.c:204 N-DATA.ind(39, 00 0c 53 0a 09 01 00 01 02 03 04 05 06 07 ) Mon Jan 29 22:22:31 2018 DMSC <0008> a_reset.c:187 A-RESET(msc-0)[0x5631ee52d8f0]{CONN}: Received Event 2 Mon Jan 29 22:22:31 2018 DMSC <0008> osmo_bsc_bssap.c:917 Rx MSC DT1 BSSMAP CIPHER MODE CMD Mon Jan 29 22:22:31 2018 DRSL <0004> bsc_api.c:774 (subscr unknown) Cipher Mode: cipher=0 key=0001020304050607 include_imeisv=1 Mon Jan 29 22:22:31 2018 DRLL <0000> abis_rsl.c:2177 (bts=0,trx=0,ts=1,ss=0) SAPI=0 DATA INDICATION Mon Jan 29 22:22:31 2018 DMSC <0008> osmo_bsc_api.c:167 CIPHER MODE COMPLETE from MS, forwarding to MSC Mon Jan 29 22:22:31 2018 DMSC <0008> osmo_bsc_sigtran.c:326 Tx MSC CIPHER MODE COMPLETE Mon Jan 29 22:22:31 2018 DMSC <0008> osmo_bsc_sigtran.c:347 Sending connection (id=39) oriented data to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP (00 03 55 2c 01 )
As the log shows, this is related to the fact that T10 is expiring, and also due to the fact that the MGCP FSM is doing things that are none of its concern (such as delaying the ASSIGNMENT COMPLETE). This needs to be untangled.
Files
Updated by dexter almost 6 years ago
I have checked this back. The current state of laforge/fsm is still uses the old implementation in osmo_bsc_mgcp.c, its very likely that this implementation produces wired side effects. However, I did not go too much into details there since the implementation is obsolete.
Instead I checked the code pathes in pmaier/fsm.
- There we GSM0808_T10_TIMER_NR is used only once. Thats on the transition from ST_WAIT_CRCX_BTS to ST_WAIT_ASS_CMPL. In gscon_fsm_wait_ass_cmpl() there is no assignment complete generated so far. So there it can not generate an early ASSIGNMENT COMPLETE.
- I also checked if it is possible to generate an assignment failure after the ASSIGNMENT COMPLETE has been sent. The ASSIGNMENT COMPLETE is generated by send_ass_compl(). The call is done when the MDCX for the MSC side completes in gscon_fsm_wait_crcx_msc(). Should the MGW time out we should never get a GSCON_EV_MGW_CRCX_RESP_MSC event and the ASSIGNMENT COMPLETE should never be generated. Instead the timeout handler takes responsibility of that and immediately sends the FSM to ST_ACTIVE. In case the MGW is just slow and we should get the GSCON_EV_MGW_CRCX_RESP_MSC evet in ST_ACTIVE, the event should be tossed and nothing should happen.
For now we should be safe but we should keep this open since the MGCP stuff is not yet sufficiently untangled from the MGCP handling.
Updated by laforge over 5 years ago
please investigate if this is still an issue after the fsm branch has been merged. Might be good to add a test?
Updated by dexter over 5 years ago
I had another look on this. The function send_ass_compl() is responsible for sending the ASSIGNMENT COMPLETE, while gsm0808_create_assignment_failure() followed by a sigtran_send() sends the ASSSIGNMENT FAILURE.
As far as I can see here the two functions are not called one after another in the code path of any state. Both functions are called when a state change to ST_ACTIVE happens, unless we are already in ST_ACTIVE, then the state remains unmodified. Once an assignment complete is done, the only way to get an assignment failure would be to perform another assignment which fails. Also important to note is that we change to ST_ACTIVE immediately after we call send_ass_compl(), there is no timeout that can strike and possibly cause an out of order ASSIGNMENT FAILURE.
From the TTCN3 test point of view we could simulate some failure cases for the ASSIGNMENT procedure. We would try all sorts of failures (MGCP) in order to see that the FSM always returns with an ASSIGNMENT FAILURE properly.
Updated by pespin about 5 years ago
The error cause of the message is "RR cause value: Protocol error unspecified (111)". As per "3GPP TS 04.08" "3.4.3.3 Abnormal cases":
On the mobile station side, if a lower layer failure happens on the new channel before the ASSIGNMENT COMPLETE message has been sent, the mobile station deactivates the new channels, reactivates the old channels, reconnects the TCHs if any and triggers the establishment of the main signalling link. It then sends a ASSIGNMENT FAILURE message, cause "protocol error unspecified" on the main DCCH and resumes the normal operation, as if no assignment attempt had occurred. The operational parameters (e.g. ciphering mode) when returning on the old channel are those applied before the procedure.
Not sure what the issue may be though. Maybe the clocks of the 2 nanoBTS boards are not synced correctly? I am not sure how are the clocks calibrated or if they need calibration at all in nanoBTS.
Updated by laforge about 5 years ago
On Thu, Oct 04, 2018 at 10:29:29AM +0000, pespin [REDMINE] wrote:
Not sure what the issue may be though. Maybe the clocks of the 2 nanoBTS boards are not synced correctly? I am not sure how are the clocks calibrated or if they need calibration at all in nanoBTS.
they need to be calibrated, it can be triggered/executed via OML, similar to ipaccess-config.
I thought we had that implemented. It's part of the "perform test" functions, IIRC.
To my surprise, the wiki doesn't seem to contain any related information, which is sad.
However, in any case, the multi-TRX setup uses only one clock anyway as it is interconnected
by the TIB (timing interface bus). So for multi-TRX it doesn't matter.
However, for hand-over between two distinct BTSs, they should all have their OCXO
calibrated to the same reference (whether sysmoBTS or nanoBTS).