Bug #3708
closedosmo-bsc: Wrong handling of PDCH DEACT NACK
100%
Description
Due to bug fixed in osmo-bts-trx 133a3d96dc07ebda4dfc7899dab9c0d0c80c9fea, osmo-bts-trx was sending a PDCH DEACT NACK followed by an ACK (in osmo-gsm-tester test dynts\:trx-b200+mod-bts0-dynts67-ipa+cfg-codec-fr-any).
Let's see how osmo-bsc behaves in that case.
After gprs, a phone call between the 2 MS is places, so the dynts which was in PDCH is selected and needs to be moved to TCH/F:
20181123034306831 DAS <0012> fsm.c:299 assignment(conn5_IMSI901700000023847)[0x612000001720]{WAIT_LCHAN_ACTIVE}: Allocated 20181123034306832 DAS <0012> fsm.c:329 assignment(conn5_IMSI901700000023847)[0x612000001720]{WAIT_LCHAN_ACTIVE}: is child of SUBSCR_CONN(conn5_IMSI901700000023847)[0x612000001ba0] 20181123034306832 DRLL <0000> lchan_select.c:159 (bts=0) lchan_select_by_type(TCH_F) 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4,state=IN_USE) is != TCH/F 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=1,pchan=SDCCH8,state=UNUSED) is != TCH/F 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=2,pchan=SDCCH8,state=UNUSED) is != TCH/F 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=3,pchan=SDCCH8,state=UNUSED) is != TCH/F 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=4,pchan=SDCCH8,state=UNUSED) is != TCH/F 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=5,pchan=SDCCH8,state=UNUSED) is != TCH/F 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=7,pchan_on_init=TCH/F_PDCH,pchan=PDCH,state=PDCH) is != TCH/F 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4,state=IN_USE) is != TCH/H 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=1,pchan=SDCCH8,state=UNUSED) is != TCH/H 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=2,pchan=SDCCH8,state=UNUSED) is != TCH/H 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=3,pchan=SDCCH8,state=UNUSED) is != TCH/H 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=4,pchan=SDCCH8,state=UNUSED) is != TCH/H 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=5,pchan=SDCCH8,state=UNUSED) is != TCH/H 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=7,pchan_on_init=TCH/F_PDCH,pchan=PDCH,state=PDCH) is != TCH/H 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4,state=IN_USE) is != TCH/F_PDCH 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=1,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=2,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH 20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=3,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH 20181123034306833 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=4,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH 20181123034306833 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=5,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH 20181123034306833 DRLL <0000> lchan_select.c:86 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=7,pchan_on_init=TCH/F_PDCH,pchan=PDCH,state=PDCH) ss=0 is available after dyn PC HAN change
fsm state is changed to WAIT_PDCH_DEACT and PDCH DEACT is sent to osmo-bts-trx:
20181123034306833 DTS <0011> lchan_fsm.c:577 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{PDCH}: Received Event TS_EV_LCHAN_REQUESTED 20181123034306833 DTS <0011> timeslot_fsm.c:347 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{PDCH}: state_chg to WAIT_PDCH_DEACT 20181123034306833 DTS <0011> abis_rsl.c:2124 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{WAIT_PDCH_DEACT}: (pchan_is=PDCH) Tx: ip.access dyn TS: PDCH DEACT
Then the (buggy in this case) NACK is received, and we move the state into BORKEN state, but the channel is not released at that time, it just keeps in that state. It should have been released at this time!
20181123034306840 DCHAN <0010> abis_rsl.c:1124 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) Rx IPAC_PDCH_DEACT_NACK 20181123034306840 DTS <0011> abis_rsl.c:1098 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{WAIT_PDCH_DEACT}: Received Event TS_EV_PDCH_DEACT_NACK 20181123034306840 DTS <0011> timeslot_fsm.c:172 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{WAIT_PDCH_DEACT}: (pchan_is=PDCH) Received PDCH deactivation NACK 20181123034306840 DTS <0011> timeslot_fsm.c:176 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{WAIT_PDCH_DEACT}: state_chg to BORKEN
Then the PDCH DEACT ACT arrives (due to bug mentioned in osmo-bts-trx). It is correctly handled as not permitted, since we already received the NACK. Atually, this FSM maybe should already be gone or should have started at list to close the channel by this time:
20181123034306880 DLMI <0017> input/ipaccess.c:251 RX 2: 08 4c 01 0f 20181123034306880 DCHAN <0010> abis_rsl.c:1124 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) Rx IPAC_PDCH_DEACT_ACK 20181123034306880 DTS <0011> abis_rsl.c:1098 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{BORKEN}: Received Event TS_EV_PDCH_DEACT_ACK 20181123034306880 DTS <0011> abis_rsl.c:1098 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{BORKEN}: Event TS_EV_PDCH_DEACT_ACK not permitted
Finally, after some time, the timer T23001 fires and the conn is released. Why did it wait until timer to release it if we knew since the NACK that it was not possible to go on?! (This is actually TS6, but actually we see same behavior for TS7, see below).
20181123034308762 DCHAN <0010> fsm.c:189 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: Timeout of T23001 20181123034308762 DCHAN <0010> lchan_fsm.c:1297 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: (type=TCH_F) Handling failure, will then transition to state UNUSED 20181123034308762 DCHAN <0010> lchan_fsm.c:78 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: (type=TCH_F) lchan allocation failed in state WAIT_TS_READY: Timeout 20181123034308762 DCHAN <0010> lchan_fsm.c:107 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: (type=TCH_F) Signalling Assignment FSM of error (lchan allocation failed in state WAIT_TS_READY: Timeout) 20181123034308762 DAS <0012> lchan_fsm.c:1297 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Received Event ASSIGNMENT_EV_LCHAN_ERROR 20181123034308762 DAS <0012> assignment_fsm.c:625 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=6,ss=0) Assignment failed in state WAIT_LCHAN_ACTIVE, cause RADIO INTERFACE MESSAGE FAILURE: Failed to activate lchan (bts=0,trx=0,ts=6,ss=0) 20181123034308762 DAS <0012> assignment_fsm.c:625 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=6,ss=0) incrementing rate counter: assignment:error Assigment failed for other reason.
Then cascade teardown happens:
20181123034308763 DAS <0012> assignment_fsm.c:127 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=6,ss=0) Assignment failed 20181123034308763 DAS <0012> assignment_fsm.c:128 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Terminating (cause = OSMO_FSM_TERM_ERROR) 20181123034308763 DAS <0012> assignment_fsm.c:128 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Removing from parent SUBSCR_CONN(conn4)[0x612000002620] 20181123034308763 DCHAN <0010> lchan_fsm.c:1324 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Received Event LCHAN_RTP_EV_ROLLBACK 20181123034308763 DCHAN <0010> lchan_rtp_fsm.c:223 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Terminating (cause = OSMO_FSM_TERM_REQUEST) 20181123034308764 DCHAN <0010> lchan_rtp_fsm.c:223 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Removing from parent lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0] 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:441 mgw-endpoint(conn4)[0x612000001ea0]{IN_USE}: (rtpbridge/0@mgw) CI[0] to-BTS CI=AA0C7641: DLCX :0: notify=NULL 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:482 mgw-endpoint(conn4)[0x612000001ea0]{IN_USE}: (rtpbridge/0@mgw) CI[0] to-BTS CI=AA0C7641: DLCX :0: Scheduling 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:485 mgw-endpoint(conn4)[0x612000001ea0]{IN_USE}: state_chg to WAIT_MGW_RESPONSE 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:530 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: (rtpbridge/0@mgw) CI[0] to-BTS CI=AA0C7641: Sending MGCP: DLCX AA0C7641 20181123034308764 DRLL <0000> mgcp_client_fsm.c:686 MGCP_CONN(to-BTS)[0x612000001d20]{ST_READY}: Received Event EV_DLCX 20181123034308764 DLMGCP <0023> mgcp_client.c:953 Queued 52 bytes for MGCP GW 20181123034308764 DRLL <0000> mgcp_client_fsm.c:347 MGCP_CONN(to-BTS)[0x612000001d20]{ST_READY}: state_chg to ST_DLCX_RESP 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:625 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: (rtpbridge/0@mgw) Sent messages: 1 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:598 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: (rtpbridge/0@mgw) CI in use: 0, waiting for response: 0 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:603 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: Terminating (cause = OSMO_FSM_TERM_REGULAR) 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:603 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: Removing from parent SUBSCR_CONN(conn4)[0x612000002620] 20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:603 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: Freeing instance 20181123034308764 DRSL <0003> fsm.c:381 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: Deallocated 20181123034308765 DMSC <0007> mgw_endpoint_fsm.c:603 SUBSCR_CONN(conn4)[0x612000002620]{ASSIGNMENT}: Received Event FORGET_MGW_ENDPOINT 20181123034308765 DCHAN <0010> lchan_rtp_fsm.c:742 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: Received Event LCHAN_EV_RTP_RELEASED 20181123034308765 DCHAN <0010> lchan_rtp_fsm.c:223 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Freeing instance 20181123034308765 DCHAN <0010> fsm.c:381 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Deallocated 20181123034308765 DCHAN <0010> lchan_rtp_fsm.c:223 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: Received Event LCHAN_EV_RTP_RELEASED 20181123034308765 DCHAN <0010> lchan_fsm.c:1338 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: transition to state WAIT_RLL_RTP_RELEASED not permitted! 20181123034308765 DAS <0012> assignment_fsm.c:128 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Freeing instance 20181123034308765 DAS <0012> fsm.c:381 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Deallocated 20181123034308765 DMSC <0007> assignment_fsm.c:128 SUBSCR_CONN(conn4)[0x612000002620]{ASSIGNMENT}: Received Event ASSIGNMENT_END 20181123034308765 DMSC <0007> bsc_subscr_conn_fsm.c:399 SUBSCR_CONN(conn4)[0x612000002620]{ASSIGNMENT}: state_chg to ACTIVE 20181123034308765 DCHAN <0010> lchan_fsm.c:1297 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: state_chg to UNUSED 20181123034308765 DCHAN <0010> lchan_fsm.c:371 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{UNUSED}: (type=TCH_F) Clearing lchan state 20181123034308765 DTS <0011> lchan_fsm.c:404 timeslot(0-0-6-TCH_F_PDCH)[0x6120000087a0]{BORKEN}: Received Event TS_EV_LCHAN_UNUSED 20181123034308766 DLMGCP <0023> mgcp_client.c:749 Tx MGCP msg to MGCP GW: 'DLCX 3 rtpbridge/0@mgw MGCP 1.0' 20181123034308766 DLMGCP <0023> mgcp_client.c:751 Sending msg to MGCP GW size: 52 20181123034308766 DLINP <0015> stream.c:279 connected write 20181123034308766 DLINP <0015> stream.c:204 sending data 20181123034308769 DRLL <0000> mgcp_client_fsm.c:423 MGCP_CONN(to-BTS)[0x612000001d20]{ST_DLCX_RESP}: Received Event EV_DLCX_RESP 20181123034308769 DRLL <0000> mgcp_client_fsm.c:439 MGCP_CONN(to-BTS)[0x612000001d20]{ST_DLCX_RESP}: Terminating (cause = OSMO_FSM_TERM_REGULAR) 20181123034308769 DRLL <0000> mgcp_client_fsm.c:439 MGCP_CONN(to-BTS)[0x612000001d20]{ST_DLCX_RESP}: Freeing instance 20181123034308769 DRLL <0000> fsm.c:381 MGCP_CONN(to-BTS)[0x612000001d20]{ST_DLCX_RESP}: Deallocated
Similarly and even later, the other conn (other leg of the call, actually this is the one which we know failed initially, TS 7) fails due to timer expiring, while it should have been torn down a lot before:
20181123034311833 DCHAN <0010> fsm.c:189 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: Timeout of T23001 20181123034311833 DCHAN <0010> lchan_fsm.c:1297 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) Handling failure, will then transition to state UNUSED 20181123034311834 DCHAN <0010> lchan_fsm.c:78 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) lchan allocation failed in state WAIT_TS_READY: Timeout 20181123034311834 DCHAN <0010> lchan_fsm.c:107 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) Signalling Assignment FSM of error (lchan allocation failed in state WAIT_TS_READY: Timeout) 20181123034311834 DAS <0012> lchan_fsm.c:1297 assignment(conn5_IMSI901700000023847_0-0-7-TCH_F_PDCHasPDCH-0)[0x612000001720]{WAIT_LCHAN_ACTIVE}: Received Event ASSIGNMENT_EV_LCHAN_ERROR 20181123034311834 DAS <0012> assignment_fsm.c:625 assignment(conn5_IMSI901700000023847_0-0-7-TCH_F_PDCHasPDCH-0)[0x612000001720]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=7,ss=0) Assignment failed in state WAIT_LCHAN_ACTIVE, cause RADIO INTERFACE MESSAGE FAILURE: Failed to activate lchan (bts=0,trx=0,ts=7,ss=0) 20181123034311834 DAS <0012> assignment_fsm.c:625 assignment(conn5_IMSI901700000023847_0-0-7-TCH_F_PDCHasPDCH-0)[0x612000001720]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=7,ss=0) incrementing rate counter: assignment:error Assigment failed for other reason. 20181123034311834 DMSC <0007> osmo_bsc_sigtran.c:367 Tx MSC: BSSMAP: ASSIGNMENT FAIL ....
Files
Related issues