Project

General

Profile

Actions

Bug #3708

closed

osmo-bsc: Wrong handling of PDCH DEACT NACK

Added by pespin over 5 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
A-bis RSL
Start date:
11/26/2018
Due date:
% Done:

100%

Spec Reference:

Description

Due to bug fixed in osmo-bts-trx 133a3d96dc07ebda4dfc7899dab9c0d0c80c9fea, osmo-bts-trx was sending a PDCH DEACT NACK followed by an ACK (in osmo-gsm-tester test dynts\:trx-b200+mod-bts0-dynts67-ipa+cfg-codec-fr-any).

Let's see how osmo-bsc behaves in that case.

After gprs, a phone call between the 2 MS is places, so the dynts which was in PDCH is selected and needs to be moved to TCH/F:

20181123034306831 DAS <0012> fsm.c:299 assignment(conn5_IMSI901700000023847)[0x612000001720]{WAIT_LCHAN_ACTIVE}: Allocated
20181123034306832 DAS <0012> fsm.c:329 assignment(conn5_IMSI901700000023847)[0x612000001720]{WAIT_LCHAN_ACTIVE}: is child of SUBSCR_CONN(conn5_IMSI901700000023847)[0x612000001ba0]
20181123034306832 DRLL <0000> lchan_select.c:159 (bts=0) lchan_select_by_type(TCH_F)
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4,state=IN_USE) is != TCH/F
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=1,pchan=SDCCH8,state=UNUSED) is != TCH/F
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=2,pchan=SDCCH8,state=UNUSED) is != TCH/F
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=3,pchan=SDCCH8,state=UNUSED) is != TCH/F
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=4,pchan=SDCCH8,state=UNUSED) is != TCH/F
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=5,pchan=SDCCH8,state=UNUSED) is != TCH/F
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F: (bts=0,trx=0,ts=7,pchan_on_init=TCH/F_PDCH,pchan=PDCH,state=PDCH) is != TCH/F
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4,state=IN_USE) is != TCH/H
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=1,pchan=SDCCH8,state=UNUSED) is != TCH/H
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=2,pchan=SDCCH8,state=UNUSED) is != TCH/H
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=3,pchan=SDCCH8,state=UNUSED) is != TCH/H
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=4,pchan=SDCCH8,state=UNUSED) is != TCH/H
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=5,pchan=SDCCH8,state=UNUSED) is != TCH/H
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/H: (bts=0,trx=0,ts=7,pchan_on_init=TCH/F_PDCH,pchan=PDCH,state=PDCH) is != TCH/H
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4,state=IN_USE) is != TCH/F_PDCH
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=1,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=2,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH
20181123034306832 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=3,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH
20181123034306833 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=4,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH
20181123034306833 DRLL <0000> lchan_select.c:71 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=5,pchan=SDCCH8,state=UNUSED) is != TCH/F_PDCH
20181123034306833 DRLL <0000> lchan_select.c:86 looking for lchan TCH/F_PDCH as TCH/F: (bts=0,trx=0,ts=7,pchan_on_init=TCH/F_PDCH,pchan=PDCH,state=PDCH) ss=0 is available after dyn PC
HAN change

fsm state is changed to WAIT_PDCH_DEACT and PDCH DEACT is sent to osmo-bts-trx:

20181123034306833 DTS <0011> lchan_fsm.c:577 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{PDCH}: Received Event TS_EV_LCHAN_REQUESTED
20181123034306833 DTS <0011> timeslot_fsm.c:347 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{PDCH}: state_chg to WAIT_PDCH_DEACT
20181123034306833 DTS <0011> abis_rsl.c:2124 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{WAIT_PDCH_DEACT}: (pchan_is=PDCH) Tx: ip.access dyn TS: PDCH DEACT

Then the (buggy in this case) NACK is received, and we move the state into BORKEN state, but the channel is not released at that time, it just keeps in that state. It should have been released at this time!

20181123034306840 DCHAN <0010> abis_rsl.c:1124 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) Rx IPAC_PDCH_DEACT_NACK
20181123034306840 DTS <0011> abis_rsl.c:1098 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{WAIT_PDCH_DEACT}: Received Event TS_EV_PDCH_DEACT_NACK
20181123034306840 DTS <0011> timeslot_fsm.c:172 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{WAIT_PDCH_DEACT}: (pchan_is=PDCH) Received PDCH deactivation NACK
20181123034306840 DTS <0011> timeslot_fsm.c:176 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{WAIT_PDCH_DEACT}: state_chg to BORKEN

Then the PDCH DEACT ACT arrives (due to bug mentioned in osmo-bts-trx). It is correctly handled as not permitted, since we already received the NACK. Atually, this FSM maybe should already be gone or should have started at list to close the channel by this time:

20181123034306880 DLMI <0017> input/ipaccess.c:251 RX 2: 08 4c 01 0f
20181123034306880 DCHAN <0010> abis_rsl.c:1124 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) Rx IPAC_PDCH_DEACT_ACK
20181123034306880 DTS <0011> abis_rsl.c:1098 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{BORKEN}: Received Event TS_EV_PDCH_DEACT_ACK
20181123034306880 DTS <0011> abis_rsl.c:1098 timeslot(0-0-7-TCH_F_PDCH)[0x612000008620]{BORKEN}: Event TS_EV_PDCH_DEACT_ACK not permitted

Finally, after some time, the timer T23001 fires and the conn is released. Why did it wait until timer to release it if we knew since the NACK that it was not possible to go on?! (This is actually TS6, but actually we see same behavior for TS7, see below).

20181123034308762 DCHAN <0010> fsm.c:189 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: Timeout of T23001
20181123034308762 DCHAN <0010> lchan_fsm.c:1297 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: (type=TCH_F) Handling failure, will then transition to state UNUSED
20181123034308762 DCHAN <0010> lchan_fsm.c:78 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: (type=TCH_F) lchan allocation failed in state WAIT_TS_READY: Timeout
20181123034308762 DCHAN <0010> lchan_fsm.c:107 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: (type=TCH_F) Signalling Assignment FSM of error (lchan allocation failed in state WAIT_TS_READY: Timeout)
20181123034308762 DAS <0012> lchan_fsm.c:1297 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Received Event ASSIGNMENT_EV_LCHAN_ERROR
20181123034308762 DAS <0012> assignment_fsm.c:625 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=6,ss=0) Assignment failed in state WAIT_LCHAN_ACTIVE, cause RADIO INTERFACE MESSAGE FAILURE: Failed to activate lchan (bts=0,trx=0,ts=6,ss=0)
20181123034308762 DAS <0012> assignment_fsm.c:625 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=6,ss=0) incrementing rate counter: assignment:error Assigment failed for other reason.

Then cascade teardown happens:

20181123034308763 DAS <0012> assignment_fsm.c:127 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=6,ss=0) Assignment failed
20181123034308763 DAS <0012> assignment_fsm.c:128 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Terminating (cause = OSMO_FSM_TERM_ERROR)
20181123034308763 DAS <0012> assignment_fsm.c:128 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Removing from parent SUBSCR_CONN(conn4)[0x612000002620]
20181123034308763 DCHAN <0010> lchan_fsm.c:1324 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Received Event LCHAN_RTP_EV_ROLLBACK
20181123034308763 DCHAN <0010> lchan_rtp_fsm.c:223 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Terminating (cause = OSMO_FSM_TERM_REQUEST)
20181123034308764 DCHAN <0010> lchan_rtp_fsm.c:223 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Removing from parent lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:441 mgw-endpoint(conn4)[0x612000001ea0]{IN_USE}: (rtpbridge/0@mgw) CI[0] to-BTS CI=AA0C7641: DLCX :0: notify=NULL
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:482 mgw-endpoint(conn4)[0x612000001ea0]{IN_USE}: (rtpbridge/0@mgw) CI[0] to-BTS CI=AA0C7641: DLCX :0: Scheduling
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:485 mgw-endpoint(conn4)[0x612000001ea0]{IN_USE}: state_chg to WAIT_MGW_RESPONSE
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:530 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: (rtpbridge/0@mgw) CI[0] to-BTS CI=AA0C7641: Sending MGCP: DLCX AA0C7641
20181123034308764 DRLL <0000> mgcp_client_fsm.c:686 MGCP_CONN(to-BTS)[0x612000001d20]{ST_READY}: Received Event EV_DLCX
20181123034308764 DLMGCP <0023> mgcp_client.c:953 Queued 52 bytes for MGCP GW
20181123034308764 DRLL <0000> mgcp_client_fsm.c:347 MGCP_CONN(to-BTS)[0x612000001d20]{ST_READY}: state_chg to ST_DLCX_RESP
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:625 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: (rtpbridge/0@mgw) Sent messages: 1
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:598 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: (rtpbridge/0@mgw) CI in use: 0, waiting for response: 0
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:603 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: Terminating (cause = OSMO_FSM_TERM_REGULAR)
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:603 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: Removing from parent SUBSCR_CONN(conn4)[0x612000002620]
20181123034308764 DRSL <0003> mgw_endpoint_fsm.c:603 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: Freeing instance
20181123034308764 DRSL <0003> fsm.c:381 mgw-endpoint(conn4)[0x612000001ea0]{WAIT_MGW_RESPONSE}: Deallocated
20181123034308765 DMSC <0007> mgw_endpoint_fsm.c:603 SUBSCR_CONN(conn4)[0x612000002620]{ASSIGNMENT}: Received Event FORGET_MGW_ENDPOINT
20181123034308765 DCHAN <0010> lchan_rtp_fsm.c:742 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: Received Event LCHAN_EV_RTP_RELEASED
20181123034308765 DCHAN <0010> lchan_rtp_fsm.c:223 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Freeing instance
20181123034308765 DCHAN <0010> fsm.c:381 lchan_rtp(0-0-6-TCH_F_PDCH-0)[0x612000002020]{WAIT_LCHAN_READY}: Deallocated
20181123034308765 DCHAN <0010> lchan_rtp_fsm.c:223 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: Received Event LCHAN_EV_RTP_RELEASED
20181123034308765 DCHAN <0010> lchan_fsm.c:1338 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: transition to state WAIT_RLL_RTP_RELEASED not permitted!
20181123034308765 DAS <0012> assignment_fsm.c:128 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Freeing instance
20181123034308765 DAS <0012> fsm.c:381 assignment(conn4_0-0-6-TCH_F_PDCHasPDCH-0)[0x6120000021a0]{WAIT_LCHAN_ACTIVE}: Deallocated
20181123034308765 DMSC <0007> assignment_fsm.c:128 SUBSCR_CONN(conn4)[0x612000002620]{ASSIGNMENT}: Received Event ASSIGNMENT_END
20181123034308765 DMSC <0007> bsc_subscr_conn_fsm.c:399 SUBSCR_CONN(conn4)[0x612000002620]{ASSIGNMENT}: state_chg to ACTIVE
20181123034308765 DCHAN <0010> lchan_fsm.c:1297 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{WAIT_TS_READY}: state_chg to UNUSED
20181123034308765 DCHAN <0010> lchan_fsm.c:371 lchan(0-0-6-TCH_F_PDCH-0)[0x6120000036a0]{UNUSED}: (type=TCH_F) Clearing lchan state
20181123034308765 DTS <0011> lchan_fsm.c:404 timeslot(0-0-6-TCH_F_PDCH)[0x6120000087a0]{BORKEN}: Received Event TS_EV_LCHAN_UNUSED
20181123034308766 DLMGCP <0023> mgcp_client.c:749 Tx MGCP msg to MGCP GW: 'DLCX 3 rtpbridge/0@mgw MGCP 1.0'
20181123034308766 DLMGCP <0023> mgcp_client.c:751 Sending msg to MGCP GW size: 52
20181123034308766 DLINP <0015> stream.c:279 connected write
20181123034308766 DLINP <0015> stream.c:204 sending data
20181123034308769 DRLL <0000> mgcp_client_fsm.c:423 MGCP_CONN(to-BTS)[0x612000001d20]{ST_DLCX_RESP}: Received Event EV_DLCX_RESP
20181123034308769 DRLL <0000> mgcp_client_fsm.c:439 MGCP_CONN(to-BTS)[0x612000001d20]{ST_DLCX_RESP}: Terminating (cause = OSMO_FSM_TERM_REGULAR)
20181123034308769 DRLL <0000> mgcp_client_fsm.c:439 MGCP_CONN(to-BTS)[0x612000001d20]{ST_DLCX_RESP}: Freeing instance
20181123034308769 DRLL <0000> fsm.c:381 MGCP_CONN(to-BTS)[0x612000001d20]{ST_DLCX_RESP}: Deallocated

Similarly and even later, the other conn (other leg of the call, actually this is the one which we know failed initially, TS 7) fails due to timer expiring, while it should have been torn down a lot before:

20181123034311833 DCHAN <0010> fsm.c:189 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: Timeout of T23001
20181123034311833 DCHAN <0010> lchan_fsm.c:1297 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) Handling failure, will then transition to state UNUSED
20181123034311834 DCHAN <0010> lchan_fsm.c:78 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) lchan allocation failed in state WAIT_TS_READY: Timeout
20181123034311834 DCHAN <0010> lchan_fsm.c:107 lchan(0-0-7-TCH_F_PDCH-0)[0x612000003520]{WAIT_TS_READY}: (type=TCH_F) Signalling Assignment FSM of error (lchan allocation failed in state WAIT_TS_READY: Timeout)
20181123034311834 DAS <0012> lchan_fsm.c:1297 assignment(conn5_IMSI901700000023847_0-0-7-TCH_F_PDCHasPDCH-0)[0x612000001720]{WAIT_LCHAN_ACTIVE}: Received Event ASSIGNMENT_EV_LCHAN_ERROR
20181123034311834 DAS <0012> assignment_fsm.c:625 assignment(conn5_IMSI901700000023847_0-0-7-TCH_F_PDCHasPDCH-0)[0x612000001720]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=7,ss=0) Assignment failed in state WAIT_LCHAN_ACTIVE, cause RADIO INTERFACE MESSAGE FAILURE: Failed to activate lchan (bts=0,trx=0,ts=7,ss=0)
20181123034311834 DAS <0012> assignment_fsm.c:625 assignment(conn5_IMSI901700000023847_0-0-7-TCH_F_PDCHasPDCH-0)[0x612000001720]{WAIT_LCHAN_ACTIVE}: (bts=0,trx=0,ts=7,ss=0) incrementing rate counter: assignment:error Assigment failed for other reason.
20181123034311834 DMSC <0007> osmo_bsc_sigtran.c:367 Tx MSC: BSSMAP: ASSIGNMENT FAIL
....


Files

merge.pcap merge.pcap 1.46 MB pespin, 11/26/2018 05:17 PM
osmo-bsc.log osmo-bsc.log 442 KB pespin, 11/26/2018 05:18 PM

Related issues

Related to OsmoBTS - Bug #3706: osmo-bts-trx: PDCH DEACT ACK is sent before receiving response from TRXResolvedpespin11/23/2018

Actions
Related to OsmoMSC - Feature #3236: Rx Assignment Failure from BSC does nothingResolvedneels05/04/2018

Actions
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)