Project

General

Profile

Actions

Bug #6208

closed

modem: "UL_TBF[0x55a5d78de0]{ASSIGN}: Event RX_UL_ACK_NACK not permitted"

Added by fixeria 5 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
OsmocomBB modem (host)
Target version:
Start date:
10/06/2023
Due date:
% Done:

0%

Resolution:
Spec Reference:
Tags:

Description

I am observing this error while testing on the mssdr-ms host:

20231005173418051 DLLC DEBUG llc.c:145 modem_llc_prim_down_cb(): Rx GRR-UNITDATA.request ll=[01 c0 09 08 16 08 09 10 10 00 00 00 10 10 2e d7 b8 ]
20231005173418051 DRLCMAC INFO rlcmac_prim.c:522 Rx from upper layers: GRR-UNITDATA.request
20231005173418092 DRLCMAC INFO fsm.c:456 UL_TBF[0x55a5d72f00]{NEW}: Allocated  
20231005173418092 DRLCMAC INFO fsm.c:456 UL_TBF_ASS[0x55a5d725d0]{IDLE}: Allocated
20231005173418092 DRLCMAC INFO tbf_ul_ass_fsm.c:711 UL_TBF_ASS[0x55a5d725d0]{IDLE}: Received Event START_FROM_DL_TBF   
20231005173418092 DRLCMAC INFO tbf_ul_ass_fsm.c:301 UL_TBF[0x55a5d72f00]{NEW}: Received Event UL_ASS_START
20231005173418092 DRLCMAC INFO tbf_ul_fsm.c:148 UL_TBF[0x55a5d72f00]{NEW}: state_chg to ASSIGN  
20231005173418092 DRLCMAC INFO tbf_ul_ass_fsm.c:303 UL_TBF_ASS[0x55a5d725d0]{IDLE}: state_chg to WAIT_PKT_UL_ASS    
20231005173418188 DRLCMAC INFO rlcmac.c:793 TS=3 FN=1551883 Rx Pkt UL ACK/NACK
20231005173418189 DRLCMAC INFO tbf_ul.c:317 UL_TBF[0x55a5d72f00]{ASSIGN}: Received Event RX_UL_ACK_NACK
20231005173418189 DRLCMAC ERROR tbf_ul.c:317 UL_TBF[0x55a5d72f00]{ASSIGN}: Event RX_UL_ACK_NACK not permitted    
20231005173418189 DRLCMAC INFO tbf_ul.c:273 TBF(UL:NR-2:TLLI-91223344) Tx CS update: CS-1 -> CS-2
20231005173418230 DRLCMAC INFO rlcmac.c:793 TS=3 FN=1551892 Rx Pkt DL Dummy Ctrl Block    
20231005173418231 DRLCMAC INFO fsm.c:568 UL_TBF_ASS[0x55a5d725d0]{WAIT_PKT_UL_ASS}: Deallocated
20231005173418231 DRLCMAC INFO tbf_ul_fsm.c:74 UL_TBF[0x55a5d72f00]{ASSIGN}: Send L1CTL-CFG_UL_TBF.req ul_tbf_nr=2 (release)      
20231005173418231 DRLCMAC INFO fsm.c:568 UL_TBF[0x55a5d72f00]{ASSIGN}: Deallocated     

Please find the PCAPs (BTS side + MS side) attached.


Files

mssdr_gprs_ms.pcap mssdr_gprs_ms.pcap 1.19 MB fixeria, 10/05/2023 05:40 PM
mssdr_gprs_bts.pcapng mssdr_gprs_bts.pcapng 21.6 MB fixeria, 10/05/2023 05:40 PM

Related issues

Related to OsmocomBB - Bug #6209: modem: no response to ICMP Echo Req (ping to MS) with mssdr-msResolvedpespin10/06/2023

Actions
Actions #1

Updated by fixeria 5 months ago

As can be seen from the MS side PCAP (GSMTAP logging), there is another weird problem:

20231005173417913 DRLCMAC INFO tbf_ul.c:1161 UL_TBF[0x55a5d72f00]{FLOW}: Received Event FIRST_UL_DATA_SENT
20231005173417913 DRLCMAC INFO tbf_ul_fsm.c:179 UL_TBF[0x55a5d72f00]{FLOW}: First UL block sent, stop T3164
20231005173417913 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x55a5d72f00]{FLOW}: Received Event LAST_UL_DATA_SENT
20231005173417913 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x55a5d72f00]{FLOW}: Last UL block sent (CV=0), start T3182
20231005173417913 DRLCMAC INFO tbf_ul_fsm.c:201 UL_TBF[0x55a5d72f00]{FLOW}: state_chg to FINISHED
20231005173417949 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x55a5d72f00]{FINISHED}: Received Event LAST_UL_DATA_SENT      
20231005173417949 DRLCMAC INFO tbf_ul_fsm.c:241 UL_TBF[0x55a5d72f00]{FINISHED}: Data block with CV=0 retransmit attempts=1
20231005173417949 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x55a5d72f00]{FINISHED}: Last UL block sent (CV=0), start T3182
20231005173417972 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x55a5d72f00]{FINISHED}: Received Event LAST_UL_DATA_SENT   
20231005173417972 DRLCMAC INFO tbf_ul_fsm.c:241 UL_TBF[0x55a5d72f00]{FINISHED}: Data block with CV=0 retransmit attempts=2
20231005173417973 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x55a5d72f00]{FINISHED}: Last UL block sent (CV=0), start T3182
20231005173417991 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x55a5d72f00]{FINISHED}: Received Event LAST_UL_DATA_SENT
20231005173417991 DRLCMAC INFO tbf_ul_fsm.c:241 UL_TBF[0x55a5d72f00]{FINISHED}: Data block with CV=0 retransmit attempts=3
20231005173417991 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x55a5d72f00]{FINISHED}: Last UL block sent (CV=0), start T3182
20231005173418009 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x55a5d72f00]{FINISHED}: Received Event LAST_UL_DATA_SENT
20231005173418009 DRLCMAC INFO tbf_ul_fsm.c:241 UL_TBF[0x55a5d72f00]{FINISHED}: Data block with CV=0 retransmit attempts=4    
20231005173418009 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x55a5d72f00]{FINISHED}: Last UL block sent (CV=0), start T3182
20231005173418010 DRLCMAC NOTICE tbf_ul_fsm.c:245 UL_TBF[0x55a5d72f00]{FINISHED}: TBF establishment failure (Data block with CV=0 retransmit attempts=4)
20231005173418010 DRLCMAC INFO fsm.c:568 UL_TBF_ASS[0x55a5d725d0]{IDLE}: Deallocated
20231005173418010 DRLCMAC INFO tbf_ul_fsm.c:74 UL_TBF[0x55a5d72f00]{FINISHED}: Send L1CTL-CFG_UL_TBF.req ul_tbf_nr=1 (release)
20231005173418010 DRLCMAC INFO fsm.c:568 UL_TBF[0x55a5d72f00]{FINISHED}: Deallocated
Actions #2

Updated by fixeria 5 months ago

  • Related to Bug #6209: modem: no response to ICMP Echo Req (ping to MS) with mssdr-ms added
Actions #3

Updated by pespin 5 months ago

Regarding your 1st comment, it's another issue some I'm ignoring it here now. Describing the issue from the initial ticket, which also has a pcap mssdr_gprs_bts.pcapng:

  • frame_nr=44323 MS sends first and last UL block (CV=0) of the UL TBF containing the GMM Identity Respnse (IMEI). Network uses the already existing DL TBF to send the next GMM Identity Request (IMSI).
  • frame_nr=44393 PCU sends PKT UL ACK/NACK FinalACK=1 [RRBP] at gsm_fn=1551849 telling the MS it received all the data, and expects PKT CTRL ACK at N+13=1551862
  • [Missing: PKT CTRL ACK transmitted by MS to PCU at gsm_fn=1551862, it was probably dropped]
  • frame_nr=44860 PCU sends PKT UPLINK ACK/NACK [RRBP] which confirms the MS didn't receive the PKT CTRL ACK, so it is trying to ack everything again.
  • frame_nr=44875 MS uses the PKT DL ACK/NACK to set "Channel_request_description" field and hence ask for a new UL TBF.
  • frame_nr=44938 PCU sends PKT UL ASS [RRBP] to assign a new UL TBF.
  • frame_nr=45329 MS PKT CTRL ACKs the PKT UL ASS.

So the main problem here is that the MS is finishing the UL TBF after sending the PKT CTRL ACK, and if the PCU doesn't received that PKT CTRL ACK, both sides become out of sync: MS thinks the UL TBF is finished, and PCU tries to ACK everything again because it doesn't know if the PKT CTRL ACK was lost or if MS didn't receive the PKT UL ACK/NACK in first place.

So while PCU is trying to finish the UL TBF, MS considers it finished and asks PCU for a new UL TBF using the existing DL TBF (through DL ACK/NACK in frame_nr=44875). That frame_nr=44875 (PKT DL ACK/NACK) is the responisble for triggering this:

20231005173418051 DLLC DEBUG llc.c:145 modem_llc_prim_down_cb(): Rx GRR-UNITDATA.request ll=[01 c0 09 08 16 08 09 10 10 00 00 00 10 10 2e d7 b8 ]
20231005173418051 DRLCMAC INFO rlcmac_prim.c:522 Rx from upper layers: GRR-UNITDATA.request
20231005173418092 DRLCMAC INFO fsm.c:456 UL_TBF[0x55a5d72f00]{NEW}: Allocated  
20231005173418092 DRLCMAC INFO fsm.c:456 UL_TBF_ASS[0x55a5d725d0]{IDLE}: Allocated
20231005173418092 DRLCMAC INFO tbf_ul_ass_fsm.c:711 UL_TBF_ASS[0x55a5d725d0]{IDLE}: Received Event START_FROM_DL_TBF   
20231005173418092 DRLCMAC INFO tbf_ul_ass_fsm.c:301 UL_TBF[0x55a5d72f00]{NEW}: Received Event UL_ASS_START
20231005173418092 DRLCMAC INFO tbf_ul_fsm.c:148 UL_TBF[0x55a5d72f00]{NEW}: state_chg to ASSIGN  
20231005173418092 DRLCMAC INFO tbf_ul_ass_fsm.c:303 UL_TBF_ASS[0x55a5d725d0]{IDLE}: state_chg to WAIT_PKT_UL_ASS

Then, since the PCU still keeps to ack the old UL TBF, MS says this when reciving it:

20231005173418188 DRLCMAC INFO rlcmac.c:793 TS=3 FN=1551883 Rx Pkt UL ACK/NACK
20231005173418189 DRLCMAC INFO tbf_ul.c:317 UL_TBF[0x55a5d72f00]{ASSIGN}: Received Event RX_UL_ACK_NACK
20231005173418189 DRLCMAC ERROR tbf_ul.c:317 UL_TBF[0x55a5d72f00]{ASSIGN}: Event RX_UL_ACK_NACK not permitted 

Because that's indeed for the older UL TBF.

Actions #4

Updated by pespin 5 months ago

TS 44.060 9.3.2.4.2 :

If the PACKET UPLINK ACK/NACK message has the Final Ack Indicator bit set to "1" and the mobile station does
not initiate the establishment of a new uplink TBF according to one of the procedures described above, the mobile
station shall transmit the PACKET CONTROL ACKNOWLEDGEMENT message and release the TBF.

so the MS is acting correctly in the sense that it's freeing the UL TBF after theoretically having sent the PKT CTRL ACK (which doesn't seem to arrive to the PCU).
The question to investigate is: why didn't the PKT CTRL ACK reach the PCU?

Actions #5

Updated by pespin 5 months ago

Logs from osmo-bts/osmo-pcu regarding the never received PKT CTRL ACK in gsm_fn=1551862

44767    17:34:18.095240221    Oct  5, 2023 19:34:18.095240221 CEST    127.0.0.1    54671    127.0.0.1    4729    GSMTAP    211    1551862/1170/00/34/38 (bts=0,trx=0,ts=3) PDTCH: Received PDTCH bid=0
44772    17:34:18.099987467    Oct  5, 2023 19:34:18.099987467 CEST    127.0.0.1    54671    127.0.0.1    4729    GSMTAP    211    1551863/1170/01/35/39 (bts=0,trx=0,ts=3) PDTCH: Received PDTCH bid=1
44779    17:34:18.104640189    Oct  5, 2023 19:34:18.104640189 CEST    127.0.0.1    54671    127.0.0.1    4729    GSMTAP    211    1551864/1170/02/36/40 (bts=0,trx=0,ts=3) PDTCH: Received PDTCH bid=2 
44807    17:34:18.109678724    Oct  5, 2023 19:34:18.109678724 CEST    127.0.0.1    54671    127.0.0.1    4729    GSMTAP    211    1551865/1170/03/37/41 (bts=0,trx=0,ts=3) PDTCH: Received PDTCH bid=3
44808    17:34:18.109869726    Oct  5, 2023 19:34:18.109869726 CEST    127.0.0.1    54671    127.0.0.1    4729    GSMTAP    245    1551865/1170/03/37/41 (bts=0,trx=0,ts=3) PDTCH: Received bad data (rc=-5, BER 0/0) ending at fn=81/104 
44809    17:34:18.109960165    Oct  5, 2023 19:34:18.109960165 CEST    127.0.0.1    54671    127.0.0.1    4729    GSMTAP    218    1551862/1170/00/34/38 Rx PH-DATA.ind chan_nr=PDCH on TS3 link_id=0x00 len=0
44810    17:34:18.110042684    Oct  5, 2023 19:34:18.110042684 CEST    127.0.0.1    54671    127.0.0.1    4729    GSMTAP    202    Sending data indication: sapi=PDTCH arfcn=979 block=6 data=

Actions #6

Updated by pespin 5 months ago

After looking at the MS pcap, it seems both issues reported in this ticket are actually related. The "TBF establishment failure (Data block with CV=0 retransmit attempts=4)" is the culprit.
That line basically says that the UL TBF is freed, at frame_nr=788.
Then, at frame_nr=801:

801      17:34:18.032486      Oct  5, 2023 19:34:18.032486000 CEST      127.0.0.1      42239      127.0.1.1      4729      GSMTAP      201      TS=3 FN=1551849 Rx Pkt UL ACK/NACK: UL_TBF TFI=0 not found

Since the TBF is gone due to retransmit problem of course it won't send a PKT CTRL ACK for that UL ACK/NACK.

So the main issue here is the fn-advance making the CV=0 retransmit 4 times and hence free the UL TBF.

Actions #7

Updated by pespin 5 months ago

  • Status changed from New to Feedback
  • Assignee changed from pespin to fixeria

The problem of the MS freeing the UL TBF after retransmitting 4 CV=0 blocks should be fixed here:
https://gerrit.osmocom.org/c/libosmo-gprs/+/34679 tbf_ul_fsm: Avoid freeing tbf after 4 CV=0 retrans in acknowledged mode

That limitation really only applied when in Unacknowledged operation mode, which we don't really support/use.

There's still the more general problem of huge delay between UL/DL (fn-advance/rts-advance), which means up to 4-5 DL blocks can be scheduled before having received the answer to the first one in PCU, which is quite a lot and for sure undesired. Assigning this ticket to fixeria since there was some extra testing he wanted to do regarding this topic.

Actions #8

Updated by fixeria 5 months ago

  • Status changed from Feedback to Resolved

Thanks, I cherry-picked the patch and deployed it on the mssdr-ms:

20231006213325367 DRLCMAC INFO tbf_ul.c:1161 UL_TBF[0x557d247f00]{FLOW}: Received Event FIRST_UL_DATA_SENT
20231006213325367 DRLCMAC INFO tbf_ul_fsm.c:179 UL_TBF[0x557d247f00]{FLOW}: First UL block sent, stop T3164
20231006213325368 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x557d247f00]{FLOW}: Received Event LAST_UL_DATA_SENT
20231006213325368 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x557d247f00]{FLOW}: Last UL block sent (CV=0), start T3182
20231006213325368 DRLCMAC INFO tbf_ul_fsm.c:201 UL_TBF[0x557d247f00]{FLOW}: state_chg to FINISHED
20231006213325404 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x557d247f00]{FINISHED}: Received Event LAST_UL_DATA_SENT
20231006213325404 DRLCMAC INFO tbf_ul_fsm.c:238 UL_TBF[0x557d247f00]{FINISHED}: Data block with CV=0 retransmit attempts=1
20231006213325404 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x557d247f00]{FINISHED}: Last UL block sent (CV=0), start T3182
20231006213325427 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x557d247f00]{FINISHED}: Received Event LAST_UL_DATA_SENT
20231006213325427 DRLCMAC INFO tbf_ul_fsm.c:238 UL_TBF[0x557d247f00]{FINISHED}: Data block with CV=0 retransmit attempts=2
20231006213325427 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x557d247f00]{FINISHED}: Last UL block sent (CV=0), start T3182
20231006213325445 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x557d247f00]{FINISHED}: Received Event LAST_UL_DATA_SENT
20231006213325446 DRLCMAC INFO tbf_ul_fsm.c:238 UL_TBF[0x557d247f00]{FINISHED}: Data block with CV=0 retransmit attempts=3
20231006213325446 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x557d247f00]{FINISHED}: Last UL block sent (CV=0), start T3182
20231006213325464 DRLCMAC INFO tbf_ul.c:1163 UL_TBF[0x557d247f00]{FINISHED}: Received Event LAST_UL_DATA_SENT
20231006213325464 DRLCMAC INFO tbf_ul_fsm.c:238 UL_TBF[0x557d247f00]{FINISHED}: Data block with CV=0 retransmit attempts=4
20231006213325464 DRLCMAC INFO tbf_ul_fsm.c:105 UL_TBF[0x557d247f00]{FINISHED}: Last UL block sent (CV=0), start T3182
20231006213325487 DRLCMAC INFO rlcmac.c:793 TS=3 FN=1655277 Rx Pkt UL ACK/NACK
20231006213325487 DRLCMAC INFO tbf_ul.c:317 UL_TBF[0x557d247f00]{FINISHED}: Received Event RX_UL_ACK_NACK
20231006213325487 DRLCMAC INFO tbf_ul_fsm.c:232 UL_TBF[0x557d247f00]{FINISHED}: state_chg to RELEASING

Things seem to look a lot better now. We were cutting it really close :)

  • The Rx Pkt UL ACK/NACK happens ~13 ms after the 4th retransmit attempt.
  • Without your patch we used to free the TBF right after the 4th retransmit attempt.
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)