Project

General

Profile

Bug #5033

N3101 potentially being updated only during RBBP poll but not during USF poll

Added by pespin 3 months ago. Updated 11 days ago.

Status:
Feedback
Priority:
Normal
Assignee:
Target version:
-
Start date:
02/17/2021
Due date:
% Done:

90%

Spec Reference:

Description

N3101 spec can be found in TS 44.060 "13.4 Counters on the Network side". Regarding possible values it states: "N3101max shall be greater than 8."

N3101 is set per-bts over PCUIF. In turn the BTS gets that N3101 value from BSC over OML (NM_ATT_IPACC_RLC_CFG). The value "10" is harcoded in osmo-bsc here:
https://git.osmocom.org/osmo-bsc/tree/src/osmo-bsc/bts_ipaccess_nanobts_omlattr.c#n170

Something alarming though... spec says: "the network will increment counter N3101 for each USF for which no data is received for this TBF.", and from looking at the code I have the feeling we are only increasing it only during tbf::poll_timeout(), but not each time that an UL packet from the tbf was expected (eg data ul packets, polled by USF). It looks as if N3101 was implementing N3105 in mind instead (which triggers T3195 which also ends up calling tbf_timeout_free()).

N3105 seems to be implemented (also increasing in tbf::poll_timeout()), but it may not be necessarily being increased in all RRBP cases (eg. CTRL ACKins a Packet Cell Change Continue). A dl_tbf::is_waiting_for_rrbp(curr_fn, curr_ts); function covering all cases would be welcome here.

First of all, I'd look at whether poll_timeout() is called when the BTS doesn't receive data for a given MS which was asked to transmit by USF. That can be done easily by writing a TTCN3 case in PCU_Tests.
AFAIU poll_timeout() should only be called when a poll FN is reserved by means of RRBP, but not USF (see sched_select_uplink() in gprs_rlcmac_sched.cpp).

To see other current polling scheduler issues, see #5020.


Related issues

Related to OsmoPCU - Bug #5020: osmo-pcu: Poll and SBA allocation improvements and fixesResolved02/11/2021

Related to OsmoPCU - Bug #3928: Missing PCU_Tests.ttcn for various timers / time-outsIn Progress04/15/2019

Associated revisions

Revision f30fa1d2 (diff)
Added by pespin 2 months ago

pcu: Introduce test TC_n3101_max_t3169

This test currently fails to pass in master osmo-pcu (and latest) due to
T3169 not being implemented exactly as per specs (due to limitations in
detecting lost UL blocks with assigned USF).

Related: OS#5033
Change-Id: I56177850f084cdaf4fcac63ebdcdff9cef4e7a5d

Revision 81c549d5 (diff)
Added by pespin about 2 months ago

direct_phy: Support submitting DATA.ind with len=0 to upper layers

Since recently (see Depends below), BTS side submits DATA.ind with len=0
to announce nothing was received on that UL block FN. This will allow
osmo-pcu track time more accurately, and use this information to quickly
find out if a UL block was expected as requested by RRBP or USF poll and
increment counters such as N3101 (finally being able to properly
implement timers such as T3619).

This patch does the same for direct phy feature, where the osmo-pcu
process receives the DATA.ind directly from the DSP.

Depends: osmo-bts.git Change-Id I343c7a721dab72411edbca816c8864926bc329fb

Related: OS#5033
Change-Id: I9a835e16ef0e5a68c003a93d1a33233aa43464ae

Revision f0b54b83 (diff)
Added by pespin about 1 month ago

Set matching USF if available when polling a UL TBF

When the scheduler detects it's time to receive a UL block due to a
scheduled poll, if that polling is done on a UL TBF, then use its USF if
available instead of using USF_UNUSED (=7) when sending a DL block on
that same FN.

This is not really needed for correct work, since MS take care
themselves of scheduling a UL block when they receive the poll (RRBP)
some time before, and don't check the USF at the time of transmitting.
In any case, it helps understand better when looking at pcap traces that
indeed it a UL block from that MS was requested, instead of setting USF
to 7.

Related: OS#5033
Change-Id: I2ad9d8ea6afc8f83192033470bd27010a7474430

Revision 96dca55d (diff)
Added by pespin about 1 month ago

Track scheduled UL blocks through USF

This way PCU can now detect whether scheduled UL blocks through USF
were never received. This allows in a follow-up patch to start
increasing N3101 properly.

Related: OS#5033
Change-Id: Ia99c9edad6e5bd837e9baeb4fb2683b227887957

Revision cdc5a829 (diff)
Added by pespin about 1 month ago

Track scheduled UL blocks through USF

This way PCU can now detect whether scheduled UL blocks through USF
were never received. This allows in a follow-up patch to start
increasing N3101 properly.

Related: OS#5033
Change-Id: Ia99c9edad6e5bd837e9baeb4fb2683b227887957

Revision c62663d4 (diff)
Added by pespin about 1 month ago

Properly implement N3101

N3101 is incremented by unanswered USF requests, not from unanswered
POLLs.

Related: OS#5033
Change-Id: I1a55bdd39db8843976915b9f74fadb0942298413

Revision 7bd92a3e (diff)
Added by pespin about 1 month ago

Set matching USF if available when polling a UL TBF

When the scheduler detects it's time to receive a UL block due to a
scheduled poll, if that polling is done on a UL TBF, then use its USF if
available instead of using USF_UNUSED (=7) when sending a DL block on
that same FN.

This is not really needed for correct work, since MS take care
themselves of scheduling a UL block when they receive the poll (RRBP)
some time before, and don't check the USF at the time of transmitting.
In any case, it helps understand better when looking at pcap traces that
indeed it a UL block from that MS was requested, instead of setting USF
to 7.

Related: OS#5033
Change-Id: I2ad9d8ea6afc8f83192033470bd27010a7474430

Revision c1f38c7f (diff)
Added by pespin about 1 month ago

Track scheduled UL blocks through USF

This way PCU can now detect whether scheduled UL blocks through USF
were never received. This allows in a follow-up patch to start
increasing N3101 properly.

Related: OS#5033
Change-Id: Ia99c9edad6e5bd837e9baeb4fb2683b227887957

Revision 0b998b15 (diff)
Added by pespin about 1 month ago

Properly implement N3101

N3101 is incremented by unanswered USF requests, not from unanswered
POLLs.

Related: OS#5033
Change-Id: I1a55bdd39db8843976915b9f74fadb0942298413

Revision f580a637 (diff)
Added by pespin 17 days ago

pcu: Introduce test TC_n3103_max_t3169

Related: OS#5033
Change-Id: I90661ce249ff2a2040b9e82688cf413eb65e0c8e

Revision cc895dfc (diff)
Added by pespin 16 days ago

pcu: Introduce test TC_n3103_max_t3169

Related: OS#5033
Change-Id: I90661ce249ff2a2040b9e82688cf413eb65e0c8e

Revision 39412e2f (diff)
Added by pespin 15 days ago

Stop abusing T3169

Now that we finally handle N3101 and N3103 correctly, we can fix abuse
of T3169 we were doing to make sure TBFs were freed.

According to 3GPP TS 44.060, T3169 should be armed:
  • N3101_MAX reached
  • N3103_MAX reached

Furthermore, when T3169 is enabled, the tbf should be in state
RELEASING so that its USF is not used.

TODO: There's a specific case where RELEASING state can go back to working
state:
"""
If N3101 reaches the value N3101max, the network shall stop sending
PACKET UPLINK ACK/NACK messages to the mobile station for that TBF
and shall start timer T3169 for the TBF. If an RLC/MAC block is received
from the TBF when timer T3169 is running, the network shall stop timer
T3169 and resume sending PACKET UPLINK ACK/NACK messages to the TBF.
When T3169 expires, the network may consider the TBF as released and
reuse the TFI value.
"""

See full description: https://osmocom.org/issues/5033#note-2

Related: OS#5033
Change-Id: I2cec531e2633281b88f69ba065c0105580c81076

Revision 846fd248 (diff)
Added by pespin 11 days ago

Stop abusing T3169

Now that we finally handle N3101 and N3103 correctly, we can fix abuse
of T3169 we were doing to make sure TBFs were freed.

According to 3GPP TS 44.060, T3169 should be armed:
  • N3101_MAX reached
  • N3103_MAX reached

Furthermore, when T3169 is enabled, the tbf should be in state
RELEASING so that its USF is not used.

See full description: https://osmocom.org/issues/5033#note-2

Related: OS#5033
Change-Id: I2cec531e2633281b88f69ba065c0105580c81076

History

#1 Updated by pespin 3 months ago

  • Related to Bug #5020: osmo-pcu: Poll and SBA allocation improvements and fixes added

#2 Updated by pespin 2 months ago

Related issue:
According to specs, T3169 is only to be armed when:
  • N3101_MAX is reached, by incrementing N3101 (unanswered USF transmitted blocks).
  • N3103_MAX is reached, by incrementing N3103 (unanswered last UL ACK by means of CTRL ACK by MS, incremented on each retransmition of UL ACK).

Furthermore, when T3169 is enabled, the tbf should be in state RELEASING so that its USF is not used (see below some uses of it don't fullfil this).

However, I see T3169 being armed in up to 5 places:
  • tbf.cpp gprs_rlcmac_tbf::poll_timeout(): This is at first sight the only proper place where it should happen, that is, after increasing N3101. It has to yet bee seen whether this whole block is correctly placed here in this function (see rant at the description of the ticket).
  • tbf.cpp gprs_rlcmac_tbf::poll_timeout(): Check for the "N3103_MAX" condition. This one looks good to me.
  • bts.cpp bts_rcv_rach(): T3169 is IMHO abused here and wrongly used, since the TBF is not set to RELEASE state (for obvious reasons). It needs to be checked with a TTCN3 what happens exactly under the scenario where MS never transmits UL data after requesting the UL ASS, specially for single block allocations. If a TBF exists, AFAIU we should simply rely on N3101 and drop the T_START for this case.
  • tbf_ul.cpp tbf_alloc_ul(): Same here, we should rely on N3101 to eventually start T3169.
  • tbf_ul.cpp gprs_rlcmac_ul_tbf::rcv_data_block_acknowledged(): This seems like a complete abuse of T3169 in order to make sure to drop eventually the TBF.

So all in all, my feeling is that N3101 counting may be broken and hence we rely on abusing T3169 by setting it directly in different places.

#3 Updated by pespin 2 months ago

pespin wrote:

  • bts.cpp bts_rcv_rach(): T3169 is IMHO abused here and wrongly used, since the TBF is not set to RELEASE state (for obvious reasons). It needs to be checked with a TTCN3 what happens exactly under the scenario where MS never transmits UL data after requesting the UL ASS, specially for single block allocations. If a TBF exists, AFAIU we should simply rely on N3101 and drop the T_START for this case.

https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/23232 pcu: Introduce test TC_n3101_max_t3169

So my concerns seems to be true. AFAICT there's no system in place for osmo-pcu now (PCUIF issue?) to detect USF assigned UL blocks for which no data was received, hence N3101 doesn't seem to be incremented, only by lost RRBP polls (gprs_rlcmac_tbf::poll_timeout() only called in that case, and N3101 being incremented there).

I still need to investigate further what goes on in BTS and PCUIF when the BTS is unable to decode a PDCH block (FN). A potentially good improvement would be to send some sort of NOPE-ind as we do in TRXDv1 nowadays.

#4 Updated by pespin about 1 month ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 60

remote: https://gerrit.osmocom.org/c/osmo-pcu/+/23486 Fix: left shift cannot be repesented in type int [NEW]
remote: https://gerrit.osmocom.org/c/osmo-pcu/+/23487 sched: Fix scheduling UL TBF not matching conditions [NEW]
remote: https://gerrit.osmocom.org/c/osmo-pcu/+/23488 sched: Simplify usf selection code [NEW]
remote: https://gerrit.osmocom.org/c/osmo-pcu/+/23489 Set matching USF if available when polling a UL TBF [NEW]
remote: https://gerrit.osmocom.org/c/osmo-pcu/+/23490 pdch: Add mising pdch_ulc_release_node in Rx Cell Change Notif [NEW]
remote: https://gerrit.osmocom.org/c/osmo-pcu/+/23491 pdch_ulc: Create helper API pdch_ulc_release_node [NEW]
remote: https://gerrit.osmocom.org/c/osmo-pcu/+/23492 Track scheduled UL blocks through USF [NEW]
remote: https://gerrit.osmocom.org/c/osmo-pcu/+/23493 Properly implement N3101 [NEW]

After this bunch of patches become merged, we already track all sort of UL blocks in a unified way. What's missing regarding this ticket:
  • Make sure TTCN3 tests for T3169, N3101, N3103 and N3105 are passing fine. That means, investigating whether those counters are being incremented/reset in a correct way.
  • Rework T3169 to only be armed as a result of N3101_MAX, N3103_MAX, etc. (when TBF goes into RELEASE state iirc), instead of having it always armed and re-arm it every itme something is received.

#5 Updated by pespin 17 days ago

What's missing regarding this ticket:
  • Make sure TTCN3 tests for T3169, N3101, N3103 and N3105 are passing fine. That means, investigating whether those counters are being incremented/reset in a correct way.

This is done in several tests, all of them behaving fine:

    execute( TC_n3101_max_t3169() );
    execute( TC_n3103_max_t3169() );
    execute( TC_n3105_max_t3195() );

I just submitted the missing one for N3103 here:
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/23811 pcu: Introduce test TC_n3103_max_t3169

  • Rework T3169 to only be armed as a result of N3101_MAX, N3103_MAX, etc. (when TBF goes into RELEASE state iirc), instead of having it always armed and re-arm it every itme something is received.

^ This is still TODO.

#6 Updated by pespin 11 days ago

  • Status changed from In Progress to Feedback
  • % Done changed from 60 to 90

pespin wrote:

  • Rework T3169 to only be armed as a result of N3101_MAX, N3103_MAX, etc. (when TBF goes into RELEASE state iirc), instead of having it always armed and re-arm it every itme something is received.

^ This is still TODO.

Done here, once merged the ticket can be closed:
https://gerrit.osmocom.org/c/osmo-pcu/+/23872 Stop abusing T3169

#7 Updated by pespin 11 days ago

  • Related to Bug #3928: Missing PCU_Tests.ttcn for various timers / time-outs added

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)