Project

General

Profile

Actions

Feature #5500

closed

MS-Side GPRS RLC/MAC implementation

Added by laforge about 2 years ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
OsmocomBB Layer 2 (LAPDm)
Target version:
Start date:
07/01/2022
Due date:
% Done:

100%

Resolution:
Spec Reference:
3GPP TS 44.060
Tags:

Description

In the Osmocom universe, we currently only have a PCU/network side RLC/MAC implementation in osmo-pcu.

In order to support GPRS in OsmocomBB, we will need a RLC/MAC implementation for the MS side.


Files

graphviz.svg View graphviz.svg 6.11 KB GRR FSM diagram fixeria, 06/17/2023 01:09 PM
grr_fsm_diagram.png View grr_fsm_diagram.png 21.5 KB fixeria, 06/17/2023 01:14 PM
gprsulmstrx.pcapng gprsulmstrx.pcapng 22 KB Hoernchen, 09/20/2023 01:23 PM
mssdr_gprs.pcap.gz mssdr_gprs.pcap.gz 1.22 MB fixeria, 09/26/2023 08:05 PM
mssdr_gprs_msclass_hack.pcap.gz mssdr_gprs_msclass_hack.pcap.gz 162 KB fixeria, 09/26/2023 08:29 PM

Checklist

  • Tx Countdown procedure (BS_CV_MAX)
  • Handle Pkt Dl Ass in UL TBF (section 8.1.1.1.3)
  • l1gprs: add TBF starting time to UL/DL TBF CFG.req
  • l1gprs: address tbf_nr problems pointed out by pespin
  • modem: implement GRR-FSM

Subtasks 5 (0 open5 closed)

Bug #6102: libosmo-gprs-rlcmac: DL ACK/NACK endless loop again PCU after 128 bsns receivedResolvedpespin07/20/2023

Actions
Feature #6108: libosmo-gprs-rlcmac: Bug in CV calculation ending with last message as CV=1Resolvedpespin07/21/2023

Actions
Bug #6130: modem: Fix Submitting CCCH_DATA.ind with hardcoded fn=0 to libosmo-gprs-rlcmacResolvedpespin08/02/2023

Actions
Feature #6131: modem: Implement pkt-access-procedure retransmissionResolvedfixeria08/02/2023

Actions
Feature #6133: modem: Support passing start_time_fn in L1CTL-CFG_UL_TBF.req towards L1CTLResolvedfixeria08/02/2023

Actions

Related issues

Related to OsmocomBB - Feature #5501: MS-side GPRS Mobility Management (GMM) + Session Management (SM)Stalledpespin07/01/2022

Actions
Related to OsmocomBB - Feature #5502: MS-side LLC implementationStalledpespin07/01/2022

Actions
Related to libosmocore - Bug #3626: LAPDm code pulls both 'l1h' and 'l2h' of msgbResolvedpespin10/04/2018

Actions
Related to OsmocomBB - Bug #6201: modem: signal proper MS Radio Access CapabilityNew10/04/2023

Actions
Related to OsmocomBB - Feature #6132: Add MS_GPRS_Tests to osmo-ttcn3-hacksNewfixeria08/02/2023

Actions
Actions #1

Updated by laforge about 2 years ago

  • Related to Feature #5501: MS-side GPRS Mobility Management (GMM) + Session Management (SM) added
Actions #2

Updated by laforge almost 2 years ago

  • Tags set to ARDC
Actions #3

Updated by fixeria over 1 year ago

  • Assignee set to fixeria
Actions #4

Updated by fixeria over 1 year ago

  • Assignee deleted (fixeria)
Actions #5

Updated by fixeria over 1 year ago

Initial l1gprs implementation can be found here:

https://cgit.osmocom.org/osmocom-bb/log/?h=fixeria/trxcon_gprs
https://cgit.osmocom.org/osmocom-bb/commit/?h=fixeria/trxcon_gprs&id=f74895cb89b4bfe4608262b66c57c3d94c408487

Currently it simply decodes Downlink RLC/MAC signalling blocks (using libosmo-gprs-rlcmac and prints them.

Actions #6

Updated by fixeria over 1 year ago

Here is the l1gprs_test application, which can be used for running TTCN-3 tests specifically against the RLC/MAC layer:

https://cgit.osmocom.org/osmocom-bb/commit/?h=fixeria/trxcon_gprs&id=5f03b0710b8d221d1708e596b9bf0af16ae6a689

The key idea is to allow attaching directly to the RLC/MAC layer, without the need to run a chain of fake_trx.py and osmo-bts-trx. This is similar to what we do in the ttcn3-pcu-test (attaching directly over the PCUIF), however instead of being the MS side the testsuite will be acting as the PCU. For the lower l1gprs_test/ttcn3 communication (towards the PHY) I propose to (re)use the L1CTL server and protocol, so that we can simply use the existing L1CTL records/templates: DATA.req and DATA.ind would carry Uplink and Downlink RLC/MAC blocks, respectively. For the upper communication (towards the LLC layer) we will likely need to extend the L1CTL protocol anyway.

Actions #7

Updated by fixeria over 1 year ago

Just to clarify: I have been working on the initial RLC/MAC sub-layer for trxcon (the libl1gprs) while Eric was absent. Somehow I overlooked this ticket, so it remained untouched until now. Sharing what have been done so far. I was assigned to work on #3400, plus will be following up on #5599. Once I am done with these tickets, I would like to get back here and continue working on the libl1gprs and the corresponding TTCN-3 tests. It does not mean I am willing to prevent anyone else from working on this ticket; we can work together as there's plenty of things to do here. Just indicating my preferences.

Actions #8

Updated by pespin over 1 year ago

  • Spec Reference set to 3GPP TS 44.060
Actions #9

Updated by fixeria over 1 year ago

  • Status changed from New to In Progress
  • Assignee set to fixeria

I am resuming to work on the l1gprs (part of trxcon) and l1gprs_test (special binary for TTCN-3 testing).

Actions #10

Updated by fixeria over 1 year ago

Quick status update:

I submitted Work-in-Progress GRR layer implementation to Gerrit:

https://gerrit.osmocom.org/c/osmocom-bb/+/30743 trxcon: add initial GPRS L1 implementation - libl1gprs.la [NEW]
https://gerrit.osmocom.org/c/osmocom-bb/+/30744 trxcon: call trxcon_set_log_cfg() from trxcon_main.c [NEW]
https://gerrit.osmocom.org/c/osmocom-bb/+/30745 trxcon: add (optional) l1gprs_test binary for TTCN-3 [NEW]

I also submitted a TTCN-3 testsuite skeleton for the GRR layer:

https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/29816 gprs-modem: initial testsuite skeleton for GPRS modem [WIP]

as well as some improvements and patches adding send templates for L1CTL PDUs:

https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/30737 library/L1CTL_Types: eliminate warning about missing 'h0h1' field [NEW]
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/30738 library/L1CTL_Types: add send template for L1CTL_DATA_IND [NEW]
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/30739 library/L1CTL_PortType: allow sending L1ctlDlMessage via L1CTL_PT [NEW]
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/30740 library: move tr_PTCCHDownlinkMsg to the proper module [NEW]
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/30741 library/RLCMAC_Templates: add ts_PTCCHDownlinkMsg [NEW]
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/30742 library/RLCMAC_Templates: add ts_RLCMAC_DL_DUMMY_CTRL [NEW]

At this stage I am able to send some Downlink RLC/MAC signalling messages and see them parsed by the GRR layer:

20221219195204577 DL1C NOTICE L1CTL server got a new connection (id=0) (l1ctl_server.c:171)
20221219195204577 DGRR NOTICE l1gprs[0x0x55b1d3eea160]: Resetting GRR state (l1gprs_test.c:82)
20221219195204656 DGRR DEBUG l1gprs[0x0x55b1d3eea160]: (PDCH-7) Rx PDTCH/D block (fn=1337, len=23): 47 94 00 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (gprs.c:136)
20221219195204671 DLCSN1 INFO osmo_csn1_stream_decode (type: Pkt DL Dummy Ctrl Block (37): MESSAGE_TYPE = 37 | PAGE_MODE = 0 | Exist_PERSISTENCE_LEVEL = 0 | Padding = 0|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43| (ts_44_060.c:4796)
20221219195204724 DGRR DEBUG l1gprs[0x0x55b1d3eea160]: (PDCH-7) Rx PDTCH/D block (fn=1337, len=23): 47 94 00 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (gprs.c:136)
20221219195204724 DLCSN1 INFO osmo_csn1_stream_decode (type: Pkt DL Dummy Ctrl Block (37): MESSAGE_TYPE = 37 | PAGE_MODE = 0 | Exist_PERSISTENCE_LEVEL = 0 | Padding = 0|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43| (ts_44_060.c:4796)
20221219195204724 DGRR DEBUG l1gprs[0x0x55b1d3eea160]: (PDCH-7) Rx PDTCH/D block (fn=1337, len=23): 47 94 00 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (gprs.c:136)
20221219195204724 DLCSN1 INFO osmo_csn1_stream_decode (type: Pkt DL Dummy Ctrl Block (37): MESSAGE_TYPE = 37 | PAGE_MODE = 0 | Exist_PERSISTENCE_LEVEL = 0 | Padding = 0|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43|43| (ts_44_060.c:4796)
20221219195204724 DGRR DEBUG l1gprs[0x0x55b1d3eea160]: (PDCH-7) Rx PTCCH/D block (fn=1337, len=23): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2b 2b 2b 2b 2b 2b 2b  (gprs.c:164)
20221219195204724 DL1D NOTICE l1c[0x55b1d3eea060]: L1CTL connection error: read() failed (rc=0): Success (l1ctl_server.c:55)
20221219195204724 DL1C NOTICE l1c[0x55b1d3eea060]: Closing L1CTL connection (l1ctl_server.c:206)

Now I am implementing the TBF FSM (looking at the one in osmo-pcu.git as the reference).

Actions #11

Updated by fixeria over 1 year ago

I submitted a few patches for libosmo-gprs-rlcmac implementing the missing bits for decoding the IA Rest Octets IE:

https://gerrit.osmocom.org/c/libosmo-gprs/+/30797 rlcmac: add decoder and test vectors for IA Rest Octets [NEW]
https://gerrit.osmocom.org/c/libosmo-gprs/+/30798 rlcmac: fix coding of EGPRS Packet Uplink Assignment in IA RestOctets [NEW]
https://gerrit.osmocom.org/c/libosmo-gprs/+/30799 rlcmac: rename s/IA_EGPRS_00_t/IA_EGPRS_PktUlAss_t/ [NEW]
https://gerrit.osmocom.org/c/libosmo-gprs/+/30800 rlcmac: implement the missing IA_MultiBlock_PktDlAss_t [NEW]

This is needed for Uplink and Downlink TBF establishment via AGCH.

Actions #12

Updated by fixeria over 1 year ago

Quick status update:

Some time ago pespin merged the modem application skeleton:

https://gerrit.osmocom.org/c/osmocom-bb/+/30260 host/layer23: Add modem app

I reworked this skeleton into a layer23 app, and implemented passive decoding of SI{3,4,13} and IA Rest Octets:

https://gerrit.osmocom.org/c/osmocom-bb/+/30812 layer23/sysinfo: implement decoding of SI13 Rest Octets
https://gerrit.osmocom.org/c/osmocom-bb/+/30870 modem: passive decoding of SI{3,4,13} and IA Rest Octets

This is the minimum set of Rest Octets (except the SI4) the GRR layer needs to decode and process (according to TS 44.060).

Apart of this, I have a draft implementation of the logic triggering Uplink TBF allocation by sending RACH.req with specific RA values (as per 3GPP TS 44.018, 9.1.8) and the logic matching Immediate Assignment messages containing the Uplink TBF information for us. This is not really useful until we have implemented the lower layers (RLC/MAC) maintaining the TBF and the upper layers (LLC & at least GMM) sending and receiving data over the TBFs.

I will continue working on the GRR logic and will focus on integration with the upper LLC layer.

Actions #13

Updated by pespin about 1 year ago

I submitted a patchset to libosmo-gprs-rlcmac to add APIs for the upper-side primitives (GRR and GMMRR SAPs):
https://gerrit.osmocom.org/c/libosmo-gprs/+/31075 rlcmac: Support extending log categories in the future
https://gerrit.osmocom.org/c/libosmo-gprs/+/31076 rlcmac: Introduce primitives for SAPs towards higher layers

Actions #14

Updated by pespin about 1 year ago

And modem app is already using those with this patch:
https://gerrit.osmocom.org/c/osmocom-bb/+/31077 modem: Initial integration of libosmo-gprs-rlcmac

Actions #15

Updated by pespin about 1 year ago

Actions #16

Updated by pespin about 1 year ago

I started implementing some of the RLC state here:
remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/31098 rlcmac: Enqueue LLC PDUs based on RadioPriority and SAPI [WIP] [NEW]
remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/31099 WIP: rlcmac: Initial implementation [WIP] [NEW]

With current state I can already Enqueue a GMM Attach Request from the modem app (hardcoded blob since we have no libosmo-gprs-gmm yet), and it goes LLC->RLCMAC, where it is stored in a priority list.
RLC/MAC already detects it has new data to send and that there's no active UL TBF for that entity (MS, tlli), hence it allocates a new ul_tbf object for that entity and instructs it to start an UL ASS towards the network. FSMs already exist for that scenario, which waits to receive CCCH Imm.Ass, and then either goes 1phase or 2phase (Tx Pkt Resource Req + Rx Pkt Ul Ass + Tx Pkt Ctrl ACk).

I also implemented an initial scheduler which should be able to trigger transmit of the Ul pkts from the FSM above (encoding of packets themselves still not done). I couldn't test it yet because I lack RTS.ind from lower layers.

What I need to use in the library which is not yet available (fixeria you may want to help here):
- API to instruct lower layers to send a RACH.req
- API to receive the CCCH Imm Ass from lower layers
- API from lower layers to gets RTS.ind (with info: trx, ts, fn, and MAC header received from DL like USF).

Actions #17

Updated by pespin about 1 year ago

  • % Done changed from 0 to 20

The initial UL TBF assignment code is still waiting for review in gerrit.
I have meanwhile been working on importing osmo-pcu code which serves as a base to encode uplink RLC/MC data from LLC frames when triggered by the scheduler.

Actions #18

Updated by pespin about 1 year ago

  • Checklist item Tx Countdown procedure (BS_CV_MAX) added
  • Checklist item Handle Pkt Dl Ass in UL TBF (section 8.1.1.1.3) added
  • % Done changed from 20 to 40

I have been implementing a lot more parts of the RLCMAC layer. Together with l1ctl patches from fixeria we can now do some early tests with the modem app against osmocom CNI using trxcon.

So far modem can already start a UL TBF and submit the GMM Attach Req and osmo-pcu handles that correctly to the SGSN which answers back. However, I still didn't implement handling Pkt Dl Ass for UL_TBF so it doesn't go further.

Stuff found to be missing / not implemented added to the check list.

Actions #19

Updated by fixeria about 1 year ago

  • Checklist item l1gprs: add TBF starting time to UL/DL TBF CFG.req added
  • Checklist item l1gprs: address tbf_nr problems pointed out by pespin added

I have been working on the l1gprs, a small GPRS (MAC) library for trxcon. Last week I also spent some time integrating it into the virtphy. Eventually (whenever we have spare time) we're planning to integrate l1gprs into the firmware for Calypso based phones, so that it will be possible to run the modem app using these cheap phones as the PHY. But this is a low priority at the moment, the main focus will be on SDR PHY. The patch is currently in Gerrit:

https://gerrit.osmocom.org/c/osmocom-bb/+/30743 {trxcon,virt_phy}: shared GPRS L1 (MAC) implementation

I am also adding the necessary TTCN-3 types in order to be able to simulate various scenarios in tests.

Actions #20

Updated by fixeria about 1 year ago

  • Checklist item l1gprs: address tbf_nr problems pointed out by pespin set to Done
Actions #21

Updated by fixeria about 1 year ago

fixeria wrote in #note-19:

I am also adding the necessary TTCN-3 types in order to be able to simulate various scenarios in tests.

I am done removing the old and adding the new GPRS related L1CTL messages and templates:

https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/31973 library: L1CTL: merge L1ctl{Ul,Dl}Message into L1ctlMessage [NEW]
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/31974 library: L1CTL: rework GPRS related message definitions [NEW]

This patchset already helped me to find and fix a bug in osmocom-bb, but still needs further testing.

https://gerrit.osmocom.org/c/osmocom-bb/+/31968 l1ctl_proto: fix unpacked struct in l1ctl_gprs_dl_block_ind [NEW]

As a bonus, should be able to execute more tests from ttcn3-bts-test, which have been disabled for trxcon:

https://gerrit.osmocom.org/c/docker-playground/+/31975 ttcn3-bts-test: enable running GPRS related tests with trxcon [NEW]

Actions #22

Updated by fixeria about 1 year ago

Today I spent quite some time debugging RLC/MAC scheduling problems. As it turned out, we had a copy-paste bug in libosmo-gprs:

https://gerrit.osmocom.org/c/libosmo-gprs/+/32059 rlcmac: fix wrong MSGT_PACKET_{RESOURCE_REQUEST->DOWNLINK_ACK_NACK} [NEW]

so the modem app was sending incorrect MESSAGE_TYPE, confusing osmo-pcu and making it complain:

DRLCMAC NOTICE pdch.cpp:687 PACKET RESOURCE REQ unknown uplink TFI=0                                                                                                     
DRLCMAC NOTICE pdch_ul_controller.c:324 PDCH(bts=0,trx=0,ts=7) Timeout for registered POLL (FN=774098, reason=DL_ACK): TBF(DL:TFI-0-0-0:STATE-FLOW:GPRS:TLLI-0xe1c5d364)

Having GSMTAP logging emitted by trxcon was quite useful, here is a patch adding it:

https://gerrit.osmocom.org/c/osmocom-bb/+/32056 trxcon: add GSMTAP logging target if '-g' is given [NEW]

We also observed the timing problem we were afraid of (even with --trx-advance=0):

DSCHD FATAL trxcon(0)[0x564f18167a00]: TS7-PDTCH l1sched_prim_dequeue(): current Fn=334187, prim Fn=334186

In this case Fn=334187 corresponds to ul_bid=1, while Fn=334186 corresponds to ul_bid=0. This means that we are late scheduling the Uplink block, and the Tx queue was empty at Fn=334186. I was running both trxcon and the modem app with default scheduler policy, running them with SCHED_RR 99 seems to solve this:

20230325042052489 DGPRS DEBUG trxcon(0)[0x55c3b366cf70]: (PDCH-7) Rx DL BLOCK.ind (fn=2065141, len=23): 40 94 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (l1gprs.c:403)
20230325042052490 DGPRS DEBUG trxcon(0)[0x55c3b366cf70]: (PDCH-7) Rx UL BLOCK.req (fn=2065145, len=34): 3c 01 01 e1 c5 d3 64 01 c0 01 08 01 04 97 07 00 00 01 0a 00 05 f4  e1 c5 d3 64 00 f0 00 00 00 00 00 00  (l1gprs.c:364)                                                                                                                     
20230325042052494 DSCHD FATAL trxcon(0)[0x55c3b366cf70]: TS7-PDTCH l1sched_pull_burst(): PDCH Fn=2065145/17, bid=0 (sched_trx.c:123)
...
20230325042052508 DGPRS DEBUG trxcon(0)[0x55c3b366cf70]: (PDCH-7) Rx DL BLOCK.ind (fn=2065145, len=23): 40 94 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (l1gprs.c:403)
20230325042052509 DGPRS DEBUG trxcon(0)[0x55c3b366cf70]: (PDCH-7) Rx UL BLOCK.req (fn=2065149, len=34): 00 01 02 0d e1 c5 d3 64 fd d1 f4 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 00  (l1gprs.c:364)                                                                                                                     
20230325042052512 DSCHD FATAL trxcon(0)[0x55c3b366cf70]: TS7-PDTCH l1sched_pull_burst(): PDCH Fn=2065149/21, bid=0 (sched_trx.c:123)
...
20230325042052526 DGPRS DEBUG trxcon(0)[0x55c3b366cf70]: (PDCH-7) Rx DL BLOCK.ind (fn=2065149, len=23): 40 94 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (l1gprs.c:403)
20230325042052527 DGPRS DEBUG trxcon(0)[0x55c3b366cf70]: (PDCH-7) Rx UL BLOCK.req (fn=2065154, len=34): 3c 01 01 e1 c5 d3 64 01 c0 01 08 01 04 97 07 00 00 01 0a 00 05 f4 e1 c5 d3 64 00 f0 00 00 00 00 00 00  (l1gprs.c:364)                                                                                                                     
20230325042052535 DSCHD FATAL trxcon(0)[0x55c3b366cf70]: TS7-PDTCH l1sched_pull_burst(): PDCH Fn=2065154/26, bid=0 (sched_trx.c:123)
...
20230325042052549 DGPRS DEBUG trxcon(0)[0x55c3b366cf70]: (PDCH-7) Rx DL BLOCK.ind (fn=2065154, len=23): 40 24 00 80 40 00 00 00 00 00 00 00 7c 38 ba 6c 98 01 4b 2b 2b 2b 2b  (l1gprs.c:403)
20230325042052550 DGPRS DEBUG trxcon(0)[0x55c3b366cf70]: (PDCH-7) Rx UL BLOCK.req (fn=2065158, len=34): 00 01 02 0d e1 c5 d3 64 fd d1 f4 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 00  (l1gprs.c:364)
20230325042052554 DSCHD FATAL trxcon(0)[0x55c3b366cf70]: TS7-PDTCH l1sched_pull_burst(): PDCH Fn=2065158/30, bid=0 (sched_trx.c:123)

On the average, trxcon gets UL BLOCK.req from the upper layers 4 ms in advance before the expected Tx time (next ul_bid=0). Not that bad compared to the theoretical ideal of 5.77 ms (see https://people.osmocom.org/fixeria/pdch.txt). Also notable that it takes the upper layers ~1 ms to send UL BLOCK.req in response to DL BLOCK.ind (also serves as RTS.ind).

I submitted two additional patches preventing early/late block transmission in trxcon:

https://gerrit.osmocom.org/c/osmocom-bb/+/32057 trxcon: l1sched_prim_dequeue(): check TDMA Fn in PDCH prims [NEW]
https://gerrit.osmocom.org/c/osmocom-bb/+/32058 trxcon: do not call l1sched_prim_dequeue() at ul_bid != 0 [NEW]

Still not getting the GMM Attach procedure completed, looks like the modem does not respond to GMM Identity Request (IMEI).

Actions #23

Updated by fixeria about 1 year ago

fixeria wrote in #note-22:

Still not getting the GMM Attach procedure completed, looks like the modem does not respond to GMM Identity Request (IMEI).

Reported this to the GMM/SM ticket, see https://osmocom.org/issues/5501#note-11 and https://osmocom.org/issues/5501#note-12.

I am often using osmo_interact_vty.py from osmo-python-tests.git to send VTY commands like test 1 gmm attach, and I noticed it stopped working since recently. As it turned out, it's related to the new prompt in mobile/modem apps, here is the fix: https://gerrit.osmocom.org/c/python/osmo-python-tests/+/32063.

Actions #24

Updated by fixeria 12 months ago

  • Checklist item l1gprs: queue Uplink blocks added

We had to go back to the old approach (originally implemented by Harald for the virt_phy) of queuing Uplink blocks in the l1gprs state, because of the race conditions (see #note-22). The idea is to have two queues: one for USF based scheduling (dynamic), another for TDMA Fn based scheduling (fixed time). Implementing this required a bit of refactoring. For instance, I had to rework the trxcon's l1sched prim API:

https://gerrit.osmocom.org/c/osmocom-bb/+/32304 trxcon/l1sched: rework the primitive API

and update the libosmo-gprs L1CTL primitives:

remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/32411 rlcmac: fix st_new_on_enter(): actually release the TBF
remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/32412 rlcmac: fix typo in TBF CFG logging messages
remote: https://gerrit.osmocom.org/c/libosmo-gprs/+/32413 rlcmac: cfg_ul_tbf_req: indicate USF for each active timeslot

I am now cleaning up the actual patch for trxcon/l1gprs and planning to submit it soon.

Actions #25

Updated by fixeria 12 months ago

fixeria wrote in #note-24:

We had to go back to the old approach (originally implemented by Harald for the virt_phy) of queuing Uplink blocks in the l1gprs state, because of the race conditions (see #note-22). The idea is to have two queues: one for USF based scheduling (dynamic), another for TDMA Fn based scheduling (fixed time).

For the record, yesterday we had a Jitsi call with laforge and pespin. We discussed this and even more radical idea of moving most of the RLC/MAC layer down below to trxcon and virt_phy. Neither of us liked the later idea, so we tried to understand what may be causing the race conditions. laforge suggested to analyze the problem a bit deeper and raised a very good point: blocking logging might be the culprit. Indeed, trxcon is simply calling osmo_init_logging2(), which does not enable logging wqueue on its own. I will add a call to log_target_file_switch_to_wqueue() and test everything again.

Actions #26

Updated by fixeria 12 months ago

Meanwhile, I've submitted some fixes and improvements:

remote: https://gerrit.osmocom.org/c/osmocom-bb/+/32501 virt_phy: fix memleaks in l1ctl_rx_gprs_ul_block_req() [NEW]
remote: https://gerrit.osmocom.org/c/osmocom-bb/+/32502 virt_phy: fix bogous TDMA Fn check in l1ctl_rx_gprs_ul_block_req() [NEW]
remote: https://gerrit.osmocom.org/c/osmocom-bb/+/32503 l1gprs: reorder #includes, add missing <stdbool.h> [NEW]

and pushed my WIP patch adding the queues to Gerrt, just in case we ever need it (I hope no):

remote: https://gerrit.osmocom.org/c/osmocom-bb/+/32504 [WIP] l1gprs: implement queueing of Uplink blocks [NEW]

Actions #27

Updated by fixeria 12 months ago

fixeria wrote in #note-25:

... laforge suggested to analyze the problem a bit deeper and raised a very good point: blocking logging might be the culprit. Indeed, trxcon is simply calling osmo_init_logging2(), which does not enable logging wqueue on its own. I will add a call to log_target_file_switch_to_wqueue() and test everything again.

Below are patches addressing this:

remote: https://gerrit.osmocom.org/c/osmocom-bb/+/32516 trxcon: ignore TRXCON_EV_TX_DATA_CNF in TRXCON_ST_PACKET_DATA [NEW]
remote: https://gerrit.osmocom.org/c/osmocom-bb/+/32517 trxcon: reduce DGPRS logging level to LOGL_NOTICE [NEW]
remote: https://gerrit.osmocom.org/c/osmocom-bb/+/32518 trxcon: use non-blocking stderr logging by default [NEW]

having them applied, trxcon is not scheduling late anymore, but sometimes trying to schedule earlier:

20230426201233395 DSCHD ERROR trxcon(0)[0x6120000000a0]: TS7-PDTCH l1sched_lchan_prim_dequeue(): dropping Tx primitive (current Fn=2171102, prim Fn=2171104) (sched_prim.c:379)
20230426201233418 DSCHD ERROR trxcon(0)[0x6120000000a0]: TS7-PDTCH l1sched_lchan_prim_dequeue(): dropping Tx primitive (current Fn=2171107, prim Fn=2171108) (sched_prim.c:379)
20230426201233437 DSCHD ERROR trxcon(0)[0x6120000000a0]: TS7-PDTCH l1sched_lchan_prim_dequeue(): dropping Tx primitive (current Fn=2171111, prim Fn=2171112) (sched_prim.c:379)

I already faced this before and even have a WIP patch, which I need to get merged:

https://gerrit.osmocom.org/c/osmocom-bb/+/32058 trxcon: do not call l1sched_prim_dequeue() at ul_bid != 0

Actions #28

Updated by fixeria 12 months ago

fixeria wrote in #note-27:

I already faced this before and even have a WIP patch, which I need to get merged:

https://gerrit.osmocom.org/c/osmocom-bb/+/32058 trxcon: do not call l1sched_prim_dequeue() at ul_bid != 0

I reworked this patch and submitted an updated version. This patch fixes the scheduler, so it does not loose UL blocks anymore by scheduling their transmission at bid != 0. However, I am still observing Uplink scheduling issues. Looks like there is a race condition between Uplink and Downlink scheduling?

20230502021957724 DSCH NOTICE trxcon(0)[0x6120000000a0]: l1sched_pull_burst(): PDTCH/U Tx time (fn=56103) (sched_trx.c:125)
20230502021957744 DGPRS NOTICE trxcon(0)[0x6120000000a0]: (PDCH-7) Rx DL BLOCK.ind (fn=56103, len=23): 47 94 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (l1gprs.c:403)
20230502021957747 DSCH NOTICE trxcon(0)[0x6120000000a0]: l1sched_pull_burst(): PDTCH/U Tx time (fn=56108) (sched_trx.c:125)
20230502021957765 DSCH NOTICE trxcon(0)[0x6120000000a0]: l1sched_pull_burst(): PDTCH/U Tx time (fn=56112) (sched_trx.c:125)
20230502021957767 DGPRS NOTICE trxcon(0)[0x6120000000a0]: (PDCH-7) Rx DL BLOCK.ind (fn=56108, len=23): 40 94 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (l1gprs.c:403)
20230502021957768 DGPRS NOTICE trxcon(0)[0x6120000000a0]: (PDCH-7) Rx UL BLOCK.req (fn=56112, len=54): 00 00 00 45 01 c0 05 08 16 08 3a 05 06 68 30 60 42 92 e9 9e 92 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 00  (l1gprs.c:364)
20230502021957784 DSCH NOTICE trxcon(0)[0x6120000000a0]: l1sched_pull_burst(): PDTCH/U Tx time (fn=56116) (sched_trx.c:125)
20230502021957784 DSCHD ERROR trxcon(0)[0x6120000000a0]: TS7-PDTCH l1sched_lchan_prim_dequeue(): dropping Tx primitive (current Fn=56116, prim Fn=56112) (sched_prim.c:378)
20230502021957786 DGPRS NOTICE trxcon(0)[0x6120000000a0]: (PDCH-7) Rx DL BLOCK.ind (fn=56112, len=23): 40 94 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (l1gprs.c:403)
20230502021957786 DGPRS NOTICE trxcon(0)[0x6120000000a0]: (PDCH-7) Rx UL BLOCK.req (fn=56116, len=54): 00 00 00 45 01 c0 05 08 16 08 3a 05 06 68 30 60 42 92 e9 9e 92 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 00  (l1gprs.c:364)
20230502021957804 DGPRS NOTICE trxcon(0)[0x6120000000a0]: (PDCH-7) Rx DL BLOCK.ind (fn=56116, len=34): 08 00 02 75 43 c0 01 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 80  (l1gprs.c:403)
20230502021957805 DGPRS NOTICE trxcon(0)[0x6120000000a0]: (PDCH-7) Rx UL BLOCK.req (fn=56121, len=54): 00 00 00 45 01 c0 05 08 16 08 3a 05 06 68 30 60 42 92 e9 9e 92 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 00  (l1gprs.c:364)

Normally we should first see l1sched_pull_burst(): PDTCH/U Tx time printed at bid=0, and then Rx DL BLOCK.ind at bid=4. But somehow in the logging snippet above we see PDTCH/U Tx time printed twice at 20230502021957747 and 20230502021957765, and only then Rx DL BLOCK.ind. This means that somehow Uplink was faster than Downlink. This is not possible in systems with proper TDMA clock and might be happening because in trxcon we're scheduling Uplink bursts based on timers.

My intermediate conclusion is that the link between trxcon and the modem app is not a problem, because on the average we get an UL BLOCK.req 1 ms after we sent DL BLOCK.ind. There is something wrong with timer driven Uplink scheduling in trxcon, and these race conditions may not be the case when operating a real setup (osmo-trx-ms + SDR). In any case, getting the virtual setup to work reliably is still important, IMO.

Actions #29

Updated by pespin 12 months ago

Summarized list of the things which prevent stable use/test of higher RLCMAC side:
A- virtphy: Sometimes the ref in CCCH ImmAss doesn't match the stored ref_match in modem app. T2 or T3 doesn't match. 2 examples below:

[03.05.2023 11:25:43] <pespin> 20230503112445100 DRSL NOTICE grr.c:431 Rx RACH.conf (RA=0x78, T1=16, T3=4, T2=14)
20230503112445127 DRR ERROR grr.c:255 grr_rx_imm_ass(): no req_ref match (RA=0x78, T1=16, T3=3, T2=18)

[03.05.2023 12:14:36] <pespin> 20230503121357010 DRSL NOTICE grr.c:431 Rx RACH.conf (RA=0x78, T1=9, T3=4, T2=24)
20230503121357038 DRR ERROR grr.c:255 grr_rx_imm_ass(): no req_ref match (RA=0x78, T1=9, T3=3, T2=2)

B- trx: Most of the times the UL blocks are sheduled to late, which means are dropped and not received by SGSN (IIUC should be fixed by fixeria 's https://gerrit.osmocom.org/c/osmocom-bb/+/32575)

C.1- virtphy/trx: If I apply this osmo-pcu fix (https://gerrit.osmocom.org/c/osmo-pcu/+/32339) which ends up making the UL TBF to be destroyed immediately after sending PKT CTRL ACK, then that PKT CTRL ACK is never transmitted to the PCU (never shows up in GSMTAP), which means probably the message is freed upon TBF free, or the block/burst scheduler at the lower layers is not scheduling it. This is reproducible 100% of the times with that osmo-pcu patch applied. This happens probably because no TBF is registered no more in that TS? This should probably be fixed once we support applying a new TBF cfg (TBF_CFG.req) at a specific FN in time. The lower layer probably has to keep it in a llist until the FN matches and then apply it.
This is needed for instance to implement TBF Starting time, or to be able to transmit that pending block as I said in this section.

C.2- Once we fix the issue described above and we can further test and merge the osmo-pcu patch, which should see quicker GMM Attach establishment, sine it now just works due to good luck because SGSN retransmits the GMM Id Req.

Actions #30

Updated by fixeria 11 months ago

fixeria wrote in #note-28:

My intermediate conclusion is that the link between trxcon and the modem app is not a problem, because on the average we get an UL BLOCK.req 1 ms after we sent DL BLOCK.ind. There is something wrong with timer driven Uplink scheduling in trxcon, and these race conditions may not be the case when operating a real setup (osmo-trx-ms + SDR). In any case, getting the virtual setup to work reliably is still important, IMO.

I submitted a patch removing the trxcon's internal clock module:

https://gerrit.osmocom.org/c/osmocom-bb/+/32575 trxcon: get rid of the timer driven clock module

This makes the scheduling a lot more reliable and eliminates the "dropping Tx primitive" problem.

I was observing a few ttcn3-bts-test regressions with this patch applied, but most of them were due to problems in TTCN-3 code:

https://cgit.osmocom.org/osmo-ttcn3-hacks/commit/?id=6c0644970afc93c4d5f86a176ac580dbdbefe319

Once merged, we can consider the problem B mentioned in #note-29 resolved.

Actions #31

Updated by pespin 11 months ago

  • Checklist item Tx Countdown procedure (BS_CV_MAX) set to Done
  • Checklist item Handle Pkt Dl Ass in UL TBF (section 8.1.1.1.3) set to Done
Actions #32

Updated by fixeria 11 months ago

fixeria wrote in #note-30:

I was observing a few ttcn3-bts-test regressions with this patch applied, but most of them were due to problems in TTCN-3 code:

https://cgit.osmocom.org/osmo-ttcn3-hacks/commit/?id=6c0644970afc93c4d5f86a176ac580dbdbefe319

Actually, I found a few additional regressions and had to work them around:

https://gerrit.osmocom.org/c/osmocom-bb/+/33165 fake_trx.py: remove SETSLOT based burst filtering

I also addressed code review comments and kept fn-advance feature as was suggested by Pau. But I would like to have it disabled by default:

https://gerrit.osmocom.org/c/osmocom-bb/+/33166 trxcon: do not advance Uplink TDMA Fn by default

Actions #33

Updated by fixeria 11 months ago

pespin wrote in #note-29:

Summarized list of the things which prevent stable use/test of higher RLCMAC side:
A- virtphy: Sometimes the ref in CCCH ImmAss doesn't match the stored ref_match in modem app. T2 or T3 doesn't match.

As we figured out last week, the problem causing this is the lack of NOPE indications when using virtphy:

  • osmo-pcu needs to keep track of the current TDMA time in order to resolve the frame number of RACH.ind properly;
  • osmo-bts-virtual is not sending NOPE indications (empty PCUIF DATA.ind) to osmo-pcu;
  • in the absence of empty PCUIF DATA.ind, osmo-pcu is resolving TDMA Fn incorrectly.

pespin submitted several patches for osmo-pcu, but the problem remains AFAICT.

B- trx: Most of the times the UL blocks are sheduled to late, which means are dropped and not received by SGSN (IIUC should be fixed by fixeria's https://gerrit.osmocom.org/c/osmocom-bb/+/32575)

The patch removing trxcon's internal clock module is still in review (see my recent ticket updates).

C.1- virtphy/trx: If I apply this osmo-pcu fix (https://gerrit.osmocom.org/c/osmo-pcu/+/32339) which ends up making the UL TBF to be destroyed immediately after sending PKT CTRL ACK, then that PKT CTRL ACK is never transmitted to the PCU (never shows up in GSMTAP), which means probably the message is freed upon TBF free, or the block/burst scheduler at the lower layers is not scheduling it. This is reproducible 100% of the times with that osmo-pcu patch applied. This happens probably because no TBF is registered no more in that TS? This should probably be fixed once we support applying a new TBF cfg (TBF_CFG.req) at a specific FN in time. The lower layer probably has to keep it in a llist until the FN matches and then apply it.

I managed to reproduce the problem, and as can be seen l1gprs is dropping UL BLOCK.req in the absence of UL/DL TBFs:

20230605205202871 DGPRS INFO trxcon(0)[0x6120000000a0]: Rx Uplink TBF config: tbf_ref=0, slotmask=0x00 (l1gprs.c:288)
20230605205202872 DGPRS DEBUG trxcon(0)[0x6120000000a0]: (PDCH-7) Unlinked UL-TBF#000 (l1gprs.c:159)
20230605205202872 DGPRS INFO trxcon(0)[0x6120000000a0]: UL-TBF#000 is unregistered and free()d (l1gprs.c:164)
20230605205202872 DGPRS DEBUG trxcon(0)[0x6120000000a0]: (PDCH-7) Rx UL BLOCK.req (fn=477815, len=23): 40 05 e0 b1 09 8c 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b 2b  (l1gprs.c:364)
20230605205202872 DGPRS ERROR trxcon(0)[0x6120000000a0]: (PDCH-7) Rx UL BLOCK.req, but this PDCH has no configured TBFs (l1gprs.c:369)

The logging clearly shows that the upper layers are sending a UL BLOCK.req after having sent the Uplink TBF config removing the TBF. It should work just fine if the upper layers first send the UL BLOCK.req before sending the Uplink TBF config, since l1gprs is not free()ing blocks which have already passed TBF presence checks. pespin does that make sense to you?

Actions #34

Updated by fixeria 11 months ago

fixeria wrote in #note-33:

The logging clearly shows that the upper layers are sending a UL BLOCK.req after having sent the Uplink TBF config removing the TBF. It should work just fine if the upper layers first send the UL BLOCK.req before sending the Uplink TBF config, since l1gprs is not free()ing blocks which have already passed TBF presence checks.

If this is impossible to implement for whatever reason, we can remove the TBF presence checks:

diff --git a/src/shared/l1gprs.c b/src/shared/l1gprs.c
index fd18e4ea..8236f4b6 100644
--- a/src/shared/l1gprs.c
+++ b/src/shared/l1gprs.c
@@ -365,11 +365,13 @@ int l1gprs_handle_ul_block_req(struct l1gprs_state *gprs,
                  "Rx UL BLOCK.req (fn=%u, len=%zu): %s\n",
                  ntohl(l1br->hdr.fn), data_len, osmo_hexdump(l1br->data, data_len));

+#if 0
        if ((pdch->ul_tbf_count == 0) && (pdch->dl_tbf_count == 0)) {
                LOGP_PDCH(pdch, LOGL_ERROR,
                          "Rx UL BLOCK.req, but this PDCH has no configured TBFs\n");
                return -EINVAL;
        }
+#endif

        *req = (struct l1gprs_prim_ul_block_req) {
                .hdr = {
@@ -409,11 +411,13 @@ struct msgb *l1gprs_handle_dl_block_ind(struct l1gprs_state *gprs,
                  BLOCK_IND_IS_PTCCH(ind) ? "PTCCH" : "PDTCH",
                  ind->hdr.fn, ind->data_len, osmo_hexdump(ind->data, ind->data_len));

+#if 0
        if ((pdch->ul_tbf_count == 0) && (pdch->dl_tbf_count == 0)) {
                LOGP_PDCH(pdch, LOGL_ERROR,
                          "Rx DL BLOCK.ind, but this PDCH has no configured TBFs\n");
                return NULL;
        }
+#endif

        msg = l1gprs_l1ctl_msgb_alloc(L1CTL_GPRS_DL_BLOCK_IND);
        if (OSMO_UNLIKELY(msg == NULL)) {

These checks were added to prevent us from shooting our own foots, they're not strictly needed.

Actions #35

Updated by pespin 11 months ago

Status update:
A- virtphy: RACH ref_req sometimes not matching : found out to be due to virtphy not sending NOPE.ind, so the cur_fn in osmo-pcu is not updated. Leaving it as is (wontfix) since we can go back to using trxcon (see below).

B- trxcon: UL blocks are sheduled to late: I tested it is fixed by https://gerrit.osmocom.org/c/osmocom-bb/+/33166 (and 2 patches before it)

C.1- last PKT CTRL scheduled immediatelly before freeing UL TBF not transmitted. Fixed by https://gerrit.osmocom.org/c/libosmo-gprs/+/33187

C.2- quicker GMM Attach establishment once osmo-pcu patch is merged (https://gerrit.osmocom.org/c/osmo-pcu/+/32339). That patch fixes the issue and (correctly) delays the DL TBF assignment, but that means the MS moves to the IDLE state much more frequently. Since we didn't implement moving to IDLE state from packet-active state, we cannot go further if this patch is merged, hence merging it is delayed until we support going to and back from IDLE state.

Summary: "A" is WONTFIX, B and C.1 are fixed, C.2 requires implementing go and back from packet-IDLE state. We discussed that fixeria is going to look at implementing primtitives RLCMAC<->L1CTL to go to/from IDLE state, and I can use them wherever needed inside the RLCMAC layer.

Actions #36

Updated by fixeria 11 months ago

  • Checklist item deleted (l1gprs: queue Uplink blocks)
Actions #37

Updated by fixeria 11 months ago

I did a little research on the starting time todo item:

  • 3GPP TS 44.018 (section 10.5.2.38) defines the Starting Time IE
    • this is an optional IE of the Immediate Assignment message itself (section 9.1.18)
    • similar to the Request Reference IE, the reduced encoding is used (FN modulo 42432)
  • 3GPP TS 44.018 (section 10.5.2.16) defines the IA Rest Octets IE
    • there we have the TBF_STARTING_TIME field (16 bit)
    • the encoding is the same as the value part of the Starting Time IE
  • 3GPP TS 44.060 (section 12.21) defines Starting Frame Number Description IE
    • this IE is used in various messages:
      • 11.2.7 Packet Downlink Assignment
      • 11.2.7a Multiple TBF Downlink Assignment
      • 11.2.29 Packet Uplink Assignment
      • 11.2.29a Multiple TBF Uplink Assignment
      • 11.2.31 Packet Timeslot Reconfigure
      • 11.2.31a Multiple TBF Timeslot Reconfigure
    • the encoding is not always the same as used for the TBF_STARTING_TIME field
      • Absolute Frame Number Encoding (same as the value part of the Starting Time IE)
      • Relative Frame Number Encoding (calculated from Fn of the block containing this IE)

I am guessing we don't really care about the Starting Time IE and it's mostly used for CS domain? Still I would be interested to see how the phone behaves when receiving an Immediate Assignment with both Starting Time IE and the TBF_STARTING_TIME field being present, and even worse containing different values ;) For now I suggest to ignore this IE.

In the case of the Starting Frame Number Description IE, we have two encoding variants. I suggest to offload all the complexity of the Fn calculation, be it absolute or relative, to the upper part of the RLC/MAC layers (libosmo-gprs-rlcmac) and use the absolute Fn on the L1CTL interface when talking to l1gprs. pespin agree?

Note that despite being called "absolute" in some places, it's actually the reduced Fn encoding (Fn % 42432):

3GPP TS 44.018 section 10.5.2.38

T1' (octet 2)  The T1' field is coded as the binary representation of (FN div 1326) mod 32.
T3 (octet 2 and 3)  The T3 field is coded as the binary representation of FN mod 51.
                    Bit 3 of octet 2 is the most significant bit and bit 6 of octet 3 is the least significant bit.
T2 (octet 3)  The T2 field is coded as the binary representation of FN mod 26.
NOTE 1: The frame number, FN modulo 42432 can be calculated as 51x((T3-T2) mod 26)+T3+51x26xT1'

The starting time and the times mentioned above are with reference to the frame numbering in the concerned cell.
They are given in units of frames (around 4.615 ms).

The Starting Time IE can encode only an interval of time of 42 432 frames, that is to say around 195.8 seconds.
To remove any ambiguity, the specification for a reception at time T is that the encoded interval is (T-10808, T+31623).
In rigorous terms, if we note ST the starting time:

  if 0 <= (ST-T) mod 42432 <= 31623, the indicated time is the next time when FN mod 42432 is equal to ST;
  if 32024 <= (ST-T) mod 42432 <= 42431, the indicated time has already elapsed.

The reception time T is not specified here precisely. To allow room for various MS implementations,
the limit between the two behaviours above may be anywhere within the interval defined by:

  31624 <= (ST-T) mod 42432 <= 32023.

Actions #38

Updated by fixeria 11 months ago

I have submitted a Work-in-Progress implementation to Gerrit:

https://gerrit.osmocom.org/c/libosmo-gprs/+/33230 rlcmac: l1ctl_prim: add 'start_rfn' field to cfg_{ul,dl}_tbf_req [NEW]
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/33231 library: L1CTL: add 'start_rfn' to L1ctlGprs{Ul,Dl}TbfCfgReq [NEW]
https://gerrit.osmocom.org/c/osmocom-bb/+/33222 l1ctl_proto: add 'start_fn' field to UL/DL TBF CFG.req messages [NEW]
https://gerrit.osmocom.org/c/osmocom-bb/+/33223 l1gprs: implement TBF starting time support [NEW]
https://gerrit.osmocom.org/c/osmocom-bb/+/33232 modem: pass TBF starting time from CFG UL/DL TBF Req [NEW]

After addressing code review comments by Pau, I started looking into libosmo-gprs-rlcmac. AFAICS, parsing of the TBF starting time is already implemented (grep for tbf_starting_time). I thought I could just pass the starting time via OSMO_GPRS_RLCMAC_L1CTL_CFG_{UL,DL}_TBF.Req primitives and be done with it, but then I figured out that sending of these primitives is delayed until the current TDMA Fn reaches the starting time Fn (grep for fn_cmp). This means that the RLC/MAC already implements handling of the starting time (I didn't know that), and now I am not sure why I was asked to implement this in the lower layers. I guess the idea was to offload this from the upper RLC/MAC implementation to l1gprs? But what's wrong with the existing implementation?

Also, I think I found a problem, see TBF_StartingTime_to_fn():

/* 12.21 Starting Frame Number Description */
uint32_t TBF_StartingTime_to_fn(const StartingTime_t *tbf_start_time, uint32_t curr_fn)
{
        const struct gsm_time g_time = { 
                .t1 = tbf_start_time->N32,
                .t2 = tbf_start_time->N51,
                .t3 = tbf_start_time->N26
        };  
        return gsm_gsmtime2fn(&g_time);

}

This function is currently returning a reduced TDMA Fn value (i.e. Fn % 42432). Note that argument curr_fn is not used.
I guess the idea was to convert the reduced Fn to absolute Fn using the current Fn?

Actions #39

Updated by pespin 11 months ago

Reminder & status update: I'm currently having a look at T3192 (DL TBF and T3168 (UL TBF) in the MS since we may need to implement those to delay going to IDLE state.
I'm also looking at PCU counterparts (T3193 and T3169), and i submitted a bunch of patches fixing N3** and T3** in osmo-pcu since they were mangled for both UL and DL TBF.
I also need to re-review https://gerrit.osmocom.org/c/osmo-pcu/+/32339 after reading about those timers, since we may actually be able to still send the PKT DL ASS while MS has T3192 or T3168 running (see TS 44.060 Table 11.2.7.2: PACKET DOWNLINK ASSIGNMENT field "CONTROL_ACK" ("Poll" in wireshark) and "TBF Starting Time" in same section).

Actions #40

Updated by pespin 11 months ago

All above started because we also talked with fixeria about the possibility to set "TBF Starting Time" during CCCH Imm Ass [ro=PktDlAss] to losely match our internal X2002 osmo-pcu timer. We also have X2001 for PACCH PktDlAss.

dexter btw it seems osmo-pcu requires PCUIF .cnf for ImmAss, see TBF_EV_ASSIGN_PCUIF_CNF in tbf_dl_fsm.c, so looks like the bts/bsc patches you submitted may need to be reverted (if they were already merged)

Actions #41

Updated by fixeria 10 months ago

Status update: last week I have been working on the GRR FSM. The current state can be found here:

https://cgit.osmocom.org/osmocom-bb/log/?h=fixeria/modem_grr_fsm

I have finished designing and integrating the new FSM and can perform GMM Attach.
What's missing is the actual logic in the RLC/MAC layer triggering state transitions between PACKET_IDLE and PACKET_TRANSFER.

Actions #43

Updated by fixeria 10 months ago

Today I worked on the PDCH ESTABLISH/RELEASE.Req primitives:

https://gerrit.osmocom.org/c/libosmo-gprs/+/33370 rlcmac: add OSMO_GPRS_RLCMAC_L1CTL_PDCH_{ESTABLISH,RELEASE}.req
https://cgit.osmocom.org/osmocom-bb/commit/?h=fixeria/modem_grr_fsm&id=e690a58b7b3033ea31d33e99ddaaa3154e89a7aa

Still need to fix one issue: the timeslot mask is indicated in UL/DL TBF.Req, but not in these new primitives. Both trxcon/virtphy need to be modified to take this into account, since currently L1CTL_DM_EST_REQ is expected to contain a timeslot number. Also need to implement handling of the release primitive in the modem app. Finally, need to implement the logic in RLC/MAC triggering sending of the release primitive (pespin is currently investigating T3192, so I'll wait for him to figure out when can we declare the release).

Actions #44

Updated by fixeria 10 months ago

  • Related to Bug #3626: LAPDm code pulls both 'l1h' and 'l2h' of msgb added
Actions #45

Updated by pespin 8 months ago

Updated step: We are sorting out several architectural problems regarding the lower layers primitives between trxcon (l1gprs) <-> modem <-> libosmo-gprs.
fixeria We have start_fn feature kind of working now, but we may need to do another architectural change to provide RTS.ind from within l1gprs, see my comment in version 10 of patch https://gerrit.osmocom.org/c/osmocom-bb/+/33223/10

Actions #46

Updated by pespin 7 months ago

  • Checklist item l1gprs: add TBF starting time to UL/DL TBF CFG.req set to Done

RTS.ind is already being served from withing l1gprs since a few days/weeks ago.

fixeria is now set to work on implementing pkt-access-procedure retransmission (#6131).

On my side, I'm waiting for osmo-trx-ms in the physical setup to become reliable when transmitting RACH.req, since most of the time they are not being seen by OsmoBTS, which makes it really hard to test most of the stuff on the upper layers coming after that step. I'm expecting Hoernchen to have a look at that.

Once I can easily test next steps reliably, I can come back to testing the whole stack in the physical setup.

Actions #47

Updated by Hoernchen 7 months ago

Should work with https://gerrit.osmocom.org/c/osmo-trx/+/34462 - I have not seen any missing ul rach attempts, it always gets to the attach request/id response by adding 2 ts of "frame advance" to the ul bursts to get the tx data faster, see attachment. RACH retry is still important, but this should be sufficient to continue testing.

Actions #48

Updated by fixeria 7 months ago

  • Checklist item modem: implement GRR-FSM set to Done
Actions #49

Updated by fixeria 7 months ago

  • Assignee changed from pespin to fixeria

So I played with the mssdr-{bts,ms} setup a bit. Some notes below.

Hoernchen wrote in #note-47:

Should work with https://gerrit.osmocom.org/c/osmo-trx/+/34462 [...]

I can confirm that this patch does the trick. Channel access procedure works reliably now.

RACH retry is still important, but this should be sufficient to continue testing.

FYI, I submitted patches implementing it to Gerrit (see #6131).

I had to update [lib]osmo-stuff on the mssdr-ms host because nothing worked (too old versions). After that, the mobile app started to work just fine.
However, the modem app still does not: I am observing the Uplink issues. Below is a logging snipped with some additional FATAL prints I added manually:

20230923212834524 DSCH NOTICE trxcon(0)[0x556cd4d900]: (Re)configure TDMA timeslot #3 as PDCH (sched_trx.c:276)
20230923212834524 DSCH NOTICE trxcon(0)[0x556cd4d900]: TS3-PDTCH activating (sched_trx.c:476)
20230923212834525 DSCH NOTICE trxcon(0)[0x556cd4d900]: TS3-PTCCH activating (sched_trx.c:476)
20230923212834525 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073313) (sched_lchan_pdtch.c:110)
20230923212834539 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073318) (trxcon_fsm.c:648)
20230923212834546 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073318) (sched_lchan_pdtch.c:110)
20230923212834562 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073322) (trxcon_fsm.c:648)
20230923212834565 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073322) (sched_lchan_pdtch.c:110)
20230923212834581 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073326) (trxcon_fsm.c:648)
20230923212834584 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073326) (sched_lchan_pdtch.c:110)
20230923212834599 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073331) (trxcon_fsm.c:648)
20230923212834604 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS UL BLOCK.req (fn=2073331) (trxcon_fsm.c:599)
20230923212834606 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073331) (sched_lchan_pdtch.c:110)
20230923212834622 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073335) (trxcon_fsm.c:648)
20230923212834625 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073335) (sched_lchan_pdtch.c:110)
20230923212834626 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS UL BLOCK.req (fn=2073335) (trxcon_fsm.c:599)
20230923212834641 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073339) (trxcon_fsm.c:648)
20230923212834644 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073339) (sched_lchan_pdtch.c:110)
20230923212834644 DSCHD ERROR trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(): dropping Tx primitive (current Fn=2073339, prim Fn=2073335) (sched_lchan_pdtch.c:121)
20230923212834644 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS UL BLOCK.req (fn=2073339) (trxcon_fsm.c:599)
20230923212834659 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073344) (trxcon_fsm.c:648)
20230923212834663 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS UL BLOCK.req (fn=2073344) (trxcon_fsm.c:599)
20230923212834667 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073344) (sched_lchan_pdtch.c:110)
20230923212834667 DSCHD ERROR trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(): dropping Tx primitive (current Fn=2073344, prim Fn=2073339) (sched_lchan_pdtch.c:121)
20230923212834682 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073348) (trxcon_fsm.c:648)
20230923212834685 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS UL BLOCK.req (fn=2073348) (trxcon_fsm.c:599)
20230923212834685 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073348) (sched_lchan_pdtch.c:110)
20230923212834685 DSCHD ERROR trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(): dropping Tx primitive (current Fn=2073348, prim Fn=2073344) (sched_lchan_pdtch.c:121)
20230923212834701 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073352) (trxcon_fsm.c:648)
20230923212834704 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073352) (sched_lchan_pdtch.c:110)
20230923212834704 DSCHD ERROR trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(): dropping Tx primitive (current Fn=2073352, prim Fn=2073348) (sched_lchan_pdtch.c:121)
20230923212834704 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS UL BLOCK.req (fn=2073352) (trxcon_fsm.c:599)
20230923212834719 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073357) (trxcon_fsm.c:648)
20230923212834724 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS UL BLOCK.req (fn=2073357) (trxcon_fsm.c:599)
20230923212834726 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073357) (sched_lchan_pdtch.c:110)
20230923212834727 DSCHD ERROR trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(): dropping Tx primitive (current Fn=2073357, prim Fn=2073352) (sched_lchan_pdtch.c:121)
20230923212834742 DAPP FATAL trxcon(0)[0x556cd4d900]{PACKET_DATA}: == GPRS RTS.ind (fn=2073361) (trxcon_fsm.c:648)
20230923212834745 DSCHD FATAL trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(fn=2073361) (sched_lchan_pdtch.c:110)
20230923212834745 DSCHD ERROR trxcon(0)[0x556cd4d900]: TS3-PDTCH prim_dequeue_pdtch(): dropping Tx primitive (current Fn=2073361, prim Fn=2073357) (sched_lchan_pdtch.c:121)
20230923212834746 DSCH NOTICE trxcon(0)[0x556cd4d900]: Delete TDMA timeslot #3 (sched_trx.c:226)
20230923212834747 DGPRS ERROR trxcon(0)[0x556cd4d900]: (PDCH-3) Rx UL BLOCK.req (fn=2073361, len=34), but this PDCH has no configured TBFs (l1gprs.c:656)
20230923212834775 DL1C NOTICE trxcon(0)[0x556cd4d900]{PACKET_DATA}: Received reset request (FULL) (l1ctl.c:439)
20230923212834775 DSCH NOTICE trxcon(0)[0x556cd4d900]: Reset scheduler and clock counter (sched_trx.c:190)

Here is a brief summary in tabular form:

Fn UL BLOCK.req prim_dequeue_pdtch() Outcome
fn=2073331 +5ms +7ms OK
fn=2073335 +4ms +3ms DROPPED
fn=2073339 +3ms +3ms DROPPED
fn=2073344 +4ms +8ms DROPPED?
fn=2073348 +3ms +3ms DROPPED
fn=2073352 +3ms +3ms DROPPED
fn=2073357 +5ms +8ms DROPPED?
  • It takes the upper layers somewhere between 3 ms and 5 ms to deliver an UL block (fits the theoretical max. 5.77 ms).
    • This is on RPi, running without RT prio. I saw a delay of ~1 ms on an old Intel Haswell CPU.
  • The PHY gives the upper layers lead time of ~3 ms to submit an Uplink block, but sometimes 7-8 ms.
    • Given the 2-3 TS periods of advance, I would expect (10 - 2) * 0.577 = 4.616 ms or (10 - 3) * 0.577 = 4.039 ms?
    • The 7-8 ms is most likely due to IDLE and PTCCH slots on the block boundaries.
  • Only one UL block was lucky to make it (fn=2073331).
    • All other UL blocks were dropped by the scheduler due to Fn mismatch.
    • Some blocks getting dropped even though they were received in time.

There's actually a problem in trxcon's l1sched: UL blocks poison the Tx queue if received late. Such stale blocks are dropped while attempting to dequeue a next block, so even if a next block was received in time it does not get transmitted and becomes a stale block itself. This is why UL blocks with fn=2073344 and fn=2073357 were dropped, despite being received in time. I will work on fixing this, so that one late UL block does not block the whole queue. The bugfix will reduce the number of dropped frames, but would not solve the problem of "slow" layer23.

Actions #50

Updated by fixeria 7 months ago

  • Status changed from In Progress to Feedback

fixeria wrote in #note-49:

There's actually a problem in trxcon's l1sched: UL blocks poison the Tx queue if received late. Such stale blocks are dropped while attempting to dequeue a next block, so even if a next block was received in time it does not get transmitted and becomes a stale block itself. This is why UL blocks with fn=2073344 and fn=2073357 were dropped, despite being received in time. I will work on fixing this, so that one late UL block does not block the whole queue. The bugfix will reduce the number of dropped frames, but would not solve the problem of "slow" layer23.

Patches submitted to Gerrit:

https://gerrit.osmocom.org/c/libosmocore/+/34520 gsm: add gsm0502_fn_compare() for comparing TDMA FNs
https://gerrit.osmocom.org/c/osmo-pcu/+/34521 pdch_ul_controller: migrate from fn_cmp() to gsm0502_fn_compare()
https://gerrit.osmocom.org/c/osmocom-bb/+/34522 l1gprs: migrate to gsm0502_fn_compare()
https://gerrit.osmocom.org/c/osmocom-bb/+/34523 trxcon/l1sched: rework dequeueing of PDCH Tx prims

I will deploy them on the mssdr-ms host and give it another spin tomorrow.

Actions #51

Updated by fixeria 7 months ago

fixeria wrote in #note-50:

Patches submitted to Gerrit:

https://gerrit.osmocom.org/c/libosmocore/+/34520 gsm: add gsm0502_fn_compare() for comparing TDMA FNs
https://gerrit.osmocom.org/c/osmo-pcu/+/34521 pdch_ul_controller: migrate from fn_cmp() to gsm0502_fn_compare()
https://gerrit.osmocom.org/c/osmocom-bb/+/34522 l1gprs: migrate to gsm0502_fn_compare()
https://gerrit.osmocom.org/c/osmocom-bb/+/34523 trxcon/l1sched: rework dequeueing of PDCH Tx prims

I will deploy them on the mssdr-ms host and give it another spin tomorrow.

All patches have been merged, and meanwhile deployed on the mssdr-ms host. I did some testing, and things are looking better, but as I said before this does not fix the latency problems: I am still seeing dropped Uplink blocks because they're arriving late.

What I also noticed is that the PCU allocates multi-slot TBFs, which are not expected to work (due to the PHY limitations, IIRC). pespin I am attaching a PCAP for you, since you said you would be interested to take a look.

Actions #52

Updated by fixeria 7 months ago

(Looks like the category gprs VTY option is broken somehow, since the PCAP contains no RLC/MAC blocks at all)

Actions #53

Updated by fixeria 7 months ago

fixeria wrote in #note-51:

What I also noticed is that the PCU allocates multi-slot TBFs, which are not expected to work (due to the PHY limitations, IIRC).

As can be seen from frame 598 in the PCAP I previously attached, the PCU allocates a multi-slot DL TBF employing TS4, TS5, and TS6, most likely assuming the default msclass 12. After forcing the default multislot class to 1 (by adding multislot-class default 1 to osmo-pcu.cfg), the modem app successfully performed the ATTACH procedure, and even got a PDP CONTEXT activated. \o/ I am attaching a new PCAP.

Actions #54

Updated by fixeria 7 months ago

  • Assignee changed from fixeria to pespin
Actions #55

Updated by fixeria 7 months ago

fixeria wrote in #note-50:

Patches submitted to Gerrit:

https://gerrit.osmocom.org/c/libosmocore/+/34520 gsm: add gsm0502_fn_compare() for comparing TDMA FNs
https://gerrit.osmocom.org/c/osmo-pcu/+/34521 pdch_ul_controller: migrate from fn_cmp() to gsm0502_fn_compare()
https://gerrit.osmocom.org/c/osmocom-bb/+/34522 l1gprs: migrate to gsm0502_fn_compare()
https://gerrit.osmocom.org/c/osmocom-bb/+/34523 trxcon/l1sched: rework dequeueing of PDCH Tx prims

A patch bumping osmocom-bb submodule revision in osmo-trx.git:

https://gerrit.osmocom.org/c/osmo-trx/+/34623 osmo-trx-ms: bump osmocom-bb submodule commit

Actions #56

Updated by fixeria 7 months ago

  • Related to Bug #6201: modem: signal proper MS Radio Access Capability added
Actions #57

Updated by fixeria 7 months ago

  • Related to Feature #6132: Add MS_GPRS_Tests to osmo-ttcn3-hacks added
Actions #58

Updated by fixeria 7 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 40 to 100

We agreed that it's the time to close this one and track other problems in separate tickets.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)