Project

General

Profile

Actions

Bug #6338

open

ms-sdr gprs startup issues

Added by Hoernchen about 1 month ago. Updated about 1 month ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
01/23/2024
Due date:
% Done:

0%

Resolution:
Spec Reference:

Description

Most of the time the startup fails at the assignment stage, see attached log & pcap (mssdr.log continues like this indefinitely)


Files

gprs.pcapng gprs.pcapng 8.61 MB Hoernchen, 01/23/2024 12:25 PM
modem.log modem.log 5.06 KB Hoernchen, 01/23/2024 12:25 PM
mssdr.log mssdr.log 10.5 KB Hoernchen, 01/23/2024 12:25 PM
modem2.log modem2.log 11.8 KB Hoernchen, 01/24/2024 08:35 AM
Actions #1

Updated by pespin about 1 month ago

  • Assignee changed from pespin to Hoernchen

Seems like you are not running with proper permissions (eg. as root):

20240123131700331 DTUN INFO apn.c:87 APN(internet): Opening TUN device modem4
20240123131700331 DLGLOBAL INFO tun.c:357 TUN(internet,if=modem4/0,ns=foobar): Open tun: Switch to netns 'foobar'
20240123131700332 DLGLOBAL ERROR tun.c:361 Open tun: Cannot switch to netns 'foobar': Permission denied (13)
20240123131700332 DTUN ERROR apn.c:96 APN(internet): Failed to configure tun device
20240123131700332 DTUN ERROR apn_fsm.c:72 APN[0x559b145150]{INACTIVE}: transition to state DISABLED not permitted!
20240123131700332 DLGLOBAL NOTICE cpu_sched_vty.c:471 Setting SCHED_RR priority 1

Seems like it's sending the RACH req correctly (matches pcap from osmo-bts point of view) but it never receives the Imm Ass from CCCH (doesn't match what osmo-bts is doing according to pcap).

So it looks like a problem in ms-sdr. fixeria I see last line is "Tx Dedic.Mode Est Req", but shouldn't lower layers answer the request?

Looking at mssdr.log, it seems the modem is actually receiving the ImmAss because it's switching to TS=3:

20240123131700836 DL1C NOTICE trxcon(0)[0x556c9c98e0]{BCCH_CCCH}: Received CCCH mode request (COMBINED) (l1ctl.c:494)
20240123131700836 DSCH NOTICE trxcon(0)[0x556c9c98e0]: (Re)configure TDMA timeslot #0 as BCCH+CCCH+SDCCH/4+SACCH/4 (sched_trx.c:276)
20240123131700836 DSCH NOTICE trxcon(0)[0x556c9c98e0]: TS0-SCH activating (sched_trx.c:476)
20240123131700836 DSCH NOTICE trxcon(0)[0x556c9c98e0]: TS0-BCCH activating (sched_trx.c:476)
20240123131700836 DSCH NOTICE trxcon(0)[0x556c9c98e0]: TS0-RACH activating (sched_trx.c:476)
20240123131700836 DSCH NOTICE trxcon(0)[0x556c9c98e0]: TS0-CCCH activating (sched_trx.c:476)
20240123131702249 DL1C NOTICE trxcon(0)[0x556c9c98e0]{BCCH_CCCH}: Received 8-bit RACH request (offset=0, ra=0x79) (l1ctl.c:542)
20240123131702288 DSCHD NOTICE trxcon(0)[0x556c9c98e0]: TS0-RACH Scheduled 8-bit RACH (TS0: GSM, GMSK) at fn=176882 (sched_lchan_rach.c:130)
20240123131702503 DL1C NOTICE trxcon(0)[0x556c9c98e0]{BCCH_CCCH}: Received L1CTL_DM_EST_REQ (tn=3, chan_nr=0xc3, tsc=7, tch_mode=SIGNALLING) (l1ctl.c:630)
20240123131702503 DL1C NOTICE trxcon(0)[0x556c9c98e0]{BCCH_CCCH}: L1CTL_DM_EST_REQ indicates single ARFCN GSM900 979 (l1ctl.c:572)
20240123131702503 DSCH NOTICE trxcon(0)[0x556c9c98e0]: Reset scheduler  (sched_trx.c:190)
20240123131702503 DSCH NOTICE trxcon(0)[0x556c9c98e0]: Delete TDMA timeslot #0 (sched_trx.c:226)
20240123131702503 DSCH NOTICE trxcon(0)[0x556c9c98e0]: Add a new TDMA timeslot #3 (sched_trx.c:207)
20240123131702503 DSCH NOTICE trxcon(0)[0x556c9c98e0]: (Re)configure TDMA timeslot #3 as PDCH (sched_trx.c:276)
20240123131702503 DSCH NOTICE trxcon(0)[0x556c9c98e0]: TS3-PDTCH activating (sched_trx.c:476)
20240123131702503 DSCH NOTICE trxcon(0)[0x556c9c98e0]: TS3-PTCCH activating (sched_trx.c:476)

Afterwards however I see some errors saying TS3 is not available? something happened in between?

20240123131702526 DGPRS ERROR trxcon(0)[0x556c9c98e0]: (PDCH-3) Rx DL BLOCK.ind (fn=176930), but this PDCH has no configured TBFs (l1gprs.c:696)
20240123131702526 DGPRS ERROR trxcon(0)[0x556c9c98e0]: (PDCH-3) Rx RTS.ind (fn=176934, usf=255), but this PDCH has no configured TBFs (l1gprs.c:774)

I have the feeling we are missing lots of stuff there. Hoernchen which errors levels where used on the UE side? Can you use something like this in modem and report back?

 logging level set-all info
 logging level rlcmac debug
 logging level llc debug
 logging level sndcp debug
 logging level tun debug
 logging level rr debug

You can also maybe try running trxcon with:

--debug=DSCH,2:DSCHD,2:DGPRS,2:DAPP,2

Actions #2

Updated by pespin about 1 month ago

  • Status changed from New to Feedback
Actions #3

Updated by fixeria about 1 month ago

pespin wrote in #note-1:

So it looks like a problem in ms-sdr. fixeria I see last line is "Tx Dedic.Mode Est Req", but shouldn't lower layers answer the request?

No, there is no confirmation for this procedure. The L1 simply tunes to PDCH [or fails to do so] and that's it.

Afterwards however I see some errors saying TS3 is not available? something happened in between?

The problem is that somehow we never received any TBF configuration request from libosmo-gprs-rlcmac... Do you remember what is triggering sending those?

Actions #4

Updated by Hoernchen about 1 month ago

Error log level for modem was just the default, same for the lower sdr/ms layer. Yes, the permission issue, just ignore that, that is unrelated and just a failed attempt to nest netns(s), and I am currently not getting to the point where the tun device would matter..
Attached modem log with suggested log levels.

Actions #5

Updated by Hoernchen about 1 month ago

... it looks like just increasing the modem stdout debug levels makes it work much better, so I guess this "fine" for the time being.. This is somewhat unexpected, tho, usually stuff breaks by slowing it down by logging too much.

Actions #6

Updated by Hoernchen about 1 month ago

  • Status changed from Feedback to Stalled

I'll just set this to stalled right now, I suppose I can work with the current state, unless it degrades again.

Actions #7

Updated by pespin about 1 month ago

fixeria wrote in #note-3:

The problem is that somehow we never received any TBF configuration request from libosmo-gprs-rlcmac... Do you remember what is triggering sending those?

fixeria it does, see the "Tx Dedic.Mode Est Req" which should be configuring the TS3 PDCH and which se saw happening in the mssdr.log.

It is also confirmed by looking at updated log file with more verbosity:

20240124092047994 DRLCMAC DEBUG rlcmac_prim.c:677 Rx from lower layers: L1CTL-CCCH_DATA.indication
20240124092047994 DRLCMAC DEBUG rlcmac_prim.c:545 Tx to lower layers: L1CTL-PDCH_ESTABLISH.request
20240124092047994 DRR DEBUG rlcmac.c:156 GPRS-RR(1)[0x55800585b0]{PACKET_ACCESS}: Received Event GRR_EV_PDCH_ESTABLISH_REQ
20240124092047994 DRR INFO grr.c:485 GPRS-RR(1)[0x55800585b0]{PACKET_ACCESS}: PDCH Establish.Req: TSC=7, H0, ARFCN=979
20240124092047994 DL1C INFO l1ctl.c:553 Tx Dedic.Mode Est Req (arfcn=979, chan_nr=0xc3)
20240124092047994 DRR DEBUG grr.c:521 GPRS-RR(1)[0x55800585b0]{PACKET_ACCESS}: state_chg to PACKET_TRANSFER

Actions #8

Updated by fixeria about 1 month ago

pespin wrote in #note-7:

fixeria wrote in #note-3:

The problem is that somehow we never received any TBF configuration request from libosmo-gprs-rlcmac...

it does, see the "Tx Dedic.Mode Est Req" which should be configuring the TS3 PDCH and which se saw happening in the mssdr.log.

You're talking about OSMO_GPRS_RLCMAC_L1CTL_PDCH_ESTABLISH:

/* OSMO_GPRS_RLCMAC_L1CTL_PDCH_ESTABLISH | Req */
struct {
        uint8_t ts_nr;  /* Timeslot Number */
        uint8_t tsc;    /* Training Sequence Code */
        uint8_t ta;     /* Timing Advance */
        bool fh;        /* Frequency Hopping */
        union { 
                uint16_t arfcn;
                struct {
                        uint8_t hsn;
                        uint8_t maio;
                        uint8_t ma_len; 
                        uint8_t ma[8];
                } fhp; /* fh == true */
        };      
} pdch_est_req;

But this primitive contains no UL/DL TBF configuration parameters. It contains only those necessary to switch from CCCH/BCCH to PDCH.

I was specifically mentioning TBF configuration primitives, one of OSMO_GPRS_RLCMAC_L1CTL_CFG_[UD]L_TBF.Req:

/* OSMO_GPRS_RLCMAC_L1CTL_CFG_UL_TBF | Req */
struct {
        uint8_t ul_tbf_nr;
        uint8_t ul_slotmask;
        uint32_t start_fn; /* TBF starting time (absolute Fn), UINT32_MAX = Invalid */
        uint8_t ul_usf[8]; /* USF for each PDCH indicated in the slotmask */
} cfg_ul_tbf_req;
/* OSMO_GPRS_RLCMAC_L1CTL_CFG_DL_TBF | Req */
struct {
        uint8_t dl_tbf_nr;
        uint8_t dl_slotmask;
        uint32_t start_fn; /* TBF starting time (absolute Fn), UINT32_MAX = Invalid */
        uint8_t dl_tfi; /* DL TFI for all PDCHs indicated in the slotmask */
} cfg_dl_tbf_req;

None of those were received, so this is why trxcon/l1gprs complains "... but this PDCH has no configured TBFs".

Actions #9

Updated by pespin about 1 month ago

  • Status changed from Stalled to Feedback
Looking at modem2.log:
20240124092047741 DRLCMAC INFO tbf_ul_ass_fsm.c:277 UL_TBF_ASS[0x5580151b60]{IDLE}: state_chg to WAIT_CCCH_IMM_ASS
...
20240124092047994 DRLCMAC DEBUG rlcmac_prim.c:677 Rx from lower layers: L1CTL-CCCH_DATA.indication
20240124092047994 DRLCMAC DEBUG rlcmac_prim.c:545 Tx to lower layers: L1CTL-PDCH_ESTABLISH.request

L1CTL-CCCH_DATA.indication ends up calling function gprs_rlcmac_handle_ccch_imm_ass() in libosmo-gprs-rlcmac which handles the CCCH Imm.Ass. message:+

Then, that code paths calls this function:

bool gprs_rlcmac_tbf_ul_ass_wait_ccch_imm_ass(const struct gprs_rlcmac_ul_tbf *ul_tbf)
{
    return ul_tbf->ul_ass_fsm.fi->state == GPRS_RLCMAC_TBF_UL_ASS_ST_WAIT_CCCH_IMM_ASS;
}

static int gprs_rlcmac_handle_ccch_imm_ass_ul_tbf(uint8_t ts_nr, uint32_t fn, const struct gsm48_imm_ass *ia, const IA_RestOctets_t *iaro)
{
    int rc = -ENOENT;
    struct gprs_rlcmac_entity *gre;
    struct gprs_rlcmac_ul_tbf *ul_tbf;
    struct tbf_ul_ass_ev_rx_ccch_imm_ass_ctx d = {
        .fn = fn,
        .ts_nr = ts_nr,
        .ia = ia,
        .iaro = iaro
    };

    llist_for_each_entry(gre, &g_rlcmac_ctx->gre_list, entry) {
        ul_tbf = gre->ul_tbf;
        if (!ul_tbf)
            continue;
        if (!gprs_rlcmac_tbf_ul_ass_wait_ccch_imm_ass(ul_tbf))
            continue;
        rc = osmo_fsm_inst_dispatch(ul_tbf->ul_ass_fsm.fi,
                        GPRS_RLCMAC_TBF_UL_ASS_EV_RX_CCCH_IMM_ASS,
                        &d);
        break;
    }
    return rc;
}

In there, gprs_rlcmac_tbf_ul_ass_wait_ccch_imm_ass() should return true because log says "tbf_ul_ass_fsm.c:277 UL_TBF_ASS[0x5580151b60]{IDLE}: state_chg to WAIT_CCCH_IMM_ASS". So GPRS_RLCMAC_TBF_UL_ASS_EV_RX_CCCH_IMM_ASS should be dispatched, but I don't see it in the log file... Acording to same log, the event should be printed if it was dispatched, so it's not reaching that point...

Hoernchen you'd need to debug starting at gprs_rlcmac_handle_ccch_imm_ass() through gprs_rlcmac_handle_ccch_imm_ass_ul_tbf() and see what happens and report back. I don't really get why it's not dispatching the event...

Actions #10

Updated by Hoernchen about 1 month ago

https://osmocom.org/issues/6342 is more important right now, because if it works it just keeps working, and it appears to be timing related.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)