Project

General

Profile

Actions

Bug #4658

open

Wrong burst order in a multi-trx setup

Added by fixeria over 3 years ago. Updated over 3 years ago.

Status:
Stalled
Priority:
High
Assignee:
Category:
TRX Toolkit
Target version:
-
Start date:
07/08/2020
Due date:
% Done:

80%

Resolution:
Spec Reference:

Description

While running the existing test cases from ttcn3-bts-test on hopping channels (see #4546), I noticed that sometimes trxcon starts to consume a lot of CPU power. As it turned out, this happens because the burst loss detection logic in trxcon somehow detects that the whole TDMA hyper-frame is lost, so it tries to substitute ~2715647 allegedly lost TDMA frames with a dummy burst. Of course it's a bug, because we're not supposed to compensate more than one TDMA multi-frame period. So the problem was a missing 'return' statement:

https://gerrit.osmocom.org/c/osmocom-bb/+/19183 trxcon/scheduler: subst_frame_loss(): print current TDMA fn
https://gerrit.osmocom.org/c/osmocom-bb/+/19184 trxcon/scheduler: fix subst_frame_loss(): do not compensate too much

However, I was interested to know what exactly tricks the burst detection logic to think that so many frames are lost.

/*! Return the difference of two specified TDMA frame numbers (subtraction) */
#define GSM_TDMA_FN_SUB(a, b) \
        ((a + GSM_TDMA_HYPERFRAME - b) % GSM_TDMA_HYPERFRAME)

/* How many frames elapsed since the last one? */
elapsed = GSM_TDMA_FN_SUB(fn, lchan->tdma.last_proc);
if (elapsed > mf->period) {
        LOGP(DSCHD, LOGL_NOTICE, "Too many (>%u) contiguous TDMA frames elapsed (%u) " 
                                 "since the last processed fn=%u (current %u)\n",
                                 mf->period, elapsed, lchan->tdma.last_proc, fn);
        return -EIO;
} else if (elapsed == 0) {
        LOGP(DSCHD, LOGL_ERROR, "No TDMA frames elapsed since the last processed " 
                                "fn=%u, must be a bug?\n", lchan->tdma.last_proc);
        return -EIO;
}

And slightly more informative logging message gives us a hint:

sched_trx.c:640 Too many (>104) contiguous TDMA frames elapsed (2715647) since the last processed fn=633 (current fn=632)

so, a burst with TDMA fn=632 is for some reason received late, since we already received a burst with TDMA fn=633.

This is definitely unexpected, and of course subtraction would result in a huge number: ((632 + 2715648) - 633) % 2715648 == 2715647.


Files

trxd_order.pcapng trxd_order.pcapng 14.8 KB fixeria, 07/08/2020 01:32 PM

Related issues

Related to OsmoBTS - Feature #4546: baseband frequency hopping support for osmo-bts-trxResolvedfixeria05/12/2020

Actions
Related to Cellular Network Infrastructure - Feature #4006: TRX protocol: wind of changeStalledfixeria05/17/2019

Actions
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)