Project

General

Profile

Actions

Bug #4765

closed

tester osmo-trx-uhd uplink half-broken, sdcch/sacch failures

Added by pespin over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
Start date:
09/25/2020
Due date:
% Done:

100%

Spec Reference:

Description

Running osmo-gsm-tester tests with osmo-bts-trx today I see issues with SMS MO MT not working, due to ACKing arriving too late to the MS (the MS re-transmits the request before the older Accept arrived, around every 100ms).
The delay seems to appear in osmo-bts-trx, where a CM Service Accept or LU Accept arrives in RSL, and it takes around 300ms to send the data in Um according to GSMTAP in pcap trace. In between that time, the MS retransmits.

I attach full run of the osmo-gsm-tester running a MO USSD to ask for the msisdn with an osmo-bts-trx + b200.
I attach 2 pcaps taken while testing, one during a MO MT SMS and one during the MO USSD explained above.


Files

ussd_run.tgz ussd_run.tgz 131 KB pespin, 09/25/2020 01:56 PM
accept_too_late_in_gsmtap_1msUSSD.pcapng accept_too_late_in_gsmtap_1msUSSD.pcapng 121 KB pespin, 09/25/2020 01:56 PM
accept_too_late_in_gsmtap.pcapng accept_too_late_in_gsmtap.pcapng 441 KB pespin, 09/25/2020 01:56 PM
accept_too_late_in_gsmtap_withgsmtaplog.pcapng.xz accept_too_late_in_gsmtap_withgsmtaplog.pcapng.xz 13.9 MB Hoernchen, 09/25/2020 08:28 PM
os4765_meas.png View os4765_meas.png 52 KB fixeria, 09/30/2020 08:21 AM
sdcch.png View sdcch.png 1.42 MB tester sdcch broken frame Hoernchen, 10/08/2020 08:48 PM
tester_sdcch_broken.pcap tester_sdcch_broken.pcap 19.5 KB bid0123 detected, but BER high/broken data Hoernchen, 10/08/2020 08:50 PM
Actions #1

Updated by Hoernchen over 3 years ago

Looking at the lapdm in accept_too_late_in_gsmtap.pcapng with filter (lapdm) && (gsmtap.chan_type == 6) there are two problems here:
  • UL appears to be completely broken, so there are a lot of DL retransmissions
  • wireshark does not dissect those messages properly, instead they are just a (Fragment), even though they are not.

605 is a ID request, retransmitted in 617, 640, 654 but not dissected - no UL response until rf channel release.

UL appears to kind of start working after 20s or something like that, so the lu request gets accepted afterwards, but even though there are no retransmissions wireshark drops the ball again in packet 855 which contains a complete LU accept, but is again not dissected and marked as (Fragment)

The next LU appears to work and is finally dissected properly on the ul and dl.

Then a service request happens, and the UE starts sending a genuine segmented data blob with the M bit set to 1 in packet 1073, but there is not further UL data, and the network then tries to release the channel, which is not acknowledged either, so it's again retrasmitted in 1187.

This is repeated once.

Then another attempt at least makes it to 2 UL fragments before failing...

well, at some later pint it actually manages to get al three fragments through, but the cp ack is never acknowledged by the MS and keeps getting repeated on the DL.

Actions #2

Updated by Hoernchen over 3 years ago

Everything is broken: as expected Ul bursts are not detected, see (gsmtap_log.string contains "669517") in the extended log.
This run is osmo-trx 0fbdfefebc30e389c69f132abda42f79bf8bd620 which is the convolution fix.

Actions #4

Updated by Hoernchen over 3 years ago

  • Subject changed from osmo-bts-sysmo forwarding LU/CM Service Accept to MS too late to tester osmo-bts-sysmo / osmo-trx-uhd uplink half-broken, sdcch/sacch failures
Actions #5

Updated by pespin over 3 years ago

I think I made a typo writing "osmo-bts-sysmo" on the topic, I probably meant osmo-bts-trx.

Actions #6

Updated by Hoernchen over 3 years ago

  • Subject changed from tester osmo-bts-sysmo / osmo-trx-uhd uplink half-broken, sdcch/sacch failures to tester osmo-trx-uhd uplink half-broken, sdcch/sacch failures
Actions #7

Updated by Hoernchen over 3 years ago

Filtering the verbose capture with (gsmtap_log.string contains "GMSK") || (gsmtap_log.string contains "bad data") || osmo_trxc || lapdm || gsm_ipa that the UL bursts osmo-bts complains about are just missing and are actually nope indications, so for some reason the UL bursts are not broken, they are just never detected and therefore never delivered to osmo-bts.

Actions #8

Updated by fixeria over 3 years ago

I quickly analyzed 'accept_too_late_in_gsmtap_1msUSSD.pcapng', and the first thing I paid my attention is the measurement reports:

(gsm_a.dtap.msg_rr_type == 0x15) || (gsm_abis_rsl.msg_type == 0x28)

As can be seen, we have non-zero BER on both Uplink and Downlink.

Actions #9

Updated by fixeria over 3 years ago

Some observations:

  • Downlink RxLevel (reported by the MS) remains equal 13 (around -100 dBm), and this is expected, because the BTS transmits with constant power.
  • Uplink RxLevel (reported by the BTS) changes in range 41 .. 47 (-70 dBm .. -68 dBm), not significant.
  • RxQual 7 corresponds to BER > 12.8% (maximum) and we reach it.
  • Some Downlink Measurement reports are lost.
Actions #10

Updated by laforge over 3 years ago

  • Assignee set to pespin
  • Priority changed from Normal to High
Actions #11

Updated by Hoernchen over 3 years ago

  • I've patched the osmo-trx binary to disable SSE, no changes, so it's not SSE related at all, which was recently activated for the tester
  • I've dumped the bursts, and.... well, see the attached plot for one frame that matches the attached pcap; there is a burst in ts0 that is detected (and broken?) but all ts look a bit odd considering everything is connected with cables. It's also hard to explain why all the other TS appear to have the same signal level/rssi, this is the uplink after all, and there should not be anything transmitting there, this is the sms test....
  • red in the plot is abs()
Actions #12

Updated by Hoernchen over 3 years ago

roh can you take a look at the physical tester setup? There is no reasonable explanation for such a bad SNR and a constant noise level on all timeslots in the uplink, at least genuine bursts should have more power than the noise?

Actions #13

Updated by pespin over 3 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

The issue seems fixed by increasing rx-gain from previous 25 to 50.

The problem is probably related to incorrect uplink RSSI being received (due to #4468), and the BTS instructs the MS to downgrade the tx signal level through the MS power loop. As a result, short channel transitions work but at some point starts receiving nothing from the MS (so CM Service Req+Accept is done fine, but the CP-DATA is never received at the BTS despite being sent by the MS according to ofono qmi log in jorunald).

By increasing the rx-gain to 50 the received signal can still be decoded. I also implemented an improvement for calculated RSSI (#4468) which should also improve the situation (by providing more relistic avalues and hence the BTS not downgrading the MS tx power so much).

Why it used to work before with rx-gain 25 and it doesn't now? I better because of some improvement/implementation of the MS power loop or some fix in measurement reports.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)