Project

General

Profile

Actions

Bug #3025

closed

BTS_Tests.TC_paging_tmsi_200percent fails

Added by laforge about 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
03/02/2018
Due date:
% Done:

100%

Spec Reference:

Description

According to
https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bts-test/test_results_analyzer/
this test case fails consistently, it must be investigated.


Files


Related issues

Related to OsmoBTS - Feature #3155: execute BTS_Tests.ttcn with real (C123) phone hardware in LTHW setupStalledpespin04/10/2018

Actions
Actions #1

Updated by laforge about 6 years ago

Expected (924 .. 978) pagings but have 900

So somehow on the jenkins less paging messages make it compared to my local setup. interesting.

Actions #2

Updated by laforge almost 6 years ago

on the jenkins build slave:

09:30:00.302175 mtc BTS_Tests.ttcn:1569 num_paging_sent=1359 rcvd_msgs=228 rcvd_ids=900
09:30:00.302235 mtc BTS_Tests.ttcn:1660 setverdict(fail): none -> fail reason: "Expected (924 .. 978) pagings but have 900", new component reason: "Expected (924 .. 978) pagings but have 900" 

on my laptop:

14:20:13.136162 mtc BTS_Tests.ttcn:1569 num_paging_sent=1359 rcvd_msgs=244 rcvd_ids=964

so for some reason, on my laptop 244-228=16 more messages have been received, each containing 4 identities, resulting in a difference of 64 paged IDs. But why?

Actions #3

Updated by laforge almost 6 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100
Actions #4

Updated by pespin over 5 years ago

  • Status changed from Resolved to New

This test fails with osmo-bts-trx+osmo-trx+motorola CXX but in the other direction, that is, it's now expecting less messages than the amount arriving:

MTC@6a1670a53352: Setting RSL_SYSTEM_INFO_4 (4): '31061C62F224002A4740E50400'O
MTC@6a1670a53352: pch_blocks_per_sec=67.973856 interval=0.014712
...
MTC@6a1670a53352: num_paging_sent=1359 rcvd_msgs=318 rcvd_ids=1258
MTC@6a1670a53352: Test case TC_paging_tmsi_200percent finished. Verdict: fail reason: Expected (896 .. 978) pagings but have 1258

So it seems https://gerrit.osmocom.org/7811 is not the proper definitive fix for this test.

Actions #5

Updated by pespin over 5 years ago

  • Assignee changed from laforge to pespin
Actions #6

Updated by pespin over 5 years ago

  • Related to Feature #3155: execute BTS_Tests.ttcn with real (C123) phone hardware in LTHW setup added
Actions #7

Updated by pespin over 5 years ago

This is what we get when running docker non-HW test:

10:41:24.054141 mtc BTS_Tests.ttcn:1761 num_paging_sent=1359 rcvd_msgs=236 rcvd_ids=935

Ratio of received ids per message is similar (1272/322=3.95 vs 935/236=3.96), but in the non-HW one a lot less messages are received in the MS (318 vs 236).

Actions #8

Updated by laforge over 5 years ago

On Wed, Sep 26, 2018 at 01:51:28PM +0000, pespin [REDMINE] wrote:

Ratio of received ids per message is similar (1272/322=3.95 vs 935/236=3.96), but in the non-HW one a lot less messages are received in the MS (318 vs 236).

this is odd. After all, the bandwidth of the downlink CCCH (and the PCH within
that) is fixed. It is impacted by BS_AG_BLKS_RES as well as the channel configuration
(combined vs. non-combined BCCH/CCCH). But neither of those should change between
executing with trxcon+fake_trx or with real hardware.

The number of PCH slots determines the number of PCH messages that can be sent,
and the number of PCH messages determines the number of paging identities,
depending on whether PAGING TYPE 1/2/3 is generated based on the identities.

So in order to debug this, I think I would start to count the number of PCH
messages sent on the BTS side vs. those received by the MS. This can be done
completely outside of the osmo-gsm-tester setup and in any virtual or real setup.

pespin can help you with computing the theoritcal amount of PCH blocks on a
combined and a non-combined CCCH, which should give you some context as to what
we actually expect.

Make sure to distinguish between
  • the number of RSL PAGING messages on Abis,
  • the number of paging blocks sent on the PCH
  • the number of identities contained in above blocks

Those figures are all separate, and there are processes within OsmoBTS which
aggregate RSL PAGING commands into PCH blocks with multiple identities. And
it's that paging aggregation process that we actually want to test under overload
conditions, i.e. when a sustained flow of RSL PAGING messages exceeds twice
the capacity of the PCH capacity, even if the aggregation works ideal.

Actions #9

Updated by pespin over 5 years ago

Comparing time span of HW and trxcon setups (time from 1st packet to last one):

GSMTAP Paging Info:
  • HW: 37 seconds
  • trxcon: 27 seconds
RSL Paging Command:
  • HW: 32 seconds
  • trxcon: 20 seconds

So the issue is clearly with TTCN3 in the HW setup not sending RSL paging commands in the specified time (20 seconds). As a result, BTS receives paging at a slower peace and hence has time to forward more IDs to the MS.

Following comparision of graphs from pcap trace in both setups shows the issue (HW setup run sends less paging commands per second). This is probably due to the HW setup running osmo-trx on an APU machine, which eats most of the CPU and thus TTCN3 test is not scheduled on time. TTCN3 test always sends a paging command, then waits for a fixed interval, instad of accunting for scheduling delays.

So we need to fix the test to account for scheduling delay, then it should be fine.

Actions #10

Updated by pespin over 5 years ago

After my changes to account for scheduling delays, variation of packet frequency and time span over whole test look a lot better, and more similar in both machines (my high end laptop running trxcon setup and the low end APU running osmo-trx with HW setup).

However, since now the test really follows the 20 seconds, throughput of received paging IDs has slowed down a bit, because previous implementation always drifted a bit (like 2 secs) even in high end setups, so osmo-bts had more time to manage the pagings.

For imsi_200: "Expected (543 .. 577) pagings but have 541" on trxcon, "Expected (543 .. 577) pagings but have 536" on HW setup.
For tmsi_200: "Expected (896 .. 978) pagings but have 883" on trxcon, "(896 .. 978) pagings but have 874" on HW setup (and sometimes it even passes with current thresholds).

Actions #11

Updated by pespin over 5 years ago

  • Status changed from New to Feedback

Should be fixed by:
https://gerrit.osmocom.org/#/c/osmo-ttcn3-hacks/+/11103 bts: Account for elapsed time in paging timer

I tested it several time in my PC (docker, trxcon) as well as on real HW with osmo-trx.

Actions #12

Updated by pespin over 5 years ago

  • Status changed from Feedback to Resolved

Merged, closing.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)