Project

General

Profile

Bug #2344

OsmoTRX is not using SCHED_RR

Added by laforge 5 months ago. Updated 26 days ago.

Status:
Stalled
Priority:
Normal
Assignee:
Target version:
-
Start date:
06/29/2017
Due date:
% Done:

80%

Spec Reference:

Description

It's quite odd that OsmoTRX, being real-time critical is not using the SCHED_RR scheduling policy (see http://man7.org/linux/man-pages/man7/sched.7.html), similar to what OsmoBTS is using.

osmo-trx-sched_rr.diff Magnifier (595 Bytes) laforge, 06/29/2017 01:23 PM


Related issues

Related to OsmoBTS - Bug #2325: sporadic shutdown of osmo-bts-trx in osmo-gsm-tester runs (no clock from trx) Stalled 06/13/2017

History

#1 Updated by laforge 5 months ago

  • Status changed from New to In Progress

#2 Updated by laforge 5 months ago

  • Related to Bug #2325: sporadic shutdown of osmo-bts-trx in osmo-gsm-tester runs (no clock from trx) added

#3 Updated by laforge 5 months ago

  • % Done changed from 0 to 20

Using a quick exerimental patch to enable SCHED_RR, I can get osmo-trx and osmo-bts-trx to run on a system with load > 100, created by the following stress-ng command line:
stress-ng --vm 10 --hdd 10 --cpu 10 --open 10 --sem 10 --sock 10 --float 10 --io 10 --timer 10

With my SCHED_RR patch, osmo-trx and osmo-bts-trx are running perfectly fine, despite the high system load. Prior to the SCHED_RR patch, osmo-trx basically immediately throws errors like

NOTICE 140480399431424 15:19:43.5 Transceiver.cpp:381:pushRadioVector: dumping STALE burst in TRX->USRP interface

and osmo-bts-trx shows high clock deviation like

<0006> scheduler_trx.c:1704 We were 16 FN faster than TRX, compensating

so definitely, using SCHED_RR significantly improves performance under high system load.

#4 Updated by laforge 5 months ago

attaching patch for unconditional use of SCHED_RR.

#5 Updated by laforge 5 months ago

  • % Done changed from 40 to 80

proper patch now in https://gerrit.osmocom.org/#/c/3080/1

However, I think we should actually make it a default.

#6 Updated by ttsou 5 months ago

OsmoTRX already uses SCHED_RR by default, however, Harald's positive result confirms a counterintuitive result. Instead of setting individual RT priorities on the primary I/O threads (as below), better results are obtained by calling sched_setscheduler() from main(). The main() thread simply loops on sleep() during operation, which is why priorities were not setup this way before.

The likely reason for the improved behavior is that the libusb event handler, which runs below osmo-trx was not being prioritized before. The scheduling setting in main() prioritizes UHD threads and likely prevents osmo-trx from preempting libusb and UHD which both run in userspace.

I've seen similar improvements in higher bandwidth LTE test cases, but never reached the same conclusion in osmo-trx.

void RxLowerLoopAdapter(Transceiver *transceiver)
{
  transceiver->setPriority(0.45);

  while (1) {
    transceiver->driveReceiveRadio();
    pthread_testcancel();
  }
  return NULL;
} 

#7 Updated by ttsou 5 months ago

Can you remove or disable the calls to RadioDevice::setPriority() and still maintain the same performance?

The current settings set priorities through the UHD scheduling wrapper, which then enabled SCHED_RR and priorities by calling pthread_setschedparam(). I'm not sure these calls have a significant effect when RT scheduling is already configured by the parent thread.

I also prefer not to have separate scheduling implementations that could potentially conflict.

#8 Updated by laforge 26 days ago

  • Status changed from In Progress to Stalled

Also available in: Atom PDF