Project

General

Profile

Bug #3971

osmo-trx-lms: makes kernel eat all system memory when run under realtime priority

Added by pespin about 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
Category:
LimeSDR
Target version:
-
Start date:
05/03/2019
Due date:
% Done:

0%

Spec Reference:

Description

Initially found and described in detail here: https://osmocom.org/issues/3339?#note-15

My system totally freezes for 2-5 seconds before/during the time osmo-trx starts failing reading/writing on OS#3339. That happens about 30 second after starting osmo-trx-lms. My XServer blocks and music playing from a youtube video on the background also either stops or plays in a 1 sec loop. When I recover control of my system, I can see in the logs of osmo-trx the read/write failure from OS#3339.

Through htop one can easily see that upon starting osmo-trx-lms, memory suddenly grows until filling my 16GB, and then is when my system freezes and osmo-trx starts failing, during that time kernel is working heavily to free up memory.

Interestingly, if I strace the osmo-trx-lms I don't see this kind of issue, but it's true too that the CPU consumption drops a lot too. strace only shows heavy use of calls: accept(), poll() and select().

If I ctrl+z (SIGSTOP) the osmo-trx-lms, the kernel stops acquiring memory (and releases most of it). Once I use "fg" to SIGCONTINUE the process, it continues acquiring memory like crazy. Same if I use gdb to do the same kind of operation.

Allocation happens in kernel memory, not process-related memory:

kernel dynamic memory         10.2G    1009.3M       9.2G  <-----!!!!!!!

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
26117200 26114592  99%    0.25K 816487       32   6531896K filp            <---!!!!!!!
26120640 26118794  99%    0.06K 408135       64   1632540K kmalloc-64      <---!!!!!!!

  • Reproducible both on LimeSDR-USB and LimeSDR-mini HW.
  • reproducible both on USB2 and USB3.
  • Reproducible both on LimeSuite 18.10.* and 19.01.*
  • Reproducible both on kernel 4.19.4-arch1-1-ARCH and 5.0.9-arch1-1-ARCH
  • Reproducible on 1.0.22-1
  • Reproducible both with ASan enabled or disabled.

Related issues

Related to OsmoTRX - Bug #3339: osmo-trx-lms "expect ... got ... diff ff0" error messageResolved06/13/2018

Related to OsmoTRX - Bug #3981: osmo-trx-lms: Overruns appear from time to timeNew05/07/2019

History

#1 Updated by pespin about 2 months ago

  • Related to Bug #3339: osmo-trx-lms "expect ... got ... diff ff0" error message added

#2 Updated by pespin about 2 months ago

I found how to reproduce it or avoid reproducing it on my system:
Add "rt-prio 18" on osmo-trx-lms.cfg -> BUG
remove it -> no memleak.

So somehow changing the process to use realtime priority makes the kernel not free stuff on time. looks like it's not really a memleak, since if you pause the process the memory is freed at some point a few secs later. But still looks like the kernel is not freeing memory quick enough to keep up with the allocation pace.

"rt-prio 18" in osmo-trx-lms.cfg basically means osmo-trx-lms is going to call this during startup:

    struct sched_param param;
    memset(&param, 0, sizeof(param));
    param.sched_priority = 18;
    rc = sched_setscheduler(getpid(), SCHED_RR, &param);

#3 Updated by pespin about 2 months ago

  • Description updated (diff)

#4 Updated by pespin about 2 months ago

I created a ticket in LimeSuite github to let other LimeSuite users/developers about the issue: https://github.com/myriadrf/LimeSuite/issues/263

#5 Updated by pespin about 2 months ago

  • Related to Bug #3981: osmo-trx-lms: Overruns appear from time to time added

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)