Bug #4136
Updated by pespin over 4 years ago
Originally https://osmocom.org/issues/3339, I deleted it by error.
Initially found and described in detail here: https://osmocom.org/issues/3339?#note-15
Related LimeSuite github issue: https://github.com/myriadrf/LimeSuite/issues/263
My system totally freezes for 2-5 seconds before/during the time osmo-trx starts failing reading/writing on OS#3339. That happens about 30 second after starting osmo-trx-lms. My XServer blocks and music playing from a youtube video on the background also either stops or plays in a 1 sec loop. When I recover control of my system, I can see in the logs of osmo-trx the read/write failure from OS#3339.
Through htop one can easily see that upon starting osmo-trx-lms, memory suddenly grows until filling my 16GB, and then is when my system freezes and osmo-trx starts failing, during that time kernel is working heavily to free up memory.
Interestingly, if I strace the osmo-trx-lms I don't see this kind of issue, but it's true too that the CPU consumption drops a lot too. strace only shows heavy use of calls: accept(), poll() and select().
If I ctrl+z (SIGSTOP) the osmo-trx-lms, the kernel stops acquiring memory (and releases most of it). Once I use "fg" to SIGCONTINUE the process, it continues acquiring memory like crazy. Same if I use gdb to do the same kind of operation.
Allocation happens in kernel memory, not process-related memory:
<pre>
kernel dynamic memory 10.2G 1009.3M 9.2G <-----!!!!!!!
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
26117200 26114592 99% 0.25K 816487 32 6531896K filp <---!!!!!!!
26120640 26118794 99% 0.06K 408135 64 1632540K kmalloc-64 <---!!!!!!!
</pre>
* Reproducible both on LimeSDR-USB and LimeSDR-mini HW.
* reproducible both on USB2 and USB3.
* Reproducible both on LimeSuite 18.10.* and 19.01.*
* Reproducible both on kernel 4.19.4-arch1-1-ARCH and 5.0.9-arch1-1-ARCH
* Reproducible on 1.0.22-1
* Reproducible both with ASan enabled or disabled.
---------------
I found how to reproduce it or avoid reproducing it on my system:
Add "rt-prio 18" on osmo-trx-lms.cfg -> BUG
remove it -> no memleak.
So somehow changing the process to use realtime priority makes the kernel not free stuff on time. looks like it's not really a memleak, since if you pause the process the memory is freed at some point a few secs later. But still looks like the kernel is not freeing memory quick enough to keep up with the allocation pace.
"rt-prio 18" in osmo-trx-lms.cfg basically means osmo-trx-lms is going to call this during startup:
<pre>
struct sched_param param;
memset(¶m, 0, sizeof(param));
param.sched_priority = 18;
rc = sched_setscheduler(getpid(), SCHED_RR, ¶m);
</pre>
----------------------------
Issue still happening on my Thinkpad x280 laptop with:
LimeSite: 19.04.0.r0.abf82140
kernel: 5.2.2-arch1-1-ARCH x86_64
libusb: 1.0.22-1
A comment was added in related github issue from LimeSuite developer:
<pre>
If it libusb ralated, then it might be "libusb_handle_events_timeout_completed()" that is being run in a loop (the loop is started when opening device). The timeout is set to 250ms in LimeSuite so it blocks for up to 250ms if there is no USB events. I don't know what it does internally while waiting for events.
</pre>
----------------------------
Attached a perf record of osmo-trx-lms running with rt-prio and seeing memory growing (around 2GB till stopped iirc).
<pre>
Samples: 106K of event 'cycles', Event count (approx.): 88212958338
Overhead Command Shared Object Symbol
10.52% osmo-trx-lms [kernel.kallsyms] [k] do_syscall_64 ◆
5.61% osmo-trx-lms [kernel.kallsyms] [k] syscall_return_via_sysret ▒
5.48% osmo-trx-lms [kernel.kallsyms] [k] entry_SYSCALL_64 ▒
5.20% osmo-trx-lms [kernel.kallsyms] [k] preempt_count_add ▒
4.72% osmo-trx-lms [kernel.kallsyms] [k] _raw_spin_lock ▒
4.04% osmo-trx-lms [kernel.kallsyms] [k] kmem_cache_alloc ▒
2.67% osmo-trx-lms [kernel.kallsyms] [k] __alloc_file ▒
2.24% osmo-trx-lms [kernel.kallsyms] [k] preempt_count_sub ▒
1.89% osmo-trx-lms [kernel.kallsyms] [k] get_mem_cgroup_from_mm ▒
1.86% osmo-trx-lms [kernel.kallsyms] [k] kmem_cache_free ▒
1.74% osmo-trx-lms [kernel.kallsyms] [k] __call_rcu ▒
1.60% osmo-trx-lms [kernel.kallsyms] [k] _raw_spin_lock_irq ▒
1.58% osmo-trx-lms [kernel.kallsyms] [k] debug_smp_processor_id ▒
1.50% osmo-trx-lms libpthread-2.29.so [.] __libc_accept ▒
1.45% osmo-trx-lms [kernel.kallsyms] [k] truncate_inode_pages_range ▒
1.41% osmo-trx-lms [kernel.kallsyms] [k] in_lock_functions ▒
1.39% osmo-trx-lms [kernel.kallsyms] [k] memcg_kmem_get_cache ▒
1.34% osmo-trx-lms [kernel.kallsyms] [k] __sys_accept4 ▒
1.31% osmo-trx-lms [kernel.kallsyms] [k] __fput ▒
1.18% osmo-trx-lms [kernel.kallsyms] [k] task_work_run ▒
1.06% osmo-trx-lms [kernel.kallsyms] [k] task_work_add ▒
1.05% osmo-trx-lms [kernel.kallsyms] [k] __fget ▒
0.98% osmo-trx-lms [kernel.kallsyms] [k] _raw_spin_lock_bh ▒
0.97% osmo-trx-lms [kernel.kallsyms] [k] rcu_segcblist_enqueue ▒
0.90% osmo-trx-lms [kernel.kallsyms] [k] inet_csk_accept ▒
0.88% osmo-trx-lms [kernel.kallsyms] [k] fput_many ▒
0.87% osmo-trx-lms [kernel.kallsyms] [k] new_slab ▒
0.81% osmo-trx-lms [kernel.kallsyms] [k] do_softirq.part.0 ▒
0.78% osmo-trx-lms [kernel.kallsyms] [k] __fsnotify_parent ▒
0.76% osmo-trx-lms [kernel.kallsyms] [k] exit_to_usermode_loop ▒
0.75% osmo-trx-lms [kernel.kallsyms] [k] __srcu_read_lock ▒
0.72% osmo-trx-lms [kernel.kallsyms] [k] syscall_trace_enter ▒
0.70% osmo-trx-lms [kernel.kallsyms] [k] __rcu_read_unlock ▒
0.70% osmo-trx-lms [kernel.kallsyms] [k] inet_release ▒
0.62% osmo-trx-lms [kernel.kallsyms] [k] _raw_spin_unlock ▒
0.60% osmo-trx-lms [kernel.kallsyms] [k] __fget_light ▒
0.59% osmo-trx-lms [kernel.kallsyms] [k] __audit_syscall_exit ▒
0.58% osmo-trx-lms [kernel.kallsyms] [k] memset_erms ▒
0.56% osmo-trx-lms [kernel.kallsyms] [k] __local_bh_enable_ip ▒
0.56% osmo-trx-lms libpthread-2.29.so [.] __pthread_enable_asynccancel ▒
0.56% osmo-trx-lms [kernel.kallsyms] [k] __srcu_read_unlock ▒
0.55% osmo-trx-lms [kernel.kallsyms] [k] inet_accept ▒
0.55% osmo-trx-lms [kernel.kallsyms] [k] _raw_spin_trylock ▒
0.55% osmo-trx-lms [kernel.kallsyms] [k] __rcu_read_lock ▒
0.55% osmo-trx-lms [kernel.kallsyms] [k] __alloc_fd ▒
0.54% osmo-trx-lms libpthread-2.29.so [.] __pthread_disable_asynccancel ▒
0.53% osmo-trx-lms [kernel.kallsyms] [k] sock_alloc_inode ▒
0.53% osmo-trx-lms [kernel.kallsyms] [k] kmem_cache_alloc_trace ▒
0.52% osmo-trx-lms [kernel.kallsyms] [k] sock_close ▒
0.51% osmo-trx-lms [kernel.kallsyms] [k] _atomic_dec_and_lock ▒
0.50% osmo-trx-lms [kernel.kallsyms] [k] fsnotify ▒
0.49% osmo-trx-lms [kernel.kallsyms] [k] __this_cpu_preempt_check ▒
0.48% osmo-trx-lms [kernel.kallsyms] [k] down_write ▒
0.48% osmo-trx-lms [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe ▒
0.48% osmo-trx-lms [kernel.kallsyms] [k] __d_instantiate ▒
0.48% osmo-trx-lms [kernel.kallsyms] [k] up_write ▒
0.47% osmo-trx-lms [kernel.kallsyms] [k] percpu_counter_add_batch ▒
0.47% osmo-trx-lms [kernel.kallsyms] [k] memcg_kmem_put_cache ▒
0.46% osmo-trx-lms [kernel.kallsyms] [k] __x64_sys_accept ▒
0.45% osmo-trx-lms [kernel.kallsyms] [k] inode_init_always ▒
0.44% osmo-trx-lms [kernel.kallsyms] [k] __x86_indirect_thunk_rax ▒
0.44% osmo-trx-lms [kernel.kallsyms] [k] _raw_spin_unlock_irq ▒
0.44% osmo-trx-lms [kernel.kallsyms] [k] lockref_put_return ▒
0.43% osmo-trx-lms [kernel.kallsyms] [k] truncate_inode_pages_final ▒
0.41% osmo-trx-lms [kernel.kallsyms] [k] get_random_u32 ▒
0.40% osmo-trx-lms [kernel.kallsyms] [k] tcp_release_cb ▒
0.39% osmo-trx-lms [kernel.kallsyms] [k] __audit_syscall_entry ▒
0.39% osmo-trx-lms [kernel.kallsyms] [k] map_id_range_down ▒
0.37% osmo-trx-lms [kernel.kallsyms] [k] __sock_release ▒
0.37% osmo-trx-lms [kernel.kallsyms] [k] memcpy_erms ▒
0.34% osmo-trx-lms [kernel.kallsyms] [k] evict ▒
0.34% osmo-trx-lms [kernel.kallsyms] [k] __d_alloc ▒
0.31% osmo-trx-lms [kernel.kallsyms] [k] alloc_file ▒
0.31% osmo-trx-lms [kernel.kallsyms] [k] __dentry_kill ▒
0.29% osmo-trx-lms [kernel.kallsyms] [k] alloc_file_pseudo ▒
0.27% osmo-trx-lms [kernel.kallsyms] [k] ksoftirqd_running ▒
0.25% osmo-trx-lms [kernel.kallsyms] [k] sock_alloc ▒
0.24% osmo-trx-lms [kernel.kallsyms] [k] locks_remove_file ▒
0.23% osmo-trx-lms [kernel.kallsyms] [k] dput ▒
0.23% osmo-trx-lms [kernel.kallsyms] [k] get_next_ino ▒
0.23% osmo-trx-lms [kernel.kallsyms] [k] put_unused_fd ▒
0.22% osmo-trx-lms [kernel.kallsyms] [k] release_sock ▒
0.22% osmo-trx-lms [kernel.kallsyms] [k] __kthread_should_park ▒
0.22% osmo-trx-lms [kernel.kallsyms] [k] iput ▒
0.21% osmo-trx-lms [kernel.kallsyms] [k] dentry_unlink_inode ▒
0.21% osmo-trx-lms [kernel.kallsyms] [k] d_set_d_op ▒
0.20% osmo-trx-lms [kernel.kallsyms] [k] ____fput ▒
0.20% osmo-trx-lms [kernel.kallsyms] [k] find_next_zero_bit ▒
0.20% osmo-trx-lms [kernel.kallsyms] [k] get_page_from_freelist ▒
</pre>