Project

General

Profile

Actions

Bug #6247

closed

osmo-trx-uhd: segfault with libuhd v4.5.0.0

Added by fixeria 4 months ago. Updated 4 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
UHD
Target version:
-
Start date:
11/06/2023
Due date:
% Done:

0%

Spec Reference:

Description

I just recompiled osmo-trx-uhd (992a49e586279e422f03b3766b3b4cd1aef409b8) due to a libuhd package update:

ноя 06 23:25:51 LEGION osmo-trx-uhd[336498]: /opt/osmocom/bin/osmo-trx-uhd: error while loading shared libraries: libuhd.so.4.4.0: cannot open shared object file: No such file or directory

and now it's crashing on POWERON from osmo-bts-trx:

fixeria@LEGION:~$ osmo-trx-uhd -C /etc/osmocom/osmo-trx-uhd.cfg
DLGLOBAL NOTICE telnet_interface.c:88 Available via telnet 127.0.0.1 4237
DLCTRL NOTICE control_if.c:1014 CTRL at 127.0.0.1 4236
DMAIN INFO osmo-trx.cpp:515 SSE3 support compiled in and supported by CPU
DMAIN INFO osmo-trx.cpp:527 SSE4.1 support compiled in and supported by CPU
DMAIN INFO osmo-trx.cpp:583 Config Settings
   Log Level............... 0
   Device args............. type=b200
   TRX Base Port........... 5700
   TRX Address............. 127.0.101.1
   GSM BTS Address......... 127.0.101.1
   Channels................ 1
   Tx Samples-per-Symbol... 4
   Rx Samples-per-Symbol... 4
   EDGE support............ 0
   Extended RACH support... 0
   Reference............... 0
   Filler Burst Type....... Empty bursts
   Filler Burst TSC........ 0
   Filler Burst RACH Delay. 0
   Multi-Carrier........... 0
   LO freq. offset......... 0
   RSSI to dBm offset...... 45
   Swap channels........... 0
   Tx Antennas............. 'TX/RX'
   Rx Antennas............. 'RX2'

[INFO] [UHD] linux; GNU C++ version 13.2.1 20230801; Boost_108300; UHD_4.5.0.0-0-unknown
DDEV INFO UHDDevice.cpp:521 Using discovered UHD device type=b200,name=MyB210,serial=30AC26A,product=B210
DDEVDRV INFO b200_impl.cpp:421 [B200] Detected Device: B210
DDEVDRV INFO b200_impl.cpp:468 [B200] Operating over USB 3.
DDEVDRV INFO b200_impl.cpp:619 [B200] Initialize CODEC control...
DDEVDRV INFO b200_impl.cpp:688 [B200] Initialize Radio control...
DDEVDRV INFO b200_impl.cpp:1099 [B200] Performing register loopback test... 
DDEVDRV INFO b200_impl.cpp:1108 [B200] Register loopback test passed
DDEVDRV INFO b200_impl.cpp:1099 [B200] Performing register loopback test... 
DDEVDRV INFO b200_impl.cpp:1108 [B200] Register loopback test passed
DDEVDRV INFO b200_impl.cpp:814 [B200] Setting master clock rate selection to 'automatic'.
DDEVDRV INFO b200_impl.cpp:1149 [B200] Asking for clock rate 16.000000 MHz... 
DDEVDRV INFO b200_impl.cpp:1162 [B200] Actually got clock rate 16.000000 MHz.
DMAIN INFO UHDDevice.cpp:225 Antennas configured successfully
DDEV INFO UHDDevice.cpp:575 Waiting for clock reference lock (max 15s)...
DDEV INFO UHDDevice.cpp:584 Selected clock source is internal
DDEVDRV INFO multi_usrp.cpp:382 [MULTI_USRP] Setting master clock rate selection to 'manual'.
DDEVDRV INFO b200_impl.cpp:1149 [B200] Asking for clock rate 26.000000 MHz... 
DDEVDRV INFO b200_impl.cpp:1162 [B200] Actually got clock rate 26.000000 MHz.
DDEV INFO UHDDevice.cpp:288 Rates configured for B210 4 SPS
DDEV INFO UHDDevice.cpp:248 Supported Tx gain range [0; 89.75]
DDEV INFO UHDDevice.cpp:253 Supported Rx gain range [0; 76]
DDEV INFO UHDDevice.cpp:257 Default setting Tx gain for channel 0 to 44.875
DDEV INFO UHDDevice.cpp:264 Default setting Rx gain for channel 0 to 38
DDEV INFO UHDDevice.cpp:633 Device configuration: Single USRP:
  Device: B-Series Device
  Mboard 0: B210
  RX Channel: 0
    RX DSP: 0
    RX Dboard: A
    RX Subdev: FE-RX2
  TX Channel: 0
    TX DSP: 0
    TX Dboard: A
    TX Subdev: FE-TX2
DMAIN NOTICE osmo-trx.cpp:621 -- Transceiver active with 1 channel(s)
DLSTATS NOTICE stats.c:169 Stats timer expire_count=4: We missed 3 timers
DLGLOBAL NOTICE rate_ctr.c:350 Stats timer expire_count=16: We missed 15 timers
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'POWEROFF'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP POWEROFF 0'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'RFMUTE 1'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP RFMUTE 0 1'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'RFMUTE 0'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP RFMUTE 0 0'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETFORMAT 2'
DTRXCTRL INFO Transceiver.cpp:1054 [chan=0] BTS requests TRXD version switch: 2
DTRXCTRL INFO Transceiver.cpp:1056 [chan=0] rejecting TRXD version 2 in favor of 1
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETFORMAT 1 2'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETFORMAT 1'
DTRXCTRL INFO Transceiver.cpp:1054 [chan=0] BTS requests TRXD version switch: 1
DTRXCTRL NOTICE Transceiver.cpp:1060 [chan=0] switching to TRXD version 1
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETFORMAT 1 1'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'RXTUNE 904800'
DDEV ERROR UHDDevice.cpp:65 No Tx Power measurements exist for device N2XX 1 SPS on band GSM900, using fallback..
DDEV INFO UHDDevice.cpp:1028 [chan=0] set_freq(9.048e+08, Rx): Tune Result:
    Target RF  Freq: 904.800000 (MHz)
    Actual RF  Freq: 904.799999 (MHz)
    Target DSP Freq: -0.000001 (MHz)
    Actual DSP Freq: -0.000001 (MHz)

DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP RXTUNE 0 904800'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'TXTUNE 949800'
DDEV INFO UHDDevice.cpp:1028 [chan=0] set_freq(9.498e+08, Tx): Tune Result:
    Target RF  Freq: 949.800000 (MHz)
    Actual RF  Freq: 949.800000 (MHz)
    Target DSP Freq: 0.000000 (MHz)
    Actual DSP Freq: 0.000000 (MHz)

DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP TXTUNE 0 949800'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETTSC 7'
DTRXCTRL NOTICE Transceiver.cpp:1033 Changing TSC from 0 to 7
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETTSC 0 7'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'NOHANDOVER 0 4'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP NOHANDOVER 0 0 4'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'NOMTXPOWER'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP NOMTXPOWER 0 13'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'POWERON'
DMAIN NOTICE Transceiver.cpp:287 Starting the transceiver
DMAIN INFO radioInterface.cpp:187 Starting radio device
DDEV INFO UHDDevice.cpp:730 Starting USRP...
DMAIN INFO Threads.cpp:52 Thread 139641902577344 (task 486804) set name: UHDAsyncEvent
DDEV INFO UHDDevice.cpp:705 Initial timestamp 49667006
DDEV INFO UHDDevice.cpp:747 The current time is 45.8486 seconds
DMAIN INFO radioInterface.cpp:208 Radio started
DMAIN INFO Threads.cpp:52 Thread 139641936148160 (task 486805) set name: TxLower
DMAIN INFO Threads.cpp:52 Thread 139641927755456 (task 486806) set name: RxLower
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP POWERON 0'
DMAIN INFO Threads.cpp:52 Thread 139641919362752 (task 486807) set name: RxUpper0
DMAIN INFO Threads.cpp:52 Thread 139641910970048 (task 486808) set name: TxUpper0
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETSLOT 0 4'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETSLOT 0 0 4'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETSLOT 1 7'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETSLOT 0 1 7'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETSLOT 2 1'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETSLOT 0 2 1'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETSLOT 3 1'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETSLOT 0 3 1'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETSLOT 4 1'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETSLOT 0 4 1'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETSLOT 5 1'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETSLOT 0 5 1'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETSLOT 6 3'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETSLOT 0 6 3'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETSLOT 7 13'
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETSLOT 0 7 13'
LLDTRXCLK NOTICE Transceiver.cpp:1199 Sending CLOCK indications
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETPOWER 5'
DDEV INFO UHDDevice.cpp:369 Set TX gain to 84.75dB, ~8.3 dBm (asked for 84.75 dB, ~8.3 dBm)
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETPOWER 0 5'
DTRXCTRL INFO Transceiver.cpp:928 [chan=0] command is 'SETPOWER 4'
DDEV INFO UHDDevice.cpp:369 Set TX gain to 85.75dB, ~9.3 dBm (asked for 85.75 dB, ~9.3 dBm)
DTRXCTRL INFO Transceiver.cpp:1082 [chan=0] response is 'RSP SETPOWER 0 4'
DDEV ERROR UHDDevice.cpp:780 No packet received, implementation timed-out
DDEV FATAL UHDDevice.cpp:784 UHD: Receive timed out
DMAIN FATAL radioInterface.cpp:340 Receive error 0
DTRXDUL FATAL Transceiver.cpp:1360 Something went wrong in thread RxLower, requesting stop
DMAIN NOTICE osmo-trx.cpp:588 Shutting down transceiver...
DMAIN NOTICE Transceiver.cpp:345 Stopping the transceiver
DMAIN INFO Transceiver.cpp:358 Stopping the device
terminate called after throwing an instance of 'uhd::assertion_error'
  what():  AssertionError: accum_timeout < _timeout
  in wait_for_ack
  at /usr/src/debug/libuhd/uhd-4.5.0.0/host/lib/usrp/b200/b200_radio_ctrl_core.cpp:227

Aborted (core dumped)

Here is a backtrace:

(gdb) bt
#0  0x00007f00f1eac83c in ?? () from /usr/lib/libc.so.6
#1  0x00007f00f1e5c668 in raise () from /usr/lib/libc.so.6
#2  0x00007f00f1e44542 in abort () from /usr/lib/libc.so.6
#3  0x00007f00f209ca6f in __gnu_cxx::__verbose_terminate_handler () at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
#4  0x00007f00f20b011c in __cxxabiv1::__terminate (handler=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
#5  0x00007f00f20af0aa in __cxa_call_terminate (ue_header=0x557ffd404190) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_call.cc:54
#6  0x00007f00f20af82a in __cxxabiv1::__gxx_personality_v0 (version=<optimized out>, actions=6, exception_class=5138137972254386944, ue_header=<optimized out>, context=0x7fffa5796680)
    at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_personality.cc:688
#7  0x00007f00f38a452a in _Unwind_RaiseException_Phase2 (exc=exc@entry=0x557ffd404190, context=context@entry=0x7fffa5796680, frames_p=frames_p@entry=0x7fffa5796588) at /usr/src/debug/gcc/gcc/libgcc/unwind.inc:64
#8  0x00007f00f38a502d in _Unwind_Resume (exc=exc@entry=0x557ffd404190) at /usr/src/debug/gcc/gcc/libgcc/unwind.inc:242
#9  0x0000557ffcf1b115 in __gthread_mutex_unlock (__mutex=<optimized out>) at /usr/include/c++/13.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:779
#10 __gthread_recursive_mutex_unlock (__mutex=<optimized out>) at /usr/include/c++/13.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:832
#11 std::recursive_mutex::unlock (this=<optimized out>) at /usr/include/c++/13.2.1/mutex:139
#12 Mutex::unlock (this=<optimized out>) at ../../../src/osmo-trx/CommonLibs/Threads.h:62
#13 ScopedLock::~ScopedLock (this=<optimized out>, __in_chrg=<optimized out>) at ../../../src/osmo-trx/CommonLibs/Threads.h:76
#14 Transceiver::stop (this=0x557ffd3e0c50) at ../../../src/osmo-trx/Transceiver52M/Transceiver.cpp:372
#15 0x0000557ffcf3be59 in Transceiver::~Transceiver (this=0x557ffd3e0c50, __in_chrg=<optimized out>) at ../../../src/osmo-trx/Transceiver52M/Transceiver.cpp:158
#16 0x0000557ffcf22bf4 in trx_stop () at ../../../src/osmo-trx/Transceiver52M/osmo-trx.cpp:590
#17 0x0000557ffcf1fb96 in main (argc=<optimized out>, argv=<optimized out>) at ../../../src/osmo-trx/Transceiver52M/osmo-trx.cpp:713

Looks like it's crashing during Transceiver::stop(), so the backtrace is not very useful.

Actions #1

Updated by Hoernchen 4 months ago

This is a crash within UHD; the actual problem is that the b2xx does not send any data to the host, and uhd itself implodes.

Actions #2

Updated by Hoernchen 4 months ago

  • Status changed from New to Rejected

invalid: user error due to outdated b2xx fw/fpga image.

Actions #3

Updated by fixeria 4 months ago

  • Assignee set to fixeria

For the record (may be useful for those running into this too), the problem was resolved by downloading FW/FPGA images from:

https://github.com/EttusResearch/uhd/releases/download/v4.5.0.0/uhd-images_4.5.0.0.tar.xz

and extracting them to /usr/share/uhd/images manually.

uhd_images_downloader reports no updates, so it cannot be used.
https://files.ettus.com/binaries/cache/b2xx/ contains old images from January 2023 (uhd-92c09f7).

For Arch there exists https://aur.archlinux.org/packages/libuhd-firmware, but it's not up to date yet. I flagged it as such and asked the author to bump the version.

Actions #4

Updated by fixeria 4 months ago

Hmm, it worked reliably yesterday, but today it segfaults again after a few minutes of running.

ноя 07 15:43:35 LEGION osmo-trx-uhd[517460]: DDEV ERROR UHDDevice.cpp:782 No packet received, implementation timed-out
ноя 07 15:43:35 LEGION osmo-trx-uhd[517460]: DDEV FATAL UHDDevice.cpp:786 UHD: Receive timed out 
ноя 07 15:43:35 LEGION osmo-trx-uhd[517460]: DMAIN FATAL radioInterface.cpp:340 Receive error 0
ноя 07 15:43:35 LEGION osmo-trx-uhd[517460]: DTRXDUL FATAL Transceiver.cpp:1360 Something went wrong in thread RxLower, requesting stop
ноя 07 15:43:35 LEGION osmo-trx-uhd[517460]: DMAIN NOTICE osmo-trx.cpp:588 Shutting down transceiver...
ноя 07 15:43:35 LEGION osmo-trx-uhd[517460]: DMAIN NOTICE Transceiver.cpp:345 Stopping the transceiver
ноя 07 15:43:35 LEGION osmo-trx-uhd[517460]: DMAIN INFO Transceiver.cpp:358 Stopping the device
ноя 07 15:43:37 LEGION osmo-trx-uhd[517460]: terminate called after throwing an instance of 'uhd::assertion_error'
ноя 07 15:43:37 LEGION osmo-trx-uhd[517460]:   what():  AssertionError: accum_timeout < _timeout
ноя 07 15:43:37 LEGION osmo-trx-uhd[517460]:   in wait_for_ack
ноя 07 15:43:37 LEGION osmo-trx-uhd[517460]:   at /usr/src/debug/libuhd/uhd-4.5.0.0/host/lib/usrp/b200/b200_radio_ctrl_core.cpp:227
ноя 07 15:43:37 LEGION systemd-coredump[518036]: Process 517460 (osmo-trx-uhd) of user 1000 dumped core.
Actions #5

Updated by fixeria 4 months ago

The actual reason was [insufficient] power. I realized this after doing libuhd downgrade and seeing the same behavior.

Since recently I started using a 100W USB-C PD charger, which is a lot smaller and lighter than the original 230W (huge brick) power supply from LENOVO.
Apparently the laptop fails to deliver enough power to USRP while charging via USB-C PD port. It worked yesterday because the battery was at 100%.
Lesson learned. Either use the original power supply, or wait for the battery to charge. Or use a dedicated power supply for USRP.

fixeria wrote in #note-3:

For Arch there exists https://aur.archlinux.org/packages/libuhd-firmware, but it's not up to date yet. I flagged it as such and asked the author to bump the version.

BTW, I adopted this package and updated pkgver to v4.5.0.0, so now it matches the libuhd package version.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)