Bug #3562
closedosmo-trx-uhd doesn't exit after UHD device disappears
0%
Description
When operating osmo-trx-uhd and then unplugging the USB connector of the USRP B2xx, osmo-trx-uhd prints the following error message:
[ERROR] [UHD] An unexpected exception was caught in a task loop.The task loop will now exit, things may not work.EnvironmentError: IOError: usb rx8 transfer status: LIBUSB_TRANSFER_NO_DEVICE
but continues to run.
I think the expectation is that the process should exit and re-spawn until the re-plugged device is found again? Let's e.g. think of a temporary glitch reading to USB device re-enumeration.
Updated by laforge over 5 years ago
- Subject changed from osmo-trx-uhd doesn't exist after UHD device disappears to osmo-trx-uhd doesn't exit after UHD device disappears
also, let's double-confirm that osmo-trx-uhd doesnt' recover internally after the device is re-plugged.
Updated by pespin over 5 years ago
- Assignee set to pespin
Assigning to me since I did related work recently regarding same issue with osmo-trx-lms.
self-reminder: I can try to test this with my LimeSDR using osmo-trx-uhd.
Updated by pespin over 5 years ago
I unplugged the usb cable from my PC while running a network with osmo-trx-uhd and B200. This is what I get, process stops:
Thu Nov 1 18:34:50 2018 DMAIN <0000> Transceiver.cpp:1038 [tid=140280087037696] ClockInterface: sending IND CLOCK 2701351 terminate called after throwing an instance of 'uhd::io_error' what(): EnvironmentError: IOError: usb rx6 transfer status: LIBUSB_TRANSFER_ERROR signal 6 received talloc report on 'OsmoTRX' (total 3336 bytes in 15 blocks) telnet_connection contains 1 bytes in 1 blocks (ref 0) 0x60b0000afd30 logging contains 2955 bytes in 9 blocks (ref 0) 0x60b000013810 struct trx_ctx contains 380 bytes in 3 blocks (ref 0) 0x6140000006a0 msgb contains 0 bytes in 1 blocks (ref 0) 0x608000004f80 full talloc report on 'OsmoTRX' (total 3336 bytes in 15 blocks) telnet_connection contains 1 bytes in 1 blocks (ref 0) 0x60b0000afd30 logging contains 2955 bytes in 9 blocks (ref 0) 0x60b000013810 Configure logging Set the log level for a specified category Main generic category Device/Driver specific code Logging from within LimeSuite itself Library-internal global log family LAPD in libosmogsm A-bis Intput Subsystem A-bis B-Subchannel TRAU Frame Multiplex A-bis Input Driver for Signalling A-bis Input Driver for B-Channels (voice) Layer3 Short Message Service (SMS) Control Interface GPRS GTP library Statistics messages and logging Generic Subscriber Update Protocol Osmocom Authentication Protocol libosmo-sigtran Signalling System 7 libosmo-sigtran SCCP Implementation libosmo-sigtran SCCP User Adaptation libosmo-sigtran MTP3 User Adaptation libosmo-mgcp Media Gateway Control Protocol libosmo-netif Jitter Buffer Deprecated alias for 'no logging level force-all' contains 779 bytes in 1 blocks (ref 0) 0x6180000014e0 logging level (main|dev|lms|lglobal|llapd|linp|lmux|lmi|lmib|lsms|lctrl|lgtp|lstats|lgsup|loap|lss7|lsccp|lsua|lm3ua|lmgcp|ljibuf) everything contains 142 bytes in 1 blocks (ref 0) 0x611000002760 Configure logging Set the log level for a specified category Main generic category Device/Driver specific code Logging from within LimeSuite itself Library-internal global log family LAPD in libosmogsm A-bis Intput Subsystem A-bis B-Subchannel TRAU Frame Multiplex A-bis Input Driver for Signalling A-bis Input Driver for B-Channels (voice) Layer3 Short Message Service (SMS) Control Interface GPRS GTP library Statistics messages and logging Generic Subscriber Update Protocol Osmocom Authentication Protocol libosmo-sigtran Signalling System 7 libosmo-sigtran SCCP Implementation libosmo-sigtran SCCP User Adaptation libosmo-sigtran MTP3 User Adaptation libosmo-mgcp Media Gateway Control Protocol libosmo-netif Jitter Buffer Log debug messages and higher levels Log informational messages and higher levels Log noticeable messages and higher levels Log error messages and higher levels Log only fatal messages contains 914 bytes in 1 blocks (ref 0) 0x6190000230e0 logging level (main|dev|lms|lglobal|llapd|linp|lmux|lmi|lmib|lsms|lctrl|lgtp|lstats|lgsup|loap|lss7|lsccp|lsua|lm3ua|lmgcp|ljibuf) (debug|info|notice|error|fatal) contains 163 bytes in 1 blocks (ref 0) 0x612000000ca0 struct log_target contains 212 bytes in 2 blocks (ref 0) 0x612000000820 struct log_category contains 44 bytes in 1 blocks (ref 0) 0x60d000000720 struct log_info contains 744 bytes in 2 blocks (ref 0) 0x60d000000650 struct log_info_cat contains 704 bytes in 1 blocks (ref 0) 0x6180000000e0 struct trx_ctx contains 380 bytes in 3 blocks (ref 0) 0x6140000006a0 192.168.30.1 contains 13 bytes in 1 blocks (ref 0) 0x60b0000ad600 192.168.30.100 contains 15 bytes in 1 blocks (ref 0) 0x60b0000ad080 msgb contains 0 bytes in 1 blocks (ref 0) 0x608000004f80 ./run_out.sh: line 12: 13790 Aborted (core dumped) $@
So it seems SIGABRT is called (signal 6) and after printing the report some random strings are printed.
Updated by pespin over 5 years ago
Another similar but not exactly equal (different exception raised) while unplugging the usb from the B200 side quickly. Again, process stops (aborts):
terminate called after throwing an instance of 'uhd::io_error' what(): EnvironmentError: IOError: usb rx6 transfer status: LIBUSB_TRANSFER_NO_DEVICE [ERROR] [UHDsignal 6 received ] An unexpected exception was caught in a task loop.The task loop will now exit, things may not work.EnvironmentError: IOError: usb rx8 transfer status: LIBUSB_TRANSFER_NO_DEVICE talloc report on 'OsmoTRX' (total 3336 bytes in 15 blocks) telnet_connection contains 1 bytes in 1 blocks (ref 0) 0x60b0000afd30 logging contains 2955 bytes in 9 blocks (ref 0) 0x60b000013810 struct trx_ctx contains 380 bytes in 3 blocks (ref 0) 0x6140000006a0 msgb contains 0 bytes in 1 blocks (ref 0) 0x608000004f80 full talloc report on 'OsmoTRX' (total 3336 bytes in 15 blocks) telnet_connection contains 1 bytes in 1 blocks (ref 0) 0x60b0000afd30 logging contains 2955 bytes in 9 blocks (ref 0) 0x60b000013810 Configure logging Set the log level for a specified category Main generic category Device/Driver specific code Logging from within LimeSuite itself Library-internal global log family LAPD in libosmogsm A-bis Intput Subsystem A-bis B-Subchannel TRAU Frame Multiplex A-bis Input Driver for Signalling A-bis Input Driver for B-Channels (voice) Layer3 Short Message Service (SMS) Control Interface GPRS GTP library Statistics messages and logging Generic Subscriber Update Protocol Osmocom Authentication Protocol libosmo-sigtran Signalling System 7 libosmo-sigtran SCCP Implementation libosmo-sigtran SCCP User Adaptation libosmo-sigtran MTP3 User Adaptation libosmo-mgcp Media Gateway Control Protocol libosmo-netif Jitter Buffer Deprecated alias for 'no logging level force-all' contains 779 bytes in 1 blocks (ref 0) 0x6180000014e0 logging level (main|dev|lms|lglobal|llapd|linp|lmux|lmi|lmib|lsms|lctrl|lgtp|lstats|lgsup|loap|lss7|lsccp|lsua|lm3ua|lmgcp|ljibuf) everything contains 142 bytes in 1 blocks (ref 0) 0x611000002760 Configure logging Set the log level for a specified category Main generic category Device/Driver specific code Logging from within LimeSuite itself Library-internal global log family LAPD in libosmogsm A-bis Intput Subsystem A-bis B-Subchannel TRAU Frame Multiplex A-bis Input Driver for Signalling A-bis Input Driver for B-Channels (voice) Layer3 Short Message Service (SMS) Control Interface GPRS GTP library Statistics messages and logging Generic Subscriber Update Protocol Osmocom Authentication Protocol libosmo-sigtran Signalling System 7 libosmo-sigtran SCCP Implementation libosmo-sigtran SCCP User Adaptation libosmo-sigtran MTP3 User Adaptation libosmo-mgcp Media Gateway Control Protocol libosmo-netif Jitter Buffer Log debug messages and higher levels Log informational messages and higher levels Log noticeable messages and higher levels Log error messages and higher levels Log only fatal messages contains 914 bytes in 1 blocks (ref 0) 0x6190000230e0 logging level (main|dev|lms|lglobal|llapd|linp|lmux|lmi|lmib|lsms|lctrl|lgtp|lstats|lgsup|loap|lss7|lsccp|lsua|lm3ua|lmgcp|ljibuf) (debug|info|notice|error|fatal) contains 163 bytes in 1 blocks (ref 0) 0x612000000ca0 struct log_target contains 212 bytes in 2 blocks (ref 0) 0x612000000820 struct log_category contains 44 bytes in 1 blocks (ref 0) 0x60d000000720 struct log_info contains 744 bytes in 2 blocks (ref 0) 0x60d000000650 struct log_info_cat contains 704 bytes in 1 blocks (ref 0) 0x6180000000e0 struct trx_ctx contains 380 bytes in 3 blocks (ref 0) 0x6140000006a0 192.168.30.1 contains 13 bytes in 1 blocks (ref 0) 0x60b0000ad600 192.168.30.100 contains 15 bytes in 1 blocks (ref 0) 0x60b0000ad080 msgb contains 0 bytes in 1 blocks (ref 0) 0x608000004f80 ./run_out.sh: line 12: 14376 Aborted (core dumped) $@
Updated by pespin almost 5 years ago
- Status changed from New to Resolved
These should all be fixed after recent patches to fix exit sequence race conditions in osmo-trx:
commit 21032b75c00710ab30a0a74a4006608a58295d99 Author: Pau Espin Pedrol <pespin@sysmocom.de> Date: Fri Mar 29 19:20:06 2019 +0100 osmo-trx: Use signalfd to serialize signals in main thread ctx This should avoid prolematic scenarios where different signal handlers are running on different thread in parallel. Furthermore, we make sure those signals are always run by main loop thread. Change-Id: I9b9d9793be9af11dbe433e0ce09b7ac57a3bdfb5 commit d01c7b98b63dfda1e72e289eccd7a384c658069f Author: Pau Espin Pedrol <pespin@sysmocom.de> Date: Fri Mar 29 18:36:30 2019 +0100 osmo-trx: Avoid handling signals after shutdown triggered Recently a blocked osmo-trx process was found after ending SIGTERM to it. Apparently one thread was handling SIGTERM and calling fprintf() (grabbing libc lock) while another thread was handling another signal and also grabbing similar lock. Both thread looked deadlocked there. Probably this change doesn't fix the block on its own, but at least simplifies scenarios inside signal ctx which can go wrong. Change-Id: If91621913b8b03d8a0f4c863be0b0d479f97e8a1