Project

General

Profile

Bug #2723

osmo-trx-uhd: failure to update CLOCK IND, timeout with LimeSDR

Added by pespin 11 months ago. Updated 6 months ago.

Status:
New
Priority:
Low
Assignee:
Category:
LimeSDR
Target version:
-
Start date:
12/08/2017
Due date:
% Done:

0%

Spec Reference:

Description

Running latest mast4ers of all software components (osmocom, limesuite, soapy) and uhd 3.10.

I first start osmo-bts-trx, which enques ctrl cmd to "POWER OFF" the trx

..
20171208175145985 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:601 Open transceiver for phy0.0
20171208175145985 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD POWEROFF'
..
(BSC OML enters here to configure everything:)
20171208175146336 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD RXTUNE 1782000'
20171208175146336 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD TXTUNE 1877000'
20171208175146336 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD SETTSC 7'
20171208175146336 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD POWERON'
20171208175146336 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD SETRXGAIN 1'
20171208175146336 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD SETPOWER 20'
20171208175146336 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD SETSLOT 0 5'
20171208175146379 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:221 Enqueuing TRX control command 'CMD SETSLOT 1 1'  <--  (until 7)
..
(Then, several of this as osmo-trx is still stopped, so the timer to re-send the first command in the que retriggers:)
20171208175147985 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:172 No response from transceiver for phy0.0
...
(Then, I power on osmo-trx, whichs answers the commands):
20171208175245995 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:379 Response message: 'RSP POWEROFF 0'
20171208175246094 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:379 Response message: 'RSP RXTUNE 0 1782000'
20171208175246177 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:379 Response message: 'RSP TXTUNE 0 1877000'
20171208175246177 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:379 Response message: 'RSP SETTSC 0 7'
20171208175247078 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:379 Response message: 'RSP POWERON 0'
20171208175247080 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:124 Clock indication: fn=1786123
20171208175247080 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/scheduler_trx.c:1619 TRX Clock Ind: elapsed_us=-1506114067, elapsed_fn=-929525, er
ror_us=-1511323488
20171208175247080 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/scheduler_trx.c:1632 GSM clock skew: old fn=0, new fn=1786123
20171208175247584 DTRX <000b> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/trx_if.c:379 Response message: 'RSP SETRXGAIN 0 0'
20171208175248931 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-trx/scheduler_trx.c:1510 No more clock from transceiver
20171208175248931 DOML <0001> /home/pespin/dev/sysmocom/git/osmo-bts/src/common/bts.c:210 Shutting down BTS 0, Reason No clock from osmo-trx

Currently, as a response to POWERON, osmo-trx is forced to send its first clock indication. On receiving this first clock indication, osmo-bts-trx calls trx_sched_clock() which sets transceiver_available=true and calls trx_setup_clock(), which sets up the timer to process FNs based on time elapsed, and also checks that the skew with clock from osmo-trx (CLOCK IND) is not too big (around 400 FN), otherwise it exits as we see in the above trace.

The skew happens because after sending its first CLOCK IND, osmo-trx is not sending more of them. The first one is forced in code at RSP POWERON time, but next ones are sent based on some internal clocks, and the conditions to send more CLOCK IND are not happening:

if (mForceClockInterface || mTransmitDeadlineClock > mLastClockUpdateTime + GSM::Time(216,0)) {
    LOG(INFO) << "ClockInterface: sending clock ind: force="<< mForceClockInterface << " " << mTransmitDeadlineClock << " > " << mLastClockUpdateTime + GSM::Time(216,0);
    mForceClockInterface = false;
    writeClockInterface();
  }

In osmo-trx logs, I see this all the time:

INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
ERR 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:1383:write: Skipping buffer data: timestamp=20902860 time_end=20899800
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
ERR 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:1383:write: Skipping buffer data: timestamp=20907960 time_end=20904900
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
ERR 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:1383:write: Skipping buffer data: timestamp=20913060 time_end=20910000
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
ERR 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:1383:write: Skipping buffer data: timestamp=20918160 time_end=20915100
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
INFO 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
ERR 140737354032896 17:52:48.3 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:1383:write: Skipping buffer data: timestamp=20922240 time_end=20920200

So it seems the clocks are not being updated, and as they are not updated, there's no new updated clock indications being sent to osmo-bts-trx, which then closes with the timeout.


Related issues

Related to OsmoTRX - Bug #2816: osmo-trx-uhd: bug with limesdr send timed outClosed2018-01-02

Related to OsmoTRX - Bug #3039: LimeSDR: Lots of lost signalling blocks ends voice call with "Connection Failure" after a few secondsClosed2018-03-07

Has duplicate OsmoTRX - Support #3316: osmo-trx-uhd fails after OpenBTS starts, LimeSDRClosed2018-06-04

History

#1 Updated by pespin 11 months ago

I forge to add that after a while (even after I already stopped osmo-bts-trx), I get the following lines in the output:

INFO 140737354032896 17:52:48.6 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
ERR 140737354032896 17:52:48.7 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:844:check_rx_md_err: No packet received, implementation timed-out at 27.0777 sec.
ALERT 140737354032896 17:52:48.7 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:848:check_rx_md_err: UHD: Receive timed out
[New Thread 0x7fffeb7fe700 (LWP 14511)]
ALERT 140737354032896 17:52:48.7 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/radioInterface.cpp:320:pullBuffer: Receive error 0
INFO 140737354032896 17:52:48.7 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:880:driveReceiveRadio: ClockInterface: skipping clock ind: force=0 3:1786262 > 3:1786337
[New Thread 0x7fffe155e700 (LWP 14512)]
[New Thread 0x7fffe0d5d700 (LWP 14513)]
ALERT 140737354032896 17:52:48.7 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:867:check_rx_md_err: UHD: Loss of monotonic time
[Thread 0x7fffe155e700 (LWP 14512) exited]
[Thread 0x7fffe0d5d700 (LWP 14513) exited]
[Thread 0x7fffeb7fe700 (LWP 14511) exited]
[New Thread 0x7fffeb7fe700 (LWP 14517)]
[New Thread 0x7fffe155e700 (LWP 14519)]
[New Thread 0x7fffe0d5d700 (LWP 14518)]
[Thread 0x7fffe0d5d700 (LWP 14518) exited]
[Thread 0x7fffe155e700 (LWP 14519) exited]
[Thread 0x7fffeb7fe700 (LWP 14517) exited]
[New Thread 0x7fffeb7fe700 (LWP 14520)]
[New Thread 0x7fffe155e700 (LWP 14521)]
[New Thread 0x7fffe0d5d700 (LWP 14522)]
ALERT 140737354032896 17:52:48.7 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:868:check_rx_md_err: Current time: 0, Previous time: 27.0758
INFO 140737354098432 17:52:50.2 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:473:setTxGain: Set TX gain to 44dB (asked for 44dB)
INFO 140737354098432 17:52:50.2 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/Transceiver.cpp:685:driveControl: command is CMD SETPOWER 20
ALERT 140737354065664 17:52:50.4 /home/pespin/dev/sysmocom/git/osmo-trx/Transceiver52M/UHDDevice.cpp:1022:writeSamples: UHD: Device send timed out

#2 Updated by ttsou 11 months ago

UHDDevice.cpp:844:check_rx_md_err: No packet received, implementation timed-out at 27.0777 sec.
UHDDevice.cpp:848:check_rx_md_err: UHD: Receive timed out
UHDDevice.cpp:867:check_rx_md_err: UHD: Loss of monotonic time
UHDDevice.cpp:1022:writeSamples: UHD: Device send timed out

The above truncated sequence of errors indicates that the device has stopped (or never started) sending sample packets to the host. The cause is device specific, but similar issues are often associated with flow control and/or overrun handling.

The receive sample packet stream from the device drives the TRX internal clock loop that drives both the CLK IND messages and the TRX transmit path.

To resolve this issue, determine why the LimeSDR is not sending samples to the host.

#3 Updated by laforge 11 months ago

  • Related to Bug #2816: osmo-trx-uhd: bug with limesdr send timed out added

#4 Updated by laforge 11 months ago

  • Priority changed from Normal to High

#5 Updated by laforge 9 months ago

  • Category set to LimeSDR

#6 Updated by pespin 6 months ago

  • Subject changed from osmo-trx: failure to update CLOCK IND, timeout with LimeSDR to osmo-trx-uhd: failure to update CLOCK IND, timeout with LimeSDR

#7 Updated by pespin 6 months ago

Using latest LimeSuite b1f38113328fa5128adf11fc6e54898cbfea4a78.

Running latest master osmo-trx-uhd I can still see this issue.
However, using latest osmo-trx-lms (branch laforge/lime), the clock is far more stable, with similar error values than when running osmo-trx-uhd with an ettus b200. osmo-bts-trx doesn't exit prematurely in this case.

#8 Updated by laforge 6 months ago

  • Priority changed from High to Low

reclassifying as "Low" as we are working on osmo-bts-lms to address this.

#9 Updated by pespin 5 months ago

  • Has duplicate Support #3316: osmo-trx-uhd fails after OpenBTS starts, LimeSDR added

#10 Updated by pespin 5 months ago

  • Related to Bug #3039: LimeSDR: Lots of lost signalling blocks ends voice call with "Connection Failure" after a few seconds added

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)