Bug #6015
openosmo-pcu (with ericsson RBS) is unable to keep TRAU frames in sync
0%
Description
The problem is observed with an RBS installation that uses a DUG20 but a different (presumably older, RRUS 01?) transceiver. The symptom is that the PCU is unable to get a reliable sync on the TRAU frames coming from the CCU (transceiver unit). The CCU eventually closes the channel.
we see the following messages:
Mon Apr 24 09:34:08 2023 <0010> trau/trau_sync.c:519 trau_sync(trau-sync){FRAME_ALIGNED}: state_chg to FRAME_ALIGNMENT_LOSTThe reasons for this could be:
- Unreliable E1 connection (the setup uses icE1usb+osmo-e1d)
- The transceiver module has a slightly different TRAU frame format (unlikely)
- Bug in the TRAU synchronizer of libosmo-abis
Files
Updated by dexter 8 months ago
(While testing on Monday keith noticed that the GPS sync was lost. To be sure that this is not the problem he has re-tested it with proper GPS sync, unfortunately with the same result.)
Meanwhile I have analyzed the problem a bit further. I am very sure that the problem is not due to an unreliable E1 connection. In fact the connection and the TRAU sync seems to be ok. When filtering the log with
cat ./pcu-startup-64k.log | grep "CCU-\|LOST\|PCU-\|555555555555555555\|LOST\|synchronized"
One can see that it goes through the sync procedure. But even though we send TRAU frames to the CCU, it does not stop sending SYNC indications. That is why we see an "In sync with CCU" message in response to every data indication we send. The CCU on the other end also seems to understand the frame since dbe and dfe are 0. If there were problems with the frame format we should get at least dfe=1 in the log. Also when looking at both logs we see the exact same behavior. So I think we can rule out a frame format problem.
I have the impression that the sync procedure somehow fails. Maybe the CCU fails to finish the procedure and still thinks that it is not synced yet. Maybe it tries to tell: "You sending me data frames, but I am not synced, here is a SYNC indication!"
One must also take into account that the latency of the E1 line is about 13 TRAU frames, so when we send a CCU_SYNC_IND, it will take some frames until it reaches the CCU, then the CCU has to respond and this also takes time. Maybe we have to send the SYNC indications longer. I have hacked up osmo-pcu so that it does not immediately stop the synchronization. Once in sync it will continue to send CCU-SYNC-IND for a few more TRAU frames.
The changes can be found on osmo-pcu.git pmaier/ccuhacks
keith: can you try the modified osmo-pcu version and attach the log output to this ticket?
Updated by keith 7 months ago
- File pcu_log.gz pcu_log.gz added
I should also mention that once the pcu is changing to FRAME_ALIGNED, then osmo-e1d is logging:
mux_demux.c:142 (I0:L0:T6) TS read underflow: We had 8 bytes to read, but socket returned only -1
I noticed this now as I was running osmo-e1d in a terminal window
I have not been able to get the dahdi driver running yet, I'm almost ashamed to say. :-(
I think it would be great if you were able to verify the icE1usb and osmo-e1d with your DUG/RUS
I hope this is enough pcu log.
Updated by dexter 7 months ago
I have now tested what happens when we try to use our setup (RUS 02 B3 with DUG 20 01) with icE1usb (GPS synced). The behavior I see is indeed the same. I have tested with 64kbps and 16kbps. I also tried out the pmaier/ccuhacks branch but the hacks (relax the sync procedure) did not help to fix the problem.
Since I have tried it with a temporary setup on my laptop, to investigate this further I would say we should setup a more permanent testbed with an icE1usb in the testrack. We also should try the dahdi kernel driver, maybe this one offers better performance than osmo-e1d.
Updated by dexter 3 months ago
I only had a very brief look at this, so I am not sure if this relates to the problem that we are facing here:
https://discourse.osmocom.org/t/octoi-bit-errors/134
https://osmocom.org/issues/6169
Updated by laforge 3 months ago
On Fri, Sep 08, 2023 at 08:22:31AM +0000, dexter wrote:
I only had a very brief look at this, so I am not sure if this relates to the problem that we are facing here:
it is not, as we are not using OCTOI but just icE1usb in this case. Furthermore, libosmo-abis is using the
correct buffer policy for B-channels.