Project

General

Profile

Actions

Bug #3104

closed

sysmobts with 201705 image no audio first seconds after call is accepted

Added by pespin about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
03/23/2018
Due date:
% Done:

100%

Spec Reference:

Description

I have a sysmbots with an entire network running on it (bts, mgw, hlr, bsc, msc, sgsn, ggsn, etc.). It is running 201705 nightly image, with current latest osmocom master code.

I have 2 MS registered on it. I start a call from one MS to the other, and I accept it and I see rtp stream being sent, but during the first 5-10 seconds there's no audio. It can be easily tested by blowing air into the micrphone of one MS since the call is accepted. Once the audio appears, there's no lag on it (like the audio first heard it's not the audio I produced 5 seconds ago).

I think I have the same issue running only osmo-bts-symo in the sysmobts and the rest of the network in my PC.

However, I think (I need to re-confirm) that when I use the same network on my PC, but using osmo-bts-trx with sysmocell5000, I don't see this behavior.

I attach a trace showing the issue (sorry no gsmtap enabled)


Files


Related issues

Related to OsmoBTS - Bug #3297: Properly deal with SSRC changesResolveddexter05/28/2018

Actions
Actions #1

Updated by dexter about 6 years ago

Since I never experienced such an issue with current master, but quite old sysmobts firmware we suspect that this could be a problem with osmo-bts. I had a quick look at the trace, it looks ok, at least there seem not to be any obvious delays or other quirks. The RTP is there, maybe it contains a lot of silence frames at the beginning of the call?

Actions #2

Updated by dexter almost 6 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 30

I tried to test it with Paus sysmobts, but unfortunately that is not in a working state at the moment, so I upgraded mine to sysmocom-core-image-sysmobts-v2-20170807010027.rootfs.ubi

So far I am not able to reproduce the problem, everything looks fine. I get voice throughput immediately after the call is established.

Actions #3

Updated by dexter almost 6 years ago

In order to see with if the issue has something to do with the phones (Samsung Galaxy S2) I did another test, also here Audio is immediately there.

Actions #4

Updated by dexter almost 6 years ago

  • Assignee changed from dexter to pespin
Actions #5

Updated by pespin almost 6 years ago

Updated to latest 201705 nightly and tested following 2 scenarios:
- Complete CNI running in the sysmobts
- osmo-bts in sysmobts connected to CN in my PC

Both scenarios show the issue I'm describing.

Actions #6

Updated by pespin almost 6 years ago

It seems the issue also appears when using a sysmocell5k (with osmo-bts-trx) and the core network on my PC.

Using dexter's CN with myt sysmobts doesn't show the issue, so it's probably something related to the CN. We spotted a main difference between his and my setup: I'm using only 1 osmo-mgw (in both my PC and in sysmobts scenarios), while dexter is using 2 (one for the bsc, one for the msc).

Actions #7

Updated by dexter almost 6 years ago

  • % Done changed from 30 to 90

I managed to reproduce the problem using pespin's osmo-mgw.cfg. I think I managed to pinpoint the problem. It looks like a problem with the SSRC and since there is a VTY option to enable SSRC patching my guess ist that this is a known problem.

Try with:
rtp-patch ssrc

btw.: Even if it seems to have nothing to do with the issue I would also add:
rtp-patch timestamp

Actions #8

Updated by pespin almost 6 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 90 to 30

I confirm enabling "rtp-patch ssrc" in osmo-mgw.cfg solves the issue.

However, I think it would be interesting to get an explanation in here on why this option intends to patch the SSRC and how it comes it fixes this issue. I did a look at the code but I couldn't find any documentation explaining it.

Actions #9

Updated by laforge almost 6 years ago

On Mon, May 14, 2018 at 12:22:57PM +0000, dexter [REDMINE] wrote:

I managed to reproduce the problem using pespin's osmo-mgw.cfg. I think I managed to pinpoint the problem. It looks like a problem with the SSRC and since there is a VTY option to enable SSRC patching my guess ist that this is a known problem.

SSRC patching was implemented only for nanoBTS which have a problem if the SSRC of their peer
suddenly changes within a call. This is the case in hand-over situations.

With OsmoBTS no such issue has existed in the past. And if there was such an issue now,
we have full control over the source code, so we should fix it properly there)

Actions #10

Updated by pespin almost 6 years ago

  • Assignee changed from pespin to dexter
Actions #11

Updated by laforge almost 6 years ago

  • Priority changed from Normal to Urgent
Actions #12

Updated by dexter almost 6 years ago

I have investigated the problem. It was really hard to spot because the two
traces look alomst identical on the first look. One really has to look very
close to spot the difference. Also the code in osmo-mgw modifies various struct
members when rtp_end->force_constant_ssrc is enabled. However, I think I now
have an understanding for the problem:

no_ssrc_patching_rtp_rtcp_mgcp_ipacc.pcapng ===========================================

The first call leg gets connected, so far we find uniform
SSRC numbers in the trace.

#1
MGW BTS
4010 ==============A2===========> 57398
4010 <=============A2============ 57398

When we look later in the trace when both lengs are already
connected we find the following situation

#1
MGW BTS
4010 ==============CD===========> 57398
4010 <=============A2============ 57398

#2
MGW BTS
4014 ==============A2===========> 60010
4014 <=============CD============ 60010

The SSRC numbers are mixed up. Each leg uses both. The reason
for this is that when the streams from both legs get crossed,
the SSRC numbers get crossed too. This also means that a
stream that has begun with one SSRC number suddenly transmits
with another one (unexpectetly for the receiver)

ssrc_patching_rtp_rtcp_mgcp_ipacc.pcapng ========================================

The beginning is the same as without SSRC patching enabled.

#1
MGW BTS
4002 ==============2F===========> 38925
4002 <=============2F============ 38925

But the end result looks different:

#1
MGW BTS
4002 ==============2F===========> 38925
4002 <=============2F============ 38925

#2
MGW BTS
4006 ==============BB===========> 53999
4006 <=============BB============ 53999

As we can see, the SSRC numbers are still uniform / did not
suddenly change.

I also had a look at the BTS side and it seems that rtp_session_recvm_with_ts() in
osmo_ortp.c (libosmo-abis, recv_with_cb()) does not return anything. Presumably ortp
already decides that the received packets are not valid.

Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(102880): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103040): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103200): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103360): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103520): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103680): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103840): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104000): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:0 Cannot use the scheduled mode: the scheduler is not started. Call ortp_scheduler_init() at the begginning of the application.Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:0 can't guess current timestamp because session is not scheduled.Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:143 osmo-ortp(16384): timestamp_jump, new TS 0, resyncing
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104160): ERROR!
Fri May 25 17:28:51 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104320): ERROR!
Fri May 25 17:28:51 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104480): ERROR!
Fri May 25 17:28:51 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104640): ERROR!

The question is now what exactly is the problem here. When looking at RFC 3550,
8.2 Collision Resolution and Loop Detection. I find the following:

"In this algorithm, packets from a newly conflicting source address will be ignored and packets
from the original source address will be kept. If no packets arrive from the original source for an
extended period, the table entry will be timed out and the new source will be able to take over.
This might occur if the original source detects the collision and moves to a new source identifier,
but in the usual case an RTCP BYE packet will be received from the original source to delete the
state without having to wait for a timeout."

When I get this correct, then this is not really a bug in osmo-bts. When the
SSRC suddenly changes, ortp discards the packets that contain the new SSRC
because it assumes that these packets were send in error. RFC 3550 also states
that when the original source remained silent for some time, the stream from
the new source may be accepted.

This fits in the picture. Presumably we just need to add some logic to the MGW
that it sends an RTCP BYE when the stream gets switched. This might fix the
problem.

At the moment I can not see any RTCP BYE in the RTCP stream:
(no_ssrc_patchin_and_no_rtcp_omit_filtered_by_rtcp.pcapng)

Actions #13

Updated by laforge almost 6 years ago

as discussed in a brief phone call with dexter, the underlying problem seems to be that the RTP streams are looped back in the first place. If the stream is in "rcvonly" or "sendonly" mode, then the audio is uni-directional. Only once it is put in "sendrecv" during MDCX, it will be bi-directional. At that point, the SSRC won't change at MDCX time.

We still need to consider SSRC changes during intra-BSC hand-over, but let's spin that into a separate ticket.

Actions #14

Updated by laforge almost 6 years ago

Actions #15

Updated by laforge almost 6 years ago

Actions #16

Updated by laforge almost 6 years ago

  • Related to Bug #3297: Properly deal with SSRC changes added
Actions #17

Updated by dexter almost 6 years ago

  • % Done changed from 30 to 100

I have now fixed the problem that the connection is initially created in loopback mode. Now the SSRC does does not change anymore in the beginning (however, it will naturally change on handovers, see also #3297). See also:

https://gerrit.osmocom.org/#/c/osmo-mgw/+/9356 client: do not start connections in loopback mode
https://gerrit.osmocom.org/#/c/osmo-msc/+/9357 mgcp: do not start connections in loopback mode
https://gerrit.osmocom.org/#/c/osmo-msc/+/9358 mgcp: hack to keep IuUP working
https://gerrit.osmocom.org/#/c/osmo-mgw/+/9359 mgcp_network: do not log destination invalid ip/port as error

Actions #18

Updated by laforge almost 6 years ago

  • Status changed from In Progress to Resolved

100% but not resolved? fixing.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)