Project

General

Profile

Actions

Bug #3297

closed

Properly deal with SSRC changes

Added by laforge almost 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
05/28/2018
Due date:
% Done:

100%

Spec Reference:

Description

From #3104, we found out that osmo-bts (with osmo-ortp) seems to be dropping RTP packets when the SSRC changes.

I also had a look at the BTS side and it seems that rtp_session_recvm_with_ts() in
osmo_ortp.c (libosmo-abis, recv_with_cb()) does not return anything. Presumably ortp
already decides that the received packets are not valid.

Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(102880): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103040): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103200): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103360): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103520): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103680): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(103840): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104000): ERROR!
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:0 Cannot use the scheduled mode: the scheduler is not started. Call ortp_scheduler_init() at the begginning of the application.Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:0 can't guess current timestamp because session is not scheduled.Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:143 osmo-ortp(16384): timestamp_jump, new TS 0, resyncing
Fri May 25 17:28:50 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104160): ERROR!
Fri May 25 17:28:51 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104320): ERROR!
Fri May 25 17:28:51 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104480): ERROR!
Fri May 25 17:28:51 2018 <0015> trau/osmo_ortp.c:190 osmo_rtp_socket_poll(104640): ERROR!

The question is now what exactly is the problem here. When looking at RFC 3550,
8.2 Collision Resolution and Loop Detection. I find the following:

"In this algorithm, packets from a newly conflicting source address will be ignored and packets
from the original source address will be kept. If no packets arrive from the original source for an
extended period, the table entry will be timed out and the new source will be able to take over.
This might occur if the original source detects the collision and moves to a new source identifier,
but in the usual case an RTCP BYE packet will be received from the original source to delete the
state without having to wait for a timeout."

When I get this correct, then this is not really a bug in osmo-bts. When the
SSRC suddenly changes, ortp discards the packets that contain the new SSRC
because it assumes that these packets were send in error. RFC 3550 also states
that when the original source remained silent for some time, the stream from
the new source may be accepted.

This fits in the picture. Presumably we just need to add some logic to the MGW
that it sends an RTCP BYE when the stream gets switched. This might fix the
problem.

At the moment I can not see any RTCP BYE in the RTCP stream:
(no_ssrc_patchin_and_no_rtcp_omit_filtered_by_rtcp.pcapng)


Related issues

Related to OsmoBTS - Bug #3104: sysmobts with 201705 image no audio first seconds after call is acceptedResolveddexter03/23/2018

Actions
Actions #1

Updated by laforge almost 6 years ago

  • Related to Bug #3104: sysmobts with 201705 image no audio first seconds after call is accepted added
Actions #2

Updated by laforge almost 6 years ago

libortp has a function called rtp_session_set_ssrc_changed_threshold which is documented as "Sets the number of packets containing a new SSRC that will trigger the "ssrc_changed" callback."

In reality, the code in rtp_session_rtp_parse() not only uses this threshold to call the call-back, but also to decide if this new SSRC is going to be accepted from now on.

The default value is "50" which in our 20ms codec frame duration means one second of silence at a SSRC change.

We could and probably should use rtp_session_set_ssrc_changed_threshold with a much lower threshold to be more tolerant/welcoming to SSRC changes.

Even with #3104 properly fixed (no SSRC change during start-up), we wills till see SSRC changes during hand-over between BTSs, so it's important to also fix this on the BTS/ORTP side.

Unfortunately, the "BYE" RTCP packet would not reset the libortp state into a state where it would blindly accept any new/different SSRC. It's basically simply discarded, as far as I read libortp source. So I guess rtp_session_set_ssrc_changed_threshold() is our only option. The question is if we set it to 0,1 or something like 5 (100ms). I'd probably go for "0".

Actions #3

Updated by dexter almost 6 years ago

I think I have narrowed down the problem a little bit.

In rtp_session_recvm_with_ts() we can see that there are three ways to obtain rtp data:

    /*calculate the stream timestamp from the user timestamp */
    ts = jitter_control_get_compensated_timestamp(&session->rtp.jittctl,user_ts);
    if (session->rtp.jittctl.enabled==TRUE){
        if (session->permissive) {
            mp = rtp_getq_permissive(&session->rtp.rq, ts,&rejected);
        }
        else{
            mp = rtp_getq(&session->rtp.rq, ts,&rejected);
        }
    }else mp=getq(&session->rtp.rq);/*no jitter buffer at all*/

At the moment we are using rtp_getq(). This function checks the timestamp and then discards the packet - at least in our case. Then there is rtp_getq_permissive() which seems to be less strict here. I think it would work. To use the permissive mode we need to set the session->permissive flag, but it appears to be an internal flag and I am not sure if we should touch it from libosmocore. This probably messes everything up.

We can also see that when jittctl.enabled is setto fals we use getq() to get rtp data. We have control this flag from outside using rtp_session_enable_jitter_buffer(). In osmo_rtp_socket_set_param() we are calling rtp_session_enable_jitter_buffer(). When I hardcode rtp_session_enable_jitter_buffer(rs->sess, FALSE); in there (I do know enough about this yet), then the problem vanishes.

However, I think we use the jitter buffer feature for a reason, so disabling will be presumably not an option.

Actions #4

Updated by dexter almost 6 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 100

The problem is definitely related to the jitter buffer. When I disable the jitter buffer via VTY, then the problem vanishes. However, this is no solution anyway.

Presumably the problem has something to do with packets which are still in some queue after the timestamp has jumped. Then probably this queue has to run empty first until libortp detects the timestamp change. I also see that we have a callback that handles timestamp changes by calling void rtp_session_resync (RtpSession *session). Since a changed SSRC means that the entire stream changed we should definitely do a resync on those events as well.

The ssrc change threshold is now changed to zero, so we detect ssrc changes immediately. Also when an ssrc change is detected we call rtp_session_resync() now. The audio dropouts are now gone.
https://gerrit.osmocom.org/#/c/libosmo-abis/+/9379 ortp: resynchronize rtp session on timestamp changes
https://gerrit.osmocom.org/#/c/libosmo-abis/+/9380 ortp: detect ssrc changes immediately

Actions #5

Updated by dexter almost 6 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)