Bug #5742: Gaps in RTP stream originating from BTS cause problems for downstream MGW - OsmoBTS - Open Source Mobile Communications

Actions

#1

Updated by laforge over 1 year ago

falconia wrote:

With current osmo-bts code the RTP stream that originates from the BTS for the call uplink intentionally pauses (i.e., the BTS keeps track of RTP time base, but intentionally chooses to not send packets) under all of the following conditions:

whenever radio errors on UL prevent decoding

whenever TCH bursts are stolen for FACCH

whenever the MS exercises DTX on UL

However, this design creates a problem for downstream transcoding MGW implementations that rely on this RTP stream as their timing source.

Well, let me use this opportunity that osmo-bts are deployed in various production cellular networks and not a single transcoding MGW of any vendor ever had problems with this behavior. And these days those MGWs are all not TDM MGWs but pure software, with other cellular or VoIP networks on the other side.

Whatever RTP timing that is recovered from the RTP flow needs to be "held over" during RTP stream pauses. That is also what happens during any kind of packet loss, which can always happen in any case. Or what during hand-over? In that case you definitely have some gaps / interruptions in the RTP flow...

The only solution I can think of is to make the RTP stream from the BTS strictly continuous, without any intentional gaps or pauses, such that gaps will only occur as a result of IP packet loss. Packet loss events are still unavoidable, but they are an error condition that can be expected (and assumed at the system engineering level) to be infrequent, whereas uplink DTX is a fully expected normal operation condition.

I'm not certain that this is what's needed. If you're willing to contribute a clean patch, with a vty config option to enable it, we may be able to merge it.

In the case of osmo-bts-sysmo (the version of primary interest to me given the hardware I have to work with), there is the additional complication that the code for ECU and BFI packets that already exists in the osmo-bts-trx version is missing there - but porting this code from osmo-bts-trx to osmo-bts-sysmo will be easy in comparison to the more fundamental problem of apparently conflicting intentions.

the reason of this absence of the ECU/BFI is that the DSP is already doing that part in the sysmoBTS. So there's no need to port it over.

Actions

Copy link

#2

Updated by falconia over 1 year ago

laforge wrote in #note-1:

Well, let me use this opportunity that osmo-bts are deployed in various production cellular networks and not a single transcoding MGW of any vendor ever had problems with this behavior. And these days those MGWs are all not TDM MGWs but pure software, with other cellular or VoIP networks on the other side.

OK, so maybe the people who wrote those other MGWs are way smarter than I am, or there is some other difference between my environment and theirs - perhaps their environment is such that they are allowed to let pauses in the RTP stream propagate to the other side, with no requirement for the MGW to fill them in.

Whatever RTP timing that is recovered from the RTP flow needs to be "held over" during RTP stream pauses.

And how would you actually implement this idea in practice? I am not smart enough to come up with a way that won't make the MGW overly sensitive to slightest timing jitter on its input and won't make it introduce extra jitter of its own on the output. Suppose RTP stream packet number N arrives at time T. Packet N+1 is expected to arrive at T+20ms. Let's say at time T, as we have received, transcoded and forwarded packet N, we start a timer of exactly 20 ms, and if that timer expires without packet N+1 having arrived, then we assume that packet N+1 was either suppressed or lost and synthesize our own error muting or comfort noise packet in its place - and if packet N+1 does arrive a little late, too bad. This approach would be horrible in terms of jitter sensitivity - any slightest jitter in the path from the BTS to the MGW will cause the MGW to throw out perfectly good speech frames from the BTS and activate error handling in their place. So to increase jitter tolerance, we change the packet loss-or-suppression detection timer from 20 ms to some higher value - but what happens then? If we increase the timer to 21 ms in order to tolerate up to 1 ms of jitter from the BTS, we deliberately introduce 1 ms of jitter into MGW output stream (21 ms from transcoded packet N to locally synthesized packet N+1), if we increase the timer to 23 ms to tolerate 3 ms of input jitter, then we generate 3 ms of intentional output jitter, and so on. All bad in my opinion - instead, changing the BTS to always send out an RTP packet no matter what, be it rain or shine, is a more robust solution all around.

That is also what happens during any kind of packet loss, which can always happen in any case.

Whenever an infrequent, unintentional packet loss event happens in the path from the BTS to the MGW, I have no problem with propagating this stream disturbance to the PSTN side of my MGW, i.e., letting the stream toward PSTN experience a dropout. Rationale: packet loss events on the public Internet path from my MGW to whatever PSTN switch is on the other end of the call are far more likely than packet loss events on my internal network, thus causing the latter to behave like the former does not seem to be a problem.

Or what during hand-over? In that case you definitely have some gaps / interruptions in the RTP flow...

Whether it is packet loss or RAN handover, having a momentary disruption in the RTP stream upon these rare, infrequent events is not a problem - I am OK with having my MGW abruptly stop any in-band DTMF generation, making its comfort noise output less smooth than it would be otherwise, etc. What I do object to are sustained pauses in the RTP stream during fully expected, normal-operation conditions of uplink DTX.

I'm not certain that this is what's needed. If you're willing to contribute a clean patch, with a vty config option to enable it, we may be able to merge it.

Producing a configurable option, acting on OsmoBTS but controlled from OsmoBSC, would be beyond my current ability, at my current level of familiarity with CNI realm. fixeria previously told me that the current behavior can be considered a bug, but given the new input from laforge, it appears that I will have to produce a non-mergeable patch for the time being, to be published by me and applied locally by whoever (if there is any other such person) find themselves in a situation similar to mine.

the reason of this absence of the ECU/BFI is that the DSP is already doing that part in the sysmoBTS. So there's no need to port it over.

Now this part is really interesting! So which osmo-bts-sysmo code path is responsible then for the intentional pauses I see in the RTP stream from the BTS during times of uplink DTX? Is it the (data_ind->msgUnitParam.u8Size < 1) condition in l1if_tch_rx() in src/osmo-bts-sysmo/tch.c? If so, would you be able to tell me under exactly what conditions the DSP sends such empty payload packets instead of its own ECU? Are these empty payload packets the DSP's idea of BFI?

Actions

Copy link

#3

Updated by falconia over 1 year ago

I have done some work on this issue, and I have a working solution implemented for GSM FR codec, also expected to work the same for EFR once I port the codec implementation from ETSI into a proper library (move all global variables into a state structure etc). The solution I've implemented consists of a non-standard extension to RTP payload format for FR and EFR codecs, a non-mergeable patch to osmo-bts, a new library implementing Rx DTX handler functions for GSM FR as a front-end to classic libgsm, and a massive clean-up to themwi-mgw, using the new BFI protocol and the new Rx DTX handler.

The patch to osmo-bts lives here:

https://www.freecalypso.org/hg/themwi-system-sw/file/tip/osmo-patches/osmo-bts-rtp-bfi.patch

and the accompanying explanation (rationale, other considered solutions, current limitations) lives here:

https://www.freecalypso.org/hg/themwi-system-sw/file/tip/doc/RTP-BFI-extension

The new library code (Rx DTX handler for GSM FR already there, EFR library coming soon) lives here:

https://www.freecalypso.org/hg/gsm-codec-lib/

Actions

Copy link

#4