Project

General

Profile

Feature #2515

integrate osmo-mgw in osmo-bsc

Added by dexter almost 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
High
Assignee:
Category:
-
Target version:
-
Start date:
09/18/2017
Due date:
% Done:

100%

Resolution:
Spec Reference:

Description

osmo-mgw has reached a development state, where it makes sens to try out how it performs in a real life situation. Osmo-bsc seems like a good test target for that and requires mgcp features anyway to support handover. The complexity can be limited by leaving osmo-msc on the legacy mgcp, while performing changes only to osmo-bsc. When done, osmo-bsc should not behave any different on the A-Interface.

History

#1 Updated by dexter almost 2 years ago

  • % Done changed from 0 to 50

Status:

  • Integrated the mgcp-client int osmo-bsc (we have its VTY, and can use its services)
  • Implemented some infrastructure to send CRCX and MDCX for the BTS side.

By now we should be half way complete here. We only miss the creation of the second connection that points to the MSC side. Besides the fact that voice calls are broken now (expected), everything looks fine. I can see the CRCX/MDCX commands in the mgw log. Issued IP and port numbers also look plausible. The only thing that might point towards a problem is that I do not hear anything, even when the the both connections are in loopback mode, but we will see. Right now is important that the signalling works fine, debugging the streams makes more sense when the signalling is complete.

#2 Updated by dexter almost 2 years ago

  • Tracker changed from Bug to Feature

#3 Updated by dexter almost 2 years ago

  • % Done changed from 50 to 60

Status:

I was wondering why there were no RTP packets flowing. The BTS remained completely silent. I have debugged this now (signalling problem) and now I get echoed audio (both connections are in loopback mode) on both phones.

#4 Updated by dexter almost 2 years ago

  • Status changed from New to In Progress

#5 Updated by dexter almost 2 years ago

  • There was some progress this week. I managed to pinpoint the problem that caused the BTS not sending RTP packets. Now the BTS sends its RTP packets to osmo-mgw and osmo-mgw loops them back since i am currently creating the connection in loopback mode and do not change the mode with the MDCX. This is for testing. I can hear the voice echoing on the mobiles.
  • I started now working on the part that points to the network side. This barely works at the moment, but I got at least voice from one mobile to the other, but not the other way around. I will see whats wrong there tomorrow. I am sure it is something simple.

Currently the code that controls it all is a callback hell. Once we have it working in both directions we will have to reinforce this with an Osmo-FSM, then we also get proper timeouts in case something goes wrong during the process the MGCP connection negotiation.

#6 Updated by dexter almost 2 years ago

  • % Done changed from 60 to 70

There is some progress on the Osmo-FSM side. The program flow that creates and removes the connections from the MGCP-GW is now implemented. But the FSM still does not handle exceptional cases, that is what is next on my agenda. It also needs to be tested a bit more, also in terms of re-using endpoints.

#7 Updated by dexter almost 2 years ago

The FSM should now be able to handle exceptional cases. If it runs into a problem it will, depending how far it came, jump directly to the state where the DLCX is generated, then the state machine is freed. In case of timeout (MGCPGW is unrsponsive) the FSM will make a last attemt to send a DLCX and then it will be freed.

Neels and me also think it might make sense to integrate a resend logic into the mgcp-client directly. Our MGW is capable of handling resents so it would be a good feature to make our system more reliable.

I have to take another look at the Assignment Complete message. At the moment it does not matter which IP-Address/Port is sent back. This is due to the wired BTS autodection method we find in osmo-bsc-mgcp. I have to check if this is working properly. I will take a trace and monitor which IP/Port is sent back. Then we should be fine.

At the moment the whole thing is suffering from the unfinished state of osmo-mgw. Things will get easier when https://gerrit.osmocom.org/#/c/4003/ made it into master.

#8 Updated by laforge almost 2 years ago

On Wed, Oct 04, 2017 at 04:46:29PM +0000, dexter [REDMINE] wrote:

At the moment the whole thing is suffering from the unfinished state of osmo-mgw. Things will get easier
when https://gerrit.osmocom.org/#/c/4003/ made it into master.

Just gave my +2 and merged it. Hope we can see your osmo-bsc and osmo-msc
integration patches in review soon.

#9 Updated by dexter almost 2 years ago

  • % Done changed from 70 to 80

Status update:

I have now tested the FSM now also with some exceptional cases, it seems to work fine now. If there are timeouts the state machine will terminate (either gracefully or hard, depending on the situation). If even that fails somehow the state machine is freed when the A interface connection is cleared (thanks talloc).

I still need to report back the IP/Port on the A-Interface. Also the bsc_nat test needs some attention, because it uses the legacy mgcp server and this seems to have problems with the client altogether. I also might considering to rename the states to something more pregnant.

see also branch pmaier/bscmgw on osmo-bsc.git

#10 Updated by laforge almost 2 years ago

Hi Philipp,

On Fri, Oct 13, 2017 at 01:48:56PM +0000, dexter [REDMINE] wrote:

If even that fails somehow the state machine is freed when the A
interface connection is cleared (thanks talloc).

This makes me wonder: Are you using talloc destructor call-backs? If
talloc simply free's the fsm instance, then the FSM could still have
timers running. Those timers need to be stopped. Also, the fsm_inst
is in a list of fsm's, from which it must be removed.

I still need to report back the IP/Port on the A-Interface.

That's of course quite important to have it working.

#11 Updated by dexter almost 2 years ago

This makes me wonder: Are you using talloc destructor call-backs? If
talloc simply free's the fsm instance, then the FSM could still have
timers running. Those timers need to be stopped. Also, the fsm_inst
is in a list of fsm's, from which it must be removed.

I see, so supplying a talloc context with

struct osmo_fsm_inst *osmo_fsm_inst_alloc(struct osmo_fsm *fsm, void *ctx, void *priv,
int log_level, const char *id);

is not enough? Maybe there should be a destructor callback. Do we still use this method somewhere?

Anyway, if everything works as expected, the FSM conn struct, where the context is held should never be freed before the FSM is terminated.

#12 Updated by dexter almost 2 years ago

Now the correct IP-Address is reported back to the MSC, so technically everything should be complete now. I send it into review when I have renamed the state constants.

#13 Updated by laforge almost 2 years ago

On Tue, Oct 17, 2017 at 04:19:59PM +0000, dexter [REDMINE] wrote:

Now the correct IP-Address is reported back to the MSC, so technically everything should be complete now. I send it into review when I have renamed the state constants.

this may be a good point in time to re-test OsmoBSC not only against OsmoMSC but also against NG40.

#14 Updated by dexter almost 2 years ago

  • % Done changed from 80 to 90

I have now pushed my patches to gerrit for review: https://gerrit.osmocom.org/4334

This patch does not address handover. From the MGCP point of view a handover is just one MDCX. We only need to update the connection that points to the BTS. Everything else stays the same sind the BTS is just commanded to send its RTP stream to the same location where the previous one did. This is very simple to implement. I would prefer to do this in a separate FSM (and a separate patch) with two states. It either timeouts (the bad case in which we would have to kill the call) or it reaches its done state. The handover terminates in osmo_bsc_audio.c/handle_abisip_signal()/S_ABISIP_MDCX_ACK, thats where we would trigger the MDCX to osmo-mgw

#15 Updated by dexter over 1 year ago

The review issues are now fixed. Unfortunately jenkins can not verify the build yet because there are still libosmo-mgcp patches pending.

#16 Updated by dexter over 1 year ago

Yesterday I noticed that there is still a segfault that only occours when the mgw is offline. Fixed a few odd state transitions in the FSM and also removed the automatic freeing of all context data when the FSM is terminating. The freeing is now done externally when the sccp-connection is closed. It now also builds fine in jenkins.

#17 Updated by dexter over 1 year ago

Another odd problem fixed: In some odd cases it may happen that the FSM is terminating before a late response from the MGW is arriving. When the response then finally arrives it will trigger its callback and the callback tries to interact with the FSM then. This is now fixed.

#18 Updated by dexter over 1 year ago

  • % Done changed from 90 to 100

The patch now also addresses handover. Successfully tested with two SysmoBTS, two K800i and manual triggering through VTY. When the call is active and a Handover event occours, an extra MDCX is generated to update the connection data inside the MGW.

#19 Updated by laforge over 1 year ago

On Mon, Nov 06, 2017 at 04:43:17PM +0000, dexter [REDMINE] wrote:

The patch now also addresses handover.

congratulations! This is good news.

#20 Updated by laforge over 1 year ago

  • Priority changed from Normal to High

#21 Updated by dexter over 1 year ago

  • Status changed from In Progress to Resolved

#22 Updated by laforge over 1 year ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)