Project

General

Profile

Bug #4952

Fix NRI routing in case SGSN is down

Added by laforge about 1 month ago. Updated 7 days ago.

Status:
New
Priority:
High
Assignee:
Target version:
-
Start date:
01/15/2021
Due date:
% Done:

0%

Spec Reference:

Related issues

Related to osmo-gbproxy - Feature #4951: more TTCN3 tests for SGSN poolingIn Progress01/15/2021

Related to osmo-gbproxy - Bug #4897: gbproxy2: Re-introduce handling of NS_AFF_CAUSE_FAILURENew12/08/2020

History

#1 Updated by laforge about 1 month ago

  • Subject changed from ix NRI routing in case SGSN is down to Fix NRI routing in case SGSN is down
  • Priority changed from Normal to High

I believe the NAS Node Selection Function must take into account whether or not the given SGSN is currently available.

Let's assume we have two SGSNs in the pool:
  • SGSN 0 serves NRI 3
  • SGSN 1 serves NRI 4

Now assume SGSN 0 has an outage.

Any traffic without TLLI or with TLLI mapping to the NULL NRI will now trigger the selection function. We currently choose any configured SGSN, unless it is administratively disabled with "no allow-attach". However, we do not check if the given PTP-BVC at that SGSN is currently available or not.

#2 Updated by laforge about 1 month ago

Another interesting question is what is supposed to happen with traffic with a NRI for the now-defunct SGSN pool member. If we simply route it to any other SGSN, that SGSN will not know what to do with that traffic? But at least it should then return some error to the MS, so the MS can re-attach?

#3 Updated by laforge about 1 month ago

  • Related to Feature #4951: more TTCN3 tests for SGSN pooling added

#4 Updated by laforge about 1 month ago

laforge wrote:

Another interesting question is what is supposed to happen with traffic with a NRI for the now-defunct SGSN pool member. If we simply route it to any other SGSN, that SGSN will not know what to do with that traffic? But at least it should then return some error to the MS, so the MS can re-attach?

daniel , lynxis any feedback on this one? Any ideas? What is the expected behavior in your understanding of the specs?

#5 Updated by daniel about 1 month ago

  • Related to Bug #4897: gbproxy2: Re-introduce handling of NS_AFF_CAUSE_FAILURE added

#6 Updated by daniel about 1 month ago

laforge wrote:

Another interesting question is what is supposed to happen with traffic with a NRI for the now-defunct SGSN pool member. If we simply route it to any other SGSN, that SGSN will not know what to do with that traffic? But at least it should then return some error to the MS, so the MS can re-attach?

Yeah, I believe it will work just like you describe, I don't really see how SGSN pooling can help with this sort of failure (other than offering a new SGSN to reconnect to).

If an outage can be planned in advance you should use the load-redistribution function. For that you would
  • Mark the SGSN as no allow-attach in gb-proxy so new connections ignore this SGSN
  • Have the SGSN reallocate its NULL-NRI on (periodic) RA update, set the update timer to min. value and force the MS to stand-by

See https://projects.sysmocom.de/attachments/download/4350/SGSNs_in_Pool.pdf (pg. 8-10)

That way all MS currently on that SGSN will slowly migrate away and the SGSN can be taken offline.

But you are right that gbproxy currently doesn't handle the case correctly where an SGSN is down (e.g. because NS failed)

#7 Updated by daniel 7 days ago

  • Assignee set to daniel

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)