Bug #4952
closedFix NRI routing in case SGSN is down
Added by laforge almost 3 years ago. Updated over 2 years ago.
100%
Related issues
Updated by laforge almost 3 years ago
- Subject changed from ix NRI routing in case SGSN is down to Fix NRI routing in case SGSN is down
- Priority changed from Normal to High
I believe the NAS Node Selection Function must take into account whether or not the given SGSN is currently available.
Let's assume we have two SGSNs in the pool:- SGSN 0 serves NRI 3
- SGSN 1 serves NRI 4
Now assume SGSN 0 has an outage.
Any traffic without TLLI or with TLLI mapping to the NULL NRI will now trigger the selection function. We currently choose any configured SGSN, unless it is administratively disabled with "no allow-attach". However, we do not check if the given PTP-BVC at that SGSN is currently available or not.
Updated by laforge almost 3 years ago
Another interesting question is what is supposed to happen with traffic with a NRI for the now-defunct SGSN pool member. If we simply route it to any other SGSN, that SGSN will not know what to do with that traffic? But at least it should then return some error to the MS, so the MS can re-attach?
Updated by laforge almost 3 years ago
- Related to Feature #4951: more TTCN3 tests for SGSN pooling added
Updated by laforge almost 3 years ago
laforge wrote:
Another interesting question is what is supposed to happen with traffic with a NRI for the now-defunct SGSN pool member. If we simply route it to any other SGSN, that SGSN will not know what to do with that traffic? But at least it should then return some error to the MS, so the MS can re-attach?
daniel , lynxis any feedback on this one? Any ideas? What is the expected behavior in your understanding of the specs?
Updated by daniel almost 3 years ago
- Related to Bug #4897: gbproxy2: Re-introduce handling of NS_AFF_CAUSE_FAILURE added
Updated by daniel almost 3 years ago
laforge wrote:
Another interesting question is what is supposed to happen with traffic with a NRI for the now-defunct SGSN pool member. If we simply route it to any other SGSN, that SGSN will not know what to do with that traffic? But at least it should then return some error to the MS, so the MS can re-attach?
Yeah, I believe it will work just like you describe, I don't really see how SGSN pooling can help with this sort of failure (other than offering a new SGSN to reconnect to).
If an outage can be planned in advance you should use the load-redistribution function. For that you would- Mark the SGSN as no allow-attach in gb-proxy so new connections ignore this SGSN
- Have the SGSN reallocate its NULL-NRI on (periodic) RA update, set the update timer to min. value and force the MS to stand-by
See https://projects.sysmocom.de/attachments/download/4350/SGSNs_in_Pool.pdf (pg. 8-10)
That way all MS currently on that SGSN will slowly migrate away and the SGSN can be taken offline.
But you are right that gbproxy currently doesn't handle the case correctly where an SGSN is down (e.g. because NS failed)
Updated by daniel over 2 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 20
Updated by daniel over 2 years ago
- % Done changed from 20 to 60
TTCN3 Test here: https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/24442
Fails without the osmo-gbproxy fix.
Fix for osmo-gbproxy: https://gerrit.osmocom.org/c/osmo-gbproxy/+/24443
Updated by daniel over 2 years ago
- Status changed from In Progress to Resolved
- % Done changed from 60 to 100
Patches are merged and the new test should pass on the next run