Project

General

Profile

Redundancy between GbProxy

I commercial setup with redundant network setups an SGSNs (through SGSN pooling) having OsmoGbProxy be a single point of failure is not desired. However, there is no official specification to provide redundancy on that level because a Gb proxy simply exists.

OsmoGbProxy sits in between the BSS and SGSN and terminates the NS connections while transparently routing BSSGP messages back and forth.

To provide redundancy towards the SGSN multiple OsmoGbProxy processes need to appear as belonging to the same NS Entity. The SGSN needs to have different NSVC configured pointing to the different GbProxies or the GbProxy advertises (through IP-SNS) the other GbProxy as another endpoint. This should be entirely transparent to the SGSN. Initially only IP-SNS will be supported on the SGSN side.

Implications:

  • NS needs to be able to announce "foreign" IP endpoints to the SGSN in SNS-CONFIG
  • NS needs to be able to disable/enable the transmission of SNS-SIZE to the SGSN at runtime
  • the SNS-CONFIG from the SGSN (listing its IP endpoints) is only received by the "primary" gbproxy who has started the SNS-SIZE/CONFIG procedure
  • we will likely have to replicate that SGSN-originated SNS-CONFIG to the "sec" gbproxy; maybe simply spoof that UDP packet (and suppress sending a response). At least this way we'd not need to invent new parsers, etc?
On the BSS-side we also need to share an NSE:
  • each BSS is one NSE with multiple NS-VC (otherwise no redundancy is possible), no way to split that
  • a likely implementation would implement a 1:1 mapping of NS-VCs from BSS to SGSN side (thus also a 1:1 mapping between BSS NSE and SGSN NSE)
  • his also ensures downlink load sharing is performed inside the SGSN and gbproxy doesn't have to re-route user plane traffic
  • if one NSVC on the BSS side fails, we block the corresponding NS-VC on the SGSN side. This causes the SGSN to send the traffic over the remaining NS-VCs, as expected
Performing this 1:1 NSE mapping and 1:1 NS-VC mapping on the SGSN side will introduce the following externally visible changes:
  • not just one NSE per gbproxy, but one NSE per BSS-side NSE
  • one IP endpoint on the SGSN-facing gbproxy side per BSS NSVC (one IP endpoint maps to one BSS-side NS-VC)
  • there will be multiple SGSN-side NS-VC for each of those endpoints, as the SGSN has different IP endpoints itself
    (typically at least one EP for user traffic and one for signalling traffic)
To simplify handling of BVC signalling traffic:
  • Only advertise signalling-weight > 0 on the primary GbProxy
  • The GbProxy failover detection needs to be faster than IP-SNS failure detection (to respond with a SIG-CHANGEWEIGHT >0 for the secondary GbProxy before NS resets all state)
  • Any new BVC state is replicated to the secondary GbProxy. The primary waits for an ACK before it handles the message further (forward/reply to SGSN/BSS)
  • primary/secondary decision must be made on a per-BSS (NSE) level (because different BSS could have broken connections to different GbProxies)
  • Since there is no way to force signalling traffic over an NSVC on the BSS side (with FR or non-SNS UDP) the secondary GbProxy will need to forward any signalling traffic it receives to the primary.
    A workaround (especially for IP-based BSS) would be to BLOCK the NSVC on the secondary GbProxy during operation and unblock during failover before NS on the other side had a chance to detect that all other NSVC are down.

Changes in OsmoGbProxy:

  • osmo-gbproxy and possibly libosmogb will need some support to allow the fine-grained control by the application (gbproxy) to control which NS-VC a given packet will go to
  • The IP-SNS statemachine needs to be kept in sync
  • BVC state for the BVC FSMs need to be replicated
  • gbproxy_bvc, _cell, _sgsn
  • The tlli/imsi cache can be ignored. The paging/suspend/resume will simply be resent after a timeout.
  • Instead of replaying all bssgp messages we will just transmit the new state to the secondary gbproxy (features, locally_block, block_cause, blocked or unblocked). That way there is no possibility for both gbproxies to go out of sync.
Add picture from clipboard (Maximum size: 48.8 MB)