Project

General

Profile

Actions

Bug #4629

closed

statically configured Gb interface not recovering after SGSN restart

Added by laforge almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
06/23/2020
Due date:
% Done:

100%

Spec Reference:
TS 48.018 Section 8.4

Description

In a situation when OsmoSGSN is interworking via Gb with a third-party BSS, we have a problem recovering after a SGSN restart.

The BSS continues to send uplink BSSGP PDUs like nothing happened, and OsmoSGSN responds with BSSGP STATUS (Cause = BVCI unknown). Normally, we would expect the BSS to understand that and follow up with a BVC-RESET in order to re-create the BVC for that BVCI. However, nothing of that sort happens.

In theory, the SGSN could also do a BVC-RESET. But it's a bit of a chicken-and-egg situation: If the BVC does not exist, as the SGSN has just restarted and lost all state, how would it know which BSSes exist out there, and send BVC-RESET to all of them?

So we'd have to cheat a bit and wait until any BSSGP PDU for a non-existant BVC is received, and then use the BVCI from that to send a SGSN-originated BSSGP RESET.

Actions #1

Updated by laforge almost 4 years ago

  • Spec Reference set to TS 48.018 Section 8.4

From 3GPP TS 48.018 Section 8.4

A BVC-RESET procedure is performed because of recovery procedures related to:
- a system failure in the SGSN or BSS that affects GPRS BVC functionality (e.g. processor recovery);
...
The BSS may also send BVC-RESET as a means to create the initial mapping between BVCIs and cell identifications. After any of the possible events stated above, the status of the affected BVCs may be inconsistent at the SGSN and the BSS. After performing the BVC Reset procedure all affected BVCs are assumed to be unblocked at the SGSN. The reset procedure forces a consistent state upon SGSN and BSS by requiring that after the completion of the BVC-Reset procedure the BSS initiates the block procedure for all affected BVCs that are marked as blocked at the BSS.

Even more interesting, section 8.4.1 seems to hold the key:

After any failure affecting the NSE, the party (BSS or SGSN) where the failure resided shall reset the signalling BVC. After sending or receiving a BVC-RESET PDU for the signalling BVC, the BSS shall stop all traffic and initiate the BVC-RESET procedure for all BVCs corresponding to PTP functional entities of the underlying network service entity. The BSS must complete the BVC-RESET procedure for signalling BVC before starting PTP BVC-RESET procedures.

So the SGSN does not need to know the BVCI of the individual PtP-BVCs, but it should simply send a BVC-RESET for the signaling BVC (BVCI=0), which should then trigger the related recovery. Let's try to implement that and test it.

Actions #2

Updated by ipse almost 4 years ago

Does OsmoSGSN/OsmoPCU support actual static Gb configuration? When we tried to configure that on the OsmoPCU/OsmoGbProxy side, we had to patch the code to achieve static Gb configuration (see our branch). The code was not clean enough to submit it for the master, though.

We also saw that in our case, a commercial SGSN sends BVC-RESET to our PCU as soon as it detects the NSE down, and keeps re-sending it until our PCU responds with ACK. I can share some traces if they could help.

Actions #3

Updated by laforge almost 4 years ago

On Thu, Jun 25, 2020 at 09:42:53AM +0000, ipse [REDMINE] wrote:

Does OsmoSGSN/OsmoPCU support actual static Gb configuration?

yes, OsmoSGSN is working here with a static Gb configuration and a
not-to-be-named third party PCU/BSC, except for the problem of recovery
described here.

When we tried to configure that on the OsmoPCU/OsmoGbProxy side, we had
to patch the code to achieve static Gb configuration (see our branch).

gbproxy should always have supported it, as especially on the FR/GRE
side, there are only static Gb configurations.

The code was not clean enough to submit it for the master, though.

I'll have a look if I find time :)

We also saw that in our case, a commercial SGSN sends BVC-RESET to our PCU as soon as it detects the NSE down, and keeps re-sending it until our PCU responds with ACK. I can share some traces if they could help.

Interesting behavior. I believe the spec state the exact opposite: As
long as the NSE/NSVC is down, it should not send BVC-RESET and only
start sending them once NS is up. However, for a static NS-IP Gb of
course you don't know when it's up or down as none of the
NS-BLOCK/UNBLOCK/RESET procedures are to be used.

Actions #4

Updated by laforge almost 4 years ago

https://gerrit.osmocom.org/c/osmo-sgsn/+/19027 should fix this. Adding a TTCN3 test case is not straight-forward as the BVC-RESET is only sent at start-up of the process, and we don't restart the SGSN during tests - and only start tests well after the SGSN has been started.

Actions #5

Updated by laforge over 3 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

patch long merged

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)