Project

General

Profile

Bug #4960

VTY doesn't show BVCs getting blocked on transport network failure

Added by laforge about 1 month ago. Updated about 1 month ago.

Status:
New
Priority:
High
Assignee:
Target version:
-
Start date:
01/19/2021
Due date:
% Done:

0%

Spec Reference:

Description

If I start osmo-gbproxy, run a single TTCN3 test against it (so all BVC get up once), the output looks as follows:

OsmoGbProxy> show gbproxy bvc bss
NSEI  2003, SIG-BVCI     0 [UNBLOCKED]
NSEI  2003, PTP-BVCI 20031, RAI 262-42-13135-1 [UNBLOCKED]
NSEI  2003, PTP-BVCI 20032, RAI 262-42-13300-0 [UNBLOCKED]
NSEI  2003, PTP-BVCI 20033, RAI 262-42-13300-0 [UNBLOCKED]
NSEI  2001, SIG-BVCI     0 [UNBLOCKED]
NSEI  2001, PTP-BVCI 20011, RAI 262-42-13135-0 [UNBLOCKED]
NSEI  2002, SIG-BVCI     0 [UNBLOCKED]
NSEI  2002, PTP-BVCI 20021, RAI 262-42-13135-1 [UNBLOCKED]
NSEI  2002, PTP-BVCI 20022, RAI 262-42-13135-2 [UNBLOCKED]
OsmoGbProxy> show gbproxy bvc bss
NSEI  2003, SIG-BVCI     0 [UNBLOCKED]
NSEI  2003, PTP-BVCI 20031, RAI 262-42-13135-1 [UNBLOCKED]
NSEI  2003, PTP-BVCI 20032, RAI 262-42-13300-0 [UNBLOCKED]
NSEI  2003, PTP-BVCI 20033, RAI 262-42-13300-0 [UNBLOCKED]
NSEI  2001, SIG-BVCI     0 [UNBLOCKED]
NSEI  2001, PTP-BVCI 20011, RAI 262-42-13135-0 [UNBLOCKED]
NSEI  2002, SIG-BVCI     0 [UNBLOCKED]
NSEI  2002, PTP-BVCI 20021, RAI 262-42-13135-1 [UNBLOCKED]
NSEI  2002, PTP-BVCI 20022, RAI 262-42-13135-2 [UNBLOCKED]

However, even 10 minutes after the TTCN3 tester terminates (and hence all BSS and SGSN peers are gone), the output is still unchanged.

I guess a normal user would have expected that the BVCs would go into BLOCKED or some kind of recovery state if the underlying NSE disappears / becomes unavailable.

The same applies to

OsmoGbProxy> show gbproxy cell     
BVCI 20031 RAI 262-42-13135-1: BSS NSEI  2003, SGSN NSEI   101   102 
BVCI 20021 RAI 262-42-13135-1: BSS NSEI  2002, SGSN NSEI   101   102 
BVCI 20011 RAI 262-42-13135-0: BSS NSEI  2001, SGSN NSEI   101   102 
BVCI 20032 RAI 262-42-13300-0: BSS NSEI  2003, SGSN NSEI   101   102 
BVCI 20022 RAI 262-42-13135-2: BSS NSEI  2002, SGSN NSEI   101   102 
BVCI 20033 RAI 262-42-13300-0: BSS NSEI  2003, SGSN NSEI   101   102 

where the NSEI are shown even a long time after those NSEI are gone. Interestingly, when you start another test, they temporarily become

OsmoGbProxy> show gbproxy cell 
BVCI 20031 RAI 262-42-13135-1: BSS NSEI <none>, SGSN NSEI   101   102 
BVCI 20021 RAI 262-42-13135-1: BSS NSEI <none>, SGSN NSEI   101   102 
BVCI 20011 RAI 262-42-13135-0: BSS NSEI <none>, SGSN NSEI   101   102 
BVCI 20032 RAI 262-42-13300-0: BSS NSEI <none>, SGSN NSEI   101   102 
BVCI 20022 RAI 262-42-13135-2: BSS NSEI <none>, SGSN NSEI   101   102 
BVCI 20033 RAI 262-42-13300-0: BSS NSEI <none>, SGSN NSEI   101   102 

only to go bac kto 2001/2002/2003 a few seconds later. So the state is lost (maybe on BVC RESET?) In that case, maybe if the BVC would go to BLOCKED or some kind of other state, this would solve itself?

This may not seem super critical, but from an operational point of view, we will be wondering about this as soon as we go into deployment/testing, as will our users, AFAICT.

History

#1 Updated by laforge about 1 month ago

And yes, I'm aware I wrote that code, so I'm not saying it's daniels fault when assigning this to him. I just try to focus at testing at the moment.

#2 Updated by laforge about 1 month ago

"BLOCKED" is spec-wise the wrong state for the signaling BVCs, as by definition it can never be blocked. At gbproxy start-up the SGSN side BVC are in WAIT_RESET_ACK state:

NSEI   101, SIG-BVCI     0 [WAIT_RESET_ACK]
NSEI   102, SIG-BVCI     0 [WAIT_RESET_ACK]

and the BSS side BVCs simply don't exist.

I would argue that the PTP BVCs could actually be deleted when a BSS disappears. This would mean
  • BLOCK each SGSN side PTP BVC for this BVCI
  • destroy the BSS side PTP BVC object
  • possibly also destroy the cell object?

On the other hand, that would also destroy any related counters etc. - and from the operational point of view it might be interesting to keep them around even if there is an outage. After all, the number of BSS/BVC is not something that changes frequently in a production network.

So as an alternative, we could simply mark the PTP BVC on the BSS side as blocked (we don't even need to start a BLOCKING procedure, as that will try to send packets and wait for ACKs). Plus start the BLOCK procedure on the SGSN side as described above.

Maybe all of the above is a "Holzweg" and we should simply show the NSE ALIVE/DEAD state next to each BVC?

Any comments/ideas?

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)