Project

General

Profile

Actions

Bug #2486

closed

show BTS up/down status + BTS uptime in BSC VTY + CTRL

Added by laforge over 6 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
09/03/2017
Due date:
% Done:

100%

Spec Reference:

Description

An operator wants to clearly see if a given BTS is fully up+running, degraded or disconnected.

If it's up + running, he wants to see the uptime of the BTS since its last re-connect.

This information should be accesible both via VTY as well as CTRL

Actions #1

Updated by laforge over 6 years ago

  • Assignee set to msuraev
Actions #2

Updated by msuraev over 6 years ago

  • % Done changed from 0 to 10

It's implemented to some degree in the latest master of OsmoBSC:

OsmoBSC> sh bts 0
BTS 0 is of sysmobts type in band DCS1800, has CI 20909 LAC 200, BSIC 0 (NCC=0, BCC=0) and 1 TRX
...
Early Classmark Sending: forbidden
  Unit ID: 6969/0/0, OML Stream ID 0xff
  NM State: Oper 'Enabled', Admin 'Unlocked', Avail 'OK'
...
  OML Link state: connected 0 days 0 hours 0 min. 3 sec.
...

I can expose this over CTRL interface but there're few things to clarify first:
  • is current representation ok or it's better to reformat, show differently etc?
  • current implementation does not take into account leap year (after >= 1 year of uptime it might be 1 day inaccurate) - shall I try fixing it?
  • right now we show OML as either connected or disconnected. When shall we show "degraded" status?
  • shall we consider anything else in addition to OML?
Actions #3

Updated by laforge over 6 years ago

Hi Max,

thanks for the status update.

On Fri, Oct 06, 2017 at 02:49:26PM +0000, msuraev [REDMINE] wrote:

I can expose this over CTRL interface but there're few things to clarify first:
  • is current representation ok or it's better to reformat, show differently etc?

I think the plain integer number of seconds would be best for a programmatic interface like CTRL

  • current implementation does not take into account leap year (after >= 1 year of uptime it might be 1 day inaccurate) - shall I try fixing it?

I think it doesn't matter, we can ignore this.

  • right now we show OML as either connected or disconnected. When shall we show "degraded" status?

If OML is connected but somehow not all MO's in the right state. Should
probably be a per-bts-model specific function that determines this. Can
be very simplistic at first, but then expanded later on.

Loss of RSL while OML is connected would also be degraded.

  • shall we consider anything else in addition to OML?

see above, I think for RSL it's should simply lead to "degraded"

Actions #4

Updated by msuraev over 6 years ago

  • Status changed from New to In Progress

Thanks for clarification.

laforge wrote:

If OML is connected but somehow not all MO's in the right state. Should
probably be a per-bts-model specific function that determines this. Can
be very simplistic at first, but then expanded later on.

Is there some sort of MO list for some BTS model which I can use as an example for such check?

Actions #5

Updated by laforge over 6 years ago

On Fri, Oct 06, 2017 at 03:54:52PM +0000, msuraev [REDMINE] wrote:

Issue #2486 has been updated by msuraev.

Status changed from New to In Progress

Thanks for clarification.

laforge wrote:

If OML is connected but somehow not all MO's in the right state. Should
probably be a per-bts-model specific function that determines this. Can
be very simplistic at first, but then expanded later on.

Is there some sort of MO list for some BTS model which I can use as an example for such check?

start simpleat first, extend later: Start with sysmoBTS :)

Actions #6

Updated by msuraev over 6 years ago

  • % Done changed from 10 to 20

Check for OML link state which takes into consideration RSL link is available in gerrit 4169.
Note: this information is already available via CTRL in "oml-connection-state" RO command.

Actions #7

Updated by msuraev over 6 years ago

  • % Done changed from 20 to 60

Gerrit 4169 has been merged. Ctrl command to get uptime (in seconds) is available in gerrit 4197.

Actions #8

Updated by msuraev over 6 years ago

Gerrit 4169 has been merged.

Not sure what else can be checked for "degraded" state in case of sysmobts. In case of osmo-bts-trx we can also check for osmo-trx connectivity for example.

Also, documentation for ctrl commands (maybe vty too) needs update - I'll work on that first.

Actions #9

Updated by laforge over 6 years ago

On Wed, Oct 11, 2017 at 02:14:25PM +0000, msuraev [REDMINE] wrote:

Not sure what else can be checked for "degraded" state in case of sysmobts.

I would presume the state of all the various OML MOs (managed objects). Unless they're
all in their expected state, state is degraded.

In case of osmo-bts-trx we can also check for osmo-trx connectivity for example.

If we miss a clock indication from osmo-trx for 1.8 seconds, we kill osmo-bts, so I think
there's no point in indicating a different status in the short time period in between.

Actions #10

Updated by msuraev over 6 years ago

  • Status changed from In Progress to Stalled
  • % Done changed from 60 to 70

Gerrit 4197 has been merged. Example use:
./bsc_control.py -d localhost -p 4249 -g bts.0.oml-connection-state

Actions #11

Updated by msuraev over 6 years ago

Gerrit 4648 sent for review with documentation update.

Actions #12

Updated by msuraev over 6 years ago

laforge wrote:

I would presume the state of all the various OML MOs (managed objects). Unless they're
all in their expected state, state is degraded.

If state is 'locked' for some MO - shall it be considered as "degraded"?

Actions #13

Updated by laforge over 6 years ago

On Thu, Nov 02, 2017 at 02:56:27PM +0000, msuraev [REDMINE] wrote:

Issue #2486 has been updated by msuraev.

laforge wrote:

I would presume the state of all the various OML MOs (managed objects). Unless they're
all in their expected state, state is degraded.

If state is 'locked' for some MO - shall it be considered as "degraded"?

I would argue yes.

In my dreams, at some point in the future the individual MOs would then
have osmo_fsm and one could inquire their detailed state via CTRL. But the overall
state is degraded in the situation you describe.

Actions #14

Updated by laforge over 6 years ago

  • Priority changed from Normal to High
Actions #15

Updated by laforge over 6 years ago

  • Priority changed from High to Urgent
Actions #16

Updated by msuraev over 6 years ago

  • Status changed from Stalled to In Progress

For the tests, NM State can be changed to "locked" via vty:

OsmoBSC# bts 0 oml class bts instance 0 0 0
OsmoBSC(oml)# change-adm-state locked

Actions #17

Updated by msuraev over 6 years ago

  • % Done changed from 70 to 80

Gerrit 5081 has been merged, gerrit 5085 is under review. This will take into account generic BTS and TRX MO.

If we exclude MO objects specific to certain BTS (BS11, RBS2k) than we have following MO available:
  • TS-related
  • GPRS-related
  • BTS site-manager

I think TS MO can be ignored because locking single TS should not degrade entire BTS, GPRS-related MO should be taken into account only if GPRS is enabled (will send follow-up patch). Not sure what to do with site-manager.

Actions #18

Updated by msuraev over 6 years ago

Gerrit 5084, 5085, 5092 were sent for review.

Actions #19

Updated by msuraev over 6 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 80 to 100

All pending patches were merged, we now take MO states into account, available both via vty and ctrl interfaces. The manual is updated as well.

Actions #20

Updated by laforge about 6 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)