Project

General

Profile

Bug #3294

transaction: refactor callref allocation

Added by fixeria over 1 year ago. Updated 2 months ago.

Status:
Stalled
Priority:
High
Assignee:
Category:
-
Target version:
-
Start date:
05/26/2018
Due date:
% Done:

0%

Resolution:

Description

Each transaction has a field called 'callref', that should contain an unique identifier. This identifier is being assigned either by OsmoMSC itself, either by an external application (e.g. LCR), and is used to distinguish between multiple allocated transactions.

In case of a new callref generation on the MSC side, there are the following sources for that:

$ git grep "static uint32_t new_callref" 

src/libmsc/gsm_04_08.c:static uint32_t new_callref = 0x80000001;
src/libmsc/gsm_04_11.c:static uint32_t new_callref = 0x40000001;
src/libmsc/mncc_builtin.c:static uint32_t new_callref = 0x00000001;

So, we have a few ranges for different types of communication:

- from 0x00000001 to 0x40000000 - Call Control, internal MNCC;
- from 0x40000001 to 0x80000000 - SMS messages;
- from 0x80000001 to 0xffffffff - Call Control, external MNCC;

And this is how a new 'callref' value is generated:

new_callref++

I see the following problems:

1) Imagine that we have a network, which is running for some long time. What if the amount of calls ever made would reach 0x40000000? The next values will be 0x40000001, 0x40000002, 0x40000003, etc. At some point, this may result into collisions, e.g. two transactions with same 'callref' value.

Possible solution: instead of doing `new_callref++` manually, create a function (e.g. trans_gen_ref), which would prevent overlaps between ranges, and flush the source to initial value.

2) The trans_alloc(), which is used to allocate a new transactions, doesn't check if passed 'callref' value is not used. In other words, it is possible to allocate a few transactions with not unique 'callref'. In this case the trans_find_by_callref() would work incorrectly.

Possible solution: before allocation, check if given 'callref' is already used.

3) The 'callref' as a field name itself looks/sounds like something call related, while this feature will be also used as soon as we implement SMS and SS/USSD over GSUP. It would be better to rename it to something more generic, e.g. just 'ref'.

4) Both sides, i.e. MSC and an external application, are involved in 'callref' generation. There is no master-slave relation... This may result in a situation, when an external application asks to allocate a new transaction with 'callref', which is already used.

Possible solution: inspire by GSM TS 04.07, section 11.2.3 "Transaction identifier" and introduce the direction bit. Probably, this can be a bit simplified, e.g. '0' means allocated by the MSC itself, '1' - by an external application.


Related issues

Related to OsmoNITB - Feature #1667: Handle callref betterNew03/22/2016

Related to OsmoMSC - Feature #2487: MSC side of LCLS (local call local switching) as per the 3GPP specsStalled09/03/2017

History

#1 Updated by laforge over 1 year ago

  • Assignee set to stsp

#2 Updated by fixeria over 1 year ago

  • Category set to SS/USSD
  • Status changed from New to Stalled
  • Priority changed from Normal to High

#3 Updated by fixeria over 1 year ago

  • Category deleted (SS/USSD)

#4 Updated by fixeria about 1 year ago

#5 Updated by msuraev 11 months ago

  • Related to Feature #2487: MSC side of LCLS (local call local switching) as per the 3GPP specs added

#6 Updated by msuraev 11 months ago

On a related note: CallRef is sent as part of LCLS-related config to BSC where it's used to match related call legs. Current allocation scheme makes it hard because of the different prefixes used by CC and MNCC code. I've worked around this by dropping prefix byte and using remainder for LCLS but it's a race-prone hack. When fixing this we should definitely take LCLS into consideration.

#7 Updated by laforge 5 months ago

  • Assignee changed from stsp to neels

neels, please review and determine if it is still valid and high priority. if yes, feel free to discuss with osmith if he can help with the implementation

#8 Updated by neels 4 months ago

This issue is still valid and I guess possible transaction collisions qualify as high prio.

#9 Updated by neels 3 months ago

fixeria wrote:

Possible solution: inspire by GSM TS 04.07, section 11.2.3 "Transaction identifier" and introduce the direction bit. Probably, this can be a bit simplified, e.g. '0' means allocated by the MSC itself, '1' - by an external application.

Let's first clarify the distinction between the TI and the "callref":

  • The transaction ID is typically a 4bit identifier communicated between MS and MSC, where the highest bit indicates the origin and values 0-6 can be used for concurrent message flows.
    There may be an "Extended" TI indicated by value 7, and followed by a full octet (with another EXT flag as highest bit and 7 bits for TI values).
    This part is well specified and fine. I'm not sure whether we are able to handle extended TI, but there is no ID collision issue here.
  • The "callref" is an Osmocom invention. It is used for:
    • MNCC protocol callref. AFAICT this is entirely for identification between osmo-msc and MNCC peer;
      In osmo-sip-connector, the callref is used as internal call->id, but does not end up in SIP.
      For MSC-initiated MNCC, it simply takes the callref received from MNCC (also used for call->id).
      For sipcon-initiated MNCC, it has a last_call_id in src/call.c, assigned to call->id = ++last_call_id, which is copied 1:1 to mncc via leg->callref = call->id and then mncc.callref = leg->callref.
      MSC starts assigning callrefs from 0x80000001, while sipcon starts at 5001.
      It is practically not very likely to collision, but a collision could happen after a sufficient number of calls, possibly causing things like inadvertent call drops or even connecting the wrong call legs to each other (talking to the wrong person).
    • USSD handling: GSUP session_id towards osmo-hlr.
      The libosmocore gsup.h in the session_id API doc says "Unique session identifier and origination flag.",
      but the GSUP specs fail to specify a origination / direction flag for Supplementary Services.
      see e.g. osmo-msc user manual 18.6.20 "Process Supplementary Service Request", 18.8 "Session (transaction) management"
      AFAICT osmo-hlr does not avoid collisions here yet, all I can see is "ss = ss_session_alloc(hlr, gsup->imsi, gsup->session_id);" -- fixeria ?

I guess a first sane step would be to separate the different IDs, i.e. instead of overloading trans->callref, have distinct trans->cc.callref and trans->ss.session_id values.
Then we can resolve ID collision issues for each realm independently.

I agree that the sanest way would be to include an origin/direction flag, so that each side can use the same number space without collisions.
We need to clarify:

  • Can we expect all MNCC client programs to adhere?
    • Implementation: modify (for starters) osmo-sip-connector to heed a direction flag.
    • Specification: does adding the flag constitue a backwards incompatibility?
  • osmo-hlr <--GSUP-SS--> osmo-msc:
    • Impl: Does Supplementary Services via GSUP already employ an origin/direction flag? (not AFAICT)
    • Spec: we need to enhance the GSUP spec in the user manuals to properly specify a direction flag.

Any opinions on this?

#10 Updated by fixeria 2 months ago

Hi Neels,

The libosmocore gsup.h in the session_id API doc says "Unique session identifier and origination flag.", but the GSUP specs fail to specify a origination / direction flag for Supplementary Services. [...]
AFAICT osmo-hlr does not avoid collisions here yet, all I can see is "ss = ss_session_alloc(hlr, gsup->imsi, gsup->session_id);" -- fixeria ?

Oh, wow. Most likely, I was planning to implement this flag at that time, but never had time to start working on it. Of course, neither OsmoMSC nor OsmoHLR implement this flag, so collisions are possible. Moreover, we have not negotiated the coding of that flag so far.

I guess a first sane step would be to separate the different IDs, i.e. instead of overloading trans->callref, have distinct trans->cc.callref and trans->ss.session_id values. Then we can resolve ID collision issues for each realm independently.

Sounds good to me. The current approach of having different value ranges for different transaction types is really odd and problem prone, since we never had all the ranges defined together in a single header file.

  • Can we expect all MNCC client programs to adhere?
    • Specification: does adding the flag constitue a backwards incompatibility?

We could use one LSB/MSB bit as a flag, and set it for all MSC-originated transactions / sessions. Therefore if it's set to zero, than a transaction is considered as 'foreign'. But since we already use quite big values like 0x80000001 (both LSB and MSB bits are set), this approach will not work.

Also, AFAIR jolly's LCR (Linux Call Router) does check if callref is within some range, and rejects a transaction otherwise. Not sure if anybody is still using it though.

Any opinions on this?

Now it's a good question: how do we introduce this flag and keep backwards compatibility at the same time...

#11 Updated by neels 2 months ago

A problem I see with a single bit origin flag: at most two entities can be involved.
As soon as osmo-hlr forwards to another handler, or maybe at some point handling happens distributed across servers or whatnot, the single bit will be inadequate.
It works well on a 1:1 link like RSL, where lower layers sort out sender and receiver and there are exactly two sides involved. But in GSUP there may be more entities on the wire.

I thought about somehow implicitly figuring out the initiator of a session, and store it internally, but I'm pretty sure that wouldn't work.

A solution that I see is to add a Session Originator IE: define that a session id is always owned by the initiator of a session.
An example:

  • osmo-msc identified on IPA as "FOO" starts a USSD session, creates session_id = 1, sends GSUP session_originator = "FOO", session_id = 1.
  • osmo-hlr receives the message, forwards to an EUSE
  • EUSE "BAR" remembers that this session has ID FOO-1.
  • When "BAR" replies to this session, it again sends session_originator = "FOO", session_id = 1 back to osmo-msc.
  • At some point "BAR" initiates a session. It can also create session_id = 1 and send to osmo-msc session_originator = "BAR", session_id = 1.
  • osmo-msc receives id BAR-1 and is able to distinguish from the unrelated FOO-1 from before.

This solution is "backwards compatible" in the way that older implementations could techincally ignore the unknown Session Originator IE and still run into collisions as before.
But I think we usually error out on unknown IEs?
Anyway, adding explicit "session id namespaces" currently seems to me to be the clearest and most future proof solution, even if it involves an incompatibility.

(BTW, we already have Source Name and Destination Name IEs for inter-MSC HO, but we can't re-use that as session originator,
because these are designed as routing targets in the way that if a request had source=A dest=B, the response will have source=B dest=A,
so this isn't able to indicate which side created a session_id.)

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)