TCH lchan allocation is non-modular and also riddled with holes
To be able to add inter-BSC Handover, I need to allocate an lchan. Since common lchan allocation steps are currently duplicated in Assignment and intra-BSC Handover, I did a review of the current code state to identify the best way to continue.
A review of the current lchan allocation procedures during BSSMAP Assignment and intra-BSC Handover has shown that besides being non-modular, the lchan allocation and release procedures have numerous "holes" where we do not safeguard message communication with timeouts, that the sequence of events is not ideal, and at least one wrong action is taken during handover error handling.
I created message sequence charts of Assignment and Handover, and marked numerous errors by red (needs a fix) and orange (could be improved) notes.
In osmo-bsc/doc/, do 'make msc' to generate PNGs from the message sequence charts, or look for "red" in the .msc files.
I have also made a plan for a separate lchan allocation FSM that should fix most of the problems identified here.
Above review has convinced me that there is no good quick way around a proper FSM that can be re-used in a modular way.
- Checklist item specify new lchan allocation FSM by message sequence chart set to Done
- Checklist item implement new lchan allocation FSM set to Done
- Checklist item implement TTCN3 tests to verify successful use for Assignment set to Done
- Checklist item implement TTCN3 tests to verify successful use for intra-BSC Handover set to Done
- % Done changed from 20 to 90
(the initial Assignment and intra-BSC handover ttcn3 tests already exist)
The ttcn3-bsc-tests pass for all of AoIP, SCCPlite and LCLS.
Pau has sent me a tar of an osmo-gsm-tester run containing errors to be fixed.
Some code review items have been fixed, but various cosmetic review items are TBD.
When all is ready, before merging, make sure to tag a release.
- so far relied on ttcn3 and osmo-gsm-tester test suites, now tested in detail with actual phones and BTSes; - there were still scores of problems. Fixed on branch neels/inter-bsc-ho; not submitted to gerrit yet. Test suite coverage doesn't catch these errors: - in reality messages come in different order than in ttcn3 tests. - I created RTP reflection loops instead of forwarding, test suite doesn't catch that. (we should probably verify MGCP messages' port information) - Osmocom style dyn TS failed to switch PCHAN mode after PDCH deactivation. - HO Failure message caused old lchan's RTP to be DLCX'd - Also identified a couple errors in ttcn3-bsc-tests. Patches on gerrit. - during handover, noticed large audio gap (several seconds) with AMR. With FR1, only a short gap. (I dimly remember some talk about shortening a timeout/sync? was actually using a slightly old osmo-bts-sysmo.) - Re-Refactored lchan FSM to start connecting RTP earlier, so that (other than the old code) we switch RTP to new lchan upon HO Detect (and roll back in case of later error). Didn't help that much with FR1 handover audio gap though. Could make sense to look at pcap and analyse timing in detail. - Handover often fails with RSL Handover Failed message, even though all BTSes and phones are in excellent reception conditions. Maybe there needs to be a little wait between Lchan Activ Ack of new lchan and Handover Command??? - various cosmetic code review items still not resolved. Focusing on functional testing first. - Verified/fixed again that ttcn3-bsc-tests pass
- Checklist item deleted (
implement TTCN3 tests to verify proper timeout actions for each and every asynchronous messaging during Assignment)
- Checklist item deleted (
implement TTCN3 tests to verify proper timeout actions for each and every asynchronous messaging during Handover)
- Status changed from In Progress to Resolved
- % Done changed from 90 to 100
All changes have been merged to osmo-bsc master. "unfortunately" I also require very detailed ttcn3 tests for osmo-bsc to close this issue. Moving to #3479.