Race condition: OsmoBTS sends empty INFO_ind to PCU socket, if not all SI arrived from BSC via RSL
OsmoBSC --A-bis OML/RSL-- OsmoBTS --/tmp/pcu_bts socket-- OsmoPCU
I wrote a fix for #3854, and to make sure that it works, I'm writing a TTCN3 test that verifies INFO_ind arriving at the PCU socket.
The test suite emulates both the BSC and PCU, and connects both at the same time. The BTS sends an INFO_ind containing empty values, such as CellId = 0, unless I'm forcing a sleep before the test suite connects to the PCU socket.
In wireshark I saw that the BTS does not wait before all system information types (SI) are arriving:
- SI3 - SI2 - pcu socket: "Sending info" - SI4
I'm preparing a separate patch that introduces the sleep, and when reverted, it can reproduce the problem.
- Sleep patch: https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/15327
- Test that reproduces the problem (with reverted sleep patch): https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/15328
As noted in the meeting yesterday, a few ttcn3-bts-test and ttcn3-bts-test-latest tests started to fail lately:
- TC_pcu_socket_connect_multi (Unexpected unix domain connect result)
- TC_pcu_socket_connect_si3gprs (SI3 indicates no GPRS despite PCU socket connected)
- TC_si_sched_13_2bis_2ter_2quater (Error: Insufficient SI in array)
(The new test TC_pcu_socket_verify_info_ind is only failing in latest, and this is expected.)
I found, that they started to fail since the sleep commit mentioned above:
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/15327 ("bts: f_init_pcu: sleep before connect")
As I ran TC_si_sched_13_2bis_2ter_2quater, I found several log messages like this one in osmo-bts.log:
20190904083328552 DL1P <0007> sysinfo.c:160 PH-RTS-IND: Unable to determine actual BS_AG_BLKS_RES value as SI3 is not available yet, fallback to 1 20190904083328552 DL1P <0007> sysinfo.c:160 GSMTAP: Unable to determine actual BS_AG_BLKS_RES value as SI3 is not available yet, fallback to 1
I think, that this is another race condition bug in the testsuite or OsmoBTS: the additional sleep introduced by the patch should not make anything fail, if everything correctly waited for messages indicating that information is available.
So... in order not investing too much time into this (I'm really getting side-tracked from #3925, which I'm originally working on), I'm going to revert that sleep commit for now. This makes TC_pcu_socket_verify_info_ind fail in master, until this issue (#4179) is resolved.