Project

General

Profile

Actions

Bug #6074

closed

Current master as of 2023-06-25 broken in my environment

Added by falconia 11 months ago. Updated 9 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
06/27/2023
Due date:
% Done:

0%

Spec Reference:

Description

Let me preface this report by saying that it is certainly not up to the quality standard for proper bug reports, and as a developer I am expected to do better. However, in my defense I have to point out that I have only one working BTS, and when I reached the limit of how much downtime I could allow on my production network, I had to abort the investigation and go back to a stable build.

On 2023-06-25 (Themyscira time zone, GMT-08:00 all year round, no DST) I made an attempt to update my production network (specifically the software that runs on my Slackware Linux server, namely OsmoHLR, OsmoSTP, OsmoMSC, OsmoBSC and one OsmoMGW for both MSC and BSC) from the now-elderly 2021-11 release to what was current master as of this attempt two days ago. (The osmo-bts-sysmo process running on the sysmoBTS box is already up to recent master, keeping up with changes I've been submitting and getting mainlined in OsmoBTS, but the other processes running on the Slackware server are a different story.) Once I got the new software version up and running, I saw this behavior: the two test MS I had on hand at the time registered successfully (good LU), USSD also worked as expected, but voice calls were completely broken, both in my production config with external MNCC and even when I switched to internal MNCC for a test. I never tested SMS.

My first test call was made in the production-style setup with ThemWi external MNCC, using themwi-test-mtc command line program that makes a single-leg (MT only, no MO leg) test call to a single connected MS. Result: the test phone started ringing and then immediately stopped some fraction of a second later; the output from OsmoMSC on the MNCC socket was MNCC_CALL_CONF_IND, then MNCC_RTP_CREATE, and then immediately MNCC_REL_IND. I looked in syslog (that's how I get logs from OsmoCNI components) - the excerpt corresponding to this test call is attached in log-frag1.txt. The logging verbosity levels were unchanged from my previously-stable production setup that ran 2021-11 release. These lines drew my attention: line numbers 9, 10, 12 and 13 in the log-frag1.txt attachment.

For the next experiment I took themwi-system-sw out of the equation by switching to internal MNCC (thereby reducing the setup to pure Osmocom, without any ThemWi components) and dialing a test call from one MS to another. This time the destination phone never rang at all, and the calling phone indicated call failure right away. Seeing the same errors in syslog as in the previous case of ThemWi external MNCC, I started looking closer at OsmoMGW and MGCP. I enabled higher logging verbosity in OsmoMGW, and I ran tcpdump on udp port 2427 (MGCP). log-frag2.txt and mgcp-debug.pcap attachments correspond to this test run. Looking at the pcap in particular, something definitely looks amiss, beginning with the second captured packet where the SDP response from MGW throws the two codecs together into a single invalid rtpmap line - but I don't know enough about this protocol to tell if the bug is on OsmoMSC or OsmoMGW side.

After the above experiment I reached the limit of how long I could keep my production network down for debug chases, and I switched to 2023-02 stable release for production use for the time being. I accomplished one goal of updating from the now-elderly 2021-11 release, but I didn't get all the way to current master. In my current setup osmo-hlr and osmo-msc have some local patches, published as branch falconia/production in the respective repos, but if you look at those local patches, you will see that they are very minor. All other components and libraries are stock 2023-02 release.

Back to debugging the issue with current master, I have to pause until I am able to acquire a second working BTS. I would need to set up a separate test network, separate from Themyscira production network, and I will need another BTS for it.


Files

log-frag1.txt log-frag1.txt 1.78 KB falconia, 06/27/2023 02:11 PM
log-frag2.txt log-frag2.txt 26.4 KB falconia, 06/27/2023 02:11 PM
mgcp-debug.pcap mgcp-debug.pcap 3.16 KB falconia, 06/27/2023 02:11 PM

Related issues

Related to OsmoMSC - Bug #6080: ERROR ptmap contains illegal mapping: codec=4294967295Resolved06/29/2023

Actions
Related to OsmoMGW - Bug #6081: osmo-mgw fails to parse the semicolon separator in MGCP header like "L: a: GSM-EFR;GSM"Resolvedneels06/30/2023

Actions
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)