Project

General

Profile

Bug #4573

[centos] ttcn3-msc-test: 177 failures!

Added by fixeria about 1 month ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
06/01/2020
Due date:
% Done:

100%

Spec Reference:

Description

See https://jenkins.osmocom.org/jenkins/view/TTCN3-centos/job/TTCN3-centos-msc-test/2/.

Here I what I noticed in the logs of osmo-stp (build artifacts):

20200531003124852 DLGLOBAL <0000> telnet_interface.c:104 Available via telnet 127.0.0.1 4239
20200531003126073 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003126073 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003128331 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003131074 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003131075 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003136075 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003136076 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003138405 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003141077 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003141077 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8
20200531003146078 DLINP <0002> stream.c:113 couldn't activate SCTP events on FD 8

and osmo-msc:

20200531003126073 DSGS <0011> sgs_server.c:186 SGs socket bound to r=NULL<->l=0.0.0.0:29118
20200531003126073 DMSC <0006> msc_main.c:697 A-interface: SCCP user OsmoMSC-A:RI=SSN_PC,PC=(no PC),SSN=BSSAP, cs7-instance 0 ((null))
20200531003126073 DMSC <0006> msc_main.c:716 Iu-interface: SCCP user OsmoMSC-IuCS:RI=SSN_PC,PC=(no PC),SSN=RANAP, cs7-instance 0 ((null))
20200531003126073 DLINP <0015> stream.c:113 couldn't activate SCTP events on FD 12
20200531003126073 DLSS7 <001f> xua_default_lm_fsm.c:354 xua_default_lm(asp-clnt-OsmoMSC-A)[0x831380]{WAIT_ASP_UP}: Ignoring primitive M-ASP_DOWN.indication
20200531003126073 DLINP <0015> stream.c:269 [WAIT_RECONNECT] osmo_stream_cli_write(): not connected, dropping data!

The origin of this error message is libosmo-netif's sctp_sock_activate_events():

/* IMPORTANT: Do NOT enable sender_dry_event here, see
 * https://bugzilla.redhat.com/show_bug.cgi?id=1442784 */
rc = setsockopt(fd, IPPROTO_SCTP, SCTP_EVENTS,
                &event, sizeof(event));

if (rc < 0)
        LOGP(DLINP, LOGL_ERROR, "couldn't activate SCTP events " 
             "on FD %u\n", fd);

Related issues

Related to Cellular Network Infrastructure - Bug #4570: TTCN3-centos-bsc-test: 159 failing testsResolved05/30/2020

History

#1 Updated by fixeria about 1 month ago

Huh, build#3 is ok (-173 failures). Still would be good to know what was the reason.

https://jenkins.osmocom.org/jenkins/view/TTCN3-centos/job/TTCN3-centos-msc-test/3/

#2 Updated by laforge about 1 month ago

On Sun, May 31, 2020 at 06:37:19PM +0000, fixeria [REDMINE] wrote:

> /* IMPORTANT: Do NOT enable sender_dry_event here, see
>  * https://bugzilla.redhat.com/show_bug.cgi?id=1442784 */
> rc = setsockopt(fd, IPPROTO_SCTP, SCTP_EVENTS,
>                 &event, sizeof(event));
> 
> if (rc < 0)
>         LOGP(DLINP, LOGL_ERROR, "couldn't activate SCTP events " 
>              "on FD %u\n", fd);
> 

sigh. This is indeed most likely a consequence of https://bugzilla.redhat.com/show_bug.cgi?id=1442784
which means that containers are no longer potable across kernels, if they are using
different definitions...

#3 Updated by laforge about 1 month ago

  • Assignee set to laforge

We already introduced a work-around in https://gerrit.osmocom.org/c/libosmo-netif/+/18097.

I just checked:
  • centos8 still has a kernel before 5.5, i.e. without the additional sctp_send_failure_event_event member of the struct.
  • host2 has kernel 4.9.189, also without the additional sctp_send_failure_event_event

So I'm not quit sure what is causing the incompatibility here...

#4 Updated by laforge about 1 month ago

fixeria wrote:

Huh, build#3 is ok (-173 failures). Still would be good to know what was the reason.

build#3 was running on build2.osmocom.org, while build#2 was running on host2.osmocom.org

  • build2: Debian 10 / Linux build2.osmocom.org 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux
  • host2: Debian 9 / Linux host2.osmocom.org 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GNU/Linux

So there appears to be an incompatibility specifically with Centos8 containers on a Debian9 kernel?

#5 Updated by fixeria about 1 month ago

  • Related to Bug #4570: TTCN3-centos-bsc-test: 159 failing tests added

#6 Updated by laforge about 1 month ago

laforge wrote:

So there appears to be an incompatibility specifically with Centos8 containers on a Debian9 kernel?

I've created a fresh debian9 qemu-kvm VM
  • running "Linux d9dc8sctp 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1 (2020-01-20) x86_64 GNU/Linux"
  • installed docker-ce
  • built the ttcn3-msc-test container and the osmo-msc-master-centos8 container
  • ran the test suite

And indeed:
DLINP <0015> stream.c:113 couldn't activate SCTP events on FD 12

it seems there has been even more ABI breakage over time:

Debian9:

struct sctp_event_subscribe {
        __u8 sctp_data_io_event;
        __u8 sctp_association_event;
        __u8 sctp_address_event;
        __u8 sctp_send_failure_event;
        __u8 sctp_peer_error_event;
        __u8 sctp_shutdown_event;
        __u8 sctp_partial_delivery_event;
        __u8 sctp_adaptation_layer_event;
        __u8 sctp_authentication_event;
        __u8 sctp_sender_dry_event;
};

centos8:

struct sctp_event_subscribe {                                                                                                                                                                                          
        __u8 sctp_data_io_event;                                                                                                                                                                                       
        __u8 sctp_association_event;                                                                                                                                                                                   
        __u8 sctp_address_event;                                                                                                                                                                                       
        __u8 sctp_send_failure_event;                                                                                                                                                                                  
        __u8 sctp_peer_error_event;                                                                                                                                                                                    
        __u8 sctp_shutdown_event;                                                                                                                                                                                      
        __u8 sctp_partial_delivery_event;                                                                                                                                                                              
        __u8 sctp_adaptation_layer_event;                                                                                                                                                                              
        __u8 sctp_authentication_event;                                                                                                                                                                                
        __u8 sctp_sender_dry_event;                                                                                                                                                                                    
        __u8 sctp_stream_reset_event;                                                                                                                                                                                  
        __u8 sctp_assoc_reset_event;                                                                                                                                                                                   
        __u8 sctp_stream_change_event;                                                                                                                                                                                 
};                                                                       

And current mainline linux / Debian unstable:

struct sctp_event_subscribe {
        __u8 sctp_data_io_event;
        __u8 sctp_association_event;
        __u8 sctp_address_event;
        __u8 sctp_send_failure_event;
        __u8 sctp_peer_error_event;
        __u8 sctp_shutdown_event;
        __u8 sctp_partial_delivery_event;
        __u8 sctp_adaptation_layer_event;
        __u8 sctp_authentication_event;
        __u8 sctp_sender_dry_event;
        __u8 sctp_stream_reset_event;
        __u8 sctp_assoc_reset_event;
        __u8 sctp_stream_change_event;
        __u8 sctp_send_failure_event_event;
};

so we have a 10, 13 or 14 byte version.

#7 Updated by laforge about 1 month ago

Ok, so

  • sctp_stream_reset_event was added in commit 35ea82d611da59f8bea44a37996b3b11bb1d3fd7 (first released in kernel v4.11)
  • sctp_assoc_reset_event was added in commit c95129d127c6d3d9fca189c6f94c539a7f086b1a (first released in kernel v4.12)
  • sctp_stream_change_event was added in commit b444153fb5a647448c2080ad28656ad183cae4fc (first released in kernel v4.12)
  • sctp_send_failure_event_event was added in commit b6e6b5f1da7e8d092f86a4351802c27c0170c5a5 (first released in kernel v5.5)
so
  • kernels < 4.11 have 10 bytes
  • kernel 4.11 has 11 bytes
  • 4.11 < x < 5.5 has 13 bytes
  • kernels >= 5.5 have 14 bytes

#8 Updated by laforge about 1 month ago

  • % Done changed from 0 to 40

#10 Updated by laforge about 1 month ago

  • % Done changed from 40 to 70

second version of patch https://gerrit.osmocom.org/c/libosmo-netif/+/18628 now merged.

#11 Updated by laforge about 1 month ago

  • Status changed from New to Resolved
  • % Done changed from 70 to 100

libosm-netif with that patch merged is now working in my debian9 vm with centos8 docker container.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)