https://osmocom.org/https://osmocom.org/favicon.ico?16647414092021-10-12T10:57:13ZOpen Source Mobile CommunicationsOsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=226732021-10-12T10:57:13Zfixeria
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>20</i></li></ul><p>I found the culprit:</p>
<p><a class="external" href="https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-latest/1095/artifact/logs/bsc/osmo-bsc.log/*view*/">https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-latest/1095/artifact/logs/bsc/osmo-bsc.log/*view*/</a></p>
<pre>
20210930074737198 DLGLOBAL <0015> logging_vty.c:1113 TTCN3 f_logp(): TC_lost_sdcch_during_assignment() start
Segmentation fault (core dumped)
</pre>
<p>This test case was introduced quite recently:</p>
<pre>
commit 92cfa1c45ae1cb52d5aefb774f93468fef607417
Author: Neels Hofmeyr <nhofmeyr@sysmocom.de>
Date: Tue Sep 28 18:29:44 2021 +0200
bsc: add TC_lost_sdcch_during_assignment()
</pre>
<p>and the aim is to reproduce a segfault described in SYS#5627.</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=226792021-10-12T13:53:45Zfixeria
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Stalled</i></li><li><strong>% Done</strong> changed from <i>20</i> to <i>40</i></li></ul><p>I decided to back-port a patch fixing the segfault and create a patch release (1.7.0 -> 1.7.1):</p>
<p><a class="external" href="https://gerrit.osmocom.org/c/osmo-bsc/+/25753">https://gerrit.osmocom.org/c/osmo-bsc/+/25753</a> assignment_fsm: Check for conn->lchan</p>
<p><a class="user active" href="https://osmocom.org/users/301771">osmith</a>, <a class="user active" href="https://osmocom.org/users/30187">pespin</a>, may I ask one of you to help with createing the actual patch release? I used to have a docker image with Debian and all the tools needed for osmo-release.sh, but then did 'docker system prune --all' and lost it.</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=226812021-10-12T15:10:54Zosmith
<ul><li><strong>Status</strong> changed from <i>Stalled</i> to <i>Resolved</i></li><li><strong>% Done</strong> changed from <i>40</i> to <i>100</i></li></ul><p>Sure, done: <a class="external" href="https://git.osmocom.org/osmo-bsc/commit/?h=1.7.1">https://git.osmocom.org/osmo-bsc/commit/?h=1.7.1</a></p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=226842021-10-12T20:23:26Zfixeria
<ul></ul><p>osmith wrote:</p>
<blockquote>
<p>Sure, done: <a class="external" href="https://git.osmocom.org/osmo-bsc/commit/?h=1.7.1">https://git.osmocom.org/osmo-bsc/commit/?h=1.7.1</a></p>
</blockquote>
<p>Thank you very much!</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227142021-10-14T14:51:36Zfixeria
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>In Progress</i></li><li><strong>% Done</strong> changed from <i>100</i> to <i>80</i></li></ul><p>Unfortunately, latest osmo-bsc still crashes when TC_lost_sdcch_during_assignment is being executed:</p>
<p><a class="external" href="https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-latest/1109/artifact/logs/bsc/core">https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-latest/1109/artifact/logs/bsc/core</a></p>
<p>This time we get a bit further and see some more logging:</p>
<pre>
20211014074531510 DLGLOBAL <0015> logging_vty.c:1113 TTCN3 f_logp(): TC_lost_sdcch_during_assignment() start
20211014074531758 DAS <0011> assignment_fsm.c:618 assignment(msc0-conn198_subscr-IMSI-001019876543210_0-0-1-TCH_F-0)[0x5579ec6b3070]{WAIT_RR_ASS_COMPLETE}: (bts=0,trx=0,ts=1,ss=0) Assignment failed in state WAIT_RR_ASS_COMPLETE, cause EQUIPMENT FAILURE: Unable to send RR Assignment Command: conn without lchan
20211014074531758 DAS <0011> assignment_fsm.c:148 assignment(msc0-conn198_subscr-IMSI-001019876543210_0-0-1-TCH_F-0)[0x5579ec6b3070]{WAIT_RR_ASS_COMPLETE}: (bts=0,trx=0,ts=1,ss=0) Assignment failed
20211014074531758 DMSC <0007> assignment_fsm.c:149 SUBSCR_CONN(msc0-conn198_subscr-IMSI-001019876543210)[0x5579ec69bd60]{CLEARING}: Event ASSIGNMENT_END not permitted
20211014074531759 DCHAN <000f> lchan_fsm.c:837 lchan(0-0-1-TCH_F-0)[0x5579ec6acf70]{WAIT_RF_RELEASE_ACK}: transition to state WAIT_RLL_RTP_ESTABLISH not permitted!
20211014074531779 DLMGCP <0025> mgcp_client.c:691 Cannot find matching MGCP transaction for trans_id 420
20211014074533758 DCHAN <000f> lchan_fsm.c:81 lchan(0-0-1-TCH_F-0)[0x5579ec6acf70]{WAIT_RF_RELEASE_ACK}: (type=TCH_F) lchan allocation failed in state WAIT_RF_RELEASE_ACK: Timeout
20211014074533759 DCHAN <000f> lchan_fsm.c:116 lchan(0-0-1-TCH_F-0)[0x5579ec6acf70]{WAIT_RF_RELEASE_ACK}: (type=TCH_F) Signalling Assignment FSM of error (lchan allocation failed in state WAIT_RF_RELEASE_ACK: Timeout)
Segmentation fault (core dumped)
</pre> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227582021-10-17T11:27:28Zfixeria
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li><li><strong>Assignee</strong> changed from <i>fixeria</i> to <i>neels</i></li></ul><blockquote>
<p>Unfortunately, latest osmo-bsc still crashes when TC_lost_sdcch_during_assignment is being executed: [...]</p>
</blockquote>
<p><a class="user active" href="https://osmocom.org/users/91">neels</a> could you please take a look? I was trying to figure out why it still segfaults, but could not find anything suspicious.</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227642021-10-18T12:31:04Zfixeria
<ul></ul><p>Interestingly enough, I cannot reproduce the segfault locally with osmo-bsc 1.7.1-0-gf20b3086a.</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227682021-10-18T16:42:27Zneelsnhofmeyr@sysmocom.de
<ul></ul><p>fixeria wrote:</p>
<blockquote><blockquote>
<p>Unfortunately, latest osmo-bsc still crashes when TC_lost_sdcch_during_assignment is being executed: [...]</p>
</blockquote>
<p><a class="user active" href="https://osmocom.org/users/91">neels</a> could you please take a look? I was trying to figure out why it still segfaults, but could not find anything suspicious.</p>
</blockquote>
<p>osmo-bsc does not crash for me anymore during this test, using current master, where pmaier's fix is merged.<br />The test also passes on jenkins. Where / how did you still see a crash?</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227692021-10-18T17:59:14Zfixeria
<ul></ul><p>neels wrote:</p>
<blockquote>
<p>osmo-bsc does not crash for me anymore during this test, using current master, where pmaier's fix is merged.<br />The test also passes on jenkins. Where / how did you still see a crash?</p>
</blockquote>
<p>The recent master does not crash, but <strong>latest release</strong> (1.7.1) does, see for instance:</p>
<p><a class="external" href="https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-latest/1113/artifact/logs/bsc/">https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-latest/1113/artifact/logs/bsc/</a></p>
<p>1.7.1 is basically a patch release with pmaier's fix applied. And somehow it still segfaults on Jenkins.</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227822021-10-19T23:45:04Zfixeria
<ul></ul><p>Good news: I managed to reproduce the segfault in a docker container by running it this way:</p>
<pre>
docker run -it --rm --network=host -v osmo-ttcn3-hacks:/data fixeria/osmo-bsc-latest /usr/bin/osmo-bsc -c /data/bsc/osmo-bsc.cfg
</pre>
<p>and I am even getting the same logging output. Here is a backtrace:</p>
<pre>
#0 _lchan_on_activation_failure (lchan=lchan@entry=0x7f037ea25748, activ_for=<optimized out>,
for_conn=0x0, line=line@entry=1574, file=0x563e3f8d910d "lchan_fsm.c") at lchan_fsm.c:117
#1 0x0000563e3f882317 in _lchan_on_activation_failure (line=1574, file=0x563e3f8d910d "lchan_fsm.c",
for_conn=<optimized out>, activ_for=<optimized out>,
lchan=0x7f037ea25748) at lchan_fsm.c:1574
#2 lchan_fsm_timer_cb (fi=0x563e401a3d00) at lchan_fsm.c:1574
#3 0x00007f037ddd5f16 in fsm_tmr_cb (data=0x563e401a3d00) at fsm.c:325
#4 0x00007f037ddd01a6 in osmo_timers_update () at timer.c:273
#5 0x00007f037ddd0b67 in _osmo_select_main (polling=0) at select.c:373
#6 0x00007f037ddd0ce6 in osmo_select_main_ctx (polling=<optimized out>) at select.c:434
#7 0x0000563e3f81e6bf in main (argc=<optimized out>, argv=<optimized out>) at osmo_bsc_main.c:1039
</pre> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227832021-10-20T00:05:59Zfixeria
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Stalled</i></li><li><strong>Assignee</strong> changed from <i>neels</i> to <i>osmith</i></li><li><strong>% Done</strong> changed from <i>80</i> to <i>90</i></li></ul><p>We need to back-port another change from the recent master:</p>
<pre>
commit dfd7bef6644d0c0837f7e5498bc5c86362b668dc
Author: Vadim Yanitskiy <vyanitskiy@sysmocom.de>
Date: Sun Jul 11 13:19:22 2021 +0600
lchan_fsm: fix potential NULL-pointer dereference
Change-Id: I373855b95f8bde0ce8f9c2ae7bf95c9135d33484
Related: SYS#5526
</pre>
<p>I submitted a cherry-pick to Gerrit:</p>
<p><a class="external" href="https://gerrit.osmocom.org/c/osmo-bsc/+/25836">https://gerrit.osmocom.org/c/osmo-bsc/+/25836</a> lchan_fsm: fix potential NULL-pointer dereference</p>
<p>And again, I would need some help from <a class="user active" href="https://osmocom.org/users/301771">osmith</a> to create a patch release. This time 1.7.2.</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227842021-10-20T00:11:31Zfixeria
<ul></ul><p>I also cherry-picked both patches to the '2021q1':</p>
<p><a class="external" href="https://gerrit.osmocom.org/c/osmo-bsc/+/25837">https://gerrit.osmocom.org/c/osmo-bsc/+/25837</a> assignment_fsm: Check for conn->lchan [NEW]<br /><a class="external" href="https://gerrit.osmocom.org/c/osmo-bsc/+/25838">https://gerrit.osmocom.org/c/osmo-bsc/+/25838</a> lchan_fsm: fix potential NULL-pointer dereference [NEW]</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227892021-10-20T15:46:30Zfixeria
<ul><li><strong>Assignee</strong> changed from <i>osmith</i> to <i>pespin</i></li></ul><p>Oliver is on holidays this week, Pau agreed to help (thanks!).</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227912021-10-20T16:46:50Zpespin
<ul><li><strong>Status</strong> changed from <i>Stalled</i> to <i>Feedback</i></li><li><strong>Assignee</strong> changed from <i>pespin</i> to <i>fixeria</i></li></ul><p>tag 1.7.2 pushed with commit "lchan_fsm: fix potential NULL-pointer dereference" in it.</p>
<p>Reassigning to <a class="user active" href="https://osmocom.org/users/67">fixeria</a> .</p> OsmoBSC - Bug #5255: ttcn3-bsc-test-latest: CBSP and LCLS test cases fail since build #1095https://osmocom.org/issues/5255?journal_id=227972021-10-21T19:42:40Zfixeria
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li><li><strong>% Done</strong> changed from <i>90</i> to <i>100</i></li></ul><p>Good news: latest osmo-bsc (1.7.2) does not crash anymore:</p>
<p><a class="external" href="https://jenkins.osmocom.org/jenkins/view/TTCN3-centos/job/TTCN3-centos-bsc-test-latest/228/">https://jenkins.osmocom.org/jenkins/view/TTCN3-centos/job/TTCN3-centos-bsc-test-latest/228/</a> (no core file, -36 failures)<br /><a class="external" href="https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-latest/1116/">https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bsc-test-latest/1116/</a> (no core file, -36 failures)</p>