https://osmocom.org/https://osmocom.org/favicon.ico?16647414092021-08-06T16:48:53ZOpen Source Mobile CommunicationsOsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=224292021-08-06T16:48:53Ziedemam
<ul></ul><p>I've seen this now with a different call stack but it looks like the same kaboom.</p>
<a name="Log-Tail-Type-2"></a>
<h3 >Log Tail Type 2<a href="#Log-Tail-Type-2" class="wiki-anchor">¶</a></h3>
<pre>
<0002> gprs_rlcmac_ts_alloc.cpp:776 No USF available
<0008> tbf.cpp:746 TBF(TFI=0 TLLI=0x78474a51 DIR=UL STATE=NULL EGPRS) Timeslot Allocation failed: trx = 1, single_slot = 0
<0008> tbf_ul.cpp:151 MS(TLLI=0x78474a51, IMSI=618010111567897, TA=0, 12/12, DL) No PDCH resource
<0002> pdch_ul_controller.c:315 PDCH(bts=0,trx=1,ts=6) Expiring FN=1792717 but previous FN=1792708 is still reserved!
<0002> pdch_ul_controller.c:315 PDCH(bts=0,trx=1,ts=6) Expiring FN=1792717 but previous FN=1792713 is still reserved!
<0002> pdch_ul_controller.c:329 PDCH(bts=0,trx=1,ts=6) Timeout for registered POLL (FN=1792713): TBF(TFI=3 TLLI=0xe2569340 DIR=DL STATE=FLOW EGPRS)
<0008> tbf.cpp:629 TBF(TFI=3 TLLI=0xe2569340 DIR=DL STATE=FLOW EGPRS) poll timeout for FN=1792713, TS=6 (curr FN 1792717)
<0008> tbf.cpp:699 TBF(TFI=3 TLLI=0xe2569340 DIR=DL STATE=FLOW EGPRS) Timeout for polling PACKET DOWNLINK ACK: |Assignment was on PACCH|Downlink ACK was received|
<0002> pdch_ul_controller.c:315 PDCH(bts=0,trx=1,ts=6) Expiring FN=1792734 but previous FN=1792721 is still reserved!
<0002> pdch_ul_controller.c:315 PDCH(bts=0,trx=1,ts=6) Expiring FN=1792734 but previous FN=1792726 is still reserved!
<0002> pdch_ul_controller.c:315 PDCH(bts=0,trx=1,ts=6) Expiring FN=1792734 but previous FN=1792730 is still reserved!
<0002> pdch_ul_controller.c:315 PDCH(bts=0,trx=1,ts=7) Expiring FN=1792734 but previous FN=1792730 is still reserved!
<0008> tbf.cpp:458 TBF(TFI=7 TLLI=0xcb2f0a00 DIR=DL STATE=WAIT_RELEASE EGPRS) T3193 timeout expired, freeing TBF
<0008> tbf.cpp:465 TBF(TFI=7 TLLI=0xcb2f0a00 DIR=DL STATE=WAIT_RELEASE EGPRS) T3193 timeout expired, freeing TBF
<0008> tbf.cpp:458 TBF(TFI=14 TLLI=0xee10ac00 DIR=DL STATE=WAIT_RELEASE EGPRS T3193 timeout expired, freeing TBF
<0008> tbf.cpp:465 TBF(TFI=14 TLLI=0xee10ac00 DIR=DL STATE=WAIT_RELEASE EGPRS T3193 timeout expired, freeing TBF
<0002> pdch_ul_controller.c:329 PDCH(bts=0,trx=1,ts=6) Timeout for registered POLL (FN=1792743): TBF(TFI=31 TLLI=0x7c31afb2 DIR=DL STATE=FLOW EGPRS)
<0008> tbf.cpp:629 TBF(TFI=31 TLLI=0x7c31afb2 DIR=DL STATE=FLOW EGPRS) poll timeout for FN=1792743, TS=6 (curr FN 1792743)
<0008> tbf.cpp:699 TBF(TFI=31 TLLI=0x7c31afb2 DIR=DL STATE=FLOW EGPRS) Timeout for polling PACKET DOWNLINK ACK: |Assignment was on PACCH|Downlink ACK was received|
<0002> pdch_ul_controller.c:315 PDCH(bts=0,trx=1,ts=7) Expiring FN=1792743 but previous FN=1792739 is still reserved!
<0008> tbf.cpp:407 TBF(TFI=7 TLLI=0xa86f0e0e DIR=UL STATE=FLOW EGPRS) N3101 exceeded MAX (10)
<0002> gprs_rlcmac_ts_alloc.cpp:776 No USF available
<0008> tbf.cpp:746 TBF(TFI=0 TLLI=0x7e64c836 DIR=UL STATE=NULL EGPRS) Timeslot Allocation failed: trx = 1, single_slot = 0
<0008> tbf_ul.cpp:151 MS(TLLI=0x7e64c836, IMSI=618010111603467, TA=0, 12/12, DL) No PDCH resource
<0008> tbf.cpp:458 TBF(TFI=9 TLLI=0x8d2aa800 DIR=UL STATE=RELEASING EGPRS) T3169 timeout expired, freeing TBF
<0008> tbf.cpp:462 TBF(TFI=9 TLLI=0x8d2aa800 DIR=UL STATE=RELEASING EGPRS) T3169 timeout expired, freeing TBF: |Assignment was on PACCH|No uplink data received yet|
<0002> pdch_ul_controller.c:315 PDCH(bts=0,trx=1,ts=7) Expiring FN=1792760 but previous FN=1792747 is still reserved!
</pre>
<a name="Backtrace-Type-2"></a>
<h3 >Backtrace Type 2<a href="#Backtrace-Type-2" class="wiki-anchor">¶</a></h3>
<pre>
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6c92110 in osmo_fsm_state_name () from /usr/lib64/libosmocore.so.17
#0 0x00007ffff6c92110 in osmo_fsm_state_name () from /usr/lib64/libosmocore.so.17
#1 0x0000000000418fe3 in osmo_fsm_inst_state_name (fi=<optimized out>) at /root/builder/firmware-master/build/rangesdmn-x86-64-core2/usr/include/osmocom/core/fsm.h:233
#2 gprs_rlcmac_tbf::state_name (this=0xe1fb80) at ./tbf.h:370
#3 gprs_rlcmac_tbf::name (this=0xe1fb80) at tbf.cpp:1070
#4 0x000000000041905a in tbf_name (tbf=<optimized out>) at tbf.cpp:1065
#5 0x000000000042ad01 in pdch_ulc_expire_fn (ulc=0x7bd690, fn=fn@entry=1792760) at pdch_ul_controller.c:327
#6 0x000000000040ff0b in pcu_rx_data_ind_pdtch (bts=bts@entry=0x7ad200, pdch=pdch@entry=0x7af720, data=data@entry=0x7fffffffe326 "@\005\341\063\354\233", '+' <repeats 17 times>, len=<optimized out>, fn=1792760, meas=meas@entry=0x7fffffffe2a0) at pcu_l1_if.cpp:285
#7 0x0000000000410834 in pcu_rx_data_ind (data_ind=0x7fffffffe324, bts=0x7ad200) at pcu_l1_if.cpp:418
#8 pcu_rx (pcu_prim=pcu_prim@entry=0x7fffffffe320, pcu_prim_length=<optimized out>) at pcu_l1_if.cpp:992
#9 0x0000000000431399 in pcu_sock_read (bfd=<optimized out>) at osmobts_sock.c:156
#10 0x0000000000431585 in pcu_sock_cb (bfd=0x6974e0 <pcu_sock_state>, flags=1) at osmobts_sock.c:211
#11 0x00007ffff6c8e78c in ?? () from /usr/lib64/libosmocore.so.17
#12 0x00007ffff6c8e836 in osmo_select_main () from /usr/lib64/libosmocore.so.17
#13 0x0000000000406bf5 in main (argc=1, argv=0x7fffffffecd8) at pcu_main.cpp:329
$1 = {si_signo = 11, si_errno = 0, si_code = 1, _sifields = {_pad = {48, 0, 160, 0, 70288, 0, 50806128, 0, 816916108, 32764, 951191690, 32604, 1, 0, 160, 0, 38817760, 0, 47792368, 0, 44, 0, 6622689, 0, 8149536, 0, 45656064, 0}, _kill = {si_pid = 48, si_uid = 0}, _timer = {si_tid = 48, si_overrun = 0, si_sigval = {sival_int = 160, sival_ptr = 0xa0}}, _rt = {si_pid = 48, si_uid = 0, si_sigval = {sival_int = 160, sival_ptr = 0xa0}}, _sigchld = {si_pid = 48, si_uid = 0, si_status = 160, si_utime = 301884661301248, si_stime = 218210658196389888}, _sigfault = {si_addr = 0x30, _addr_lsb = 160, _addr_bnd = {_lower = 0x11290, _upper = 0x3073d70}}, _sigpoll = {si_band = 48, si_fd = 160}}}
</pre> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=224322021-08-06T17:12:22Zlaforge
<ul><li><strong>Assignee</strong> set to <i>neels</i></li></ul> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=224462021-08-09T17:02:04Zneelsnhofmeyr@sysmocom.de
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul><p>The code where the SEGV is reported:</p>
<pre>
const char *tbf_name(const gprs_rlcmac_tbf *tbf)
{
return tbf ? tbf->name() : "(no TBF)";
}
const char *gprs_rlcmac_tbf::name() const
{
snprintf(m_name_buf, sizeof(m_name_buf) - 1,
"TBF(TFI=%d TLLI=0x%08x DIR=%s STATE=%s%s)",
m_tfi, tlli(),
direction == GPRS_RLCMAC_UL_TBF ? "UL" : "DL",
state_name(),
is_egprs_enabled() ? " EGPRS" : ""
);
m_name_buf[sizeof(m_name_buf) - 1] = '\0';
return m_name_buf;
}
</pre>
<p>AFAICT this cannot be a NULL dereference.</p>
<ul>
<li>tbf_name() checks against NULL tbf</li>
<li>tlli() checks against a NULL m_ms</li>
<li>state_name() uses osmo_fsm_inst_state_name() which is NULL safe</li>
<li>state_fsm is a member struct and cannot be NULL</li>
<li>is_gprs_enabled() merely returns a bool member</li>
</ul>
<p>Since it's not a NULL deref, I guess it must be a use-after-free??</p> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=224472021-08-09T20:21:24Zneelsnhofmeyr@sysmocom.de
<ul></ul><p>seems that the ulc->tree_root traversed in pdch_ulc_expire_fn() upon receiving a pdtch data ind<br />still points at a entry that has already been deallocated.</p>
<p>I'm pretty sure that the actual cause for the error happens way before the symptom shows:<br />at some point a tbf gets freed without calling tbf_free(), and it busts up pdch_ulc_expire_fn() some time later.</p>
<p>The reason is not trivial: tbf_free() properly calls tbf_unlink_pdch() that<br />removes the tbf entry via gprs_rlcmac_pdch::detach_tbf() and pdch_ulc_release_tbf()</p>
<p>Next checking whether:</p>
<ul>
<li>some error handling path frees a tbf in a different way,</li>
<li>some parent talloc ctx might be freed and "pulls the carpet".</li>
</ul>
<p>If I don't find any cause, we could add some strategic logging and analyse allocation and freeing next time it happens.</p> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=225382021-09-15T09:25:52Zlaforge
<ul><li><strong>Assignee</strong> changed from <i>neels</i> to <i>pespin</i></li></ul> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=225622021-09-15T12:36:59Zpespin
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Feedback</i></li><li><strong>Assignee</strong> changed from <i>pespin</i> to <i>iedemam</i></li></ul><p>Same as <a class="issue tracker-1 status-3 priority-3 priority-high3 closed" title="Bug: Crash: SEGV on current master 945be910 (Resolved)" href="https://osmocom.org/issues/5205">#5205</a>, let's ask <a class="user active" href="https://osmocom.org/users/732038">iedemam</a> if he stills sees it with recent versions of osmo-pcu after FSM refactoring was merged (end of August).</p> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=227432021-10-15T15:49:05Zpespin
<ul></ul><p>I believe this should be fixed in current osmo-pcu.git master b0aba591433c7c22298035453713724172d1cfbc</p>
<p>Please <a class="user active" href="https://osmocom.org/users/732038">iedemam</a> see if you can still reproduce.</p> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=228102021-10-25T06:03:24Zlaforge
<ul></ul><p>pespin wrote:</p>
<blockquote>
<p>I believe this should be fixed in current osmo-pcu.git master b0aba591433c7c22298035453713724172d1cfbc</p>
<p>Please <a class="user active" href="https://osmocom.org/users/732038">iedemam</a> see if you can still reproduce.</p>
</blockquote>
<p>any update here, can the issue be closed?</p> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=228152021-10-25T08:37:44Ziedemam
<ul></ul><p>Also looking good here, thanks. Can be closed.</p> OsmoPCU - Bug #5206: Crash: SEGV on current master 945be910 around gprs_rlcmac_tbf::namehttps://osmocom.org/issues/5206?journal_id=228212021-10-25T09:22:38Zpespin
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul>