Bug #3952
closedMSGB_ABORT in msgb_l3trim called from gprs_llc_rcvmsg()
100%
Description
crash is observed with a large number and variey of phones attaching.
Happens within a few minutes of startup of SGSN
The offending msgb passed to gprs_llc_rcvmsg():
(gdb) print *msg $2 = {list = {next = 0x0, prev = 0x0}, {dst = 0x0, trx = 0x0}, lchan = 0x0, l1h = 0x0, l2h = 0x84c3fc "", l3h = 0x0, l4h = 0x0, cb = {8700928, 8700951, 8700938, 12350589634047246338, 0}, data_len = 3072, len = 32, head = 0x84c3e8 "", tail = 0x84c41c "", data = 0x84c3fc "", _data = 0x84c3e8 ""}
from gprs_llc_hdr_parse() :
(gdb) print llhp $3 = {sapi = 1 '\001', is_cmd = 1 '\001', ack_req = 0 '\000', is_encrypted = 0 '\000', seq_rx = 0, seq_tx = 0, fcs = 11772444, fcs_calc = 11772444, data = 0x0, data_len = 0, crc_length = 2, cmd = GPRS_LLC_NULL}
Then we do:
msg->l3h = llhp.data; (0x0)
msgb_l3trim(msg, llhp.data_len); (0)
and so in msgb_l3trim(msg, l3len)
msgb_trim(msg, (msg->l3h - msg->data) + l3len);
that is to say (0x0 - 0x84c3fc) + 0) results in passing a negative length to msgb_trim and so...
if (len < 0)
MSGB_ABORT(msg, "Negative length is not allowed\n");
backtrace:
(gdb) bt #0 0x00007ffff5402067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007ffff5403448 in __GI_abort () at abort.c:89 #2 0x00007ffff7314310 in osmo_panic () from /usr/lib/x86_64-linux-gnu/libosmocore.so.12 #3 0x000000000041c31c in msgb_trim (len=<optimized out>, msg=<optimized out>) at /usr/include/osmocom/core/msgb.h:487 #4 msgb_l3trim (l3len=<optimized out>, msg=<optimized out>) at /usr/include/osmocom/core/msgb.h:504 #5 gprs_llc_rcvmsg (msg=0x84c360, tv=0x72ea) at gprs_llc.c:950 #6 0x00007ffff775df5b in bssgp_rcvmsg () from /usr/lib/x86_64-linux-gnu/libosmogb.so.6 #7 0x00007ffff7757e3a in ?? () from /usr/lib/x86_64-linux-gnu/libosmogb.so.6 #8 0x00007ffff77597fa in gprs_ns_rcvmsg () from /usr/lib/x86_64-linux-gnu/libosmogb.so.6 #9 0x00007ffff7759974 in ?? () from /usr/lib/x86_64-linux-gnu/libosmogb.so.6 #10 0x00007ffff7308dd4 in osmo_select_main () from /usr/lib/x86_64-linux-gnu/libosmocore.so.12 #11 0x00000000004050d7 in main (argc=1, argv=0x0) at sgsn_main.c:524
LLC debug log leading up to this was:
<0011> gprs_llc_parse.c:81 LLC SAPI=1 C U GEA0 IOV-UI=0x000000 FCS=0x172bab CMD=UI DATA <0011> gprs_llc.c:342 Sending XID type NULL (8 bytes) request to MS... <0011> gprs_llc_parse.c:81 LLC SAPI=1 C U GEA0 IOV-UI=0x000000 FCS=0xea1c55 CMD=UI DATA <0011> gprs_llc_parse.c:81 LLC SAPI=1 C U GEA0 IOV-UI=0x000000 FCS=0xb3a21c msgb(0x84c360): Negative length is not allowed
Updated by keith almost 5 years ago
patch to stop this crash:
http://git.osmocom.org/osmo-sgsn/commit/?h=keith/prevent_MSGB_ABORT
Updated by laforge almost 5 years ago
Looking at gprs_llc_hdr_parse(), there are more cases where ghp->data = NULL.
Basically, in all cases except the following three, ghp->data will be NULL:- I frames (only in ABM mode, which we don't implement)
- UI frames (which is what we actually do handle)
- U frames if they contain XID
Everything else doesn't contain any payload.
So I think in order to solve the problem in a more general way, we should
turn it upside down:
- only ever pass UI frames up to GMM/SNDCP
- actually refuse/drop any of the frame types we don't support. ABM is optional
in LLC, and not really needed as we already use the RLC/MAC mode in relaible/confirmed
mode
The best way to handle this is to first write a couple of TTCN-3 tests that
reproduce various scenarios we don't support, and then add the fix[es] to osmo-sgsn
as a follow-up.
Updated by laforge almost 5 years ago
Simple test case reproducing the bug: https://gerrit.osmocom.org/#/c/osmo-ttcn3-hacks/+/13760
Updated by laforge almost 5 years ago
- Status changed from New to In Progress
- Assignee set to laforge
- % Done changed from 0 to 80
Updated by laforge almost 5 years ago
Patch to address the problem is https://gerrit.osmocom.org/#/c/osmo-sgsn/+/13764
Updated by laforge almost 5 years ago
- Status changed from In Progress to Resolved
- % Done changed from 80 to 100
patch and testcase have both been merged.