Bug #4603
closedlots of SDNCP defrag queue msgb's allocated
100%
Description
In one of the crashes of #4602, there were 2783 msgb's with "SDNCP Defrag" allocated at the time of the crash:
Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: msgb contains 3473671 bytes in 2785 blocks (ref 0) 0x55d97cc7b340 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: Attach Request contains 3208 bytes in 1 blocks (ref 0) 0x55d97d058100 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: Attach Request contains 3208 bytes in 1 blocks (ref 0) 0x55d97cd993d0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13ec90 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13e870 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13e450 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13e030 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13dc10 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13d7f0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13d3d0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13cfb0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13cb90 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13c770 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13c350 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13bf30 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13bb10 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13b6f0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13b2d0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13aeb0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13aa90 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13a670 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d13a250 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d139e30 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d139a10 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d1395f0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d1391d0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d138db0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d138990 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d138570 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d138150 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d137d30 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d137910 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d1374f0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d1370d0 Jun 08 15:28:15 osmo-cn osmo-sgsn[4976]: SNDCP Defrag contains 949 bytes in 1 blocks (ref 0) 0x55d97d136cb0 ...
so clearly we have some memory leaking going on here.
The defrag segments are created in defrag_segments() at the end fo the defragmentation process, i.e. if all of the fragments have been received. The msgb contains the defragmented (complete) PDU.
The way how I understand defrag_segments(): It will only free the 'expnd' message if any DCOMP or PCOMP is active. But it will not actually free the 'msg' that is passed into sgsn_rx_sndcp_ud_ind(). And the latter function is not free()ing msg. In fact, it not even uses the 'msg' argument at all ?!?
Updated by laforge almost 4 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 20
So the normal flow of events is:
- handle_nsip_read()
- gprs_ns_rcvmsg()
- gprs_ns_process_msg()
- gprs_ns_rx_unitdata()
- sgsn_ns_cb(GPRS_NS_EVT_UNIT_DATA, ...)
- bssgp_rcvmsg()
- bssgp_rx_ptp()
- bssgp_rx_ul_ud()
- sgsn_main.c:bssgp_prim_cb(SAP_BSSGP_LL, PRIM_BSSGP_UL_UD, PRIM_OP_INDICATION)
- gprs_llc.c:gprs_llc_rcvmsg()
- sndcp_llunitdata_ind()
- if defragmentation needed, defrag_input()
- crate copy of payload in defrag_enqueue()
- eventually end up in defrag_segments(), where the "SNDCP Defrag" msg is allocated
- if no defragmentation is needed, sgsn_rx_sndcp_ud_ind()
- if defragmentation needed, defrag_input()
- return up the stack to handle_nsip_read() where the msgb is free'd
- memory ownership of the msgb is never transferred anywhere during the entire NS/BSSGP/LLC/SGSN stack traversion
- anyone wanting to take a copy of the memory needs to do so
- sgsn_rx_sndcp_ud_ind() will not take ownership
=> defrag_segments() must free the msgb it allocates.
Updated by laforge almost 4 years ago
- % Done changed from 20 to 50
Proposed fix in https://gerrit.osmocom.org/c/osmo-sgsn/+/18733
Updated by laforge almost 4 years ago
As far as I can tell this bug has been present since the original defragmentation implementation was added in 2010 (!). It seems we never encountered a sufficient number of fragmented SNDCP messages to really run into serious resource exhaustion problems ?!?
Updated by laforge almost 4 years ago
- Status changed from In Progress to Resolved
- % Done changed from 50 to 100
patch merged.