Project

General

Profile

Actions

Bug #1733

closed

nat: Memory leak in osmo-bsc_nat?

Added by zecke almost 8 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
-
Target version:
-
Start date:
05/22/2016
Due date:
% Done:

100%

Resolution:
Spec Reference:

Description

The osmo-bsc_nat process has been selected to be killed but it is not clear that it was the process that consumed too much memory (it might just be the unlucky one asking for memory). Look into the memory leak. Right now I can only think of two places that changed. It is Osmux and my new token based auth (that is not used but might not be freed).

It currently sits at 10MB of resident memory. Let's have a look in a bit.

Actions #1

Updated by laforge almost 8 years ago

  • Assignee set to daniel
Actions #2

Updated by zecke almost 8 years ago

-full talloc report on 'nat' (total 5366856 bytes in 33246 blocks)
+full talloc report on 'nat' (total 5525325 bytes in 33358 blocks)
     telnet_connection              contains      1 bytes in   1 blocks (ref 0) 0x11dcc60
-    struct bsc_nat                 contains 3851016 bytes in 32875 blocks (ref 0) 0x1143a60
-        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x1a58ad0
-            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1a2e820
-        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x1a2e750
-            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1a4ea80
-        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x1a69620
-            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1913040
-        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x1a2c860
-            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1917ef0
-        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x19f4cf0
-            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1a3ce10
-        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x1919b20
-            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1a2c750
-        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x19b0200
-            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1a4ef00
-        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x1a7d5c0
-            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1a3cd90
+    struct bsc_nat                 contains 3853396 bytes in 32953 blocks (ref 0) 0x1143a60
+        struct nat_sccp_connection     contains    112 bytes in   1 blocks (ref 0) 0x1a2ccd0
+        struct bsc_connection          contains    472 bytes in   1 blocks (ref 0) 0x1909dc0
+        struct nat_sccp_connection     contains    128 bytes in   2 blocks (ref 0) 0x1a2cf50
+            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1911bc0

This seems to be IMSIs stolen here:

                        con->filter_state.con_type = con_type;
                        con->filter_state.imsi_checked = filter;
                        bsc_nat_extract_lac(bsc, con, parsed, msg);
                        if (imsi)
                                con->filter_state.imsi = talloc_steal(con, imsi);

So we need to see how/why it sometimes remains scoped by:

+        struct bsc_connection          contains    665 bytes in  12 blocks (ref 0) 0x1a7a250
+            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1a1cb60
+            XXX                contains     16 bytes in   1 blocks (ref 0) 0x198e090
+            XXX                contains     16 bytes in   1 blocks (ref 0) 0x1a2c8d0

But as this is either scoped by the nat_sccp_connection or the bsc_connection the memory is freed once the TCP connection is dead or it can be forced by resetting the connection.

Actions #3

Updated by zecke almost 8 years ago

The other leak seems to be in the Osmux/RTP code.

-    msgb                           contains 1515551 bytes in 362 blocks (ref 0) 0x11432a0
+    msgb                           contains 1671640 bytes in 396 blocks (ref 0) 0x11432a0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bccb30
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bcba50
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bca970
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc9890
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc65f0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bbac50
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb79b0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc1190
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc3350
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bd84d0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb8a90
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc76d0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc4430
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bbefd0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb1470
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc87b0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc5510
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc2270
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bbdef0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bbbd30
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bc00b0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb9b70
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bbce10
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb0390
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1baf2b0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bad0f0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bac010
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1baaf30
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb57f0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb4710
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bae1d0
         RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba5ad0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb3630
         RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba49f0
-        OSMUX test                     contains    165 bytes in   1 blocks (ref 0) 0x190d210
-        OSMUX test                     contains    165 bytes in   1 blocks (ref 0) 0x1919cf0
-        OSMUX test                     contains    165 bytes in   1 blocks (ref 0) 0x1919bf0
         RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba3910
-        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba1750
         RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba0670
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba6bb0
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb2550
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba1750
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba8d70
         RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba9e50
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1ba7c90
+        RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1bb68d0
         RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1b9f590
         RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1b9e4b0
         RTP                            contains   4232 bytes in   1 blocks (ref 0) 0x1b9d3d0
git grep '"RTP"'
libmgcp/mgcp_network.c:         was_rtcp ? "RTCP" : "RTP",
libmgcp/mgcp_osmux.c:   msg = msgb_alloc(4096, "RTP");

not sure how we go from 4096 to 4232 allocated bytes but I think it is most likely this place. As this is scoped by the msgb context, the bytes will never go away. The diff was taken with 30 minutes apart, no RTP packet should stay that long in a queue (and I don't think the address just happens to be recycled as the total memory usages grows).

Actions #4

Updated by zecke almost 8 years ago

IMSI "leak":

  • We analyze access-lists before there is a tracked SCCP connection
  • The libfilter/ code is shared by BSC/NAT but:
                            filter = bsc_nat_filter_sccp_cr(bsc, msg, parsed,
                                                    &con_type, &imsi, &cause);
    ... will call
    bsc_nat_filter.c:bsc_nat_filter_sccp_cr
    .. which will fill out the bsc_filter_request and set the req.ctx to bsc (of type struct bsc_connection).
    .. then it goes to libfilter into code like this:
    
     *imsi = talloc_strdup(ctx, mi_string);
    

So this explains why the IMSI is scoped by the bsc_connection. It doesn't explain why it is not freed. So it takes another path as well. I think it comes from later Identity Requests.

Actions #5

Updated by zecke almost 8 years ago

IMSI "leak":

diff --git a/openbsc/src/osmo-bsc_nat/bsc_nat_filter.c b/openbsc/src/osmo-bsc_nat/bsc_nat_filter.c
index 393aea3..e735290 100644
--- a/openbsc/src/osmo-bsc_nat/bsc_nat_filter.c
+++ b/openbsc/src/osmo-bsc_nat/bsc_nat_filter.c
@@ -109,7 +109,7 @@ int bsc_nat_filter_dt(struct bsc_connection *bsc, struct msgb *msg,
        if (!hdr48)
                return -1;

-       req.ctx = bsc;
+       req.ctx = con;
        req.black_list = &bsc->nat->imsi_black_list;
        req.access_lists = &bsc->nat->access_lists;
        req.local_lst_name = bsc->cfg->acc_lst_name;
Actions #6

Updated by daniel almost 8 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 30

osmux: There is an issue when a circuit is deleted while it still has msgs in the buffer. The buffer contains a list of circuits which contain a list of msgs.
When the circuit is deleted the msgs are lost. The proposed fix is to dequeue and free the msgs if a circuit is deleted.

https://gerrit.osmocom.org/#/c/119
https://gerrit.osmocom.org/#/c/120

Actions #7

Updated by laforge almost 8 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 30 to 100

daniel wrote:

osmux: There is an issue when a circuit is deleted while it still has msgs in the buffer. The buffer contains a list of circuits which contain a list of msgs.
When the circuit is deleted the msgs are lost. The proposed fix is to dequeue and free the msgs if a circuit is deleted.

https://gerrit.osmocom.org/#/c/119
https://gerrit.osmocom.org/#/c/120

the "proposed" fix has long been merged, resolving the ticket. please don't wait for me to spot such things...

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)