Project

General

Profile

Bug #3997

Retransmit queue is not cleared

Added by manatails 5 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
05/11/2019
Due date:
% Done:

100%

Spec Reference:

Description

osmo-ggsn slowly fills its retransmit queue every few minutes, but queued items are not removed.

following is the view of the queue with QUEUE_DEBUG 1.

53 1 61635 54 52 1557380751 0
54 1 61636 55 53 1557380760 0
55 1 61637 56 54 1557380777 0
56 1 61638 57 55 1557380787 0
57 1 61639 58 56 1557380830 0
58 1 61640 59 57 1557380906 0
59 1 61641 60 58 1557381026 0
60 1 61642 61 59 1557381074 0
61 1 61643 62 60 1557381086 0
62 1 61644 63 61 1557381153 0
63 1 61645 64 62 1557381300 0
64 1 61646 65 63 1557381305 0
65 1 61647 66 64 1557381564 0
66 1 61648 67 65 1557381619 0
67 1 61649 68 66 1557381630 0
68 1 61650 69 67 1557381668 0
69 1 61651 70 68 1557381704 0
70 1 61652 71 69 1557381732 0
71 1 61653 72 70 1557381737 0
72 1 61654 73 71 1557381742 0
73 1 61655 74 72 1557381747 0
74 1 61656 75 73 1557381752 0
75 1 61657 76 74 1557381757 0
76 1 61658 77 75 1557381784 0
77 1 61659 78 76 1557381988 0
78 1 61660 79 77 1557382047 0
79 1 61661 80 78 1557382230 0

Checklist

  • create TTCN3 tests to exercise GTP-C retransmissions in OsmoGGSN
  • create TTCN3 tests to exercise GTP-C retransmissions in OsmoSGSN

Related issues

Related to OsmoGGSN (former OpenGGSN) - Bug #1740: add unit tests for queue_seqdel() and callersResolved06/01/2016

History

#1 Updated by pespin 5 months ago

  • Status changed from New to In Progress
  • Assignee set to pespin

It took me a while to reproduce because i was using only 1 MS. For some reason until I didn't register a 2nd MS (while 1st one was downloading a big file), I didn't start seeing the queue dump every few seconds.

So to given a better example, one first sees for instance this state:

20190527125321073 DGGSN <0002> /home/pespin/dev/sysmocom/git/osmo-ggsn/ggsn/ggsn.c:852 PDP(901700000015254:6): Packet received on APN(internet): forwarding to tun tun4
20190527125321106 DTUN <0001> /home/pespin/dev/sysmocom/git/osmo-ggsn/ggsn/ggsn.c:818 Received packet for APN(internet) from tun tun4
Begin queue_seqget seq = 26654
End queue_seqget. Not found
queue_newmsg 26654
Begin queue_seqset seq = 26654
SIZEOF PEER 8, *PEER 16
End queue_seqset
Queue: 0x7fffe9ac7800 Next: 31 First: 0 Last: 30
# State seq next prev timeout retrans
0 1 26624 1 -1 1558952675 0
1 1 26625 2 0 1558952853 0
2 1 26626 3 1 1558952858 0
3 1 26627 4 2 1558953034 0
4 1 26628 5 3 1558953037 0
5 1 26629 6 4 1558953207 0
6 1 26630 7 5 1558953268 0
7 1 26631 8 6 1558953537 0
8 1 26632 9 7 1558953579 0
9 1 26633 10 8 1558953662 0
10 1 26634 11 9 1558953784 0
11 1 26635 12 10 1558953969 0
12 1 26636 13 11 1558953971 0
13 1 26637 14 12 1558954371 0
14 1 26638 15 13 1558954373 0
15 1 26639 16 14 1558954377 0
16 1 26640 17 15 1558954380 0
17 1 26641 18 16 1558954383 0
18 1 26642 19 17 1558954389 0
19 1 26643 20 18 1558954395 0
20 1 26644 21 19 1558954401 0
21 1 26645 22 20 1558954407 0
22 1 26646 23 21 1558954413 0
23 1 26647 24 22 1558954419 0
24 1 26648 25 23 1558954425 0
25 1 26649 26 24 1558954431 0
26 1 26650 27 25 1558954437 0
27 1 26651 28 26 1558954443 0
28 1 26652 29 27 1558954449 0
29 1 26653 30 28 1558954455 0
30 1 26654 -1 29 0 0
31 0 0 0 0 0 0
32 0 0 0 0 0 0
33 0 0 0 0 0 0
34 0 0 0 0 0 0

Then, a few seconds later:

Begin queue_seqget seq = 26677
End queue_seqget. Not found
queue_newmsg 26677
Begin queue_seqset seq = 26677
SIZEOF PEER 8, *PEER 16
End queue_seqset
Queue: 0x7fffe9ac7800 Next: 54 First: 0 Last: 53
# State seq next prev timeout retrans
0 1 26624 1 -1 1558952675 0
1 1 26625 2 0 1558952853 0
2 1 26626 3 1 1558952858 0
3 1 26627 4 2 1558953034 0
4 1 26628 5 3 1558953037 0
5 1 26629 6 4 1558953207 0
6 1 26630 7 5 1558953268 0
7 1 26631 8 6 1558953537 0
8 1 26632 9 7 1558953579 0
9 1 26633 10 8 1558953662 0
10 1 26634 11 9 1558953784 0
11 1 26635 12 10 1558953969 0
12 1 26636 13 11 1558953971 0
13 1 26637 14 12 1558954371 0
14 1 26638 15 13 1558954373 0
15 1 26639 16 14 1558954377 0
16 1 26640 17 15 1558954380 0
17 1 26641 18 16 1558954383 0
18 1 26642 19 17 1558954389 0
19 1 26643 20 18 1558954395 0
20 1 26644 21 19 1558954401 0
21 1 26645 22 20 1558954407 0
22 1 26646 23 21 1558954413 0
23 1 26647 24 22 1558954419 0
24 1 26648 25 23 1558954425 0
25 1 26649 26 24 1558954431 0
26 1 26650 27 25 1558954437 0
27 1 26651 28 26 1558954443 0
28 1 26652 29 27 1558954449 0
29 1 26653 30 28 1558954455 0
30 1 26654 31 29 1558954461 0
31 1 26655 32 30 1558954471 0
32 1 26656 33 31 1558954473 0
33 1 26657 34 32 1558954479 0
34 1 26658 35 33 1558954485 0
35 1 26659 36 34 1558954491 0
36 1 26660 37 35 1558954497 0
37 1 26661 38 36 1558954503 0
38 1 26662 39 37 1558954509 0
39 1 26663 40 38 1558954515 0
40 1 26664 41 39 1558954521 0
41 1 26665 42 40 1558954527 0
42 1 26666 43 41 1558954533 0
43 1 26667 44 42 1558954539 0
44 1 26668 45 43 1558954545 0
45 1 26669 46 44 1558954551 0
46 1 26670 47 45 1558954557 0
47 1 26671 48 46 1558954563 0
48 1 26672 49 47 1558954569 0
49 1 26673 50 48 1558954575 0
50 1 26674 51 49 1558954581 0
51 1 26675 52 50 1558954587 0
52 1 26676 53 51 1558954593 0
53 1 26677 -1 52 0 0
54 0 0 0 0 0 0
55 0 0 0 0 0 0

So it can be seen that the amount of used packets in the queue increses over time (every few seconds).

From seq numbers seen in log and matching with wireshark output, that filling of the list is triggered by osmo-sgsn sending an Echo Request and osmo-ggsn answering with an Echo Response.
I now need to look into why the list increases when that happens.

#2 Updated by pespin 5 months ago

  • Status changed from In Progress to Feedback
  • % Done changed from 0 to 90

Osmo-ggsn is receiving the request and inserting the echo response in gsn->queue_resp through gtp_dublicate()->queue_seqget()).

retrans are handled and freed through gtp_retrans() called upon trigger of ggsn->gtp_timer. But that timer was never started during startup, so it will never trigger.

Following patch fixes the issue, as can be seen in sample output below:
https://gerrit.osmocom.org/#/c/osmo-ggsn/+/14201 ggsn: Start gtp retrans timer during startup

Begin queue_seqget seq = 28725
End queue_seqget. Not found
queue_newmsg 28725
Begin queue_seqset seq = 28725
SIZEOF PEER 8, *PEER 16
End queue_seqset
Queue: 0x7fffe9ac7800 Next: 54 First: 43 Last: 53
# State seq next prev timeout retrans
0 0 0 0 0 0 0
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 0 0 0 0
5 0 0 0 0 0 0
6 0 0 0 0 0 0
7 0 0 0 0 0 0
8 0 0 0 0 0 0
9 0 0 0 0 0 0
10 0 0 0 0 0 0
11 0 0 0 0 0 0
12 0 0 0 0 0 0
13 0 0 0 0 0 0
14 0 0 0 0 0 0
15 0 0 0 0 0 0
16 0 0 0 0 0 0
17 0 0 0 0 0 0
18 0 0 0 0 0 0
19 0 0 0 0 0 0
20 0 0 0 0 0 0
21 0 0 0 0 0 0
22 0 0 0 0 0 0
23 0 0 0 0 0 0
24 0 0 0 0 0 0
25 0 0 0 0 0 0
26 0 0 0 0 0 0
27 0 0 0 0 0 0
28 0 0 0 0 0 0
29 0 0 0 0 0 0
30 0 0 0 0 0 0
31 0 0 0 0 0 0
32 0 0 0 0 0 0
33 0 0 0 0 0 0
34 0 0 0 0 0 0
35 0 0 0 0 0 0
36 0 0 0 0 0 0
37 0 0 0 0 0 0
38 0 0 0 0 0 0
39 0 0 0 0 0 0
40 0 0 0 0 0 0
41 0 0 0 0 0 0
42 0 0 0 0 0 0
43 1 28715 44 -1 1558969344 0
44 1 28716 45 43 1558969350 0
45 1 28717 46 44 1558969356 0
46 1 28718 47 45 1558969362 0
47 1 28719 48 46 1558969368 0
48 1 28720 49 47 1558969374 0
49 1 28721 50 48 1558969380 0
50 1 28722 51 49 1558969386 0
51 1 28723 52 50 1558969392 0
52 1 28724 53 51 1558969398 0
53 1 28725 -1 52 0 0
54 0 0 0 0 0 0
55 0 0 0 0 0 0
56 0 0 0 0 0 0

#3 Updated by laforge 5 months ago

  • Checklist item create TTCN3 tests to exercise GTP-C retransmissions in OsmoGGSN added
  • Checklist item create TTCN3 tests to exercise GTP-C retransmissions in OsmoSGSN added

#4 Updated by pespin 5 months ago

This patch should also improve a bit the amount of packets in the resp queue in some specific scenarios:
https://gerrit.osmocom.org/#/c/osmo-ggsn/+/14211 gtp: Take queue_resp into account to schedule retrans timer

#5 Updated by pespin 5 months ago

Submitted one test to verify retrans queue_resp in osmo-ggsn:

https://gerrit.osmocom.org/#/c/osmo-ttcn3-hacks/+/14240 ggsn: Add test to verify retransmit queue_resp

#6 Updated by pespin 5 months ago

  • Checklist item create TTCN3 tests to exercise GTP-C retransmissions in OsmoGGSN set to Done
  • Checklist item create TTCN3 tests to exercise GTP-C retransmissions in OsmoSGSN set to Done

Submitted similar patch to test retransmit queues in osmo-sgsn:
https://gerrit.osmocom.org/#/c/osmo-ttcn3-hacks/+/14262 sgsn: Add tests to check GTP retransmit queue

Once merged we can close the ticket.

#7 Updated by pespin 5 months ago

  • Related to Bug #1740: add unit tests for queue_seqdel() and callers added

#8 Updated by pespin 4 months ago

  • Status changed from Feedback to Resolved
  • % Done changed from 90 to 100

Merged, closing.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)