Project

General

Profile

Bug #4995

handle ENOBUFS on write to AF_PACKET socket

Added by laforge 4 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
01/30/2021
Due date:
% Done:

100%

Spec Reference:

Description

AF_PACKET sockets have the following "incredible" semantics:
  • even when marked write-able by select/poll, they still many return -1 / ENOBUFS in case the socket buffer and/or tx-queue of the driver is full
  • there is no way to safely/sanely wait for buffer space to become available again
  • the only option is to re-try until it finally succeeds, preferably after a "reasonable" amount of sleep, considering the data rate of the underlying transport medium

I'm attaching a reproducer to demonstrate the problem, both with select() and without.

See also SYS#5343

The problem is further increased due to the fact that libosmocore/ns2 doesn't even notice if this happens.

The call chain looks like this:
  • frame_relay.c:osmo_fr_tx_dlc() is used to transmit NS messages by the NS2 core
  • frame_relay.c will do whatever handling internaly
  • gprs_ns2fr.c:fr_tx_cb() is the call-back we register with the frame_relay.c cre
    • puts msgb into osmo_wqueue
  • wqueue code calls gprs_ns2_fr.c:handle_netif_write() as write-callback when the FD is write-able (always!)
    • we directlry retunr te result of the write() syscall (which is -1 in case of error)
    • osmo_wqueue_bfd_cb() calls that write_cb(), but
      • it expects a -errno type return value, not the return value of a syscall that's likely just -1.
      • only treats "-EAGAIN" as a trigger to re-enqueue the just-dequeued message

So all in all, we are using a write_queue, but it will never really queue anything, as the socket is always writable and we don't realize if the write actually fails.

What makes this even worse: A shared write_queue for user traffic and Q.933 LMI (or even NS-ALIVE) traffic is actively dangerous
  • if Q.933 starts to fail ,the entire link will be marked dead at the FR level
  • if NS-ALIVE starts to fail, the NS-VC will be marked as DEAD
Now the question is what to do about it:
  1. as an absolute minimum, we should have a counter and/or error messages if ENOBUFS or any other error happens
  2. users of osmo_wqueue should always return a "-errno" style return value. We should audit all our code, not just this example
  3. there should be some notification of the uppwer layers/application when ENOBUFS happens
  4. have separate queues or some other prioritization that prefers Q.933 LMI traffic and NS-signaling (ALIVE/ALIVE-ACK) over all user traffic (NS-UNITDATA)
packet.c packet.c 3.94 KB reproducer program laforge, 01/30/2021 10:33 AM

Related issues

Related to libosmocore - Bug #4974: gbproxy-ttcn3-test over framerelay are unstableResolved01/25/2021

Associated revisions

Revision 335c550f (diff)
Added by laforge 4 months ago

ns2: Use proper return value from write_queue callback function

write_queue expects a -errno value on error, not '-1'.

Change-Id: I93c858facfe7e1c533df8dccc4502a574686bc8a
Related: OS#4995

Revision f0073d70 (diff)
Added by laforge 4 months ago

ns2: Log ERROR if we cannot transmit a packet due to ENOBUFS

Related: OS#4995
Change-Id: I2ba64e96c60e23d2e6c8ecdcab0b52b3833f092c

Revision 0b7614b4 (diff)
Added by laforge 4 months ago

osmo-ns-dummy: Add simple NS traffic generator

This adds a simple NS traffic generator that can be used to perform
load testing on NS links, particularly those with limited bandwidth
such as frame-relay E1 lines.

Related: OS#4995
Change-Id: Iad3b694c85962dbbc6b4a27a0ed5bc841add464f

Revision d06128d1 (diff)
Added by laforge 4 months ago

ns2: Work around AF_PACKET socket ENOBUFS problems

AF_PACKET sockets cannot be written-to using select(), as they
will always return "writable" but then still fail with ENOBUFS.

This also means that we cannot use osmo_wqueue() as it assumes
that a writable socket can actually be written to.

As there's no way to figure out when exactly we can again perform
a successful write, we have no other option but to start a timer
and re-try at a later time.

We will scale that timer based on the estimated duration of transmission
for the just-failed PDU on the line rate of a 31TS E1 Line.

Furthermore, with this patch, we stop queueing anything but signaling
traffic (NS-BVCI=0) or Q.933 LMI. User data (NS-BVCI != 0) will
instead be dropped in order to avoid buffer-bloat.

Change-Id: I4f79a246236c94175ceffa5f3f556c8931c6bc62
Closes: OS#4995

History

#1 Updated by laforge 4 months ago

#2 Updated by laforge 4 months ago

  • % Done changed from 0 to 10

After some thinking about possible approaches with prioritiy queues or the like, I think I concluded

  • osmo_wqueue cannot be used at all due to its integration with select and assumption that select would only return if a socket is really write-able, ...
  • instead of complex multiple priority queues, I think we should implement one new mechanism with the following policy
    • enqueue Q.933 LMI at the head of the qeuue, instead of the tail
    • enqueue NS-ALIVE/ACK at the head of the queue, instad of the tail
    • enqueue NS-UNITDATA with BVCI==0 at the tail of the queue
    • never enqueue NS-UNITDATA with BVCI!=0, i.e. user traffic on PTP BVCs

The dequeue mechanism would be entirely driven by the ENOBUFS returns and a related osmo_timer. So if we get ENOBUFS, we start a timer and retry later. The duration of the timer can be baesed on a rough estimate of the transimit time of the just-failed packet.

This should
  • keep our queue very short, and
  • ensure that all essential signaling on all layers (LMI/NS/BVC) gets reliable delivery
    • even BVC-RESET for PTP BVC happens on NS-BVCI 0), while
  • dropping that traffic which we really can afford to drop (user plane)
  • also ensures we don't get into buffer bloat

The number of dropped packets should have a counter, and it ideally should also provide some feedback to the higher layers so that we can incorporate it in the per-BVC flow control. I would consider the latter a second, possibly even optional step.

#3 Updated by laforge 4 months ago

  • Status changed from New to In Progress

I can actually reproduce the dropping of vital messages like Q.933 or NS-ALIVE/ACK under high load.

Using osmo-ns-dummy with the load-generator patch from https://gerrit.osmocom.org/c/libosmocore/+/22553 and this config snippet:

ns-traffic-generator foo
 nsei 1001
 bvci 0
 packet-size 1400
 interval-us 2500
 lsp 0
 lsp-mode fixed

will try to transmit at more than twice the capacity of the E1 line and hence reproduce the problem very quickly. Within a minute or so, the FR/HDLC link will be declared as unreliable.

#4 Updated by laforge 4 months ago

  • % Done changed from 10 to 70

laforge wrote:

After some thinking about possible approaches with prioritiy queues or the like, I think I concluded

  • osmo_wqueue cannot be used at all due to its integration with select and assumption that select would only return if a socket is really write-able, ...
  • instead of complex multiple priority queues, I think we should implement one new mechanism with the following policy
    • enqueue Q.933 LMI at the head of the qeuue, instead of the tail
    • enqueue NS-ALIVE/ACK at the head of the queue, instad of the tail
    • enqueue NS-UNITDATA with BVCI==0 at the tail of the queue
    • never enqueue NS-UNITDATA with BVCI!=0, i.e. user traffic on PTP BVCs

This is now implemented in https://gerrit.osmocom.org/c/libosmocore/+/22555

The "Q.933 / NS-ALIVE starvation" can no longer be observed when doing overload testing using this patch.

#5 Updated by laforge 4 months ago

  • Description updated (diff)

#6 Updated by laforge 3 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 70 to 100

patch merged.

#7 Updated by laforge 3 months ago

  • Related to Bug #4974: gbproxy-ttcn3-test over framerelay are unstable added

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)