1. 13 Dec, 2019 20 commits
  2. 12 Dec, 2019 2 commits
  3. 11 Dec, 2019 13 commits
    • Daniel Borkmann's avatar
      bpf: Emit audit messages upon successful prog load and unload · bae141f5
      Daniel Borkmann authored
      Allow for audit messages to be emitted upon BPF program load and
      unload for having a timeline of events. The load itself is in
      syscall context, so additional info about the process initiating
      the BPF prog creation can be logged and later directly correlated
      to the unload event.
      
      The only info really needed from BPF side is the globally unique
      prog ID where then audit user space tooling can query / dump all
      info needed about the specific BPF program right upon load event
      and enrich the record, thus these changes needed here can be kept
      small and non-intrusive to the core.
      
      Raw example output:
      
        # auditctl -D
        # auditctl -a always,exit -F arch=x86_64 -S bpf
        # ausearch --start recent -m 1334
        ...
        ----
        time->Wed Nov 27 16:04:13 2019
        type=PROCTITLE msg=audit(1574867053.120:84664): proctitle="./bpf"
        type=SYSCALL msg=audit(1574867053.120:84664): arch=c000003e syscall=321   \
          success=yes exit=3 a0=5 a1=7ffea484fbe0 a2=70 a3=0 items=0 ppid=7477    \
          pid=12698 auid=1001 uid=1001 gid=1001 euid=1001 suid=1001 fsuid=1001    \
          egid=1001 sgid=1001 fsgid=1001 tty=pts2 ses=4 comm="bpf"                \
          exe="/home/jolsa/auditd/audit-testsuite/tests/bpf/bpf"                  \
          subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
        type=UNKNOWN[1334] msg=audit(1574867053.120:84664): prog-id=76 op=LOAD
        ----
        time->Wed Nov 27 16:04:13 2019
        type=UNKNOWN[1334] msg=audit(1574867053.120:84665): prog-id=76 op=UNLOAD
        ...
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Co-developed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Link: https://lore.kernel.org/bpf/20191206214934.11319-1-jolsa@kernel.org
      bae141f5
    • Stanislav Fomichev's avatar
      bpf: Switch to offsetofend in BPF_PROG_TEST_RUN · b590cb5f
      Stanislav Fomichev authored
      Switch existing pattern of "offsetof(..., member) + FIELD_SIZEOF(...,
      member)' to "offsetofend(..., member)" which does exactly what
      we need without all the copy-paste.
      Suggested-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20191210191933.105321-1-sdf@google.com
      b590cb5f
    • Andrii Nakryiko's avatar
      libbpf: Bump libpf current version to v0.0.7 · 09c4708d
      Andrii Nakryiko authored
      New development cycles starts, bump to v0.0.7 proactively.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20191209224022.3544519-1-andriin@fb.com
      09c4708d
    • Russell King's avatar
      ARM: net: bpf: Improve prologue code sequence · c4533128
      Russell King authored
      Improve the prologue code sequence to be able to take advantage of
      64-bit stores, changing the code from:
      
        push    {r4, r5, r6, r7, r8, r9, fp, lr}
        mov     fp, sp
        sub     ip, sp, #80     ; 0x50
        sub     sp, sp, #600    ; 0x258
        str     ip, [fp, #-100] ; 0xffffff9c
        mov     r6, #0
        str     r6, [fp, #-96]  ; 0xffffffa0
        mov     r4, #0
        mov     r3, r4
        mov     r2, r0
        str     r4, [fp, #-104] ; 0xffffff98
        str     r4, [fp, #-108] ; 0xffffff94
      
      to the tighter:
      
        push    {r4, r5, r6, r7, r8, r9, fp, lr}
        mov     fp, sp
        mov     r3, #0
        sub     r2, sp, #80     ; 0x50
        sub     sp, sp, #600    ; 0x258
        strd    r2, [fp, #-100] ; 0xffffff9c
        mov     r2, #0
        strd    r2, [fp, #-108] ; 0xffffff94
        mov     r2, r0
      
      resulting in a saving of three instructions.
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/E1ieH2g-0004ih-Rb@rmk-PC.armlinux.org.uk
      c4533128
    • Shahjada Abul Husain's avatar
      cxgb4: add support for high priority filters · c2193999
      Shahjada Abul Husain authored
      T6 has a separate region known as high priority filter region
      that allows classifying packets going through ULD path. So,
      query firmware for HPFILTER resources and enable the high
      priority offload filter support when it is available.
      Signed-off-by: default avatarShahjada Abul Husain <shahjada@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2193999
    • Chen Wandun's avatar
      enetc: remove variable 'tc_max_sized_frame' set but not used · 6525b5ef
      Chen Wandun authored
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      drivers/net/ethernet/freescale/enetc/enetc_qos.c: In function enetc_setup_tc_cbs:
      drivers/net/ethernet/freescale/enetc/enetc_qos.c:195:6: warning: variable tc_max_sized_frame set but not used [-Wunused-but-set-variable]
      
      Fixes: c431047c ("enetc: add support Credit Based Shaper(CBS) for hardware offload")
      Signed-off-by: default avatarChen Wandun <chenwandun@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6525b5ef
    • Jakub Kicinski's avatar
      nfp: add support for TLV device stats · ca866ee8
      Jakub Kicinski authored
      Device stats are currently hard coded in the PCI BAR0 layout.
      Add a ability to read them from the TLV area instead.
      Names for the stats are maintained by the driver, and their
      meaning documented. This allows us to more easily add and
      remove device stats.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca866ee8
    • Kuniyuki Iwashima's avatar
      tcp: Cleanup duplicate initialization of sk->sk_state. · 5000b28b
      Kuniyuki Iwashima authored
      When a TCP socket is created, sk->sk_state is initialized twice as
      TCP_CLOSE in sock_init_data() and tcp_init_sock(). The tcp_init_sock() is
      always called after the sock_init_data(), so it is not necessary to update
      sk->sk_state in the tcp_init_sock().
      
      Before v2.1.8, the code of the two functions was in the inet_create(). In
      the patch of v2.1.8, the tcp_v4/v6_init_sock() were added and the code of
      initialization of sk->state was duplicated.
      Signed-off-by: default avatarKuniyuki Iwashima <kuni1840@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5000b28b
    • Michael Walle's avatar
      enetc: add software timestamping · 4caefbce
      Michael Walle authored
      Provide a software TX timestamp and add it to the ethtool query
      interface.
      
      skb_tx_timestamp() is also needed if one would like to use PHY
      timestamping.
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4caefbce
    • David S. Miller's avatar
      Merge branch 'tipc-introduce-variable-window-congestion-control' · bb9d8454
      David S. Miller authored
      Jon Maloy says:
      
      ====================
      tipc: introduce variable window congestion control
      
      We improve thoughput greatly by introducing a variety of the Reno
      congestion control algorithm at the link level.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb9d8454
    • Jon Maloy's avatar
      tipc: introduce variable window congestion control · 16ad3f40
      Jon Maloy authored
      We introduce a simple variable window congestion control for links.
      The algorithm is inspired by the Reno algorithm, covering both 'slow
      start', 'congestion avoidance', and 'fast recovery' modes.
      
      - We introduce hard lower and upper window limits per link, still
        different and configurable per bearer type.
      
      - We introduce a 'slow start theshold' variable, initially set to
        the maximum window size.
      
      - We let a link start at the minimum congestion window, i.e. in slow
        start mode, and then let is grow rapidly (+1 per rceived ACK) until
        it reaches the slow start threshold and enters congestion avoidance
        mode.
      
      - In congestion avoidance mode we increment the congestion window for
        each window-size number of acked packets, up to a possible maximum
        equal to the configured maximum window.
      
      - For each non-duplicate NACK received, we drop back to fast recovery
        mode, by setting the both the slow start threshold to and the
        congestion window to (current_congestion_window / 2).
      
      - If the timeout handler finds that the transmit queue has not moved
        since the previous timeout, it drops the link back to slow start
        and forces a probe containing the last sent sequence number to the
        sent to the peer, so that this can discover the stale situation.
      
      This change does in reality have effect only on unicast ethernet
      transport, as we have seen that there is no room whatsoever for
      increasing the window max size for the UDP bearer.
      For now, we also choose to keep the limits for the broadcast link
      unchanged and equal.
      
      This algorithm seems to give a 50-100% throughput improvement for
      messages larger than MTU.
      Suggested-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16ad3f40
    • Jon Maloy's avatar
      tipc: eliminate more unnecessary nacks and retransmissions · d3b09995
      Jon Maloy authored
      When we increase the link tranmsit window we often observe the following
      scenario:
      
      1) A STATE message bypasses a sequence of traffic packets and arrives
         far ahead of those to the receiver. STATE messages contain a
         'peers_nxt_snt' field to indicate which was the last packet sent
         from the peer. This mechanism is intended as a last resort for the
         receiver to detect missing packets, e.g., during very low traffic
         when there is no packet flow to help early loss detection.
      3) The receiving link compares the 'peer_nxt_snt' field to its own
         'rcv_nxt', finds that there is a gap, and immediately sends a
         NACK message back to the peer.
      4) When this NACKs arrives at the sender, all the requested
         retransmissions are performed, since it is a first-time request.
      
      Just like in the scenario described in the previous commit this leads
      to many redundant retransmissions, with decreased throughput as a
      consequence.
      
      We fix this by adding two more conditions before we send a NACK in
      this sitution. First, the deferred queue must be empty, so we cannot
      assume that the potential packet loss has already been detected by
      other means. Second, we check the 'peers_snd_nxt' field only in probe/
      probe_reply messages, thus turning this into a true mechanism of last
      resort as it was really meant to be.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3b09995
    • Jon Maloy's avatar
      tipc: eliminate gap indicator from ACK messages · 02288248
      Jon Maloy authored
      When we increase the link send window we sometimes observe the
      following scenario:
      
      1) A packet #N arrives out of order far ahead of a sequence of older
         packets which are still under way. The packet is added to the
         deferred queue.
      2) The missing packets arrive in sequence, and for each 16th of them
         an ACK is sent back to the receiver, as it should be.
      3) When building those ACK messages, it is checked if there is a gap
         between the link's 'rcv_nxt' and the first packet in the deferred
         queue. This is always the case until packet number #N-1 arrives, and
         a 'gap' indicator is added, effectively turning them into NACK
         messages.
      4) When those NACKs arrive at the sender, all the requested
         retransmissions are done, since it is a first-time request.
      
      This sometimes leads to a huge amount of redundant retransmissions,
      causing a drop in max throughput. This problem gets worse when we
      in a later commit introduce variable window congestion control,
      since it drops the link back to 'fast recovery' much more often
      than necessary.
      
      We now fix this by not sending any 'gap' indicator in regular ACK
      messages. We already have a mechanism for sending explicit NACKs
      in place, and this is sufficient to keep up the packet flow.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02288248
  4. 10 Dec, 2019 5 commits
    • Nathan Chancellor's avatar
      ppp: Adjust indentation into ppp_async_input · 08cbc75f
      Nathan Chancellor authored
      Clang warns:
      
      ../drivers/net/ppp/ppp_async.c:877:6: warning: misleading indentation;
      statement is not part of the previous 'if' [-Wmisleading-indentation]
                                      ap->rpkt = skb;
                                      ^
      ../drivers/net/ppp/ppp_async.c:875:5: note: previous statement is here
                                      if (!skb)
                                      ^
      1 warning generated.
      
      This warning occurs because there is a space before the tab on this
      line. Clean up this entire block's indentation so that it is consistent
      with the Linux kernel coding style and clang no longer warns.
      
      Fixes: 6722e78c ("[PPP]: handle misaligned accesses")
      Link: https://github.com/ClangBuiltLinux/linux/issues/800Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08cbc75f
    • Nathan Chancellor's avatar
      net: smc911x: Adjust indentation in smc911x_phy_configure · 5c61e223
      Nathan Chancellor authored
      Clang warns:
      
      ../drivers/net/ethernet/smsc/smc911x.c:939:3: warning: misleading
      indentation; statement is not part of the previous 'if'
      [-Wmisleading-indentation]
               if (!lp->ctl_rfduplx)
               ^
      ../drivers/net/ethernet/smsc/smc911x.c:936:2: note: previous statement
      is here
              if (lp->ctl_rspeed != 100)
              ^
      1 warning generated.
      
      This warning occurs because there is a space after the tab on this line.
      Remove it so that the indentation is consistent with the Linux kernel
      coding style and clang no longer warns.
      
      Fixes: 0a0c72c9 ("[PATCH] RE: [PATCH 1/1] net driver: Add support for SMSC LAN911x line of ethernet chips")
      Link: https://github.com/ClangBuiltLinux/linux/issues/796Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c61e223
    • Nathan Chancellor's avatar
      net: tulip: Adjust indentation in {dmfe, uli526x}_init_module · fe06bf3d
      Nathan Chancellor authored
      Clang warns:
      
      ../drivers/net/ethernet/dec/tulip/uli526x.c:1812:3: warning: misleading
      indentation; statement is not part of the previous 'if'
      [-Wmisleading-indentation]
              switch (mode) {
              ^
      ../drivers/net/ethernet/dec/tulip/uli526x.c:1809:2: note: previous
      statement is here
              if (cr6set)
              ^
      1 warning generated.
      
      ../drivers/net/ethernet/dec/tulip/dmfe.c:2217:3: warning: misleading
      indentation; statement is not part of the previous 'if'
      [-Wmisleading-indentation]
              switch(mode) {
              ^
      ../drivers/net/ethernet/dec/tulip/dmfe.c:2214:2: note: previous
      statement is here
              if (cr6set)
              ^
      1 warning generated.
      
      This warning occurs because there is a space before the tab on these
      lines. Remove them so that the indentation is consistent with the Linux
      kernel coding style and clang no longer warns.
      
      While we are here, adjust the default block in dmfe_init_module to have
      a proper break between the label and assignment and add a space between
      the switch and opening parentheses to avoid a checkpatch warning.
      
      Fixes: e1c3e501 ("[PATCH] initialisation cleanup for ULI526x-net-driver")
      Link: https://github.com/ClangBuiltLinux/linux/issues/795Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe06bf3d
    • David S. Miller's avatar
      Merge branch 'dp83867-fix-fifo-depth' · 80bfc3b4
      David S. Miller authored
      Dan Murphy says:
      
      ====================
      Fix Tx/Rx FIFO depth for DP83867
      
      The DP83867 supports both the RGMII and SGMII modes.  The Tx and Rx FIFO depths
      are configurable in these modes but may not applicable for both modes.
      
      When the device is configured for RGMII mode the Tx FIFO depth is applicable
      and for SGMII mode both Tx and Rx FIFO depth settings are applicable.  When
      the driver was originally written only the RGMII device was available and there
      were no standard fifo-depth DT properties.
      
      The patchset converts the special ti,fifo-depth property to the standard
      tx-fifo-depth property while still allowing the ti,fifo-depth property to be
      set as to maintain backward compatibility.
      
      In addition to this change the rx-fifo-depth property support was added and only
      written when the device is configured for SGMII mode.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80bfc3b4
    • Dan Murphy's avatar
      net: phy: dp83867: Add rx-fifo-depth and tx-fifo-depth · e02d1816
      Dan Murphy authored
      This code changes the TI specific ti,fifo-depth to the common
      tx-fifo-depth property.  The tx depth is applicable for both RGMII and
      SGMII modes of operation.
      
      rx-fifo-depth was added as well but this is only applicable for SGMII
      mode.
      
      So in summary
      if RGMII mode write tx fifo depth only
      if SGMII mode write both rx and tx fifo depths
      
      If the property is not populated in the device tree then set the value
      to the default values.
      Signed-off-by: default avatarDan Murphy <dmurphy@ti.com>
      Reported-by: default avatarAdrian Bunk <bunk@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e02d1816