1. 17 Feb, 2019 27 commits
    • David S. Miller's avatar
      Merge branch 'devlink-add-the-ability-to-update-device-flash' · eaec2efb
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      devlink: add the ability to update device flash
      
      This series is the second step to allow trouble shooting and recovering
      devices in bad state without the use of netdevs as handles.  We can
      already query FW versions over devlink, now we add the ability to update
      the FW.  This will allow drivers to implement some from of "limp-mode"
      where the device can't really be used for networking and hence has no
      netdev, but we can interrogate it over devlink and fix the broken FW.
      
      Small but nice advantage of devlink is that it only holds the devlink
      instance lock during flashing, unlike ethtool which holds rtnl_lock().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eaec2efb
    • Jakub Kicinski's avatar
      nfp: devlink: allow flashing the device via devlink · 5c5696f3
      Jakub Kicinski authored
      Devlink now allows updating device flash.  Implement this
      callback.
      
      Compared to ethtool update we no longer have to release
      the networking locks - devlink doesn't take them.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c5696f3
    • Jakub Kicinski's avatar
      ethtool: add compat for flash update · 4eceba17
      Jakub Kicinski authored
      If driver does not support ethtool flash update operation
      call into devlink.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4eceba17
    • Jakub Kicinski's avatar
      devlink: add flash update command · 76726ccb
      Jakub Kicinski authored
      Add devlink flash update command. Advanced NICs have firmware
      stored in flash and often cryptographically secured. Updating
      that flash is handled by management firmware. Ethtool has a
      flash update command which served us well, however, it has two
      shortcomings:
       - it takes rtnl_lock unnecessarily - really flash update has
         nothing to do with networking, so using a networking device
         as a handle is suboptimal, which leads us to the second one:
       - it requires a functioning netdev - in case device enters an
         error state and can't spawn a netdev (e.g. communication
         with the device fails) there is no netdev to use as a handle
         for flashing.
      
      Devlink already has the ability to report the firmware versions,
      now with the ability to update the firmware/flash we will be
      able to recover devices in bad state.
      
      To enable updates of sub-components of the FW allow passing
      component name.  This name should correspond to one of the
      versions reported in devlink info.
      
      v1: - replace target id with component name (Jiri).
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76726ccb
    • David S. Miller's avatar
      Merge branch 'net-phy-improve-and-use-phy_resolve_aneg_linkmode' · 8e31c474
      David S. Miller authored
      Heiner Kallweit says:
      
      ====================
      net: phy: improve and use phy_resolve_aneg_linkmode
      
      Improve phy_resolve_aneg_linkmode and use it in genphy_read_status.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e31c474
    • Heiner Kallweit's avatar
      net: phy: use phy_resolve_aneg_linkmode in genphy_read_status · 5502b218
      Heiner Kallweit authored
      Now that we have phy_resolve_aneg_linkmode() we can make
      genphy_read_status() much simpler.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5502b218
    • Heiner Kallweit's avatar
      net: phy: improve phy_resolve_aneg_linkmode · a2703de7
      Heiner Kallweit authored
      We have the settings array of modes which is sorted based on aneg
      priority. Instead of checking each mode manually let's simply iterate
      over the sorted settings.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2703de7
    • Vlad Buslov's avatar
      net: sched: cgroup: verify that filter is not NULL during walk · 8b58d12f
      Vlad Buslov authored
      Check that filter is not NULL before passing it to tcf_walker->fn()
      callback in cls_cgroup_walk(). This can happen when cls_cgroup_change()
      failed to set first filter.
      
      Fixes: ed76f5ed ("net: sched: protect filter_chain list with filter_chain_lock mutex")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b58d12f
    • Vlad Buslov's avatar
      net: sched: matchall: verify that filter is not NULL in mall_walk() · d66022cd
      Vlad Buslov authored
      Check that filter is not NULL before passing it to tcf_walker->fn()
      callback. This can happen when mall_change() failed to offload filter to
      hardware.
      
      Fixes: ed76f5ed ("net: sched: protect filter_chain list with filter_chain_lock mutex")
      Reported-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Tested-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d66022cd
    • Vlad Buslov's avatar
      net: sched: route: don't set arg->stop in route4_walk() when empty · 3027ff41
      Vlad Buslov authored
      Some classifiers set arg->stop in their implementation of tp->walk() API
      when empty. Most of classifiers do not adhere to that convention. Do not
      set arg->stop in route4_walk() to unify tp->walk() behavior among
      classifier implementations.
      
      Fixes: ed76f5ed ("net: sched: protect filter_chain list with filter_chain_lock mutex")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3027ff41
    • Vlad Buslov's avatar
      net: sched: fw: don't set arg->stop in fw_walk() when empty · 31a99848
      Vlad Buslov authored
      Some classifiers set arg->stop in their implementation of tp->walk() API
      when empty. Most of classifiers do not adhere to that convention. Do not
      set arg->stop in fw_walk() to unify tp->walk() behavior among classifier
      implementations.
      
      Fixes: ed76f5ed ("net: sched: protect filter_chain list with filter_chain_lock mutex")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31a99848
    • Jann Horn's avatar
      net: caif: use skb helpers instead of open-coding them · 1eb00162
      Jann Horn authored
      Use existing skb_put_data() and skb_trim() instead of open-coding them,
      with the skb_put_data() first so that logically, `skb` still contains the
      data to be copied in its data..tail area when skb_put_data() reads it.
      This change on its own is a cleanup, and it is also necessary for potential
      future integration of skbuffs with things like KASAN.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1eb00162
    • Vadim Pasternak's avatar
      mlxsw: core: Extend thermal module with per QSFP module thermal zones · 6a79507c
      Vadim Pasternak authored
      Add a dedicated thermal zone for each QSFP/SFP module. The current
      temperature is obtained from the module's temperature sensor and the
      trip points are set based on the warning and critical thresholds
      read from the module.
      
      A cooling device (fan) is bound to all the thermal zones. The
      thermal zone governor is set to user space in order to avoid
      collisions between thermal zones.
      For example, one thermal zone might want to increase the speed of
      the fan, whereas another one would like to decrease it.
      
      Deferring this decision to user space allows the user to the take
      the most suitable decision.
      Signed-off-by: default avatarVadim Pasternak <vadimp@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a79507c
    • David S. Miller's avatar
      Merge branch 'neigh-tracepoints' · 3c136c54
      David S. Miller authored
      Roopa Prabhu says:
      
      ====================
      tracepoints in neighbor subsystem
      
      Roopa Prabhu (2):
        trace: events: add a few neigh tracepoints
        neigh: hook tracepoints in neigh update code
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c136c54
    • Roopa Prabhu's avatar
      neigh: hook tracepoints in neigh update code · 56dd18a4
      Roopa Prabhu authored
      hook tracepoints at the end of functions that
      update a neigh entry. neigh_update gets an additional
      tracepoint to trace the update flags and old and new
      neigh states.
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56dd18a4
    • Roopa Prabhu's avatar
      trace: events: add a few neigh tracepoints · 9c03b282
      Roopa Prabhu authored
      The goal here is to trace neigh state changes covering all possible
      neigh update paths. Plus have a specific trace point in neigh_update
      to cover flags sent to neigh_update.
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c03b282
    • David S. Miller's avatar
      Merge branch 'net-phy-add-and-use-genphy_c45_an_config_an' · 9e8ccd89
      David S. Miller authored
      Heiner Kallweit says:
      
      ====================
      net: phy: add and use genphy_c45_an_config_an
      
      This series adds genphy_c45_an_config_an() and uses it in the
      marvell10g diver. In addition patch 4 aligns the aneg configuration
      with what is done in genphy_config_aneg().
      
      v2:
      - in patch 2 changed function name to genphy_c45_an_config_aneg
      - in patch 3 add a comment regarding 1000BaseT vendor registers
      
      v3:
      - rebase patch 3
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e8ccd89
    • Heiner Kallweit's avatar
      net: phy: marvell10g: check for newly set aneg · 3ce2a027
      Heiner Kallweit authored
      Even if the advertisement registers content didn't change, we may have
      just switched to aneg, and therefore have to trigger an aneg restart.
      This matches the behavior of genphy_config_aneg().
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ce2a027
    • Andrew Lunn's avatar
      net: phy: marvell10g: use genphy_c45_an_config_aneg · 3de97f3c
      Andrew Lunn authored
      Use new function genphy_c45_config_aneg() in mv3310_config_aneg().
      
      v2:
      - add a comment regarding 1000BaseT vendor registers
      v3:
      - rebased
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      [hkallweit1@gmail.com: patch splitted]
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3de97f3c
    • Andrew Lunn's avatar
      net: phy: add genphy_c45_an_config_aneg · 9a5dc8af
      Andrew Lunn authored
      C45 configuration of 10/100 and multi-giga bit auto negotiation
      advertisement is standardized. Configuration of 1000Base-T however
      appears to be vendor specific. Move the generic code out of the
      Marvell driver into the common phy-c45.c file.
      
      v2:
      - change function name to genphy_c45_an_config_aneg
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      [hkallweit1@gmail.com: use new helper linkmode_adv_to_mii_10gbt_adv_t and split patch]
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a5dc8af
    • Heiner Kallweit's avatar
      net: phy: add helper linkmode_adv_to_mii_10gbt_adv_t · 744e458a
      Heiner Kallweit authored
      Add a helper linkmode_adv_to_mii_10gbt_adv_t(), similar to
      linkmode_adv_to_mii_adv_t.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      744e458a
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 885e6319
      David S. Miller authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf-next 2019-02-16
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      The main changes are:
      
      1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong.
      
      2) test all bpf progs in alu32 mode, from Jiong.
      
      3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin.
      
      4) support for IP encap in lwt bpf progs, from Peter.
      
      5) remove XDP_QUERY_XSK_UMEM dead code, from Jan.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      885e6319
    • Andrii Nakryiko's avatar
      tools/libbpf: support bigger BTF data sizes · 5aab392c
      Andrii Nakryiko authored
      While it's understandable why kernel limits number of BTF types to 65535
      and size of string section to 64KB, in libbpf as user-space library it's
      too restrictive. E.g., pahole converting DWARF to BTF type information
      for Linux kernel generates more than 3 million BTF types and more than
      3MB of strings, before deduplication. So to allow btf__dedup() to do its
      work, we need to be able to load bigger BTF sections using btf__new().
      Singed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      5aab392c
    • Peter Oskolkov's avatar
      selftests: bpf: test_lwt_ip_encap: add negative tests. · 9d6b3584
      Peter Oskolkov authored
      As requested by David Ahern:
      
      - add negative tests (no routes, explicitly unreachable destinations)
        to exercize error handling code paths;
      - do not exit on test failures, but instead print a summary of
        passed/failed tests at the end.
      
      Future patches will add TSO and VRF tests.
      Signed-off-by: default avatarPeter Oskolkov <posk@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      9d6b3584
    • Alexandre Torgue's avatar
      net: stmmac: use correct define to get rx timestamp on GMAC4 · f186a82b
      Alexandre Torgue authored
      In dwmac4_wrback_get_rx_timestamp_status we looking for a RX timestamp.
      For that receive descriptors are handled and so we should use defines
      related to receive descriptors. It'll no change the functional behavior
      as RDES3_RDES1_VALID=TDES3_RS1V=BIT(26) but it makes code easier to read.
      Signed-off-by: default avatarAlexandre Torgue <alexandre.torgue@st.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f186a82b
    • Dan Carpenter's avatar
      atm: clean up vcc_seq_next() · d0edde8d
      Dan Carpenter authored
      It's confusing to call PTR_ERR(v).  The PTR_ERR() function is basically
      a fancy cast to long so it makes you wonder, was IS_ERR() intended?  But
      that doesn't make sense because vcc_walk() doesn't return error
      pointers.
      
      This patch doesn't affect runtime, it's just a cleanup.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0edde8d
    • Guillaume Nault's avatar
      sock: consistent handling of extreme SO_SNDBUF/SO_RCVBUF values · 4057765f
      Guillaume Nault authored
      SO_SNDBUF and SO_RCVBUF (and their *BUFFORCE version) may overflow or
      underflow their input value. This patch aims at providing explicit
      handling of these extreme cases, to get a clear behaviour even with
      values bigger than INT_MAX / 2 or lower than INT_MIN / 2.
      
      For simplicity, only SO_SNDBUF and SO_SNDBUFFORCE are described here,
      but the same explanation and fix apply to SO_RCVBUF and SO_RCVBUFFORCE
      (with 'SNDBUF' replaced by 'RCVBUF' and 'wmem_max' by 'rmem_max').
      
      Overflow of positive values
      
      ===========================
      
      When handling SO_SNDBUF or SO_SNDBUFFORCE, if 'val' exceeds
      INT_MAX / 2, the buffer size is set to its minimum value because
      'val * 2' overflows, and max_t() considers that it's smaller than
      SOCK_MIN_SNDBUF. For SO_SNDBUF, this can only happen with
      net.core.wmem_max > INT_MAX / 2.
      
      SO_SNDBUF and SO_SNDBUFFORCE are actually designed to let users probe
      for the maximum buffer size by setting an arbitrary large number that
      gets capped to the maximum allowed/possible size. Having the upper
      half of the positive integer space to potentially reduce the buffer
      size to its minimum value defeats this purpose.
      
      This patch caps the base value to INT_MAX / 2, so that bigger values
      don't overflow and keep setting the buffer size to its maximum.
      
      Underflow of negative values
      ============================
      
      For negative numbers, SO_SNDBUF always considers them bigger than
      net.core.wmem_max, which is bounded by [SOCK_MIN_SNDBUF, INT_MAX].
      Therefore such values are set to net.core.wmem_max and we're back to
      the behaviour of positive integers described above (return maximum
      buffer size if wmem_max <= INT_MAX / 2, return SOCK_MIN_SNDBUF
      otherwise).
      
      However, SO_SNDBUFFORCE behaves differently. The user value is
      directly multiplied by two and compared with SOCK_MIN_SNDBUF. If
      'val * 2' doesn't underflow or if it underflows to a value smaller
      than SOCK_MIN_SNDBUF then buffer size is set to its minimum value.
      Otherwise the buffer size is set to the underflowed value.
      
      This patch treats negative values passed to SO_SNDBUFFORCE as null, to
      prevent underflows. Therefore negative values now always set the buffer
      size to its minimum value.
      
      Even though SO_SNDBUF behaves inconsistently by setting buffer size to
      the maximum value when passed a negative number, no attempt is made to
      modify this behaviour. There may exist some programs that rely on using
      negative numbers to set the maximum buffer size. Avoiding overflows
      because of extreme net.core.wmem_max values is the most we can do here.
      
      Summary of altered behaviours
      =============================
      
      val      : user-space value passed to setsockopt()
      val_uf   : the underflowed value resulting from doubling val when
                 val < INT_MIN / 2
      wmem_max : short for net.core.wmem_max
      val_cap  : min(val, wmem_max)
      min_len  : minimal buffer length (that is, SOCK_MIN_SNDBUF)
      max_len  : maximal possible buffer length, regardless of wmem_max (that
                 is, INT_MAX - 1)
      ^^^^     : altered behaviour
      
      SO_SNDBUF:
      +-------------------------+-------------+------------+----------------+
      |       CONDITION         | OLD RESULT  | NEW RESULT |    COMMENT     |
      +-------------------------+-------------+------------+----------------+
      | val < 0 &&              |             |            | No overflow,   |
      | wmem_max <= INT_MAX/2   | wmem_max*2  | wmem_max*2 | keep original  |
      |                         |             |            | behaviour      |
      +-------------------------+-------------+------------+----------------+
      | val < 0 &&              |             |            | Cap wmem_max   |
      | INT_MAX/2 < wmem_max    | min_len     | max_len    | to prevent     |
      |                         |             | ^^^^^^^    | overflow       |
      +-------------------------+-------------+------------+----------------+
      | 0 <= val <= min_len/2   | min_len     | min_len    | Ordinary case  |
      +-------------------------+-------------+------------+----------------+
      | min_len/2 < val &&      | val_cap*2   | val_cap*2  | Ordinary case  |
      | val_cap <= INT_MAX/2    |             |            |                |
      +-------------------------+-------------+------------+----------------+
      | min_len < val &&        |             |            | Cap val_cap    |
      | INT_MAX/2 < val_cap     | min_len     | max_len    | again to       |
      | (implies that           |             | ^^^^^^^    | prevent        |
      | INT_MAX/2 < wmem_max)   |             |            | overflow       |
      +-------------------------+-------------+------------+----------------+
      
      SO_SNDBUFFORCE:
      +------------------------------+---------+---------+------------------+
      |          CONDITION           | BEFORE  | AFTER   |     COMMENT      |
      |                              | PATCH   | PATCH   |                  |
      +------------------------------+---------+---------+------------------+
      | val < INT_MIN/2 &&           | min_len | min_len | Underflow with   |
      | val_uf <= min_len            |         |         | no consequence   |
      +------------------------------+---------+---------+------------------+
      | val < INT_MIN/2 &&           | val_uf  | min_len | Set val to 0 to  |
      | val_uf > min_len             |         | ^^^^^^^ | avoid underflow  |
      +------------------------------+---------+---------+------------------+
      | INT_MIN/2 <= val < 0         | min_len | min_len | No underflow     |
      +------------------------------+---------+---------+------------------+
      | 0 <= val <= min_len/2        | min_len | min_len | Ordinary case    |
      +------------------------------+---------+---------+------------------+
      | min_len/2 < val <= INT_MAX/2 | val*2   | val*2   | Ordinary case    |
      +------------------------------+---------+---------+------------------+
      | INT_MAX/2 < val              | min_len | max_len | Cap val to       |
      |                              |         | ^^^^^^^ | prevent overflow |
      +------------------------------+---------+---------+------------------+
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4057765f
  2. 16 Feb, 2019 13 commits