1. 04 Dec, 2018 1 commit
  2. 03 Dec, 2018 25 commits
    • David S. Miller's avatar
      Merge branch 'udp-msg_zerocopy' · 6e360f73
      David S. Miller authored
      Willem de Bruijn says:
      
      ====================
      udp msg_zerocopy
      
      Enable MSG_ZEROCOPY for udp sockets
      
      Patch 1/3 is the main patch, a rework of RFC patch
        http://patchwork.ozlabs.org/patch/899630/
        more details in the patch commit message
      
      Patch 2/3 is an optimization to remove a branch from the UDP hot path
        and refcount_inc/refcount_dec_and_test pair when zerocopy is used.
        This used to be included in the first patch in v2.
      
      Patch 3/3 runs the already existing udp zerocopy tests
        as part of kselftest
      
      See also recent Linux Plumbers presentation
        https://linuxplumbersconf.org/event/2/contributions/106/attachments/104/128/willemdebruijn-lpc2018-udpgso-presentation-20181113.pdf
      
      Changes:
        v1 -> v2
          - Fixup reverse christmas tree violation
        v2 -> v3
          - Split refcount avoidance optimization into separate patch
            - Fix refcount leak on error in fragmented case
              (thanks to Paolo Abeni for pointing this one out!)
            - Fix refcount inc on zero
        v3 -> v4
          - Move skb_zcopy_set below the only kfree_skb that might cause
            a premature uarg destroy before skb_zerocopy_put_abort
            - Move the entire skb_shinfo assignment block, to keep that
      	cacheline access in one place
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e360f73
    • Willem de Bruijn's avatar
      selftests: extend zerocopy tests to udp · db63e489
      Willem de Bruijn authored
      Both msg_zerocopy and udpgso_bench have udp zerocopy variants.
      Exercise these as part of the standard kselftest run.
      
      With udp, msg_zerocopy has no control channel. Ensure that the
      receiver exits after the sender by accounting for the initial
      delay in starting them (in msg_zerocopy.sh).
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db63e489
    • Willem de Bruijn's avatar
      udp: elide zerocopy operation in hot path · 52900d22
      Willem de Bruijn authored
      With MSG_ZEROCOPY, each skb holds a reference to a struct ubuf_info.
      Release of its last reference triggers a completion notification.
      
      The TCP stack in tcp_sendmsg_locked holds an extra ref independent of
      the skbs, because it can build, send and free skbs within its loop,
      possibly reaching refcount zero and freeing the ubuf_info too soon.
      
      The UDP stack currently also takes this extra ref, but does not need
      it as all skbs are sent after return from __ip(6)_append_data.
      
      Avoid the extra refcount_inc and refcount_dec_and_test, and generally
      the sock_zerocopy_put in the common path, by passing the initial
      reference to the first skb.
      
      This approach is taken instead of initializing the refcount to 0, as
      that would generate error "refcount_t: increment on 0" on the
      next skb_zcopy_set.
      
      Changes
        v3 -> v4
          - Move skb_zcopy_set below the only kfree_skb that might cause
            a premature uarg destroy before skb_zerocopy_put_abort
            - Move the entire skb_shinfo assignment block, to keep that
              cacheline access in one place
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52900d22
    • Willem de Bruijn's avatar
      udp: msg_zerocopy · b5947e5d
      Willem de Bruijn authored
      Extend zerocopy to udp sockets. Allow setting sockopt SO_ZEROCOPY and
      interpret flag MSG_ZEROCOPY.
      
      This patch was previously part of the zerocopy RFC patchsets. Zerocopy
      is not effective at small MTU. With segmentation offload building
      larger datagrams, the benefit of page flipping outweights the cost of
      generating a completion notification.
      
      tools/testing/selftests/net/msg_zerocopy.sh after applying follow-on
      test patch and making skb_orphan_frags_rx same as skb_orphan_frags:
      
          ipv4 udp -t 1
          tx=191312 (11938 MB) txc=0 zc=n
          rx=191312 (11938 MB)
          ipv4 udp -z -t 1
          tx=304507 (19002 MB) txc=304507 zc=y
          rx=304507 (19002 MB)
          ok
          ipv6 udp -t 1
          tx=174485 (10888 MB) txc=0 zc=n
          rx=174485 (10888 MB)
          ipv6 udp -z -t 1
          tx=294801 (18396 MB) txc=294801 zc=y
          rx=294801 (18396 MB)
          ok
      
      Changes
        v1 -> v2
          - Fixup reverse christmas tree violation
        v2 -> v3
          - Split refcount avoidance optimization into separate patch
            - Fix refcount leak on error in fragmented case
              (thanks to Paolo Abeni for pointing this one out!)
            - Fix refcount inc on zero
            - Test sock_flag SOCK_ZEROCOPY directly in __ip_append_data.
              This is needed since commit 5cf4a853 ("tcp: really ignore
      	MSG_ZEROCOPY if no SO_ZEROCOPY") did the same for tcp.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5947e5d
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-next-for-davem-2018-11-30' of... · ce01a56b
      David S. Miller authored
      Merge tag 'wireless-drivers-next-for-davem-2018-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
      
      Kalle Valo says:
      
      ====================
      wireless-drivers-next patches for 4.21
      
      First set of patches for 4.21. Most notable here is support for
      Quantenna's QSR1000/QSR2000 chipsets and more flexible ways to provide
      nvram files for brcmfmac.
      
      Major changes:
      
      brcmfmac
      
      * add support for first trying to get a board specific nvram file
      
      * add support for getting nvram contents from EFI variables
      
      qtnfmac
      
      * use single PCIe driver for all platforms and rename
        Kconfig option CONFIG_QTNFMAC_PEARL_PCIE to CONFIG_QTNFMAC_PCIE
      
      * add support for QSR1000/QSR2000 (Topaz) family of chipsets
      
      ath10k
      
      * add support for WCN3990 firmware crash recovery
      
      * add firmware memory dump support for QCA4019
      
      wil6210
      
      * add firmware error recovery while in AP mode
      
      ath9k
      
      * remove experimental notice from dynack feature
      
      iwlwifi
      
      * PCI IDs for some new 9000-series cards
      
      * improve antenna usage on connection problems
      
      * new firmware debugging infrastructure
      
      * some more work on 802.11ax
      
      * improve support for multiple RF modules with 22000 devices
      
      cordic
      
      * move cordic macros and defines to a public header file
      
      * convert brcmsmac and b43 to fully use cordic library
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce01a56b
    • David S. Miller's avatar
      Merge branch 'davinci_emac-read-the-MAC-address-from-nvmem' · 37a0bc39
      David S. Miller authored
      Bartosz Golaszewski says:
      
      ====================
      davinci_emac: read the MAC address from nvmem
      
      This series is part of a bigger series that aims at removing the platform
      data structure from the at24 EEPROM driver[1].
      
      We provide a generalized version of of_get_nvmem_mac_address(), switch the
      only user of the of_ variant to using it, remove the previous
      implementation and use the new routine in the davinci_emac driver.
      
      [1] https://lkml.org/lkml/2018/11/13/884
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37a0bc39
    • Bartosz Golaszewski's avatar
      net: davinci_emac: use nvmem_get_mac_address() · 18dbfc81
      Bartosz Golaszewski authored
      All DaVinci boards still supported in board files now define nvmem
      cells containing the MAC address. We want to stop using the setup
      callback from at24 so the MAC address for those users will no longer
      be provided over platform data. If we didn't get a valid MAC in pdata,
      try nvmem before resorting to a random MAC.
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18dbfc81
    • Bartosz Golaszewski's avatar
      of: net: kill of_get_nvmem_mac_address() · afa64a72
      Bartosz Golaszewski authored
      We've switched all users to nvmem_get_mac_address(). Remove the now
      dead code.
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afa64a72
    • Bartosz Golaszewski's avatar
      net: cadence: switch to using nvmem_get_mac_address() · cce41b8f
      Bartosz Golaszewski authored
      We now have a generalized helper routine to read the MAC address from
      nvmem which takes struct device as argument. The nvmem subsystem will
      then try device tree first before all other potential providers.
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cce41b8f
    • Bartosz Golaszewski's avatar
      net: ethernet: provide nvmem_get_mac_address() · 0e839df9
      Bartosz Golaszewski authored
      We already have of_get_nvmem_mac_address() but some non-DT systems want
      to read the MAC address from NVMEM too. Implement a generalized routine
      that takes struct device as argument.
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e839df9
    • NeilBrown's avatar
      rhashtable: detect when object movement between tables might have invalidated a lookup · 82208d0d
      NeilBrown authored
      Some users of rhashtables might need to move an object from one table
      to another -  this appears to be the reason for the incomplete usage
      of NULLS markers.
      
      To support these, we store a unique NULLS_MARKER at the end of
      each chain, and when a search fails to find a match, we check
      if the NULLS marker found was the expected one.  If not, the search
      may not have examined all objects in the target bucket, so it is
      repeated.
      
      The unique NULLS_MARKER is derived from the address of the
      head of the chain.  As this cannot be derived at load-time the
      static rhnull in rht_bucket_nested() needs to be initialised
      at run time.
      
      Any caller of a lookup function must still be prepared for the
      possibility that the object returned is in a different table - it
      might have been there for some time.
      
      Note that this does NOT provide support for other uses of
      NULLS_MARKERs such as allocating with SLAB_TYPESAFE_BY_RCU or changing
      the key of an object and re-inserting it in the same table.
      These could only be done safely if new objects were inserted
      at the *start* of a hash chain, and that is not currently the case.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      82208d0d
    • David S. Miller's avatar
      Merge branch 'hns3-ethtool-dump' · 77ac327c
      David S. Miller authored
      Salil Mehta says:
      
      ====================
      Adds VF/PF PCIe reg dump(ethtool -d) support to HNS3 driver
      
      This patchset adds VF/PF PCIe register dump support to HNS3 VF and PF
      driver using "ethtool -d" command.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77ac327c
    • Jian Shen's avatar
      net: hns3: Adds support to dump(using ethool-d) PCIe regs in HNS3 PF driver · ea4750ca
      Jian Shen authored
      This patch adds support to dump PF PCIe registers using ethtool -d
      for HNS3 PF Driver.
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarSalil Mehta <salil.mehta@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea4750ca
    • Jian Shen's avatar
      net: hns3: Support "ethtool -d" for HNS3 VF driver · 1600c3e5
      Jian Shen authored
      This patch adds "ethtool -d" support for HNS3 VF Driver.
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarSalil Mehta <salil.mehta@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1600c3e5
    • Heiner Kallweit's avatar
      net: phy: improve generic EEE ethtool functions · d1420bb9
      Heiner Kallweit authored
      So far the two functions consider neither member eee_enabled nor
      eee_active. Therefore network drivers have to do this in some kind
      of glue code. I think this can be avoided.
      
      Getting EEE parameters:
      When not advertising any EEE mode, we can't consider EEE to be enabled.
      Therefore interpret "EEE enabled" as "we advertise at least one EEE
      mode". It's similar with "EEE active": interpret it as "EEE modes
      advertised by both link partner have at least one mode in common".
      
      Setting EEE parameters:
      If eee_enabled isn't set, don't advertise any EEE mode and restart
      aneg if needed to switch off EEE. If eee_enabled is set and
      data->advertised is empty (e.g. because EEE was disabled), advertise
      everything we support as default. This way EEE can easily switched
      on/off by doing ethtool --set-eee <if> eee on/off, w/o any additional
      parameters.
      
      The changes to both functions shouldn't break any existing user.
      Once the changes have been applied, at least some users can be
      simplified.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1420bb9
    • David S. Miller's avatar
      Merge branch 'VXLAN-underlay-VRF' · 79dfab43
      David S. Miller authored
      Alexis Bauvin says:
      
      ====================
      net: Add VRF support for VXLAN underlay
      
      v6 -> v7:
      - proper locking for device in udp_tunnel following Sabrina Dubroca's advice
      
      v5 -> v6:
      - remove automatic rebinding patch following Roopa Prabhu's advice
      
      v4 -> v5:
      - move test script to its own patch (6/6)
      - add schematic for test script
      - apply David Ahern comments to the test script
      
      v3 -> v4:
      - rename vxlan_is_in_l3mdev_chain to netdev_is_upper master
      - move it to net/core/dev.c
      - make it return bool instead of int
      - check if remote_ifindex is zero before resolving the l3mdev
      - add testing script
      
      v2 -> v3:
      - fix build when CONFIG_NET_IPV6 is off
      - fix build "unused l3mdev_master_upper_ifindex_by_index" build error with some
        configs
      
      v1 -> v2:
      - move vxlan_get_l3mdev from vxlan driver to l3mdev driver as
        l3mdev_master_upper_ifindex_by_index
      - vxlan: rename variables named l3mdev_ifindex to ifindex
      
      v0 -> v1:
      - fix typos
      
      We are trying to isolate the VXLAN traffic from different VMs with VRF as shown
      in the schemas below:
      
      +-------------------------+   +----------------------------+
      | +----------+            |   |     +------------+         |
      | |          |            |   |     |            |         |
      | | tap-red  |            |   |     |  tap-blue  |         |
      | |          |            |   |     |            |         |
      | +----+-----+            |   |     +-----+------+         |
      |      |                  |   |           |                |
      |      |                  |   |           |                |
      | +----+---+              |   |      +----+----+           |
      | |        |              |   |      |         |           |
      | | br-red |              |   |      | br-blue |           |
      | |        |              |   |      |         |           |
      | +----+---+              |   |      +----+----+           |
      |      |                  |   |           |                |
      |      |                  |   |           |                |
      |      |                  |   |           |                |
      | +----+--------+         |   |     +--------------+       |
      | |             |         |   |     |              |       |
      | |  vxlan-red  |         |   |     |  vxlan-blue  |       |
      | |             |         |   |     |              |       |
      | +------+------+         |   |     +-------+------+       |
      |        |                |   |             |              |
      |        |           VRF  |   |             |          VRF |
      |        |           red  |   |             |         blue |
      +-------------------------+   +----------------------------+
               |                                  |
               |                                  |
       +---------------------------------------------------------+
       |       |                                  |              |
       |       |                                  |              |
       |       |         +--------------+         |              |
       |       |         |              |         |              |
       |       +---------+  eth0.2030   +---------+              |
       |                 |  10.0.0.1/24 |                        |
       |                 +-----+--------+                    VRF |
       |                       |                            green|
       +---------------------------------------------------------+
                               |
                               |
                          +----+---+
                          |        |
                          |  eth0  |
                          |        |
                          +--------+
      
      iproute2 commands to reproduce the setup:
      
      ip link add green type vrf table 1
      ip link set green up
      ip link add eth0.2030 link eth0 type vlan id 2030
      ip link set eth0.2030 master green
      ip addr add 10.0.0.1/24 dev eth0.2030
      ip link set eth0.2030 up
      
      ip link add blue type vrf table 2
      ip link set blue up
      ip link add br-blue type bridge
      ip link set br-blue master blue
      ip link set br-blue up
      ip link add vxlan-blue type vxlan id 2 local 10.0.0.1 dev eth0.2030 \
       port 4789
      ip link set vxlan-blue master br-blue
      ip link set vxlan-blue up
      ip link set tap-blue master br-blue
      ip link set tap-blue up
      
      ip link add red type vrf table 3
      ip link set red up
      ip link add br-red type bridge
      ip link set br-red master red
      ip link set br-red up
      ip link add vxlan-red type vxlan id 3 local 10.0.0.1 dev eth0.2030 \
       port 4789
      ip link set vxlan-red master br-red
      ip link set vxlan-red up
      ip link set tap-red master br-red
      ip link set tap-red up
      
      We faced some issue in the datapath, here are the details:
      
      * Egress traffic:
      The vxlan packets are sent directly to the default VRF because it's where the
      socket is bound, therefore the traffic has a default route via eth0. the
      workaround is to force this traffic to VRF green with ip rules.
      
      * Ingress traffic:
      When receiving the traffic on eth0.2030 the vxlan socket is unreachable from
      VRF green. The workaround is to enable *udp_l3mdev_accept* sysctl, but
      this breaks isolation between overlay and underlay: packets sent from
      blue or red by e.g. a guest VM will be accepted by the socket, allowing
      injection of VXLAN packets from the overlay.
      
      This patch series fixes the issues describe above by allowing VXLAN socket to be
      bound to a specific VRF device therefore looking up in the correct table.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79dfab43
    • Alexis Bauvin's avatar
      test/net: Add script for VXLAN underlay in a VRF · 03f1c26b
      Alexis Bauvin authored
      This script tests the support of a VXLAN underlay in a non-default VRF.
      
      It does so by simulating two hypervisors and two VMs, an extended L2
      between the VMs with the hypervisors as VTEPs with the underlay in a
      VRF, and finally by pinging the two VMs.
      
      It also tests that moving the underlay from a VRF to another works when
      down/up the VXLAN interface.
      Signed-off-by: default avatarAlexis Bauvin <abauvin@scaleway.com>
      Reviewed-by: default avatarAmine Kherbouche <akherbouche@scaleway.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Tested-by: default avatarAmine Kherbouche <akherbouche@scaleway.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      03f1c26b
    • Alexis Bauvin's avatar
      vxlan: add support for underlay in non-default VRF · aab8cc36
      Alexis Bauvin authored
      Creating a VXLAN device with is underlay in the non-default VRF makes
      egress route lookup fail or incorrect since it will resolve in the
      default VRF, and ingress fail because the socket listens in the default
      VRF.
      
      This patch binds the underlying UDP tunnel socket to the l3mdev of the
      lower device of the VXLAN device. This will listen in the proper VRF and
      output traffic from said l3mdev, matching l3mdev routing rules and
      looking up the correct routing table.
      
      When the VXLAN device does not have a lower device, or the lower device
      is in the default VRF, the socket will not be bound to any interface,
      keeping the previous behaviour.
      
      The underlay l3mdev is deduced from the VXLAN lower device
      (IFLA_VXLAN_LINK).
      
      +----------+                         +---------+
      |          |                         |         |
      | vrf-blue |                         | vrf-red |
      |          |                         |         |
      +----+-----+                         +----+----+
           |                                    |
           |                                    |
      +----+-----+                         +----+----+
      |          |                         |         |
      | br-blue  |                         | br-red  |
      |          |                         |         |
      +----+-----+                         +---+-+---+
           |                                   | |
           |                             +-----+ +-----+
           |                             |             |
      +----+-----+                +------+----+   +----+----+
      |          |  lower device  |           |   |         |
      |   eth0   | <- - - - - - - | vxlan-red |   | tap-red | (... more taps)
      |          |                |           |   |         |
      +----------+                +-----------+   +---------+
      Signed-off-by: default avatarAlexis Bauvin <abauvin@scaleway.com>
      Reviewed-by: default avatarAmine Kherbouche <akherbouche@scaleway.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Tested-by: default avatarAmine Kherbouche <akherbouche@scaleway.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aab8cc36
    • Alexis Bauvin's avatar
      l3mdev: add function to retreive upper master · 6a6d6681
      Alexis Bauvin authored
      Existing functions to retreive the l3mdev of a device did not walk the
      master chain to find the upper master. This patch adds a function to
      find the l3mdev, even indirect through e.g. a bridge:
      
      +----------+
      |          |
      | vrf-blue |
      |          |
      +----+-----+
           |
           |
      +----+-----+
      |          |
      | br-blue  |
      |          |
      +----+-----+
           |
           |
      +----+-----+
      |          |
      |   eth0   |
      |          |
      +----------+
      
      This will properly resolve the l3mdev of eth0 to vrf-blue.
      Signed-off-by: default avatarAlexis Bauvin <abauvin@scaleway.com>
      Reviewed-by: default avatarAmine Kherbouche <akherbouche@scaleway.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Tested-by: default avatarAmine Kherbouche <akherbouche@scaleway.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a6d6681
    • Alexis Bauvin's avatar
      udp_tunnel: add config option to bind to a device · da5095d0
      Alexis Bauvin authored
      UDP tunnel sockets are always opened unbound to a specific device. This
      patch allow the socket to be bound on a custom device, which
      incidentally makes UDP tunnels VRF-aware if binding to an l3mdev.
      Signed-off-by: default avatarAlexis Bauvin <abauvin@scaleway.com>
      Reviewed-by: default avatarAmine Kherbouche <akherbouche@scaleway.com>
      Tested-by: default avatarAmine Kherbouche <akherbouche@scaleway.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      da5095d0
    • David S. Miller's avatar
      Merge branch 'mlxsw-fw_load_policy' · e3dd7627
      David S. Miller authored
      Ido Schimmel says:
      
      ====================
      mlxsw: Add 'fw_load_policy' devlink parameter
      
      Shalom says:
      
      Currently, drivers do not have the ability to control the firmware
      loading policy and they always use their own fixed policy. This prevents
      drivers from running the device with a different firmware version for
      testing and/or debugging purposes. For example, testing a firmware bug
      fix.
      
      For these situations, the new devlink generic parameter,
      'fw_load_policy', gives the ability to control this option and allows
      drivers to run with a different firmware version than required by the
      driver.
      
      Patch #1 adds the new parameter to devlink. The other two patches, #2
      and #3, add support for this parameter in the mlxsw driver.
      
      Example:
        # Query the devlink parameters supported by the device
          $ devlink dev param show
          pci/0000:03:00.0:
            name fw_load_policy type generic
              values:
                cmode driverinit value driver
      
        # Flash new firmware using ethtool
          $ ethtool -f swp1 mellanox/mlxsw_spectrum-13.1703.4.mfa2
      
        # Toggle parameter
          $ devlink dev param set pci/0000:03:00.0 name fw_load_policy value flash cmode driverinit
      
        # devlink reset
          $ devlink dev reload pci/0000:03:00.0
      
        # Query firmware version to show changes took affect
          $ ethtool -i swp1
          driver: mlxsw_spectrum
          version: 1.0
          firmware-version: 13.1703.4
          expansion-rom-version:
          bus-info: 0000:03:00.0
          supports-statistics: yes
          supports-test: no
          supports-eeprom-access: no
          supports-register-dump: no
          supports-priv-flags: no
      
      iproute2 patches available here:
      https://github.com/tshalom/iproute2-next
      
      v2:
      * Change 'fw_version_check' to 'fw_load_policy' with values 'driver' and
        'flash' (Jakub)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3dd7627
    • Shalom Toledo's avatar
      mlxsw: spectrum: Load firmware version based on devlink parameter · 064501c5
      Shalom Toledo authored
      Load firmware version based on 'fw_load_policy' devlink parameter. The
      driver supports these two options:
          * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DRIVER (0)
            Default, load firmware version preferred by the driver
          * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH (1)
            Load firmware currently stored in flash
      
      The second option, 'flash', allow the device to run with different firmware
      version than preferred by the driver for testing and/or debugging purposes.
      For example, testing a firmware bug fix.
      Signed-off-by: default avatarShalom Toledo <shalomt@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      064501c5
    • Shalom Toledo's avatar
      mlxsw: core: Reset firmware after flash during driver initialization · 03bffcad
      Shalom Toledo authored
      After flashing new firmware during the driver initialization flow (reload
      or not), the driver should do a firmware reset when it gets -EAGAIN in
      order to load the new one.
      Signed-off-by: default avatarShalom Toledo <shalomt@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      03bffcad
    • Shalom Toledo's avatar
      devlink: Add 'fw_load_policy' generic parameter · 846e980a
      Shalom Toledo authored
      Many drivers load the device's firmware image during the initialization
      flow either from the flash or from the disk. Currently this option is not
      controlled by the user and the driver decides from where to load the
      firmware image.
      
      'fw_load_policy' gives the ability to control this option which allows the
      user to choose between different loading policies supported by the driver.
      
      This parameter can be useful while testing and/or debugging the device. For
      example, testing a firmware bug fix.
      Signed-off-by: default avatarShalom Toledo <shalomt@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      846e980a
    • Heiner Kallweit's avatar
      net: phy: don't allow __set_phy_supported to add unsupported modes · 6915bf3b
      Heiner Kallweit authored
      Currently __set_phy_supported allows to add modes w/o checking whether
      the PHY supports them. This is wrong, it should never add modes but
      only remove modes we don't want to support.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6915bf3b
  3. 01 Dec, 2018 13 commits
  4. 30 Nov, 2018 1 commit
    • David S. Miller's avatar
      Merge branch 'qed-Doorbell-overflow-recovery' · 734317d9
      David S. Miller authored
      Ariel Elior says:
      
      ====================
      qed*: Doorbell overflow recovery
      
      Doorbell Overflow
      If sufficient CPU cores will send doorbells at a sufficiently high rate, they
      can cause an overflow in the doorbell queue block message fifo. When fill level
      reaches maximum, the device stops accepting all doorbells from that PF until a
      recovery procedure has taken place.
      
      Doorbell Overflow Recovery
      The recovery procedure basically means resending the last doorbell for every
      doorbelling entity. A doorbelling entity is anything which may send doorbells:
      L2 tx ring, rdma sq/rq/cq, light l2, vf l2 tx ring, spq, etc. This relies on
      the design assumption that all doorbells are aggregative, so last doorbell
      carries the information of all previous doorbells.
      
      APIs
      All doorbelling entities need to register with the mechanism before sending
      doorbells. The registration entails providing the doorbell address the entity
      would be using, and a virtual address where last doorbell data can be found.
      Typically fastpath structures already have this construct.
      
      Executing the recovery procedure
      Handling the attentions, iterating over all the registered entities and
      resending their doorbells, is all handled within qed core module.
      
      Relevance
      All doorbelling entities in all protocols need to register with the mechanism,
      via the new APIs. Technically this is quite simple (just call the API). Some
      protocol fastpath implementation may not have the doorbell data stored anywhere
      (compute it from scratch every time) and will have to add such a place.
      This is rare and is also better practice (save some cycles on the fastpath).
      
      Performance Penalty
      No performance penalty should incur as a result of this feature. If anything
      performance can improve by avoiding recalcualtion of doorbell data everytime
      doorbell is sent (in some flows).
      
      Add the database used to register doorbelling entities, and APIs for adding
      and deleting entries, and logic for traversing the database and doorbelling
      once on behalf of all entities.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      734317d9