1. 03 Dec, 2015 40 commits
    • David S. Miller's avatar
      Merge branch 'vsock-virtio' · c402293b
      David S. Miller authored
      Stefan Hajnoczi says:
      
      ====================
      Add virtio transport for AF_VSOCK
      
      v2:
       * Rebased onto Linux v4.4-rc2
       * vhost: Refuse to assign reserved CIDs
       * vhost: Refuse guest CID if already in use
       * vhost: Only accept correctly addressed packets (no spoofing!)
       * vhost: Support flexible rx/tx descriptor layout
       * vhost: Add missing total_tx_buf decrement
       * virtio_transport: Fix total_tx_buf accounting
       * virtio_transport: Add virtio_transport global mutex to prevent races
       * common: Notify other side of SOCK_STREAM disconnect (fixes shutdown
         semantics)
       * common: Avoid recursive mutex_lock(tx_lock) for write_space (fixes deadlock)
       * common: Define VIRTIO_VSOCK_TYPE_STREAM/DGRAM hardware interface constants
       * common: Define VIRTIO_VSOCK_SHUTDOWN_RCV/SEND hardware interface constants
       * common: Fix peer_buf_alloc inheritance on child socket
      
      This patch series adds a virtio transport for AF_VSOCK (net/vmw_vsock/).
      AF_VSOCK is designed for communication between virtual machines and
      hypervisors.  It is currently only implemented for VMware's VMCI transport.
      
      This series implements the proposed virtio-vsock device specification from
      here:
      http://comments.gmane.org/gmane.comp.emulators.virtio.devel/855
      
      Most of the work was done by Asias He and Gerd Hoffmann a while back.  I have
      picked up the series again.
      
      The QEMU userspace changes are here:
      https://github.com/stefanha/qemu/commits/vsock
      
      Why virtio-vsock?
      -----------------
      Guest<->host communication is currently done over the virtio-serial device.
      This makes it hard to port sockets API-based applications and is limited to
      static ports.
      
      virtio-vsock uses the sockets API so that applications can rely on familiar
      SOCK_STREAM and SOCK_DGRAM semantics.  Applications on the host can easily
      connect to guest agents because the sockets API allows multiple connections to
      a listen socket (unlike virtio-serial).  This simplifies the guest<->host
      communication and eliminates the need for extra processes on the host to
      arbitrate virtio-serial ports.
      
      Overview
      --------
      This series adds 3 pieces:
      
      1. virtio_transport_common.ko - core virtio vsock code that uses vsock.ko
      
      2. virtio_transport.ko - guest driver
      
      3. drivers/vhost/vsock.ko - host driver
      
      Howto
      -----
      The following kernel options are needed:
        CONFIG_VSOCKETS=y
        CONFIG_VIRTIO_VSOCKETS=y
        CONFIG_VIRTIO_VSOCKETS_COMMON=y
        CONFIG_VHOST_VSOCK=m
      
      Launch QEMU as follows:
        # qemu ... -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3
      
      Guest and host can communicate via AF_VSOCK sockets.  The host's CID (address)
      is 2 and the guest is automatically assigned a CID (use VMADDR_CID_ANY (-1) to
      bind to it).
      
      Status
      ------
      There are a few design changes I'd like to make to the virtio-vsock device:
      
      1. The 3-way handshake isn't necessary over a reliable transport (virtqueue).
         Spoofing packets is also impossible so the security aspects of the 3-way
         handshake (including syn cookie) add nothing.  The next version will have a
         single operation to establish a connection.
      
      2. Credit-based flow control doesn't work for SOCK_DGRAM since multiple clients
         can transmit to the same listen socket.  There is no way for the clients to
         coordinate buffer space with each other fairly.  The next version will drop
         credit-based flow control for SOCK_DGRAM and only rely on best-effort
         delivery.  SOCK_STREAM still has guaranteed delivery.
      
      3. In the next version only the host will be able to establish connections
         (i.e. to connect to a guest agent).  This is for security reasons since
         there is currently no ability to provide host services only to certain
         guests.  This also matches how AF_VSOCK works on modern VMware hypervisors.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c402293b
    • Asias He's avatar
      VSOCK: Add Makefile and Kconfig · 8a2a2029
      Asias He authored
      Enable virtio-vsock and vhost-vsock.
      Signed-off-by: default avatarAsias He <asias@redhat.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a2a2029
    • Asias He's avatar
      VSOCK: Introduce vhost-vsock.ko · 98bb8928
      Asias He authored
      VM sockets vhost transport implementation. This module runs in host
      kernel.
      Signed-off-by: default avatarAsias He <asias@redhat.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98bb8928
    • Asias He's avatar
      VSOCK: Introduce virtio-vsock.ko · 32e61b06
      Asias He authored
      VM sockets virtio transport implementation. This module runs in guest
      kernel.
      Signed-off-by: default avatarAsias He <asias@redhat.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32e61b06
    • Asias He's avatar
      VSOCK: Introduce virtio-vsock-common.ko · 80a19e33
      Asias He authored
      This module contains the common code and header files for the following
      virtio-vsock and virtio-vhost kernel modules.
      Signed-off-by: default avatarAsias He <asias@redhat.com>
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80a19e33
    • Asias He's avatar
    • Roopa Prabhu's avatar
      mpls: support for dead routes · c89359a4
      Roopa Prabhu authored
      Adds support for RTNH_F_DEAD and RTNH_F_LINKDOWN flags on mpls
      routes due to link events. Also adds code to ignore dead
      routes during route selection.
      
      Unlike ip routes, mpls routes are not deleted when the route goes
      dead. This is current mpls behaviour and this patch does not change
      that. With this patch however, routes will be marked dead.
      dead routes are not notified to userspace (this is consistent with ipv4
      routes).
      
      dead routes:
      -----------
      $ip -f mpls route show
      100
          nexthop as to 200 via inet 10.1.1.2  dev swp1
          nexthop as to 700 via inet 10.1.1.6  dev swp2
      
      $ip link set dev swp1 down
      
      $ip link show dev swp1
      4: swp1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode
      DEFAULT group default qlen 1000
          link/ether 00:02:00:00:00:01 brd ff:ff:ff:ff:ff:ff
      
      $ip -f mpls route show
      100
          nexthop as to 200 via inet 10.1.1.2  dev swp1 dead linkdown
          nexthop as to 700 via inet 10.1.1.6  dev swp2
      
      linkdown routes:
      ----------------
      $ip -f mpls route show
      100
          nexthop as to 200 via inet 10.1.1.2  dev swp1
          nexthop as to 700 via inet 10.1.1.6  dev swp2
      
      $ip link show dev swp1
      4: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
      state UP mode DEFAULT group default qlen 1000
          link/ether 00:02:00:00:00:01 brd ff:ff:ff:ff:ff:ff
      
      /* carrier goes down */
      $ip link show dev swp1
      4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast
      state DOWN mode DEFAULT group default qlen 1000
          link/ether 00:02:00:00:00:01 brd ff:ff:ff:ff:ff:ff
      
      $ip -f mpls route show
      100
          nexthop as to 200 via inet 10.1.1.2  dev swp1 linkdown
          nexthop as to 700 via inet 10.1.1.6  dev swp2
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Acked-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c89359a4
    • David S. Miller's avatar
      Merge branch 'rsvb-compat-strings' · 9f4842a8
      David S. Miller authored
      Simon Horman says:
      
      ====================
      ravb: More compatibility strings
      
      this short series adds generic gen2 and gen3, and soc-specific
      compatibility strings for the missing gen2 SoCs.
      
      Key Changes in v2:
      * Include "rcar-" in generic bindings
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f4842a8
    • Simon Horman's avatar
      ravb: add device tree support for r8a779[123] · af8002d3
      Simon Horman authored
      Simply document new compatibility strings.
      As a previous patch adds a generic R-Car Gen2 compatibility string
      there appears to be no need for a driver updates.
      Signed-off-by: default avatarSimon Horman <horms+renesas@verge.net.au>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af8002d3
    • Simon Horman's avatar
      ravb: add fallback compatibility strings · 0e874361
      Simon Horman authored
      Add fallback compatibility strings for R-Car Gen 2 & 3 SoC Families.
      This is in keeping with the fallback scheme being adopted wherever appropriate
      for drivers for Renesas SoCs.
      Signed-off-by: default avatarSimon Horman <horms+renesas@verge.net.au>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e874361
    • Phil Sutter's avatar
      net: ipv6: restrict hop_limit sysctl setting to range [1; 255] · d6df198d
      Phil Sutter authored
      Setting a value bigger than 255 resulted in using only the lower eight
      bits of that value as it is assigned to the u8 header field. To avoid
      this unexpected result, reject such values.
      
      Setting a value of zero is technically possible, but hosts receiving
      such a packet have to treat it like hop_limit was set to one, according
      to RFC2460. Therefore I don't see a use-case for that.
      
      Setting a route's hop_limit to zero in iproute2 means to use the sysctl
      default, which is not the case here: Setting e.g.
      net.conf.eth0.hop_limit=0 will not make the kernel use
      net.conf.all.hop_limit for outgoing packets on eth0. To avoid these
      kinds of confusion, reject zero.
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6df198d
    • Kazuya Mizuguchi's avatar
      ravb: ptp: Add CONFIG mode support · f5d7837f
      Kazuya Mizuguchi authored
      This patch makes PTP support active in CONFIG mode on R-Car Gen3.
      Signed-off-by: default avatarKazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
      Signed-off-by: default avatarYoshihiro Kaneko <ykaneko0929@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5d7837f
    • David S. Miller's avatar
      Merge branch 'netronome-NFP4000-NFP6000' · 03b01cf5
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      Netronome NFP4000/NFP6000 NIC VF driver
      
      This patchset adds support for VFs of Netronome's NFP-4000 and NFP-6000
      based NICs. We are currently also preparing the submission for the PF
      driver, but it is not quite ready yet. The PF driver can be found on
      GitHub:
      
      https://github.com/Netronome/nfp-drv-kmods
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      03b01cf5
    • Jakub Kicinski's avatar
      net: add driver for Netronome NFP4000/NFP6000 NIC VFs · 4c352362
      Jakub Kicinski authored
      Add driver for Virtual Functions for the Netronome's
      NFP-4000 and NFP-6000 based NICs.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarRolf Neugebauer <rolf.neugebauer@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c352362
    • Jakub Kicinski's avatar
      pci_ids: add Netronome Systems vendor · 2d1e0254
      Jakub Kicinski authored
      Add PCI vendor id for Netronome Systems.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarRolf Neugebauer <rolf.neugebauer@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d1e0254
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · f4f7981e
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2015-12-03
      
      This series contains updates to i40e and i40evf only.
      
      Mitch updates the i40evf driver by increasing the maximum number of queues,
      since future devices will allow for more queue pairs.  Cleans up a
      duplicate printing of the driver info string done in init, since it is
      already done in probe.  Cleaned up the several allocations which did
      not need to be at atomic level, where GFP_KERNEL would work just fine.
      Then makes i40e_sync_vsi_filters() a more mature function, make having
      a common exit point so it will properly release the busy lock on the VSI
      and propagate errors to the callers.  Then does some whitespace
      housekeeping in i40evf.
      
      Kiran moves and updates the detection/recovery of transmit queue hang code
      to service_task from tx_timeout function.  Also fixed memory leak when
      users program flow-director filter using ethtool (sideband filter
      programming), the cause being the check of 'tx_buffer->skb' was preventing
      'raw_buf' from being freed as part of the cleanup.
      
      Jesse enabled the ability to turn off/on packet split using ethtool priv
      flags.  Then does some housekeeping for both the i40e and i40evf drivers
      which includes: remove unused/useless code, correct whitespace, remove
      duplicate #include, fix incorrect comment, etc...
      
      Neerav cleans up functions to gather Flow Control Rx XOFF stats, since
      the recent change in the driver logic for checking transmit hang has been
      moved, so these functions do not do anything meaningful any longer.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4f7981e
    • David S. Miller's avatar
      Merge branch 'mlx5-connectx-4-sriov' · c5b6c3ee
      David S. Miller authored
      Or Gerlitz says:
      
      ====================
      Introducing ConnectX-4 Ethernet SRIOV
      
      This patchset introduces the support of Ethernet SRIOV in ConnectX-4
      family of 100G Ethernet NICs.
      
      Some features are still missing, but all the basic SRIOV functionalities
      are there already.
      
      Basic Introduction:
      ConnectX-4 HW architecture provides two kinds of underlying HW switches.
      
      MPFS (Multi Physical Function Switch) or L2 Table in Software terms:
      
      The HCA has one MPFS switch per physical port, this switch is responsible
      of forwarding Unicast traffic to the various overlying Physical Functions (PFs).
      Multicast traffic is flooded amongst all the PFs, Each PF can request to
      forward a unicast MAC to its E-Switch Uplink vport (which we will cover later)
      through SET_L2_TABLE_ENTRY HW command.
      
      MPFS has five ports, four are connected to PFs (one for each) and one is connected
      directly to the Physical Port (Physical Link).
      
      E-Switch (Ethernet Switch):
      
      The HCA has one per physical function. The main responsibility of this component is
      to forward Unicast/Multicast and vlan tagged/untagged traffic to the various
      Virtual Functions (VFs) allocated by the PF. Unlike MPFS, the PF needs to explicitly
      create the E-Switch FDB table, Which is a HW flow table managed by the PF driver
      whenever vport_group_manager capability bit is set for this PF.
      
      E-Switch has Virtual Ports (vports) entities as its ports, vport0 and uplink vport
      are special kind of vports that represents PF vport (vport0) and uplink vport which
      is connected to the MPFS switch (if exists) as the PF external link.
      vport1..vportN represent VF0..VF(N-1) egress/ingress ports.
      
      E-Switch FDB contains forwarding rules such as:
              UC MAC0 -> vport0(PF).
              UC MAC1 -> vport1.
              UC MAC2 -> vport2.
              MC MACX -> vport0, vport2, Uplink.
              MC MACY -> vport1, Uplink.
      
          For unmatched traffic FDB has the following default rules:
              Unmatched Traffic (src vport != Uplink) -> Uplink.
              Unmatched Traffic (src vport == Uplink) -> vport0(PF).
      
      NIC VPort context:
      Each NIC (VF/PF) has its own vport context which will be used to store the current
      NIC vport context (UC/MC and vlan lists) and other NIC properties such as MTU, promisc
      mode, etc.. NIC (VF/PF) driver is responsible of constantly updating this context.
      
      FDB rules population:
      Each NIC vport (VF/PF) will notify E-Switch manager of its UC/MC vport
      context changes via modify vport context command, which will be
      translated to an event that will be handled by E-Switch manager (PF)
      which will update FDB table accordingly.
      
      Both PF and VF use the same driver and submit commands directly to the firmware.
      The PF sees the vport_group_manager capability bit and as such runs the code
      to populate the embedded switches as explained above.
      
      The patch goes as follows:
      
      Patches 1-2 introduces the basic PCI SRIOV functionalities and the support of
      Connectx4 to enable specific VFs via enable/disable HCA commands. These two
      patches will be also in use later for the IB SRIOV flow.
      
      Patches 3-8 Introduces the basic E-Switch capabilities and commands to be used later by
      VF to modify and update its NIC vport context, and by PF (E-Switch Manager) driver to
      Query the VF NIC context and acts accordingly.
      
      Patches 9-10 Provide the needed functionality of a NIC driver VF/PF to support SRIOV,
      mainly vport context update support.
      
      Patch 11 ("net/mlx5: Introducing E-Switch and l2 table"), Introduces the basic
      E-Switch support and infrastructure to read vport context events and to update
      MPFS L2 Table of the UC mac addresses request by the PF.
      
      Patches 12-18 Introduces SRIOV enablemenet and E-Switch FDB table management
      It adds the Basic E-Swtich public API to set and get sriov properties to be used
      in PF netdev sriov ndos.
      
      Patchset was applied ontop of commit 3f8c0f7e "gianfar: use of_property_read_bool()"
      
      Saeed, Eli and Or.
      
      changes from V0, addressed feedback from Alex Duyck:
       - patch 09, remove the loop to seek the device address
       - patch 09, avoid using array as returned value from helper function
       - patch 10, fix possible buffer over-run
      
      changes from V1, addressed feedback from and Julia Lawall and kbuild test robot
       - patch 11 check the right variable for allocation failure
       - patch 18 eliminated unneeded semicolon
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5b6c3ee
    • Saeed Mahameed's avatar
      net/mlx5e: Add support for SR-IOV ndos · 66e49ded
      Saeed Mahameed authored
      Implement and enable SR-IOV ndos to manage SR-IOV configuration via
      netdev netlink API.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66e49ded
    • Saeed Mahameed's avatar
      net/mlx5: E-Switch, Introduce get vf statistics · 3b751a2a
      Saeed Mahameed authored
      Add support to get VF statistics using query vport
      counter command.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b751a2a
    • Saeed Mahameed's avatar
      net/mlx5: E-Switch, Introduce set vport vlan (VST mode) · 9e7ea352
      Saeed Mahameed authored
      Add query and modify functions to control client vlan and qos
      striping or insertion, in E-Switch vports contexts.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e7ea352
    • Saeed Mahameed's avatar
      net/mlx5: E-Switch, Introduce HCA cap and E-Switch vport context · d6666753
      Saeed Mahameed authored
      E-Switch vport context is unlike NIC vport context, managed by the
      E-Switch manager or vport_group_manager and not by the NIC(VF) driver.
      
      The E-Switch manager can access (read/modify) any of its vports
      E-Switch context.
      
      Currently E-Switch vport context includes only clietnt and server
      vlan insertion and striping data (for later support of VST mode).
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6666753
    • Saeed Mahameed's avatar
      net/mlx5: E-Switch, Introduce Vport administration functions · 77256579
      Saeed Mahameed authored
      Implement set VF mac/link state and query VF config
      to be used later in nedev VF ndos or any other management API.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77256579
    • Saeed Mahameed's avatar
      net/mlx5: E-Switch, Add SR-IOV (FDB) support · 81848731
      Saeed Mahameed authored
      Enabling E-Switch SRIOV for nvfs+1 vports.
      
      Create E-Switch FDB for L2 UC/MC mac steering between VFs/PF and
      external vport (Uplink).
      
      FDB contains forwarding rules such as:
      	UC MAC0 -> vport0(PF).
      	UC MAC1 -> vport1.
      	UC MAC2 -> vport2.
      	MC MACX -> vport0, vport2, Uplink.
      	MC MACY -> vport1, Uplink.
      
      For unmatched traffic FDB has the following default rules:
      	Unmached Traffic (src vport != Uplink) -> Uplink.
      	Unmached Traffic (src vport == Uplink) -> vport0(PF).
      
      FDB rules population:
      Each NIC vport (VF) will notify E-Switch manager of its UC/MC vport
      context changes via modify vport context command, which will be
      translated to an event that will be handled by E-Switch manager (PF)
      which will update FDB table accordingly.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      81848731
    • Saeed Mahameed's avatar
      net/mlx5: E-Switch, Introduce FDB hardware capabilities · 495716b1
      Saeed Mahameed authored
      Define needed hardware structures and capabilities needed
      for E-Switch FDB flow tables and read them on driver load.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      495716b1
    • Saeed Mahameed's avatar
      net/mlx5: Introducing E-Switch and l2 table · 073bb189
      Saeed Mahameed authored
      E-Switch is the software entity that represents and manages ConnectX4
      inter-HCA ethernet l2 switching.
      
      E-Switch has its own Virtual Ports, each Vport/vNIC/VF can be
      connected to the device through a vport of an e-switch.
      
      Each e-switch is managed by one vNIC identified by
      HCA_CAP.vport_group_manager (usually it is the PF/vport[0]),
      and its main responsibility is to forward each packet to the
      right vport.
      
      e-Switch needs to manage its own l2-table and FDB tables.
      
      L2 table is a flow table that is managed by FW, it is needed for
      Multi-host (Multi PF) configuration for inter HCA switching between
      PFs.
      
      FDB table is a flow table that is totally managed by e-Switch driver,
      its main responsibility is to switch packets between e-Swtich internal
      vports and uplink vport that belong to the same.
      
      This patch introduces only e-Swtich l2 table management, FDB managemnt
      will come later when ethernet SRIOV/VFs will be enabled.
      
      preperation for ethernet sriov and l2 table management.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      073bb189
    • Saeed Mahameed's avatar
      net/mlx5e: Write vlan list into vport context · aad9e6e4
      Saeed Mahameed authored
      Each Vport/vNIC must notify underlying e-Switch layer
      for vlan table changes in-order to update SR-IOV FDB tables.
      
      We do that at vlan_rx_add_vid and vlan_rx_kill_vid ndos.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aad9e6e4
    • Saeed Mahameed's avatar
      net/mlx5e: Write UC/MC list and promisc mode into vport context · 5e55da1d
      Saeed Mahameed authored
      Each Vport/vNIC must notify underlying e-Switch layer
      for UC/MC list and promisc mode updates, in-order to update
      l2 tables and SR-IOV FDB tables.
      
      We do that at set_rx_mode ndo.
      
      preperation for ethernet-SRIOV and l2 table management.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e55da1d
    • Saeed Mahameed's avatar
      net/mlx5: Introduce access functions to modify/query vport vlans · c0046cf7
      Saeed Mahameed authored
      Those functions are needed to notify the upcoming L2 table and SR-IOV
      E-Switch(FDB) manager(PF), of the NIC vport (vf) vlan table changes.
      
      preperation for ethernet sriov and l2 table management.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0046cf7
    • Saeed Mahameed's avatar
      net/mlx5: Introduce access functions to modify/query vport promisc mode · d82b7318
      Saeed Mahameed authored
      Those functions are needed to notify the upcoming SR-IOV
      E-Switch(FDB) manager(PF), of the NIC vport (vf) promisc mode changes.
      
      Preperation for ethernet sriov and l2 table management.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d82b7318
    • Saeed Mahameed's avatar
      net/mlx5: Introduce access functions to modify/query vport state · e7546514
      Saeed Mahameed authored
      In preparation for SR-IOV we add here an API to enable each e-switch
      manager (PF) to configure its VFs link states in e-switch
      
      preparation for ethernet sriov.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7546514
    • Saeed Mahameed's avatar
      net/mlx5: Introduce access functions to modify/query vport mac lists · e16aea27
      Saeed Mahameed authored
      Those functions are needed to notify the upcoming L2 table and SR-IOV
      E-Switch(FDB) manager(PF), of the NIC vport (vf) UC/MC mac lists
      changes.
      
      preperation for ethernet sriov and l2 table management.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e16aea27
    • Saeed Mahameed's avatar
      net/mlx5: Update access functions to Query/Modify vport MAC address · e1d7d349
      Saeed Mahameed authored
      In preparation for SR-IOV we add here an API to enable each e-switch
      client (PF/VF) to configure its L2 MAC addresses and for the e-switch
      manager (usually the PF) to access them in order to be able to
      configure them into the e-switch.
      Therefore we now pass vport num parameter to
      mlx5_query_nic_vport_context, so PF can access other vports contexts.
      
      preperation for ethernet sriov and l2 table management.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1d7d349
    • Saeed Mahameed's avatar
      net/mlx5: Add HW capabilities and structs for SR-IOV E-Switch · 54f0a411
      Saeed Mahameed authored
      Update HCA capabilities and HW struct to include needed
      capabilities for upcoming Ethernet Switch (SR-IOV E-Switch).
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54f0a411
    • Eli Cohen's avatar
      net/mlx5_core: Add base sriov support · fc50db98
      Eli Cohen authored
      This patch adds SRIOV base support for mlx5 supported devices. The same
      driver is used for both PFs and VFs; VFs are identified by the driver
      through the flag MLX5_PCI_DEV_IS_VF added to the pci table entries.
      Virtual functions are created as usual through writing a value to the
      sriov_numvs sysfs file of the PF device. Upon instantiating VFs, they will
      all be probed by the driver on the hypervisor. One can gracefully unbind
      them through /sys/bus/pci/drivers/mlx5_core/unbind.
      
      mlx5_wait_for_vf_pages() was added to ensure that when a VF dies without
      executing proper teardown, the hypervisor driver waits till all of the
      pages that were allocated at the hypervisor to maintain its operation
      are returned.
      
      In order for the VF to be operational, the PF needs to call enable_hca
      for it. This can be done before the VFs are created through a call to
      pci_enable_sriov.
      
      If the there are VFs assigned to a VMs when the driver of the PF is
      unloaded, all the VF will experience system error and PF driver unloads
      cleanly; in this case pci_disable_sriov is not called and the devices
      will show when running lspci. Once the PF driver is reloaded, it will
      sync its data structures which maintain state on its VFs.
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc50db98
    • Eli Cohen's avatar
      net/mlx5_core: Modify enable/disable hca functions · 0b107106
      Eli Cohen authored
      Modify these functions to have func_id argument to state which device we
      are referring to. This is done as a preparation for SRIOV support where
      a PF driver needs to control its virtual functions.
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b107106
    • Jarod Wilson's avatar
      alx: remove pointless assignment · 24e2416e
      Jarod Wilson authored
      Reasonably sure this doesn't serve any purpose.
      
      CC: Jay Cliburn <jcliburn@gmail.com>
      CC: Chris Snook <chris.snook@gmail.com>
      CC: netdev@vger.kernel.org
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      24e2416e
    • David S. Miller's avatar
      Merge branch 'bonding-team-offload' · c5b8b34c
      David S. Miller authored
      Jiri Pirko says:
      
      ====================
      bonding/team offload + mlxsw implementation
      
      This patchset introduces needed infrastructure for link aggregation
      offload - for both team and bonding. It also implements the offload
      in mlxsw driver.
      
      Particulary, this patchset introduces possibility for upper driver
      (bond/team/bridge/..) to pass type-specific info down to notifier listeners.
      Info is passed along with NETDEV_CHANGEUPPER/NETDEV_PRECHANGEUPPER
      notifiers. Listeners (drivers of netdevs being enslaved) can react
      accordingly.
      
      Other extension is for run-time use. This patchset introduces
      new netdev notifier type - NETDEV_CHANGELOWERSTATE. Along with this
      notification, the upper driver (bond/team/bridge/..) can pass some
      information about lower device change, particulary link-up and
      TX-enabled states. Listeners (drivers of netdevs being enslaved)
      can react accordingly.
      
      The last part of the patchset is implementation of LAG offload in mlxsw,
      using both previously introduced infrastructre extensions.
      
      Note that bond-speficic (and ugly) NETDEV_BONDING_INFO used by mlx4
      can be removed and mlx4 can use the extensions this patchset adds.
      I plan to convert it and get rid of NETDEV_BONDING_INFO in
      a follow-up patchset.
      
      v2->v3:
      - one small fix in patch 1
      v1->v2:
      - added patch 1 and 2 per Andy's request
      - couple of more or less cosmetic changes described in couple other patches
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5b8b34c
    • Jiri Pirko's avatar
      mlxsw: spectrum: Implement LAG tx enabled lower state change · 74581206
      Jiri Pirko authored
      Enabling/disabling TX on a LAG port means enabling/disabling distribution
      in our HW.
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74581206
    • Jiri Pirko's avatar
      mlxsw: spectrum: Implement FDB add/remove/dump for LAG · 8a1ab5d7
      Jiri Pirko authored
      Implement FDB offloading for lagged ports, including learning LAG FDB
      entries, adding/removing static FDB entries and dumping existing LAG FDB
      entries.
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a1ab5d7
    • Jiri Pirko's avatar
      mlxsw: spectrum: Implement LAG port join/leave · 0d65fc13
      Jiri Pirko authored
      Implement basic procedures for joining/leaving port to/from LAG. That
      includes HW setup of collector, core LAG mapping setup.
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d65fc13