1. 06 Sep, 2018 40 commits
    • Weilin Chang's avatar
      liquidio: Add spoof checking on a VF MAC address · 48875222
      Weilin Chang authored
      1. Provide the API to set/unset the spoof checking feature.
      2. Add a function to periodically provide the count of found
         packets with spoof VF MAC address.
      3. Prevent VF MAC address changing while the spoofchk of the VF is
         on unless the changing MAC address is issued from PF.
      Signed-off-by: default avatarWeilin Chang <weilin.chang@cavium.com>
      Signed-off-by: default avatarFelix Manlunas <felix.manlunas@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48875222
    • David S. Miller's avatar
      Merge tag 'mlx5e-updates-2018-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · ddc9cc01
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5e-updates-2018-09-05
      
      This series provides updates to mlx5 ethernet driver.
      
      1) Starting with a four patches series to optimize flow counters updates,
      From Vlad Buslov:
      ==============================================
      
      By default mlx5 driver updates cached counters each second. Update function
      consumes noticeable amount of CPU resources. The goal of this patch series
      is to optimize update function.
      
      Investigation revealed following bottlenecks in fs counters
      implementation:
       1) Update code(scheduled each second) iterates over all counters twice.
       (first for finding and deleting counters that are marked for deletion,
       second iteration is for actually updating the counters)
       2) Counters are stored in rb tree. Linear iteration over all rb tree
       elements(rb_next in profiling data) consumed ~65% of time spent in
       update function.
      
      Following optimizations were implemented:
       1) Instead of just marking counters for deletion, store them in
       standalone list. This removes first iteration over whole counters tree.
       2) Store counters in sorted list to optimize traversing them and remove
       calls to rb_next.
      
      First implementation of these changes caused degradation of performance,
      instead of improving it. Investigation revealed that there first cache
      line of struct mlx5_fc is full and adding anything to it causes amount
      of cache misses to double. To mitigate that, following refactorings were
      implemented:
       - Change 'addlist' list type from double linked to single linked. This
       allowes to get free space for one additional pointer that is used to
       store deletion list(optimization 1)
       - Substitute rb tree with idr. Idr is non-intrusive data structure and
       doesn't require adding any new members to struct mlx5_fc. Use free
       space that became available for double linked sorted list that is used
       for traversing all counters. (optimization 2)
      
      Described changes reduced CPU time spent in mlx5_fc_stats_work from 70%
      to 44%. (global perf profile mode)
      ============================================
      
      The rest of the series are misc updates:
      
      2) From Kamal, Move mlx5e_priv_flags into en_ethtool.c, to avoid a
      compilation warning.
      
      3) From Roi Dayan, Move Q counters allocation and drop RQ to init_rx profile
      function to avoid allocating Q counters when not required.
      
      4) From Shay Agroskin, Replace PTP clock lock from RW lock to seq lock.
      Almost double the packet rate when timestamping is active on multiple TX
      queues.
      
      5) From: Natali Shechtman, set ECN for received packets using CQE indication.
      
      6) From: Alaa Hleihel, don't set CHECKSUM_COMPLETE on SCTP packets.
      CHECKSUM_COMPLETE is not applicable to SCTP protocol.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ddc9cc01
    • David S. Miller's avatar
      Merge branch 'dsa-b53-SerDes-support' · 2002bc32
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: dsa: b53: SerDes support
      
      This patch series adds support for the SerDes found on NorthStar Plus
      (NSP) which allows us to use the SFP port on the BCM958625HR board (and
      other similar designs).
      
      Changes in v3:
      
      - properly hunk the request_threaded_irq() bits into patch #2
      
      Changes in v2:
      
      - migrate to threaded interrupt (Andrew)
      - fixed a case where MLO_AN_FIXED's mac_config would still call into
        the serdes_config callback
      - added an additional check on the phylink interface in mac_config
      - default to ARCH_BCM_NSP instead of ARCH_BCM_IPROC which is really
        the NSP Kconfig bit we want
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2002bc32
    • Florian Fainelli's avatar
      net: dsa: b53: Add SerDes support · 0e01491d
      Florian Fainelli authored
      Add support for the Northstar Plus SerDes which is accessed through a
      special page of the switch. Since this is something that most people
      probably will not want to use, make it a configurable option with a
      default on ARCH_BCM_NSP where it is the most useful currently.
      
      The SerDes supports both SGMII and 1000baseX modes for both lanes, and
      2500baseX for one of the lanes, and is internally looking like a
      seemingly standard MII PHY, except for the few bits that got repurposed.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e01491d
    • Florian Fainelli's avatar
      net: dsa: b53: Add PHYLINK support · a8e8b985
      Florian Fainelli authored
      Add support for PHYLINK, things are reasonably straight forward since we
      do not yet support SerDes interfaces, that leaves us with just
      MLO_AN_PHY and MLO_AN_FIXED to deal with.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8e8b985
    • Florian Fainelli's avatar
      net: dsa: b53: Add helper to set link parameters · 5e004460
      Florian Fainelli authored
      Extract the logic from b53_adjust_link() responsible for overriding a
      given port's link, speed, duplex and pause settings and make two helper
      functions to set the port's configuration and the port's link settings.
      We will make use of both, as separate functions while adding PHYLINK
      support next.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e004460
    • Florian Fainelli's avatar
      net: dsa: b53: Make SRAB driver manage port interrupts · 16994374
      Florian Fainelli authored
      Update the SRAB driver to manage per-port interrupts. Since we cannot
      sleep during b53_io_ops, schedule a workqueue whenever we get a port
      specific interrupt. We will later make use of this to call back into
      PHYLINK when there is e.g: a link state change.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16994374
    • Florian Fainelli's avatar
      net: dsa: b53: Add ability to enable/disable port interrupts · 8ca7c160
      Florian Fainelli authored
      Some switches expose individual interrupt line(s) for port specific
      event(s), allow configuring these interrupts at an appropriate time
      during port_enable/disable callbacks where all port specific resources
      are known to be set-up and ready for use.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ca7c160
    • Denis Bolotin's avatar
      qed*: Utilize FW 8.37.7.0 · a3f72307
      Denis Bolotin authored
      This patch adds a new qed firmware with fixes and support for new features.
      
      Fixes:
      - Fix a rare case of device crash with iWARP, iSCSI or FCoE offload.
      - Fix GRE tunneled traffic when iWARP offload is enabled.
      - Fix RoCE failure in ib_send_bw when using inline data.
      - Fix latency optimization flow for inline WQEs.
      - BigBear 100G fix
      
      RDMA:
      - Reduce task context size.
      - Application page sizes above 2GB support.
      - Performance improvements.
      
      ETH:
      - Tenant DCB support.
      - Replace RSS indirection table update interface.
      
      Misc:
      - Debug Tools changes.
      Signed-off-by: default avatarDenis Bolotin <denis.bolotin@cavium.com>
      Signed-off-by: default avatarAriel Elior <ariel.elior@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3f72307
    • David S. Miller's avatar
      Merge branch 'rtnetlink-add-IFA_TARGET_NETNSID-for-RTM_GETADDR' · 6ef848ef
      David S. Miller authored
      Christian Brauner says:
      
      ====================
      rtnetlink: add IFA_TARGET_NETNSID for RTM_GETADDR
      
      This iteration should mainly addresses the suggestion to use
      IFA_TARGET_NETNSID as the property name. Additionally, an an alias for
      the already existing IFLA_IF_NETNSID property is added.
      
      Note that two additional cleanup patches (8\9 and 9\9) were added to
      address concerns raised that passing more than 6 arguments to a function
      will cause additional variables to be pushed onto the stack instead of
      being placed into registers. The way I addressed this is by introducing
      two new struct inet{6}_fill_args that are used to pass common
      information down to inet{6}_fill_if*() functions shortening all those
      functions to three pointer arguments.
      If this is something more people than Kirill find useful they can be
      kept if not they can simply be dropped in later iterations of this
      series or when merging.
      
      Here is a short overview:
      1. Rename from IFA_IF_NETNSID to IFA_TARGET_NETNSID.
      2. Add IFLA_TARGET_NETNSID as an alias for IFA_IFLA_NETNSID and switch
         all occurrences over to the new alias.
      3. Add inet4_fill_args struct to avoid passing more than 6 arguments in
         inet_fill_if*() functions.
      4. Add inet6_fill_args struct to avoid passing more than 6 arguments in
         inet_fill_if*() functions.
      
      The only functional change is the export of rtnl_get_net_ns_capable()
      which is needed in case ipv6 is built as a module.
      
      Note, I did not change the property name to IFA_TARGET_NSID as there was
      no clear agreement what would be preferred. My personal preference is to
      keep the IFA_IF_NETNSID name because it aligns naturally with the
      IFLA_IF_NETNSID property for RTM_*LINK requests. Jiri seems to prefer
      this name too.
      However, if there is agreement that another property name makes more
      sense I'm happy to send a v2 that changes this.
      
      To test this patchset I performed 1 million getifaddrs() requests
      against a network namespace containing 5 interfaces (lo, eth{0-4}). The
      first test used a network namespace aware getifaddrs() implementation I
      wrote and the second test used the traditional setns() + getifaddrs()
      method. The results show that this patchsets allows userspace to cut
      retrieval time in half:
      1. netns_getifaddrs():      82 microseconds
      2. setns() + getifaddrs(): 162 microseconds
      
      A while back we introduced and enabled IFLA_IF_NETNSID in
      RTM_{DEL,GET,NEW}LINK requests (cf. [1], [2], [3], [4], [5]). This has led
      to signficant performance increases since it allows userspace to avoid
      taking the hit of a setns(netns_fd, CLONE_NEWNET), then getting the
      interfaces from the netns associated with the netns_fd. Especially when a
      lot of network namespaces are in use, using setns() becomes increasingly
      problematic when performance matters.
      Usually, RTML_GETLINK requests are followed by RTM_GETADDR requests (cf.
      getifaddrs() style functions and friends). But currently, RTM_GETADDR
      requests do not support a similar property like IFLA_IF_NETNSID for
      RTM_*LINK requests.
      This is problematic since userspace can retrieve interfaces from another
      network namespace by sending a IFLA_IF_NETNSID property along but
      RTM_GETLINK request but is still forced to use the legacy setns() style of
      retrieving interfaces in RTM_GETADDR requests.
      
      The goal of this series is to make it possible to perform RTM_GETADDR
      requests on different network namespaces. To this end a new IFA_IF_NETNSID
      property for RTM_*ADDR requests is introduced. It can be used to send a
      network namespace identifier along in RTM_*ADDR requests.  The network
      namespace identifier will be used to retrieve the target network namespace
      in which the request is supposed to be fulfilled.  This aligns the behavior
      of RTM_*ADDR requests with the behavior of RTM_*LINK requests.
      
      - The caller must have assigned a valid network namespace identifier for
        the target network namespace.
      - The caller must have CAP_NET_ADMIN in the owning user namespace of the
        target network namespace.
      
      [1]: commit 7973bfd8 ("rtnetlink: remove check for IFLA_IF_NETNSID")
      [2]: commit 5bb8ed07 ("rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK")
      [3]: commit b61ad68a ("rtnetlink: enable IFLA_IF_NETNSID for RTM_DELLINK")
      [4]: commit c310bfcb ("rtnetlink: enable IFLA_IF_NETNSID for RTM_SETLINK")
      [5]: commit 7c4f63ba ("rtnetlink: enable IFLA_IF_NETNSID in do_setlink()")
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ef848ef
    • Christian Brauner's avatar
      ipv6: add inet6_fill_args · 203651b6
      Christian Brauner authored
      inet6_fill_if{addr,mcaddr, acaddr}() already took 6 arguments which
      meant the 7th argument would need to be pushed onto the stack on x86.
      Add a new struct inet6_fill_args which holds common information passed
      to inet6_fill_if{addr,mcaddr, acaddr}() and shortens the functions to
      three pointer arguments.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      203651b6
    • Christian Brauner's avatar
      ipv4: add inet_fill_args · 978a46fa
      Christian Brauner authored
      inet_fill_ifaddr() already took 6 arguments which meant the 7th argument
      would need to be pushed onto the stack on x86.
      Add a new struct inet_fill_args which holds common information passed
      to inet_fill_ifaddr() and shortens the function to three pointer arguments.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      978a46fa
    • Christian Brauner's avatar
      rtnetlink: s/IFLA_IF_NETNSID/IFLA_TARGET_NETNSID/g · 7e4a8d5a
      Christian Brauner authored
      IFLA_TARGET_NETNSID is the new alias for IFLA_IF_NETNSID. This commit
      replaces all occurrences of IFLA_IF_NETNSID with the new alias to
      indicate that this identifier is the preferred one.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: Jiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e4a8d5a
    • Christian Brauner's avatar
      if_link: add IFLA_TARGET_NETNSID alias · 19d8f1ad
      Christian Brauner authored
      This adds IFLA_TARGET_NETNSID as an alias for IFLA_IF_NETNSID for
      RTM_*LINK requests.
      The new name is clearer and also aligns with the newly introduced
      IFA_TARGET_NETNSID propert for RTM_*ADDR requests.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Suggested-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: Jiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19d8f1ad
    • Christian Brauner's avatar
      rtnetlink: move type calculation out of loop · 87ccbb1f
      Christian Brauner authored
      I don't see how the type - which is one of
      RTM_{GETADDR,GETROUTE,GETNETCONF} - can change. So do the message type
      calculation once before entering the for loop.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      87ccbb1f
    • Christian Brauner's avatar
      ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR · 6ecf4c37
      Christian Brauner authored
      - Backwards Compatibility:
        If userspace wants to determine whether ipv6 RTM_GETADDR requests
        support the new IFA_TARGET_NETNSID property it should verify that the
        reply includes the IFA_TARGET_NETNSID property. If it does not
        userspace should assume that IFA_TARGET_NETNSID is not supported for
        ipv6 RTM_GETADDR requests on this kernel.
      - From what I gather from current userspace tools that make use of
        RTM_GETADDR requests some of them pass down struct ifinfomsg when they
        should actually pass down struct ifaddrmsg. To not break existing
        tools that pass down the wrong struct we will do the same as for
        RTM_GETLINK | NLM_F_DUMP requests and not error out when the
        nlmsg_parse() fails.
      
      - Security:
        Callers must have CAP_NET_ADMIN in the owning user namespace of the
        target network namespace.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ecf4c37
    • Christian Brauner's avatar
      ipv4: enable IFA_TARGET_NETNSID for RTM_GETADDR · d3807145
      Christian Brauner authored
      - Backwards Compatibility:
        If userspace wants to determine whether ipv4 RTM_GETADDR requests
        support the new IFA_TARGET_NETNSID property it should verify that the
        reply includes the IFA_TARGET_NETNSID property. If it does not
        userspace should assume that IFA_TARGET_NETNSID is not supported for
        ipv4 RTM_GETADDR requests on this kernel.
      - From what I gather from current userspace tools that make use of
        RTM_GETADDR requests some of them pass down struct ifinfomsg when they
        should actually pass down struct ifaddrmsg. To not break existing
        tools that pass down the wrong struct we will do the same as for
        RTM_GETLINK | NLM_F_DUMP requests and not error out when the
        nlmsg_parse() fails.
      
      - Security:
        Callers must have CAP_NET_ADMIN in the owning user namespace of the
        target network namespace.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3807145
    • Christian Brauner's avatar
      if_addr: add IFA_TARGET_NETNSID · 9f3c057c
      Christian Brauner authored
      This adds a new IFA_TARGET_NETNSID property to be used by address
      families such as PF_INET and PF_INET6.
      The IFA_TARGET_NETNSID property can be used to send a network namespace
      identifier as part of a request. If a IFA_TARGET_NETNSID property is
      identified it will be used to retrieve the target network namespace in
      which the request is to be made.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Cc: Jiri Benc <jbenc@redhat.com>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f3c057c
    • Christian Brauner's avatar
      rtnetlink: add rtnl_get_net_ns_capable() · c383edc4
      Christian Brauner authored
      get_target_net() will be used in follow-up patches in ipv{4,6} codepaths to
      retrieve network namespaces based on network namespace identifiers. So
      remove the static declaration and export in the rtnetlink header. Also,
      rename it to rtnl_get_net_ns_capable() to make it obvious what this
      function is doing.
      Export rtnl_get_net_ns_capable() so it can be used when ipv6 is built as
      a module.
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c383edc4
    • David S. Miller's avatar
      Merge branch 'net-lan78xx-Minor-improvements' · d4cc5976
      David S. Miller authored
      Stefan Wahren says:
      
      ====================
      net: lan78xx: Minor improvements
      
      This patch series contains some minor improvements for the lan78xx
      driver.
      
      Changes in V2:
      - Keep Copyright comment as multi-line
      - Add Raghuram's Reviewed-by
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4cc5976
    • Stefan Wahren's avatar
      net: lan78xx: Make declaration style consistent · 51ceac9f
      Stefan Wahren authored
      This patch makes some declaration more consistent.
      Signed-off-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Reviewed-by: default avatarRaghuram Chary Jallipalli <raghuramchary.jallipalli@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51ceac9f
    • Stefan Wahren's avatar
      net: lan78xx: Switch to SPDX identifier · 6be665a5
      Stefan Wahren authored
      Adopt the SPDX license identifier headers to ease license compliance
      management.
      Signed-off-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6be665a5
    • Stefan Wahren's avatar
      net: lan78xx: Drop unnecessary strcpy in lan78xx_probe · 7a6b022d
      Stefan Wahren authored
      There is no need for this strcpy because alloc_etherdev() already
      does this job.
      Signed-off-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Reviewed-by: default avatarRaghuram Chary Jallipalli <raghuramchary.jallipalli@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a6b022d
    • Stefan Wahren's avatar
      net: lan78xx: Bail out if lan78xx_get_endpoints fails · fa8cd98c
      Stefan Wahren authored
      We need to bail out if lan78xx_get_endpoints() fails, otherwise the
      result is overwritten.
      
      Fixes: 55d7de9d ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet")
      Signed-off-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Reviewed-by: default avatarRaghuram Chary Jallipalli <raghuramchary.jallipalli@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa8cd98c
    • Jakub Kicinski's avatar
      nfp: separate VXLAN and GRE feature handling · 7848418e
      Jakub Kicinski authored
      VXLAN and GRE FW features have to currently be both advertised
      for the driver to enable them.  Separate the handling.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7848418e
    • David S. Miller's avatar
      Merge branch 'nfp-improve-the-new-rtsym-helpers' · eebd3faa
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      nfp: improve the new rtsym helpers
      
      This set fixes a bug in ABS rtsym handling I added in net-next,
      it expands the error checking and reporting on the rtsym accesses.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eebd3faa
    • Jakub Kicinski's avatar
      nfp: validate rtsym accesses fall within the symbol · e84b2f2d
      Jakub Kicinski authored
      With the accesses to rtsyms now all going via special helpers
      we can easily make sure the driver is not reading past the
      end of the symbol.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarFrancois H. Theron <francois.theron@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e84b2f2d
    • Jakub Kicinski's avatar
      nfp: prefix rtsym error messages with symbol name · 31e380f3
      Jakub Kicinski authored
      For ease of debug preface all error messages with the name
      of the symbol which caused them.  Use the same message format
      for existing messages while at it.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarFrancois H. Theron <francois.theron@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31e380f3
    • Jakub Kicinski's avatar
      nfp: fix readq on absolute RTsyms · 3c576de3
      Jakub Kicinski authored
      Return the error and report value through the output param.
      
      Fixes: 640917dd ("nfp: support access to absolute RTsyms")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarFrancois H. Theron <francois.theron@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c576de3
    • YueHaibing's avatar
      failover: Add missing check to validate 'slave_dev' in net_failover_slave_unregister · 9e7e6cab
      YueHaibing authored
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      drivers/net/net_failover.c: In function 'net_failover_slave_unregister':
      drivers/net/net_failover.c:598:35: warning:
       variable 'primary_dev' set but not used [-Wunused-but-set-variable]
      
      There should check the validity of 'slave_dev'.
      
      Fixes: cfc80d9a ("net: Introduce net_failover driver")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Acked-by: default avatarSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e7e6cab
    • Dmitry Safonov's avatar
      netlink: Make groups check less stupid in netlink_bind() · 428f944b
      Dmitry Safonov authored
      As Linus noted, the test for 0 is needless, groups type can follow the
      usual kernel style and 8*sizeof(unsigned long) is BITS_PER_LONG:
      
      > The code [..] isn't technically incorrect...
      > But it is stupid.
      > Why stupid? Because the test for 0 is pointless.
      >
      > Just doing
      >        if (nlk->ngroups < 8*sizeof(groups))
      >                groups &= (1UL << nlk->ngroups) - 1;
      >
      > would have been fine and more understandable, since the "mask by shift
      > count" already does the right thing for a ngroups value of 0. Now that
      > test for zero makes me go "what's special about zero?". It turns out
      > that the answer to that is "nothing".
      [..]
      > The type of "groups" is kind of silly too.
      >
      > Yeah, "long unsigned int" isn't _technically_ wrong. But we normally
      > call that type "unsigned long".
      
      Cleanup my piece of pointlessness.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: netdev@vger.kernel.org
      Fairly-blamed-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      428f944b
    • Vincent Whitchurch's avatar
      packet: add sockopt to ignore outgoing packets · fa788d98
      Vincent Whitchurch authored
      Currently, the only way to ignore outgoing packets on a packet socket is
      via the BPF filter.  With MSG_ZEROCOPY, packets that are looped into
      AF_PACKET are copied in dev_queue_xmit_nit(), and this copy happens even
      if the filter run from packet_rcv() would reject them.  So the presence
      of a packet socket on the interface takes away the benefits of
      MSG_ZEROCOPY, even if the packet socket is not interested in outgoing
      packets.  (Even when MSG_ZEROCOPY is not used, the skb is unnecessarily
      cloned, but the cost for that is much lower.)
      
      Add a socket option to allow AF_PACKET sockets to ignore outgoing
      packets to solve this.  Note that the *BSDs already have something
      similar: BIOCSSEESENT/BIOCSDIRECTION and BIOCSDIRFILT.
      
      The first intended user is lldpd.
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa788d98
    • Alaa Hleihel's avatar
      net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets · fe1dc069
      Alaa Hleihel authored
      CHECKSUM_COMPLETE is not applicable to SCTP protocol.
      Setting it for SCTP packets leads to CRC32c validation failure.
      
      Fixes: bbceefce ("net/mlx5e: Support RX CHECKSUM_COMPLETE")
      Signed-off-by: default avatarAlaa Hleihel <alaa@mellanox.com>
      Reviewed-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      fe1dc069
    • Natali Shechtman's avatar
      net/mlx5e: Set ECN for received packets using CQE indication · f007c13d
      Natali Shechtman authored
      In multi-host (MH) NIC scheme, a single HW port serves multiple hosts
      or sockets on the same host.
      The HW uses a mechanism in the PCIe buffer which monitors
      the amount of consumed PCIe buffers per host.
      On a certain configuration, under congestion,
      the HW emulates a switch doing ECN marking on packets using ECN
      indication on the completion descriptor (CQE).
      
      The driver needs to set the ECN bits on the packet SKB,
      such that the network stack can react on that, this commit does that.
      Signed-off-by: default avatarNatali Shechtman <natali@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      f007c13d
    • Shay Agroskin's avatar
      net/mlx5e: Replace PTP clock lock from RW lock to seq lock · 64109f1d
      Shay Agroskin authored
      Changed "priv.clock.lock" lock from 'rw_lock' to 'seq_lock'
      in order to improve packet rate performance.
      
      Tested on Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz.
      Sent 64b packets between two peers connected by ConnectX-5,
      and measured packet rate for the receiver in three modes:
      	no time-stamping (base rate)
      	time-stamping using rw_lock (old lock) for critical region
      	time-stamping using seq_lock (new lock) for critical region
      Only the receiver time stamped its packets.
      
      The measured packet rate improvements are:
      
      	Single flow (multiple TX rings to single RX ring):
      		without timestamping:	  4.26 (M packets)/sec
      		with rw-lock (old lock):  4.1  (M packets)/sec
      		with seq-lock (new lock): 4.16 (M packets)/sec
      		1.46% improvement
      
      	Multiple flows (multiple TX rings to six RX rings):
      		without timestamping: 	  22   (M packets)/sec
      		with rw-lock (old lock):  11.7 (M packets)/sec
      		with seq-lock (new lock): 21.3 (M packets)/sec
      		82.05% improvement
      
      The packet rate improvement is due to the lack of atomic operations
      for the 'readers' by the seq-lock.
      Since there are much more 'readers' than 'writers' contention
      on this lock, almost all atomic operations are saved.
      this results in a dramatic decrease in overall
      cache misses.
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      64109f1d
    • Roi Dayan's avatar
      net/mlx5e: Move Q counters allocation and drop RQ to init_rx · 1462e48d
      Roi Dayan authored
      Not all profiles query the HW Q counters in update_stats() callback.
      HW Q couners are limited per device and in case of representors all
      their Q counters are allocated on the parent PF device.
      Avoid reundant allocation of HW Q counters by moving the allocation
      to init_rx profile callback.
      Signed-off-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      1462e48d
    • Kamal Heib's avatar
      net/mlx5e: Move mlx5e_priv_flags into en_ethtool.c · d2408205
      Kamal Heib authored
      Move the definition of mlx5e_priv_flags into en_ethtool.c because it's
      only used there.
      
      Fixes: 4e59e288 ("net/mlx5e: Introduce net device priv flags infrastructure")
      Signed-off-by: default avatarKamal Heib <kamalheib1@gmail.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      d2408205
    • Vlad Buslov's avatar
      net/mlx5: Add flow counters idr · 12d6066c
      Vlad Buslov authored
      Previous patch in series changed flow counter storage structure from
      rb_tree to linked list in order to improve flow counter traversal
      performance. The drawback of such solution is that flow counter lookup by
      id becomes linear in complexity.
      
      Store pointers to flow counters in idr in order to improve lookup
      performance to logarithmic again. Idr is non-intrusive data structure and
      doesn't require extending flow counter struct with new elements. This means
      that idr can be used for lookup, while linked list from previous patch is
      used for traversal, and struct mlx5_fc size is <= 2 cache lines.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarAmir Vadai <amir@vadai.me>
      Reviewed-by: default avatarPaul Blakey <paulb@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      12d6066c
    • Vlad Buslov's avatar
      net/mlx5: Store flow counters in a list · 9aff93d7
      Vlad Buslov authored
      In order to improve performance of flow counter stats query loop that
      traverses all configured flow counters, replace rb_tree with double-linked
      list. This change improves performance of traversing flow counters by
      removing the tree traversal. (profiling data showed that call to rb_next
      was most top CPU consumer)
      
      However, lookup of flow flow counter in list becomes linear, instead of
      logarithmic. This problem is fixed by next patch in series, which adds idr
      for fast lookup. Idr is to be used because it is not an intrusive data
      structure and doesn't require adding any new members to struct mlx5_fc,
      which allows its control data part to stay <= 1 cache line in size.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarAmir Vadai <amir@vadai.me>
      Reviewed-by: default avatarPaul Blakey <paulb@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      9aff93d7
    • Vlad Buslov's avatar
      net/mlx5: Add new list to store deleted flow counters · 6e5e2283
      Vlad Buslov authored
      In order to prevent flow counters stats work function from traversing whole
      flow counters tree while searching for deleted flow counters, new list to
      store deleted flow counters is added to struct mlx5_fc_stats. Lockless
      NULL-terminated single linked list data type is used due to following
      reasons:
       - This use case only needs to add single element to list and
       remove/iterate whole list. Lockless list doesn't require any additional
       synchronization for these operations.
       - First cache line of flow counter data structure only has space to store
       single additional pointer, which precludes usage of double linked list.
      
      Remove flow counter 'deleted' flag that is no longer needed.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarAmir Vadai <amir@vadai.me>
      Reviewed-by: default avatarPaul Blakey <paulb@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      6e5e2283