1. 19 Jul, 2021 3 commits
    • Stefan Assmann's avatar
      iavf: fix locking of critical sections · 226d5285
      Stefan Assmann authored
      To avoid races between iavf_init_task(), iavf_reset_task(),
      iavf_watchdog_task(), iavf_adminq_task() as well as the shutdown and
      remove functions more locking is required.
      The current protection by __IAVF_IN_CRITICAL_TASK is needed in
      additional places.
      
      - The reset task performs state transitions, therefore needs locking.
      - The adminq task acts on replies from the PF in
        iavf_virtchnl_completion() which may alter the states.
      - The init task is not only run during probe but also if a VF gets stuck
        to reinitialize it.
      - The shutdown function performs a state transition.
      - The remove function performs a state transition and also free's
        resources.
      
      iavf_lock_timeout() is introduced to avoid waiting infinitely
      and cause a deadlock. Rather unlock and print a warning.
      Signed-off-by: default avatarStefan Assmann <sassmann@kpanic.de>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      226d5285
    • Stefan Assmann's avatar
      iavf: do not override the adapter state in the watchdog task · 22c8fd71
      Stefan Assmann authored
      The iavf watchdog task overrides adapter->state to __IAVF_RESETTING
      when it detects a pending reset. Then schedules iavf_reset_task() which
      takes care of the reset.
      
      The reset task is capable of handling the reset without changing
      adapter->state. In fact we lose the state information when the watchdog
      task prematurely changes the adapter state. This may lead to a crash if
      instead of the reset task the iavf_remove() function gets called before
      the reset task.
      In that case (if we were in state __IAVF_RUNNING previously) the
      iavf_remove() function triggers iavf_close() which fails to close the
      device because of the incorrect state information.
      
      This may result in a crash due to pending interrupts.
      kernel BUG at drivers/pci/msi.c:357!
      [...]
      Call Trace:
       [<ffffffffbddf24dd>] pci_disable_msix+0x3d/0x50
       [<ffffffffc08d2a63>] iavf_reset_interrupt_capability+0x23/0x40 [iavf]
       [<ffffffffc08d312a>] iavf_remove+0x10a/0x350 [iavf]
       [<ffffffffbddd3359>] pci_device_remove+0x39/0xc0
       [<ffffffffbdeb492f>] __device_release_driver+0x7f/0xf0
       [<ffffffffbdeb49c3>] device_release_driver+0x23/0x30
       [<ffffffffbddcabb4>] pci_stop_bus_device+0x84/0xa0
       [<ffffffffbddcacc2>] pci_stop_and_remove_bus_device+0x12/0x20
       [<ffffffffbddf361f>] pci_iov_remove_virtfn+0xaf/0x160
       [<ffffffffbddf3bcc>] sriov_disable+0x3c/0xf0
       [<ffffffffbddf3ca3>] pci_disable_sriov+0x23/0x30
       [<ffffffffc0667365>] i40e_free_vfs+0x265/0x2d0 [i40e]
       [<ffffffffc0667624>] i40e_pci_sriov_configure+0x144/0x1f0 [i40e]
       [<ffffffffbddd5307>] sriov_numvfs_store+0x177/0x1d0
      Code: 00 00 e8 3c 25 e3 ff 49 c7 86 88 08 00 00 00 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 8b 7b 28 e8 0d 44
      RIP  [<ffffffffbbbf1068>] free_msi_irqs+0x188/0x190
      
      The solution is to not touch the adapter->state in iavf_watchdog_task()
      and let the reset task handle the state transition.
      Signed-off-by: default avatarStefan Assmann <sassmann@kpanic.de>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      22c8fd71
    • Stefan Assmann's avatar
      i40e: improve locking of mac_filter_hash · 8b4b0691
      Stefan Assmann authored
      i40e_config_vf_promiscuous_mode() calls
      i40e_getnum_vf_vsi_vlan_filters() without acquiring the
      mac_filter_hash_lock spinlock.
      
      This is unsafe because mac_filter_hash may get altered in another thread
      while i40e_getnum_vf_vsi_vlan_filters() traverses the hashes.
      
      Simply adding the spinlock in i40e_getnum_vf_vsi_vlan_filters() is not
      possible as it already gets called in i40e_get_vlan_list_sync() with the
      spinlock held. Therefore adding a wrapper that acquires the spinlock and
      call the correct function where appropriate.
      
      Fixes: 37d318d7 ("i40e: Remove scheduling while atomic possibility")
      Fix-suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarStefan Assmann <sassmann@kpanic.de>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      8b4b0691
  2. 18 Jul, 2021 1 commit
  3. 17 Jul, 2021 11 commits
    • Liu Jian's avatar
      igmp: Add ip_mc_list lock in ip_check_mc_rcu · 23d2b940
      Liu Jian authored
      I got below panic when doing fuzz test:
      
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 4056 Comm: syz-executor.3 Tainted: G    B             5.14.0-rc1-00195-gcff5c4254439-dirty #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
      Call Trace:
      dump_stack_lvl+0x7a/0x9b
      panic+0x2cd/0x5af
      end_report.cold+0x5a/0x5a
      kasan_report+0xec/0x110
      ip_check_mc_rcu+0x556/0x5d0
      __mkroute_output+0x895/0x1740
      ip_route_output_key_hash_rcu+0x2d0/0x1050
      ip_route_output_key_hash+0x182/0x2e0
      ip_route_output_flow+0x28/0x130
      udp_sendmsg+0x165d/0x2280
      udpv6_sendmsg+0x121e/0x24f0
      inet6_sendmsg+0xf7/0x140
      sock_sendmsg+0xe9/0x180
      ____sys_sendmsg+0x2b8/0x7a0
      ___sys_sendmsg+0xf0/0x160
      __sys_sendmmsg+0x17e/0x3c0
      __x64_sys_sendmmsg+0x9e/0x100
      do_syscall_64+0x3b/0x90
      entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x462eb9
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8
       48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48>
       3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f3df5af1c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462eb9
      RDX: 0000000000000312 RSI: 0000000020001700 RDI: 0000000000000007
      RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3df5af26bc
      R13: 00000000004c372d R14: 0000000000700b10 R15: 00000000ffffffff
      
      It is one use-after-free in ip_check_mc_rcu.
      In ip_mc_del_src, the ip_sf_list of pmc has been freed under pmc->lock protection.
      But access to ip_sf_list in ip_check_mc_rcu is not protected by the lock.
      Signed-off-by: default avatarLiu Jian <liujian56@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23d2b940
    • David S. Miller's avatar
      Merge branch 'vmxnet3-version-6' · ab0441b4
      David S. Miller authored
      Ronak Doshi says:
      
      ====================
      vmxnet3: upgrade to version 6
      
      vmxnet3 emulation has recently added several new features which includes
      increase in queues supported, remove power of 2 limitation on queues,
      add RSS for ESP IPv6, etc. This patch series extends the vmxnet3 driver
      to leverage these new features.
      
      Compatibility is maintained using existing vmxnet3 versioning mechanism as
      follows:
      - new features added to vmxnet3 emulation are associated with new vmxnet3
         version viz. vmxnet3 version 6.
      - emulation advertises all the versions it supports to the driver.
      - during initialization, vmxnet3 driver picks the highest version number
      supported by both the emulation and the driver and configures emulation
      to run at that version.
      
      In particular, following changes are introduced:
      
      Patch 1:
        This patch introduces utility macros for vmxnet3 version 6 comparison
        and updates Copyright information.
      
      Patch 2:
        This patch adds support to increase maximum Tx/Rx queues from 8 to 32.
      
      Patch 3:
        This patch removes the limitation of power of 2 on the queues.
      
      Patch 4:
        Uses existing get_rss_hash_opts and set_rss_hash_opts methods to add
        support for ESP IPv6 RSS.
      
      Patch 5:
        This patch reports correct RSS hash type based on the type of RSS
        performed.
      
      Patch 6:
        This patch updates maximum configurable mtu to 9190.
      
      Patch 7:
        With all vmxnet3 version 6 changes incorporated in the vmxnet3 driver,
        with this patch, the driver can configure emulation to run at vmxnet3
        version 6.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab0441b4
    • Ronak Doshi's avatar
      vmxnet3: update to version 6 · ce2639ad
      Ronak Doshi authored
      With all vmxnet3 version 6 changes incorporated in the vmxnet3 driver,
      the driver can configure emulation to run at vmxnet3 version 6, provided
      the emulation advertises support for version 6.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce2639ad
    • Ronak Doshi's avatar
      vmxnet3: increase maximum configurable mtu to 9190 · 8c5663e4
      Ronak Doshi authored
      This patch increases the maximum configurable mtu to 9190
      to accommodate jumbo packets of overlay traffic.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c5663e4
    • Ronak Doshi's avatar
      vmxnet3: set correct hash type based on rss information · b3973bb4
      Ronak Doshi authored
      As vmxnet3 supports IP/TCP/UDP RSS, this patch sets appropriate
      hash type based on the type of RSS performed.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3973bb4
    • Ronak Doshi's avatar
      vmxnet3: add support for ESP IPv6 RSS · 79d124bb
      Ronak Doshi authored
      Vmxnet3 version 4 added support for ESP RSS. However, only IPv4 was
      supported. With vmxnet3 version 6, this patch enables RSS for ESP
      IPv6 packets as well.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79d124bb
    • Ronak Doshi's avatar
      vmxnet3: remove power of 2 limitation on the queues · 15ccf2f4
      Ronak Doshi authored
      With version 6, vmxnet3 relaxes the restriction on queues to
      be power of two. This is helpful in cases (Edge VM) where
      vcpus are less than 8 and device requires more than 4 queues.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15ccf2f4
    • Ronak Doshi's avatar
      vmxnet3: add support for 32 Tx/Rx queues · 39f9895a
      Ronak Doshi authored
      Currently, vmxnet3 supports maximum of 8 Tx/Rx queues. With increase
      in number of vcpus on a VM, to achieve better performance and utilize
      idle vcpus, we need to increase the max number of queues supported.
      
      This patch enhances vmxnet3 to support maximum of 32 Tx/Rx queues.
      Increasing the Rx queues also increases the probability of distrubuting
      the traffic from different flows to different queues with RSS.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39f9895a
    • Ronak Doshi's avatar
      vmxnet3: prepare for version 6 changes · 69dbef0d
      Ronak Doshi authored
      vmxnet3 is currently at version 4 and this patch initiates the
      preparation to accommodate changes for upto version 6. Introduced
      utility macros for vmxnet3 version 6 comparison and update Copyright
      information.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69dbef0d
    • Xin Long's avatar
      tipc: keep the skb in rcv queue until the whole data is read · f4919ff5
      Xin Long authored
      Currently, when userspace reads a datagram with a buffer that is
      smaller than this datagram, the data will be truncated and only
      part of it can be received by users. It doesn't seem right that
      users don't know the datagram size and have to use a huge buffer
      to read it to avoid the truncation.
      
      This patch to fix it by keeping the skb in rcv queue until the
      whole data is read by users. Only the last msg of the datagram
      will be marked with MSG_EOR, just as TCP/SCTP does.
      
      Note that this will work as above only when MSG_EOR is set in the
      flags parameter of recvmsg(), so that it won't break any old user
      applications.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4919ff5
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/t · 5242b0c6
      David S. Miller authored
      nguy/next-queue
      
      Tony Nguyen says:
      
      ====================
      1GbE Intel Wired LAN Driver Updates 2021-07-16
      
      Vinicius Costa Gomes says:
      
      Add support for steering traffic to specific RX queues using Flex Filters.
      
      As the name implies, Flex Filters are more flexible than using
      Layer-2, VLAN or MAC address filters, one of the reasons is that they
      allow "AND" operations more easily, e.g. when the user wants to steer
      some traffic based on the source MAC address and the packet ethertype.
      
      Future work include adding support for offloading tc-u32 filters to
      the hardware.
      
      The series is divided as follows:
      
      Patch 1/5, add the low level primitives for configuring Flex filters.
      
      Patch 2/5 and 3/5, allow ethtool to manage Flex filters.
      
      Patch 4/5, when specifying filters that have multiple predicates, use
      Flex filters.
      
      Patch 5/5, Adds support for exposing the i225 LEDs using the LED subsystem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5242b0c6
  4. 16 Jul, 2021 25 commits