1. 08 Mar, 2022 6 commits
    • Dave Ertman's avatar
      ice: Fix error with handling of bonding MTU · 97b01291
      Dave Ertman authored
      When a bonded interface is destroyed, .ndo_change_mtu can be called
      during the tear-down process while the RTNL lock is held.  This is a
      problem since the auxiliary driver linked to the LAN driver needs to be
      notified of the MTU change, and this requires grabbing a device_lock on
      the auxiliary_device's dev.  Currently this is being attempted in the
      same execution context as the call to .ndo_change_mtu which is causing a
      dead-lock.
      
      Move the notification of the changed MTU to a separate execution context
      (watchdog service task) and eliminate the "before" notification.
      
      Fixes: 348048e7 ("ice: Implement iidc operations")
      Signed-off-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Tested-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      97b01291
    • Jacob Keller's avatar
      ice: stop disabling VFs due to PF error responses · 79498d5a
      Jacob Keller authored
      The ice_vc_send_msg_to_vf function has logic to detect "failure"
      responses being sent to a VF. If a VF is sent more than
      ICE_DFLT_NUM_INVAL_MSGS_ALLOWED then the VF is marked as disabled.
      Almost identical logic also existed in the i40e driver.
      
      This logic was added to the ice driver in commit 1071a835 ("ice:
      Implement virtchnl commands for AVF support") which itself copied from
      the i40e implementation in commit 5c3c48ac ("i40e: implement virtual
      device interface").
      
      Neither commit provides a proper explanation or justification of the
      check. In fact, later commits to i40e changed the logic to allow
      bypassing the check in some specific instances.
      
      The "logic" for this seems to be that error responses somehow indicate a
      malicious VF. This is not really true. The PF might be sending an error
      for any number of reasons such as lack of resources, etc.
      
      Additionally, this causes the PF to log an info message for every failed
      VF response which may confuse users, and can spam the kernel log.
      
      This behavior is not documented as part of any requirement for our
      products and other operating system drivers such as the FreeBSD
      implementation of our drivers do not include this type of check.
      
      In fact, the change from dev_err to dev_info in i40e commit 18b7af57
      ("i40e: Lower some message levels") explains that these messages
      typically don't actually indicate a real issue. It is quite likely that
      a user who hits this in practice will be very confused as the VF will be
      disabled without an obvious way to recover.
      
      We already have robust malicious driver detection logic using actual
      hardware detection mechanisms that detect and prevent invalid device
      usage. Remove the logic since its not a documented requirement and the
      behavior is not intuitive.
      
      Fixes: 1071a835 ("ice: Implement virtchnl commands for AVF support")
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      79498d5a
    • Jacob Keller's avatar
      i40e: stop disabling VFs due to PF error responses · 5710ab79
      Jacob Keller authored
      The i40e_vc_send_msg_to_vf_ex (and its wrapper i40e_vc_send_msg_to_vf)
      function has logic to detect "failure" responses sent to the VF. If a VF
      is sent more than I40E_DEFAULT_NUM_INVALID_MSGS_ALLOWED, then the VF is
      marked as disabled. In either case, a dev_info message is printed
      stating that a VF opcode failed.
      
      This logic originates from the early implementation of VF support in
      commit 5c3c48ac ("i40e: implement virtual device interface").
      
      That commit did not go far enough. The "logic" for this behavior seems
      to be that error responses somehow indicate a malicious VF. This is not
      really true. The PF might be sending an error for any number of reasons
      such as lacking resources, an unsupported operation, etc. This does not
      indicate a malicious VF. We already have a separate robust malicious VF
      detection which relies on hardware logic to detect and prevent a variety
      of behaviors.
      
      There is no justification for this behavior in the original
      implementation. In fact, a later commit 18b7af57 ("i40e: Lower some
      message levels") reduced the opcode failure message from a dev_err to a
      dev_info. In addition, recent commit 01cbf508 ("i40e: Fix to not
      show opcode msg on unsuccessful VF MAC change") changed the logic to
      allow quieting it for expected failures.
      
      That commit prevented this logic from kicking in for specific
      circumstances. This change did not go far enough. The behavior is not
      documented nor is it part of any requirement for our products. Other
      operating systems such as the FreeBSD implementation of our driver do
      not include this logic.
      
      It is clear this check does not make sense, and causes problems which
      led to ugly workarounds.
      
      Fix this by just removing the entire logic and the need for the
      i40e_vc_send_msg_to_vf_ex function.
      
      Fixes: 01cbf508 ("i40e: Fix to not show opcode msg on unsuccessful VF MAC change")
      Fixes: 5c3c48ac ("i40e: implement virtual device interface")
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      5710ab79
    • Michal Maloszewski's avatar
      iavf: Fix adopting new combined setting · 57d03f56
      Michal Maloszewski authored
      In some cases overloaded flag IAVF_FLAG_REINIT_ITR_NEEDED
      which should indicate that interrupts need to be completely
      reinitialized during reset leads to RTNL deadlocks using ethtool -C
      while a reset is in progress.
      To fix, it was added a new flag IAVF_FLAG_REINIT_MSIX_NEEDED
      used to trigger MSI-X reinit.
      New combined setting is fixed adopt after VF reset.
      This has been implemented by call reinit interrupt scheme
      during VF reset.
      Without this fix new combined setting has never been adopted.
      
      Fixes: 209f2f9c ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation")
      Signed-off-by: default avatarGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
      Signed-off-by: default avatarMitch Williams <mitch.a.williams@intel.com>
      Signed-off-by: default avatarMichal Maloszewski <michal.maloszewski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      57d03f56
    • Michal Maloszewski's avatar
      iavf: Fix handling of vlan strip virtual channel messages · 2cf29e55
      Michal Maloszewski authored
      Modify netdev->features for vlan stripping based on virtual
      channel messages received from the PF. Change is needed
      to synchronize vlan strip status between PF sysfs and iavf ethtool.
      
      Fixes: 5951a2b9 ("iavf: Fix VLAN feature flags after VFR")
      Signed-off-by: default avatarNorbert Ciosek <norbertx.ciosek@intel.com>
      Signed-off-by: default avatarMichal Maloszewski <michal.maloszewski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      2cf29e55
    • Russell King (Oracle)'s avatar
      net: dsa: mt7530: fix incorrect test in mt753x_phylink_validate() · e5417cbf
      Russell King (Oracle) authored
      Discussing one of the tests in mt753x_phylink_validate() with Landen
      Chao confirms that the "||" should be "&&". Fix this.
      
      Fixes: c288575f ("net: dsa: mt7530: Add the support of MT7531 switch")
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Link: https://lore.kernel.org/r/E1nRCF0-00CiXD-7q@rmk-PC.armlinux.org.ukSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e5417cbf
  2. 07 Mar, 2022 6 commits
  3. 06 Mar, 2022 1 commit
  4. 05 Mar, 2022 2 commits
  5. 04 Mar, 2022 3 commits
    • Tung Nguyen's avatar
      tipc: fix kernel panic when enabling bearer · be4977b8
      Tung Nguyen authored
      When enabling a bearer on a node, a kernel panic is observed:
      
      [    4.498085] RIP: 0010:tipc_mon_prep+0x4e/0x130 [tipc]
      ...
      [    4.520030] Call Trace:
      [    4.520689]  <IRQ>
      [    4.521236]  tipc_link_build_proto_msg+0x375/0x750 [tipc]
      [    4.522654]  tipc_link_build_state_msg+0x48/0xc0 [tipc]
      [    4.524034]  __tipc_node_link_up+0xd7/0x290 [tipc]
      [    4.525292]  tipc_rcv+0x5da/0x730 [tipc]
      [    4.526346]  ? __netif_receive_skb_core+0xb7/0xfc0
      [    4.527601]  tipc_l2_rcv_msg+0x5e/0x90 [tipc]
      [    4.528737]  __netif_receive_skb_list_core+0x20b/0x260
      [    4.530068]  netif_receive_skb_list_internal+0x1bf/0x2e0
      [    4.531450]  ? dev_gro_receive+0x4c2/0x680
      [    4.532512]  napi_complete_done+0x6f/0x180
      [    4.533570]  virtnet_poll+0x29c/0x42e [virtio_net]
      ...
      
      The node in question is receiving activate messages in another
      thread after changing bearer status to allow message sending/
      receiving in current thread:
      
               thread 1           |              thread 2
               --------           |              --------
                                  |
      tipc_enable_bearer()        |
        test_and_set_bit_lock()   |
          tipc_bearer_xmit_skb()  |
                                  | tipc_l2_rcv_msg()
                                  |   tipc_rcv()
                                  |     __tipc_node_link_up()
                                  |       tipc_link_build_state_msg()
                                  |         tipc_link_build_proto_msg()
                                  |           tipc_mon_prep()
                                  |           {
                                  |             ...
                                  |             // null-pointer dereference
                                  |             u16 gen = mon->dom_gen;
                                  |             ...
                                  |           }
        // Not being executed yet |
        tipc_mon_create()         |
        {                         |
          ...                     |
          // allocate             |
          mon = kzalloc();        |
          ...                     |
        }                         |
      
      Monitoring pointer in thread 2 is dereferenced before monitoring data
      is allocated in thread 1. This causes kernel panic.
      
      This commit fixes it by allocating the monitoring data before enabling
      the bearer to receive messages.
      
      Fixes: 35c55c98 ("tipc: add neighbor monitoring framework")
      Reported-by: default avatarShuang Li <shuali@redhat.com>
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      be4977b8
    • Robert Hancock's avatar
      net: macb: Fix lost RX packet wakeup race in NAPI receive · 0bf476fc
      Robert Hancock authored
      There is an oddity in the way the RSR register flags propagate to the
      ISR register (and the actual interrupt output) on this hardware: it
      appears that RSR register bits only result in ISR being asserted if the
      interrupt was actually enabled at the time, so enabling interrupts with
      RSR bits already set doesn't trigger an interrupt to be raised. There
      was already a partial fix for this race in the macb_poll function where
      it checked for RSR bits being set and re-triggered NAPI receive.
      However, there was a still a race window between checking RSR and
      actually enabling interrupts, where a lost wakeup could happen. It's
      necessary to check again after enabling interrupts to see if RSR was set
      just prior to the interrupt being enabled, and re-trigger receive in that
      case.
      
      This issue was noticed in a point-to-point UDP request-response protocol
      which periodically saw timeouts or abnormally high response times due to
      received packets not being processed in a timely fashion. In many
      applications, more packets arriving, including TCP retransmissions, would
      cause the original packet to be processed, thus masking the issue.
      
      Fixes: 02f7a34f ("net: macb: Re-enable RX interrupt only when RX is done")
      Cc: stable@vger.kernel.org
      Co-developed-by: default avatarScott McNutt <scott.mcnutt@siriusxm.com>
      Signed-off-by: default avatarScott McNutt <scott.mcnutt@siriusxm.com>
      Signed-off-by: default avatarRobert Hancock <robert.hancock@calian.com>
      Tested-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0bf476fc
    • Jakub Kicinski's avatar
      Merge tag 'for-net-2022-03-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 9f3956d6
      Jakub Kicinski authored
      Luiz Augusto von Dentz says:
      
      ====================
      bluetooth pull request for net:
      
       - Fix regression with processing of MGMT commands
       - Fix unbalanced unlock in Set Device Flags
      
      * tag 'for-net-2022-03-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
        Bluetooth: hci_sync: Fix not processing all entries on cmd_sync_work
        Bluetooth: hci_core: Fix unbalanced unlock in set_device_flags()
      ====================
      
      Link: https://lore.kernel.org/r/20220303210743.314679-1-luiz.dentz@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9f3956d6
  6. 03 Mar, 2022 22 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · b949c21f
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from can, xfrm, wifi, bluetooth, and netfilter.
      
        Lots of various size fixes, the length of the tag speaks for itself.
        Most of the 5.17-relevant stuff comes from xfrm, wifi and bt trees
        which had been lagging as you pointed out previously. But there's also
        a larger than we'd like portion of fixes for bugs from previous
        releases.
      
        Three more fixes still under discussion, including and xfrm revert for
        uAPI error.
      
        Current release - regressions:
      
         - iwlwifi: don't advertise TWT support, prevent FW crash
      
         - xfrm: fix the if_id check in changelink
      
         - xen/netfront: destroy queues before real_num_tx_queues is zeroed
      
         - bluetooth: fix not checking MGMT cmd pending queue, make scanning
           work again
      
        Current release - new code bugs:
      
         - mptcp: make SIOCOUTQ accurate for fallback socket
      
         - bluetooth: access skb->len after null check
      
         - bluetooth: hci_sync: fix not using conn_timeout
      
         - smc: fix cleanup when register ULP fails
      
         - dsa: restore error path of dsa_tree_change_tag_proto
      
         - iwlwifi: fix build error for IWLMEI
      
         - iwlwifi: mvm: propagate error from request_ownership to the user
      
        Previous releases - regressions:
      
         - xfrm: fix pMTU regression when reported pMTU is too small
      
         - xfrm: fix TCP MSS calculation when pMTU is close to 1280
      
         - bluetooth: fix bt_skb_sendmmsg not allocating partial chunks
      
         - ipv6: ensure we call ipv6_mc_down() at most once, prevent leaks
      
         - ipv6: prevent leaks in igmp6 when input queues get full
      
         - fix up skbs delta_truesize in UDP GRO frag_list
      
         - eth: e1000e: fix possible HW unit hang after an s0ix exit
      
         - eth: e1000e: correct NVM checksum verification flow
      
         - ptp: ocp: fix large time adjustments
      
        Previous releases - always broken:
      
         - tcp: make tcp_read_sock() more robust in presence of urgent data
      
         - xfrm: distinguishing SAs and SPs by if_id in xfrm_migrate
      
         - xfrm: fix xfrm_migrate issues when address family changes
      
         - dcb: flush lingering app table entries for unregistered devices
      
         - smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error
      
         - mac80211: fix EAPoL rekey fail in 802.3 rx path
      
         - mac80211: fix forwarded mesh frames AC & queue selection
      
         - netfilter: nf_queue: fix socket access races and bugs
      
         - batman-adv: fix ToCToU iflink problems and check the result belongs
           to the expected net namespace
      
         - can: gs_usb, etas_es58x: fix opened_channel_cnt's accounting
      
         - can: rcar_canfd: register the CAN device when fully ready
      
         - eth: igb, igc: phy: drop premature return leaking HW semaphore
      
         - eth: ixgbe: xsk: change !netif_carrier_ok() handling in
           ixgbe_xmit_zc(), prevent live lock when link goes down
      
         - eth: stmmac: only enable DMA interrupts when ready
      
         - eth: sparx5: move vlan checks before any changes are made
      
         - eth: iavf: fix races around init, removal, resets and vlan ops
      
         - ibmvnic: more reset flow fixes
      
        Misc:
      
         - eth: fix return value of __setup handlers"
      
      * tag 'net-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits)
        ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report()
        net: dsa: make dsa_tree_change_tag_proto actually unwind the tag proto change
        ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc()
        selftests: mlxsw: resource_scale: Fix return value
        selftests: mlxsw: tc_police_scale: Make test more robust
        net: dcb: disable softirqs in dcbnl_flush_dev()
        bnx2: Fix an error message
        sfc: extend the locking on mcdi->seqno
        net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server
        net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client
        net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe()
        tcp: make tcp_read_sock() more robust
        bpf, sockmap: Do not ignore orig_len parameter
        net: ipa: add an interconnect dependency
        net: fix up skbs delta_truesize in UDP GRO frag_list
        iwlwifi: mvm: return value for request_ownership
        nl80211: Update bss channel on channel switch for P2P_CLIENT
        iwlwifi: fix build error for IWLMEI
        ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments
        batman-adv: Don't expect inter-netns unique iflink indices
        ...
      b949c21f
    • Linus Torvalds's avatar
      Merge tag 'mips-fixes-5.17_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · e58bd49d
      Linus Torvalds authored
      Pull MIPS fixes from Thomas Bogendoerfer:
      
       - Fix memory detection for MT7621 devices
      
       - Fix setnocoherentio kernel option
      
       - Fix warning when CONFIG_SCHED_CORE is enabled
      
      * tag 'mips-fixes-5.17_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: ralink: mt7621: use bitwise NOT instead of logical
        mips: setup: fix setnocoherentio() boolean setting
        MIPS: smp: fill in sibling and core maps earlier
        MIPS: ralink: mt7621: do memory detection on KSEG1
      e58bd49d
    • Linus Torvalds's avatar
      Merge tag 'auxdisplay-for-linus-v5.17-rc7' of git://github.com/ojeda/linux · 4d5ae234
      Linus Torvalds authored
      Pull auxdisplay fixes from Miguel Ojeda:
       "A few lcd2s fixes from Andy Shevchenko"
      
      * tag 'auxdisplay-for-linus-v5.17-rc7' of git://github.com/ojeda/linux:
        auxdisplay: lcd2s: Use proper API to free the instance of charlcd object
        auxdisplay: lcd2s: Fix memory leak in ->remove()
        auxdisplay: lcd2s: Fix lcd2s_redefine_char() feature
      4d5ae234
    • Eric Dumazet's avatar
      ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report() · 2d3916f3
      Eric Dumazet authored
      While investigating on why a synchronize_net() has been added recently
      in ipv6_mc_down(), I found that igmp6_event_query() and igmp6_event_report()
      might drop skbs in some cases.
      
      Discussion about removing synchronize_net() from ipv6_mc_down()
      will happen in a different thread.
      
      Fixes: f185de28 ("mld: add new workqueues for process mld events")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Taehee Yoo <ap420073@gmail.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: David Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220303173728.937869-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2d3916f3
    • Vladimir Oltean's avatar
      net: dsa: make dsa_tree_change_tag_proto actually unwind the tag proto change · e1bec7fa
      Vladimir Oltean authored
      The blamed commit said one thing but did another. It explains that we
      should restore the "return err" to the original "goto out_unwind_tagger",
      but instead it replaced it with "goto out_unlock".
      
      When DSA_NOTIFIER_TAG_PROTO fails after the first switch of a
      multi-switch tree, the switches would end up not using the same tagging
      protocol.
      
      Fixes: 0b0e2ff1 ("net: dsa: restore error path of dsa_tree_change_tag_proto")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220303154249.1854436-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e1bec7fa
    • Maciej Fijalkowski's avatar
      ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc() · 6c7273a2
      Maciej Fijalkowski authored
      Commit c685c69f ("ixgbe: don't do any AF_XDP zero-copy transmit if
      netif is not OK") addressed the ring transient state when
      MEM_TYPE_XSK_BUFF_POOL was being configured which in turn caused the
      interface to through down/up. Maurice reported that when carrier is not
      ok and xsk_pool is present on ring pair, ksoftirqd will consume 100% CPU
      cycles due to the constant NAPI rescheduling as ixgbe_poll() states that
      there is still some work to be done.
      
      To fix this, do not set work_done to false for a !netif_carrier_ok().
      
      Fixes: c685c69f ("ixgbe: don't do any AF_XDP zero-copy transmit if netif is not OK")
      Reported-by: default avatarMaurice Baijens <maurice.baijens@ellips.com>
      Tested-by: default avatarMaurice Baijens <maurice.baijens@ellips.com>
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarSandeep Penigalapati <sandeep.penigalapati@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6c7273a2
    • Jakub Kicinski's avatar
      Merge branch 'selftests-mlxsw-a-couple-of-fixes' · 312f2d50
      Jakub Kicinski authored
      Ido Schimmel says:
      
      ====================
      selftests: mlxsw: A couple of fixes
      
      Patch #1 fixes a breakage due to a change in iproute2 output. The real
      problem is not iproute2, but the fact that the check was not strict
      enough. Fixed by using JSON output instead. Targeting at net so that the
      test will pass as part of old and new kernels regardless of iproute2
      version.
      
      Patch #2 fixes an issue uncovered by the first one.
      ====================
      
      Link: https://lore.kernel.org/r/20220302161447.217447-1-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      312f2d50
    • Amit Cohen's avatar
      selftests: mlxsw: resource_scale: Fix return value · 196f9bc0
      Amit Cohen authored
      The test runs several test cases and is supposed to return an error in
      case at least one of them failed.
      
      Currently, the check of the return value of each test case is in the
      wrong place, which can result in the wrong return value. For example:
      
       # TESTS='tc_police' ./resource_scale.sh
       TEST: 'tc_police' [default] 968                                     [FAIL]
               tc police offload count failed
       Error: mlxsw_spectrum: Failed to allocate policer index.
       We have an error talking to the kernel
       Command failed /tmp/tmp.i7Oc5HwmXY:969
       TEST: 'tc_police' [default] overflow 969                            [ OK ]
       ...
       TEST: 'tc_police' [ipv4_max] overflow 969                           [ OK ]
      
       $ echo $?
       0
      
      Fix this by moving the check to be done after each test case.
      
      Fixes: 059b18e2 ("selftests: mlxsw: Return correct error code in resource scale test")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      196f9bc0
    • Amit Cohen's avatar
      selftests: mlxsw: tc_police_scale: Make test more robust · dc975207
      Amit Cohen authored
      The test adds tc filters and checks how many of them were offloaded by
      grepping for 'in_hw'.
      
      iproute2 commit f4cd4f127047 ("tc: add skip_hw and skip_sw to control
      action offload") added offload indication to tc actions, producing the
      following output:
      
       $ tc filter show dev swp2 ingress
       ...
       filter protocol ipv6 pref 1000 flower chain 0 handle 0x7c0
         eth_type ipv6
         dst_ip 2001:db8:1::7bf
         skip_sw
         in_hw in_hw_count 1
               action order 1:  police 0x7c0 rate 10Mbit burst 100Kb mtu 2Kb action drop overhead 0b
               ref 1 bind 1
               not_in_hw
               used_hw_stats immediate
      
      The current grep expression matches on both 'in_hw' and 'not_in_hw',
      resulting in incorrect results.
      
      Fix that by using JSON output instead.
      
      Fixes: 5061e773 ("selftests: mlxsw: Add scale test for tc-police")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dc975207
    • Vladimir Oltean's avatar
      net: dcb: disable softirqs in dcbnl_flush_dev() · 10b6bb62
      Vladimir Oltean authored
      Ido Schimmel points out that since commit 52cff74e ("dcbnl : Disable
      software interrupts before taking dcb_lock"), the DCB API can be called
      by drivers from softirq context.
      
      One such in-tree example is the chelsio cxgb4 driver:
      dcb_rpl
      -> cxgb4_dcb_handle_fw_update
         -> dcb_ieee_setapp
      
      If the firmware for this driver happened to send an event which resulted
      in a call to dcb_ieee_setapp() at the exact same time as another
      DCB-enabled interface was unregistering on the same CPU, the softirq
      would deadlock, because the interrupted process was already holding the
      dcb_lock in dcbnl_flush_dev().
      
      Fix this unlikely event by using spin_lock_bh() in dcbnl_flush_dev() as
      in the rest of the dcbnl code.
      
      Fixes: 91b0383f ("net: dcb: flush lingering app table entries for unregistered devices")
      Reported-by: default avatarIdo Schimmel <idosch@idosch.org>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220302193939.1368823-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      10b6bb62
    • Christophe JAILLET's avatar
      bnx2: Fix an error message · 8ccffe9a
      Christophe JAILLET authored
      Fix an error message and report the correct failing function.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ccffe9a
    • Niels Dossche's avatar
      sfc: extend the locking on mcdi->seqno · f1fb205e
      Niels Dossche authored
      seqno could be read as a stale value outside of the lock. The lock is
      already acquired to protect the modification of seqno against a possible
      race condition. Place the reading of this value also inside this locking
      to protect it against a possible race condition.
      Signed-off-by: default avatarNiels Dossche <dossche.niels@gmail.com>
      Acked-by: default avatarMartin Habets <habetsm.xilinx@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1fb205e
    • Luiz Augusto von Dentz's avatar
      Bluetooth: hci_sync: Fix not processing all entries on cmd_sync_work · 008ee9eb
      Luiz Augusto von Dentz authored
      hci_cmd_sync_queue can be called multiple times, each adding a
      hci_cmd_sync_work_entry, before hci_cmd_sync_work is run so this makes
      sure they are all dequeued properly otherwise it creates a backlog of
      entries that are never run.
      
      Link: https://lore.kernel.org/all/CAJCQCtSeUtHCgsHXLGrSTWKmyjaQDbDNpP4rb0i+RE+L2FTXSA@mail.gmail.com/T/
      Fixes: 6a98e383 ("Bluetooth: Add helper for serialized HCI command execution")
      Tested-by: default avatarChris Clayton <chris2553@googlemail.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      008ee9eb
    • Hans de Goede's avatar
      Bluetooth: hci_core: Fix unbalanced unlock in set_device_flags() · 815d5121
      Hans de Goede authored
      There is only one "goto done;" in set_device_flags() and this happens
      *before* hci_dev_lock() is called, move the done label to after the
      hci_dev_unlock() to fix the following unlock balance:
      
      [   31.493567] =====================================
      [   31.493571] WARNING: bad unlock balance detected!
      [   31.493576] 5.17.0-rc2+ #13 Tainted: G         C  E
      [   31.493581] -------------------------------------
      [   31.493584] bluetoothd/685 is trying to release lock (&hdev->lock) at:
      [   31.493594] [<ffffffffc07603f5>] set_device_flags+0x65/0x1f0 [bluetooth]
      [   31.493684] but there are no more locks to release!
      
      Note this bug has been around for a couple of years, but before
      commit fe92ee64 ("Bluetooth: hci_core: Rework hci_conn_params flags")
      supported_flags was hardcoded to "((1U << HCI_CONN_FLAG_MAX) - 1)" so
      the check for unsupported flags which does the "goto done;" never
      triggered.
      
      Fixes: fe92ee64 ("Bluetooth: hci_core: Rework hci_conn_params flags")
      Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      815d5121
    • David S. Miller's avatar
      Merge branch 'smc-fix' · f8e9bd34
      David S. Miller authored
      D. Wythe says:
      
      ====================
      fix unexpected SMC_CLC_DECL_ERR_REGRMB error
      
      We can easily trigger the SMC_CLC_DECL_ERR_REGRMB exception within
      following script:
      
      server: smc_run nginx
      client: smc_run  ./wrk -c 2000 -t 8 -d 20 http://smc-server
      
      And we can clearly see that this error is also divided into two types:
      
      1. 0x09990003
      2. 0x05000000/0x09990003
      
      Which has the same root causes, but the immediate causes vary.
      
      The root cause of this issues is that remove connections from link group
      is not synchronous with add/delete rtoken entry,  which means that even
      the number of connections is less that SMC_RMBS_PER_LGR_MAX, it does not
      mean that the connection can register rtoken successfully later. In
      other words, the rtoken entry may released, This will cause an
      unexpected SMC_CLC_DECL_ERR_REGRMB to be reported, and then this SMC
      connections have to fallback to TCP.
      
      This patch set handles two types of SMC_CLC_DECL_ERR_REGRMB exceptions
      from different perspectives.
      
      Patch 1: fix the 0x05000000/0x09990003 error.
      Patch 2: fix the 0x09990003 error.
      
      After those patches, there is no SMC_CLC_DECL_ERR_REGRMB exceptions in
      my
      test case any more.
      
      v1 -> v2:
      - add bugfix patch for SMC_CLC_DECL_ERR_REGRMB cause by server side
      v2 -> v3:
      - fix incorrect mail thread
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8e9bd34
    • D. Wythe's avatar
      net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server · 4940a1fd
      D. Wythe authored
      The problem of SMC_CLC_DECL_ERR_REGRMB on the server is very clear.
      Based on the fact that whether a new SMC connection can be accepted or
      not depends on not only the limit of conn nums, but also the available
      entries of rtoken. Since the rtoken release is trigger by peer, while
      the conn nums is decrease by local, tons of thing can happen in this
      time difference.
      
      This only thing that needs to be mentioned is that now all connection
      creations are completely protected by smc_server_lgr_pending lock, it's
      enough to check only the available entries in rtokens_used_mask.
      
      Fixes: cd6851f3 ("smc: remote memory buffers (RMBs)")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4940a1fd
    • D. Wythe's avatar
      net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client · 0537f0a2
      D. Wythe authored
      The main reason for this unexpected SMC_CLC_DECL_ERR_REGRMB in client
      dues to following execution sequence:
      
      Server Conn A:           Server Conn B:			Client Conn B:
      
      smc_lgr_unregister_conn
                              smc_lgr_register_conn
                              smc_clc_send_accept     ->
                                                              smc_rtoken_add
      smcr_buf_unuse
      		->		Client Conn A:
      				smc_rtoken_delete
      
      smc_lgr_unregister_conn() makes current link available to assigned to new
      incoming connection, while smcr_buf_unuse() has not executed yet, which
      means that smc_rtoken_add may fail because of insufficient rtoken_entry,
      reversing their execution order will avoid this problem.
      
      Fixes: 3e034725 ("net/smc: common functions for RMBs and send buffers")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0537f0a2
    • Zheyu Ma's avatar
      net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe() · bd6f1fd5
      Zheyu Ma authored
      During driver initialization, the pointer of card info, i.e. the
      variable 'ci' is required. However, the definition of
      'com20020pci_id_table' reveals that this field is empty for some
      devices, which will cause null pointer dereference when initializing
      these devices.
      
      The following log reveals it:
      
      [    3.973806] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f]
      [    3.973819] RIP: 0010:com20020pci_probe+0x18d/0x13e0 [com20020_pci]
      [    3.975181] Call Trace:
      [    3.976208]  local_pci_probe+0x13f/0x210
      [    3.977248]  pci_device_probe+0x34c/0x6d0
      [    3.977255]  ? pci_uevent+0x470/0x470
      [    3.978265]  really_probe+0x24c/0x8d0
      [    3.978273]  __driver_probe_device+0x1b3/0x280
      [    3.979288]  driver_probe_device+0x50/0x370
      
      Fix this by checking whether the 'ci' is a null pointer first.
      
      Fixes: 8c14f9c7 ("ARCNET: add com20020 PCI IDs with metadata")
      Signed-off-by: default avatarZheyu Ma <zheyuma97@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd6f1fd5
    • Eric Dumazet's avatar
      tcp: make tcp_read_sock() more robust · e3d5ea2c
      Eric Dumazet authored
      If recv_actor() returns an incorrect value, tcp_read_sock()
      might loop forever.
      
      Instead, issue a one time warning and make sure to make progress.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20220302161723.3910001-2-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e3d5ea2c
    • Eric Dumazet's avatar
      bpf, sockmap: Do not ignore orig_len parameter · 60ce37b0
      Eric Dumazet authored
      Currently, sk_psock_verdict_recv() returns skb->len
      
      This is problematic because tcp_read_sock() might have
      passed orig_len < skb->len, due to the presence of TCP urgent data.
      
      This causes an infinite loop from tcp_read_sock()
      
      Followup patch will make tcp_read_sock() more robust vs bad actors.
      
      Fixes: ef565928 ("bpf, sockmap: Allow skipping sk_skb parser program")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Tested-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20220302161723.3910001-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      60ce37b0
    • Alex Elder's avatar
      net: ipa: add an interconnect dependency · 1dba41c9
      Alex Elder authored
      In order to function, the IPA driver very clearly requires the
      interconnect framework to be enabled in the kernel configuration.
      State that dependency in the Kconfig file.
      
      This became a problem when CONFIG_COMPILE_TEST support was added.
      Non-Qualcomm platforms won't necessarily enable CONFIG_INTERCONNECT.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Fixes: 38a4066f ("net: ipa: support COMPILE_TEST")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Link: https://lore.kernel.org/r/20220301113440.257916-1-elder@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1dba41c9
    • lena wang's avatar
      net: fix up skbs delta_truesize in UDP GRO frag_list · 224102de
      lena wang authored
      The truesize for a UDP GRO packet is added by main skb and skbs in main
      skb's frag_list:
      skb_gro_receive_list
              p->truesize += skb->truesize;
      
      The commit 53475c5d ("net: fix use-after-free when UDP GRO with
      shared fraglist") introduced a truesize increase for frag_list skbs.
      When uncloning skb, it will call pskb_expand_head and trusesize for
      frag_list skbs may increase. This can occur when allocators uses
      __netdev_alloc_skb and not jump into __alloc_skb. This flow does not
      use ksize(len) to calculate truesize while pskb_expand_head uses.
      skb_segment_list
      err = skb_unclone(nskb, GFP_ATOMIC);
      pskb_expand_head
              if (!skb->sk || skb->destructor == sock_edemux)
                      skb->truesize += size - osize;
      
      If we uses increased truesize adding as delta_truesize, it will be
      larger than before and even larger than previous total truesize value
      if skbs in frag_list are abundant. The main skb truesize will become
      smaller and even a minus value or a huge value for an unsigned int
      parameter. Then the following memory check will drop this abnormal skb.
      
      To avoid this error we should use the original truesize to segment the
      main skb.
      
      Fixes: 53475c5d ("net: fix use-after-free when UDP GRO with shared fraglist")
      Signed-off-by: default avatarlena wang <lena.wang@mediatek.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/1646133431-8948-1-git-send-email-lena.wang@mediatek.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      224102de