1. 20 Jul, 2022 22 commits
  2. 19 Jul, 2022 18 commits
    • Paolo Abeni's avatar
      Merge branch 'amt-fix-validation-and-synchronization-bugs' · b3fcfc4f
      Paolo Abeni authored
      Taehee Yoo says:
      
      ====================
      amt: fix validation and synchronization bugs
      
      There are some synchronization issues in the amt module.
      Especially, an amt gateway doesn't well synchronize its own variables
      and status(amt->status).
      It tries to use a workqueue for handles in a single thread.
      A global lock is also good, but it would occur complex locking complex.
      
      In this patchset, only the gateway uses workqueue.
      The reason why only gateway interface uses workqueue is that gateway
      should manage its own states and variables a little bit statefully.
      But relay doesn't need to manage tunnels statefully, stateless is okay.
      So, relay side message handlers are okay to be called concurrently.
      But it doesn't mean that no lock is needed.
      
      Only amt multicast data message type will not be processed by the work
      queue because It contains actual multicast data.
      So, it should be processed immediately.
      
      When any amt gateway events are triggered(sending discovery message by
      delayed_work, sending request message by delayed_work and receiving
      messages), it stores event and skb into the event queue(amt->events[16]).
      Then, workqueue processes these events one by one.
      
      The first patch is to use the work queue.
      
      The second patch is to remove unnecessary lock due to a previous patch.
      
      The third patch is to use READ_ONCE() in the amt module.
      Even if the amt module uses a single thread, some variables (ready4,
      ready6, amt->status) can be accessed concurrently.
      
      The fourth patch is to add missing nonce generation logic when it sends a
      new request message.
      
      The fifth patch is to drop unexpected advertisement messages.
      advertisement message should be received only after the gateway sends
      a discovery message first.
      So, the gateway should drop advertisement messages if it has never
      sent a discovery message and it also should drop duplicate advertisement
      messages.
      Using nonce is good to distinguish whether a received message is an
      expected message or not.
      
      The sixth patch is to drop unexpected query messages.
      This is the same behavior as the fourth patch.
      Query messages should be received only after the gateway sends a request
      message first.
      The nonce variable is used to distinguish whether it is a reply to a
      previous request message or not.
      amt->ready4 and amt->ready6 are used to distinguish duplicate messages.
      
      The seventh patch is to drop unexpected multicast data.
      AMT gateway should not receive multicast data message type before
      establish between gateway and relay.
      In order to drop unexpected multicast data messages, it checks amt->status.
      
      The last patch is to fix a locking problem on the relay side.
      amt->nr_tunnels variable is protected by amt->lock.
      But amt_request_handler() doesn't protect this variable.
      
      v2:
       - Use local_bh_disable() instead of rcu_read_lock_bh() in
         amt_membership_query_handler.
       - Fix using uninitialized variables.
       - Fix unexpectedly start the event_wq after stopping.
       - Fix possible deadlock in amt_event_work().
       - Add a limit variable in amt_event_work() to prevent infinite working.
       - Rename amt_queue_events() to amt_queue_event().
      ====================
      
      Link: https://lore.kernel.org/r/20220717160910.19156-1-ap420073@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b3fcfc4f
    • Taehee Yoo's avatar
      amt: do not use amt->nr_tunnels outside of lock · 98991848
      Taehee Yoo authored
      amt->nr_tunnels is protected by amt->lock.
      But, amt_request_handler() has been using this variable without the
      amt->lock.
      So, it expands context of amt->lock in the amt_request_handler() to
      protect amt->nr_tunnels variable.
      
      Fixes: cbc21dc1 ("amt: add data plane of amt interface")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      98991848
    • Taehee Yoo's avatar
      amt: drop unexpected multicast data · e882827d
      Taehee Yoo authored
      AMT gateway interface should not receive unexpected multicast data.
      Multicast data message type should be received after sending an update
      message, which means all establishment between gateway and relay is
      finished.
      So, amt_multicast_data_handler() checks amt->status.
      
      Fixes: cbc21dc1 ("amt: add data plane of amt interface")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e882827d
    • Taehee Yoo's avatar
      amt: drop unexpected query message · 239d8866
      Taehee Yoo authored
      AMT gateway interface should not receive unexpected query messages.
      In order to drop unexpected query messages, it checks nonce.
      And it also checks ready4 and ready6 variables to drop duplicated messages.
      
      Fixes: cbc21dc1 ("amt: add data plane of amt interface")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      239d8866
    • Taehee Yoo's avatar
      amt: drop unexpected advertisement message · 40185f35
      Taehee Yoo authored
      AMT gateway interface should not receive unexpected advertisement messages.
      In order to drop these packets, it should check nonce and amt->status.
      
      Fixes: cbc21dc1 ("amt: add data plane of amt interface")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      40185f35
    • Taehee Yoo's avatar
      amt: add missing regeneration nonce logic in request logic · 627f1693
      Taehee Yoo authored
      When AMT gateway starts sending a new request message, it should
      regenerate the nonce variable.
      
      Fixes: cbc21dc1 ("amt: add data plane of amt interface")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      627f1693
    • Taehee Yoo's avatar
      amt: use READ_ONCE() in amt module · 928f353c
      Taehee Yoo authored
      There are some data races in the amt module.
      amt->ready4, amt->ready6, and amt->status can be accessed concurrently
      without locks.
      So, it uses READ_ONCE() and WRITE_ONCE().
      
      Fixes: cbc21dc1 ("amt: add data plane of amt interface")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      928f353c
    • Taehee Yoo's avatar
      amt: remove unnecessary locks · 9c343ea6
      Taehee Yoo authored
      By the previous patch, amt gateway handlers are changed to worked by
      a single thread.
      So, most locks for gateway are not needed.
      So, it removes.
      
      Fixes: cbc21dc1 ("amt: add data plane of amt interface")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9c343ea6
    • Taehee Yoo's avatar
      amt: use workqueue for gateway side message handling · 30e22a6e
      Taehee Yoo authored
      There are some synchronization issues(amt->status, amt->req_cnt, etc)
      if the interface is in gateway mode because gateway message handlers
      are processed concurrently.
      This applies a work queue for processing these messages instead of
      expanding the locking context.
      
      So, the purposes of this patch are to fix exist race conditions and to make
      gateway to be able to validate a gateway status more correctly.
      
      When the AMT gateway interface is created, it tries to establish to relay.
      The establishment step looks stateless, but it should be managed well.
      In order to handle messages in the gateway, it saves the current
      status(i.e. AMT_STATUS_XXX).
      This patch makes gateway code to be worked with a single thread.
      
      Now, all messages except the multicast are triggered(received or
      delay expired), and these messages will be stored in the event
      queue(amt->events).
      Then, the single worker processes stored messages asynchronously one
      by one.
      The multicast data message type will be still processed immediately.
      
      Now, amt->lock is only needed to access the event queue(amt->events)
      if an interface is the gateway mode.
      
      Fixes: cbc21dc1 ("amt: add data plane of amt interface")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      30e22a6e
    • Oleksij Rempel's avatar
      net: dsa: vitesse-vsc73xx: silent spi_device_id warnings · 1774559f
      Oleksij Rempel authored
      Add spi_device_id entries to silent SPI warnings.
      
      Fixes: 5fa6863b ("spi: Check we have a spi_device_id for each DT compatible")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220717135831.2492844-2-o.rempel@pengutronix.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1774559f
    • Oleksij Rempel's avatar
      net: dsa: sja1105: silent spi_device_id warnings · 855fe499
      Oleksij Rempel authored
      Add spi_device_id entries to silent following warnings:
       SPI driver sja1105 has no spi_device_id for nxp,sja1105e
       SPI driver sja1105 has no spi_device_id for nxp,sja1105t
       SPI driver sja1105 has no spi_device_id for nxp,sja1105p
       SPI driver sja1105 has no spi_device_id for nxp,sja1105q
       SPI driver sja1105 has no spi_device_id for nxp,sja1105r
       SPI driver sja1105 has no spi_device_id for nxp,sja1105s
       SPI driver sja1105 has no spi_device_id for nxp,sja1110a
       SPI driver sja1105 has no spi_device_id for nxp,sja1110b
       SPI driver sja1105 has no spi_device_id for nxp,sja1110c
       SPI driver sja1105 has no spi_device_id for nxp,sja1110d
      
      Fixes: 5fa6863b ("spi: Check we have a spi_device_id for each DT compatible")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20220717135831.2492844-1-o.rempel@pengutronix.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      855fe499
    • Hristo Venev's avatar
      be2net: Fix buffer overflow in be_get_module_eeprom · d7241f67
      Hristo Venev authored
      be_cmd_read_port_transceiver_data assumes that it is given a buffer that
      is at least PAGE_DATA_LEN long, or twice that if the module supports SFF
      8472. However, this is not always the case.
      
      Fix this by passing the desired offset and length to
      be_cmd_read_port_transceiver_data so that we only copy the bytes once.
      
      Fixes: e36edd9d ("be2net: add ethtool "-m" option support")
      Signed-off-by: default avatarHristo Venev <hristo@venev.name>
      Link: https://lore.kernel.org/r/20220716085134.6095-1-hristo@venev.nameSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d7241f67
    • Wong Vee Khee's avatar
      net: stmmac: remove redunctant disable xPCS EEE call · da791bac
      Wong Vee Khee authored
      Disable is done in stmmac_init_eee() on the event of MAC link down.
      Since setting enable/disable EEE via ethtool will eventually trigger
      a MAC down, removing this redunctant call in stmmac_ethtool.c to avoid
      calling xpcs_config_eee() twice.
      
      Fixes: d4aeaed8 ("net: stmmac: trigger PCS EEE to turn off on link down")
      Signed-off-by: default avatarWong Vee Khee <vee.khee.wong@linux.intel.com>
      Link: https://lore.kernel.org/r/20220715122402.1017470-1-vee.khee.wong@linux.intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      da791bac
    • Jakub Kicinski's avatar
      Merge branch 'fix-2-dsa-issues-with-vlan_filtering_is_global' · 49a2f5c8
      Jakub Kicinski authored
      Vladimir Oltean says:
      
      ====================
      Fix 2 DSA issues with vlan_filtering_is_global
      
      This patch set fixes 2 issues with vlan_filtering_is_global switches.
      
      Both are regressions introduced by refactoring commit d0004a02
      ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the
      core"), which wasn't tested on a wide enough variety of switches.
      
      Tested on the sja1105 driver.
      ====================
      
      Link: https://lore.kernel.org/r/20220715151659.780544-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      49a2f5c8
    • Vladimir Oltean's avatar
      net: dsa: fix NULL pointer dereference in dsa_port_reset_vlan_filtering · 1699b4d5
      Vladimir Oltean authored
      The "ds" iterator variable used in dsa_port_reset_vlan_filtering() ->
      dsa_switch_for_each_port() overwrites the "dp" received as argument,
      which is later used to call dsa_port_vlan_filtering() proper.
      
      As a result, switches which do enter that code path (the ones with
      vlan_filtering_is_global=true) will dereference an invalid dp in
      dsa_port_reset_vlan_filtering() after leaving a VLAN-aware bridge.
      
      Use a dedicated "other_dp" iterator variable to avoid this from
      happening.
      
      Fixes: d0004a02 ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1699b4d5
    • Vladimir Oltean's avatar
      net: dsa: fix dsa_port_vlan_filtering when global · 4db2a5ef
      Vladimir Oltean authored
      The blamed refactoring commit changed a "port" iterator with "other_dp",
      but still looked at the slave_dev of the dp outside the loop, instead of
      other_dp->slave from the loop.
      
      As a result, dsa_port_vlan_filtering() would not call
      dsa_slave_manage_vlan_filtering() except for the port in cause, and not
      for all switch ports as expected.
      
      Fixes: d0004a02 ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core")
      Reported-by: default avatarLucian Banu <Lucian.Banu@westermo.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4db2a5ef
    • Piotr Skajewski's avatar
      ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero · 1e53834c
      Piotr Skajewski authored
      It is possible to disable VFs while the PF driver is processing requests
      from the VF driver.  This can result in a panic.
      
      BUG: unable to handle kernel paging request at 000000000000106c
      PGD 0 P4D 0
      Oops: 0000 [#1] SMP NOPTI
      CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: G I      --------- -
      Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020
      RIP: 0010:ixgbe_msg_task+0x4c8/0x1690 [ixgbe]
      Code: 00 00 48 8d 04 40 48 c1 e0 05 89 7c 24 24 89 fd 48 89 44 24 10 83 ff
      01 0f 84 b8 04 00 00 4c 8b 64 24 10 4d 03 a5 48 22 00 00 <41> 80 7c 24 4c
      00 0f 84 8a 03 00 00 0f b7 c7 83 f8 08 0f 84 8f 0a
      RSP: 0018:ffffb337869f8df8 EFLAGS: 00010002
      RAX: 0000000000001020 RBX: 0000000000000000 RCX: 000000000000002b
      RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000006
      RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000029780
      R10: 00006957d8f42832 R11: 0000000000000000 R12: 0000000000001020
      R13: ffff8a00e8978ac0 R14: 000000000000002b R15: ffff8a00e8979c80
      FS:  0000000000000000(0000) GS:ffff8a07dfd00000(0000) knlGS:00000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000000106c CR3: 0000000063e10004 CR4: 00000000007726e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       <IRQ>
       ? ttwu_do_wakeup+0x19/0x140
       ? try_to_wake_up+0x1cd/0x550
       ? ixgbevf_update_xcast_mode+0x71/0xc0 [ixgbevf]
       ixgbe_msix_other+0x17e/0x310 [ixgbe]
       __handle_irq_event_percpu+0x40/0x180
       handle_irq_event_percpu+0x30/0x80
       handle_irq_event+0x36/0x53
       handle_edge_irq+0x82/0x190
       handle_irq+0x1c/0x30
       do_IRQ+0x49/0xd0
       common_interrupt+0xf/0xf
      
      This can be eventually be reproduced with the following script:
      
      while :
      do
          echo 63 > /sys/class/net/<devname>/device/sriov_numvfs
          sleep 1
          echo 0 > /sys/class/net/<devname>/device/sriov_numvfs
          sleep 1
      done
      
      Add lock when disabling SR-IOV to prevent process VF mailbox communication.
      
      Fixes: d773d131 ("ixgbe: Fix memory leak when SR-IOV VFs are direct assigned")
      Signed-off-by: default avatarPiotr Skajewski <piotrx.skajewski@intel.com>
      Tested-by: default avatarMarek Szlosek <marek.szlosek@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20220715214456.2968711-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1e53834c
    • Dawid Lukwinski's avatar
      i40e: Fix erroneous adapter reinitialization during recovery process · f838a633
      Dawid Lukwinski authored
      Fix an issue when driver incorrectly detects state
      of recovery process and erroneously reinitializes interrupts,
      which results in a kernel error and call trace message.
      
      The issue was caused by a combination of two factors:
      1. Assuming the EMP reset issued after completing
      firmware recovery means the whole recovery process is complete.
      2. Erroneous reinitialization of interrupt vector after detecting
      the above mentioned EMP reset.
      
      Fixes (1) by changing how recovery state change is detected
      and (2) by adjusting the conditional expression to ensure using proper
      interrupt reinitialization method, depending on the situation.
      
      Fixes: 4ff0ee1a ("i40e: Introduce recovery mode support")
      Signed-off-by: default avatarDawid Lukwinski <dawid.lukwinski@intel.com>
      Signed-off-by: default avatarJan Sokolowski <jan.sokolowski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20220715214542.2968762-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f838a633