1. 03 Nov, 2021 33 commits
    • Brett Creeley's avatar
      ice: Fix race conditions between virtchnl handling and VF ndo ops · e6ba5273
      Brett Creeley authored
      The VF can be configured via the PF's ndo ops at the same time the PF is
      receiving/handling virtchnl messages. This has many issues, with
      one of them being the ndo op could be actively resetting a VF (i.e.
      resetting it to the default state and deleting/re-adding the VF's VSI)
      while a virtchnl message is being handled. The following error was seen
      because a VF ndo op was used to change a VF's trust setting while the
      VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing:
      
      [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
      [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
      [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
      
      Fix this by making sure the virtchnl handling and VF ndo ops that
      trigger VF resets cannot run concurrently. This is done by adding a
      struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex
      will be locked around the critical operations and VFR. Since the ndo ops
      will trigger a VFR, the virtchnl thread will use mutex_trylock(). This
      is done because if any other thread (i.e. VF ndo op) has the mutex, then
      that means the current VF message being handled is no longer valid, so
      just ignore it.
      
      This issue can be seen using the following commands:
      
      for i in {0..50}; do
              rmmod ice
              modprobe ice
      
              sleep 1
      
              echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
      
              ip link set ens785f1 vf 0 trust on
              ip link set ens785f0 vf 0 trust on
      
              sleep 2
      
              echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
              sleep 1
              echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
      
              ip link set ens785f1 vf 0 trust on
              ip link set ens785f0 vf 0 trust on
      done
      
      Fixes: 7c710869 ("ice: Add handlers for VF netdevice operations")
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      e6ba5273
    • Brett Creeley's avatar
      ice: Fix not stopping Tx queues for VFs · b385cca4
      Brett Creeley authored
      When a VF is removed and/or reset its Tx queues need to be
      stopped from the PF. This is done by calling the ice_dis_vf_qs()
      function, which calls ice_vsi_stop_lan_tx_rings(). Currently
      ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA.
      Unfortunately, this is causing the Tx queues to not be disabled in some
      cases and when the VF tries to re-enable/reconfigure its Tx queues over
      virtchnl the op is failing. This is because a VF can be reset and/or
      removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues
      were already configured via ice_vsi_cfg_single_txq() in the
      VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit
      is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always
      happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op.
      
      This was causing the following error message when loading the ice
      driver, creating VFs, and modifying VF trust in an endless loop:
      
      [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
      [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
      [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
      
      Fix this by always calling ice_dis_vf_qs() and silencing the error
      message in ice_vsi_stop_tx_ring() since the calling code ignores the
      return anyway. Also, all other places that call ice_vsi_stop_tx_ring()
      catch the error, so this doesn't affect those flows since there was no
      change to the values the function returns.
      
      Other solutions were considered (i.e. tracking which VF queues had been
      "started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed
      more complicated than it was worth. This solution also brings in the
      chance for other unexpected conditions due to invalid state bit checks.
      So, the proposed solution seemed like the best option since there is no
      harm in failing to stop Tx queues that were never started.
      
      This issue can be seen using the following commands:
      
      for i in {0..50}; do
              rmmod ice
              modprobe ice
      
              sleep 1
      
              echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
      
              ip link set ens785f1 vf 0 trust on
              ip link set ens785f0 vf 0 trust on
      
              sleep 2
      
              echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
              sleep 1
              echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
              echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
      
              ip link set ens785f1 vf 0 trust on
              ip link set ens785f0 vf 0 trust on
      done
      
      Fixes: 77ca27c4 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      b385cca4
    • Sylwester Dziedziuch's avatar
      ice: Fix replacing VF hardware MAC to existing MAC filter · ce572a5b
      Sylwester Dziedziuch authored
      VF was not able to change its hardware MAC address in case
      the new address was already present in the MAC filter list.
      Change the handling of VF add mac request to not return
      if requested MAC address is already present on the list
      and check if its hardware MAC needs to be updated in this case.
      
      Fixes: ed4c068d ("ice: Enable ip link show on the PF to display VF unicast MAC(s)")
      Signed-off-by: default avatarSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
      Tested-by: default avatarTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      ce572a5b
    • Brett Creeley's avatar
      ice: Remove toggling of antispoof for VF trusted promiscuous mode · 0299faea
      Brett Creeley authored
      Currently when a trusted VF enables promiscuous mode spoofchk will be
      disabled. This is wrong and should only be modified from the
      ndo_set_vf_spoofchk callback. Fix this by removing the call to toggle
      spoofchk for trusted VFs.
      
      Fixes: 01b5e89a ("ice: Add VF promiscuous support")
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      0299faea
    • Brett Creeley's avatar
      ice: Fix VF true promiscuous mode · 1a8c7778
      Brett Creeley authored
      When a VF requests promiscuous mode and it's trusted and true promiscuous
      mode is enabled the PF driver attempts to enable unicast and/or
      multicast promiscuous mode filters based on the request. This is fine,
      but there are a couple issues with the current code.
      
      [1] The define to configure the unicast promiscuous mode mask also
          includes bits to configure the multicast promiscuous mode mask, which
          causes multicast to be set/cleared unintentionally.
      [2] All 4 cases for enable/disable unicast/multicast mode are not
          handled in the promiscuous mode message handler, which causes
          unexpected results regarding the current promiscuous mode settings.
      
      To fix [1] make sure any promiscuous mask defines include the correct
      bits for each of the promiscuous modes.
      
      To fix [2] make sure that all 4 cases are handled since there are 2 bits
      (FLAG_VF_UNICAST_PROMISC and FLAG_VF_MULTICAST_PROMISC) that can be
      either set or cleared. Also, since either unicast and/or multicast
      promiscuous configuration can fail, introduce two separate error values
      to handle each of these cases.
      
      Fixes: 01b5e89a ("ice: Add VF promiscuous support")
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      1a8c7778
    • Vladimir Oltean's avatar
      net: dsa: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge · 92f62485
      Vladimir Oltean authored
      Normally it is expected that the dsa_device_ops :: rcv() method finishes
      parsing the DSA tag and consumes it, then never looks at it again.
      
      But commit c0bcf537 ("net: dsa: ocelot: add hardware timestamping
      support for Felix") added support for RX timestamping in a very
      unconventional way. On this switch, a partial timestamp is available in
      the DSA header, but the driver got away with not parsing that timestamp
      right away, but instead delayed that parsing for a little longer:
      
      dsa_switch_rcv():
      	nskb = cpu_dp->rcv(skb, dev); <------------- not here
      	-> ocelot_rcv()
      	...
      
      	skb = nskb;
      	skb_push(skb, ETH_HLEN);
      	skb->pkt_type = PACKET_HOST;
      	skb->protocol = eth_type_trans(skb, skb->dev);
      
      	...
      
      	if (dsa_skb_defer_rx_timestamp(p, skb)) <--- but here
      	-> felix_rxtstamp()
      		return 0;
      
      When in felix_rxtstamp(), this driver accounted for the fact that
      eth_type_trans() happened in the meanwhile, so it got a hold of the
      extraction header again by subtracting (ETH_HLEN + OCELOT_TAG_LEN) bytes
      from the current skb->data.
      
      This worked for quite some time but was quite fragile from the very
      beginning. Not to mention that having DSA tag parsing split in two
      different files, under different folders (net/dsa/tag_ocelot.c vs
      drivers/net/dsa/ocelot/felix.c) made it quite non-obvious for patches to
      come that they might break this.
      
      Finally, the blamed commit does the following: at the end of
      ocelot_rcv(), it checks whether the skb payload contains a VLAN header.
      If it does, and this port is under a VLAN-aware bridge, that VLAN ID
      might not be correct in the sense that the packet might have suffered
      VLAN rewriting due to TCAM rules (VCAP IS1). So we consume the VLAN ID
      from the skb payload using __skb_vlan_pop(), and take the classified
      VLAN ID from the DSA tag, and construct a hwaccel VLAN tag with the
      classified VLAN, and the skb payload is VLAN-untagged.
      
      The big problem is that __skb_vlan_pop() does:
      
      	memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);
      	__skb_pull(skb, VLAN_HLEN);
      
      aka it moves the Ethernet header 4 bytes to the right, and pulls 4 bytes
      from the skb headroom (effectively also moving skb->data, by definition).
      So for felix_rxtstamp()'s fragile logic, all bets are off now.
      Instead of having the "extraction" pointer point to the DSA header,
      it actually points to 4 bytes _inside_ the extraction header.
      Corollary, the last 4 bytes of the "extraction" header are in fact 4
      stale bytes of the destination MAC address from the Ethernet header,
      from prior to the __skb_vlan_pop() movement.
      
      So of course, RX timestamps are completely bogus when the system is
      configured in this way.
      
      The fix is actually very simple: just don't structure the code like that.
      For better or worse, the DSA PTP timestamping API does not offer a
      straightforward way for drivers to present their RX timestamps, but
      other drivers (sja1105) have established a simple mechanism to carry
      their RX timestamp from dsa_device_ops :: rcv() all the way to
      dsa_switch_ops :: port_rxtstamp() and even later. That mechanism is to
      simply save the partial timestamp to the skb->cb, and complete it later.
      
      Question: why don't we simply populate the skb's struct
      skb_shared_hwtstamps from ocelot_rcv(), and bother with this
      complication of propagating the timestamp to felix_rxtstamp()?
      
      Answer: dsa_switch_ops :: port_rxtstamp() answers the question whether
      PTP packets need sleepable context to retrieve the full RX timestamp.
      Currently felix_rxtstamp() answers "no, thanks" to that question, and
      calls ocelot_ptp_gettime64() from softirq atomic context. This is
      understandable, since Felix VSC9959 is a PCIe memory-mapped switch, so
      hardware access does not require sleeping. But the felix driver is
      preparing for the introduction of other switches where hardware access
      is over a slow bus like SPI or MDIO:
      https://lore.kernel.org/lkml/20210814025003.2449143-1-colin.foster@in-advantage.com/
      
      So I would like to keep this code structure, so the rework needed when
      that driver will need PTP support will be minimal (answer "yes, I need
      deferred context for this skb's RX timestamp", then the partial
      timestamp will still be found in the skb->cb.
      
      Fixes: ea440cd2 ("net: dsa: tag_ocelot: use VLAN information from tagging header when available")
      Reported-by: default avatarPo Liu <po.liu@nxp.com>
      Cc: Yangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92f62485
    • Ansuel Smith's avatar
      net: dsa: qca8k: make sure PAD0 MAC06 exchange is disabled · 5f15d392
      Ansuel Smith authored
      Some device set MAC06 exchange in the bootloader. This cause some
      problem as we don't support this strange mode and we just set the port6
      as the primary CPU port. With MAC06 exchange, PAD0 reg configure port6
      instead of port0. Add an extra check and explicitly disable MAC06 exchange
      to correctly configure the port PAD config.
      Signed-off-by: default avatarAnsuel Smith <ansuelsmth@gmail.com>
      Fixes: 3fcf734a ("net: dsa: qca8k: add support for cpu port 6")
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5f15d392
    • Ziyang Xuan's avatar
      net: vlan: fix a UAF in vlan_dev_real_dev() · 563bcbae
      Ziyang Xuan authored
      The real_dev of a vlan net_device may be freed after
      unregister_vlan_dev(). Access the real_dev continually by
      vlan_dev_real_dev() will trigger the UAF problem for the
      real_dev like following:
      
      ==================================================================
      BUG: KASAN: use-after-free in vlan_dev_real_dev+0xf9/0x120
      Call Trace:
       kasan_report.cold+0x83/0xdf
       vlan_dev_real_dev+0xf9/0x120
       is_eth_port_of_netdev_filter.part.0+0xb1/0x2c0
       is_eth_port_of_netdev_filter+0x28/0x40
       ib_enum_roce_netdev+0x1a3/0x300
       ib_enum_all_roce_netdevs+0xc7/0x140
       netdevice_event_work_handler+0x9d/0x210
      ...
      
      Freed by task 9288:
       kasan_save_stack+0x1b/0x40
       kasan_set_track+0x1c/0x30
       kasan_set_free_info+0x20/0x30
       __kasan_slab_free+0xfc/0x130
       slab_free_freelist_hook+0xdd/0x240
       kfree+0xe4/0x690
       kvfree+0x42/0x50
       device_release+0x9f/0x240
       kobject_put+0x1c8/0x530
       put_device+0x1b/0x30
       free_netdev+0x370/0x540
       ppp_destroy_interface+0x313/0x3d0
      ...
      
      Move the put_device(real_dev) to vlan_dev_free(). Ensure
      real_dev not be freed before vlan_dev unregistered.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: syzbot+e4df4e1389e28972e955@syzkaller.appspotmail.com
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      563bcbae
    • Menglong Dong's avatar
      net: udp6: replace __UDP_INC_STATS() with __UDP6_INC_STATS() · 250962e4
      Menglong Dong authored
      __UDP_INC_STATS() is used in udpv6_queue_rcv_one_skb() when encap_rcv()
      fails. __UDP6_INC_STATS() should be used here, so replace it with
      __UDP6_INC_STATS().
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      250962e4
    • Jakub Kicinski's avatar
      ethtool: fix ethtool msg len calculation for pause stats · 1aabe578
      Jakub Kicinski authored
      ETHTOOL_A_PAUSE_STAT_MAX is the MAX attribute id,
      so we need to subtract non-stats and add one to
      get a count (IOW -2+1 == -1).
      
      Otherwise we'll see:
      
        ethnl cmd 21: calculated reply length 40, but consumed 52
      
      Fixes: 9a27a330 ("ethtool: add standard pause stats")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1aabe578
    • Talal Ahmad's avatar
      net: avoid double accounting for pure zerocopy skbs · 9b65b17d
      Talal Ahmad authored
      Track skbs containing only zerocopy data and avoid charging them to
      kernel memory to correctly account the memory utilization for
      msg_zerocopy. All of the data in such skbs is held in user pages which
      are already accounted to user. Before this change, they are charged
      again in kernel in __zerocopy_sg_from_iter. The charging in kernel is
      excessive because data is not being copied into skb frags. This
      excessive charging can lead to kernel going into memory pressure
      state which impacts all sockets in the system adversely. Mark pure
      zerocopy skbs with a SKBFL_PURE_ZEROCOPY flag and remove
      charge/uncharge for data in such skbs.
      
      Initially, an skb is marked pure zerocopy when it is empty and in
      zerocopy path. skb can then change from a pure zerocopy skb to mixed
      data skb (zerocopy and copy data) if it is at tail of write queue and
      there is room available in it and non-zerocopy data is being sent in
      the next sendmsg call. At this time sk_mem_charge is done for the pure
      zerocopied data and the pure zerocopy flag is unmarked. We found that
      this happens very rarely on workloads that pass MSG_ZEROCOPY.
      
      A pure zerocopy skb can later be coalesced into normal skb if they are
      next to each other in queue but this patch prevents coalescing from
      happening. This avoids complexity of charging when skb downgrades from
      pure zerocopy to mixed. This is also rare.
      
      In sk_wmem_free_skb, if it is a pure zerocopy skb, an sk_mem_uncharge
      for SKB_TRUESIZE(skb_end_offset(skb)) is done for sk_mem_charge in
      tcp_skb_entail for an skb without data.
      
      Testing with the msg_zerocopy.c benchmark between two hosts(100G nics)
      with zerocopy showed that before this patch the 'sock' variable in
      memory.stat for cgroup2 that tracks sum of sk_forward_alloc,
      sk_rmem_alloc and sk_wmem_queued is around 1822720 and with this
      change it is 0. This is due to no charge to sk_forward_alloc for
      zerocopy data and shows memory utilization for kernel is lowered.
      
      With this commit we don't see the warning we saw in previous commit
      which resulted in commit 84882cf7.
      Signed-off-by: default avatarTalal Ahmad <talalahmad@google.com>
      Acked-by: default avatarArjun Roy <arjunroy@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b65b17d
    • Zhang Mingyu's avatar
      net:ipv6:Remove unneeded semicolon · acaea0d5
      Zhang Mingyu authored
      Eliminate the following coccinelle check warning:
      net/ipv6/seg6.c:381:2-3
      Reported-by: default avatarZeal Robot <zealci@zte.com.cn>
      Signed-off-by: default avatarZhang Mingyu <zhang.mingyu@zte.com.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acaea0d5
    • Lin Ma's avatar
      NFC: add necessary privilege flags in netlink layer · aedddb4e
      Lin Ma authored
      The CAP_NET_ADMIN checks are needed to prevent attackers faking a
      device under NCIUARTSETDRIVER and exploit privileged commands.
      
      This patch add GENL_ADMIN_PERM flags in genl_ops to fulfill the check.
      Except for commands like NFC_CMD_GET_DEVICE, NFC_CMD_GET_TARGET,
      NFC_CMD_LLC_GET_PARAMS, and NFC_CMD_GET_SE, which are mainly information-
      read operations.
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aedddb4e
    • David S. Miller's avatar
      Merge branch 'sctp-=security-hook-fixes' · 2bd080b0
      David S. Miller authored
      Xin Long says:
      
      ====================
      security: fixups for the security hooks in sctp
      
      There are a couple of problems in the currect security hooks in sctp:
      
      1. The hooks incorrectly treat sctp_endpoint in SCTP as request_sock in
         TCP, while it's in fact no more than an extension of the sock, and
         represents the local host. It is created when sock is created, not
         when a conn request comes. sctp_association is actually the correct
         one to represent the connection, and created when a conn request
         arrives.
      
      2. security_sctp_assoc_request() hook should also be called in processing
         COOKIE ECHO, as that's the place where the real assoc is created and
         used in the future.
      
      The problems above may cause accept sk, peeloff sk or client sk having
      the incorrect security labels.
      
      So this patchset is to change some hooks and pass asoc into them and save
      these secids into asoc, as well as add the missing sctp_assoc_request
      hook into the COOKIE ECHO processing.
      
      v1->v2:
        - See each patch, and thanks the help from Ondrej, Paul and Richard.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2bd080b0
    • Xin Long's avatar
      security: implement sctp_assoc_established hook in selinux · e7310c94
      Xin Long authored
      Different from selinux_inet_conn_established(), it also gives the
      secid to asoc->peer_secid in selinux_sctp_assoc_established(),
      as one UDP-type socket may have more than one asocs.
      
      Note that peer_secid in asoc will save the peer secid for this
      asoc connection, and peer_sid in sksec will just keep the peer
      secid for the latest connection. So the right use should be do
      peeloff for UDP-type socket if there will be multiple asocs in
      one socket, so that the peeloff socket has the right label for
      its asoc.
      
      v1->v2:
        - call selinux_inet_conn_established() to reduce some code
          duplication in selinux_sctp_assoc_established(), as Ondrej
          suggested.
        - when doing peeloff, it calls sock_create() where it actually
          gets secid for socket from socket_sockcreate_sid(). So reuse
          SECSID_WILD to ensure the peeloff socket keeps using that
          secid after calling selinux_sctp_sk_clone() for client side.
      
      Fixes: 72e89f50 ("security: Add support for SCTP security hooks")
      Reported-by: default avatarPrashanth Prahlad <pprahlad@redhat.com>
      Reviewed-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Tested-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7310c94
    • Xin Long's avatar
      security: add sctp_assoc_established hook · 7c2ef024
      Xin Long authored
      security_sctp_assoc_established() is added to replace
      security_inet_conn_established() called in
      sctp_sf_do_5_1E_ca(), so that asoc can be accessed in security
      subsystem and save the peer secid to asoc->peer_secid.
      
      v1->v2:
        - fix the return value of security_sctp_assoc_established() in
          security.h, found by kernel test robot and Ondrej.
      
      Fixes: 72e89f50 ("security: Add support for SCTP security hooks")
      Reported-by: default avatarPrashanth Prahlad <pprahlad@redhat.com>
      Reviewed-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Tested-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c2ef024
    • Xin Long's avatar
      security: call security_sctp_assoc_request in sctp_sf_do_5_1D_ce · e215dab1
      Xin Long authored
      The asoc created when receives the INIT chunk is a temporary one, it
      will be deleted after INIT_ACK chunk is replied. So for the real asoc
      created in sctp_sf_do_5_1D_ce() when the COOKIE_ECHO chunk is received,
      security_sctp_assoc_request() should also be called.
      
      v1->v2:
        - fix some typo and grammar errors, noticed by Ondrej.
      
      Fixes: 72e89f50 ("security: Add support for SCTP security hooks")
      Reported-by: default avatarPrashanth Prahlad <pprahlad@redhat.com>
      Reviewed-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Tested-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e215dab1
    • Xin Long's avatar
      security: pass asoc to sctp_assoc_request and sctp_sk_clone · c081d53f
      Xin Long authored
      This patch is to move secid and peer_secid from endpoint to association,
      and pass asoc to sctp_assoc_request and sctp_sk_clone instead of ep. As
      ep is the local endpoint and asoc represents a connection, and in SCTP
      one sk/ep could have multiple asoc/connection, saving secid/peer_secid
      for new asoc will overwrite the old asoc's.
      
      Note that since asoc can be passed as NULL, security_sctp_assoc_request()
      is moved to the place right after the new_asoc is created in
      sctp_sf_do_5_1B_init() and sctp_sf_do_unexpected_init().
      
      v1->v2:
        - fix the description of selinux_netlbl_skbuff_setsid(), as Jakub noticed.
        - fix the annotation in selinux_sctp_assoc_request(), as Richard Noticed.
      
      Fixes: 72e89f50 ("security: Add support for SCTP security hooks")
      Reported-by: default avatarPrashanth Prahlad <pprahlad@redhat.com>
      Reviewed-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Tested-by: default avatarRichard Haines <richard_c_haines@btinternet.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c081d53f
    • David S. Miller's avatar
      Merge branch 'kselftests-net-missing' · 843c3cbb
      David S. Miller authored
      Hangbin Liu says:
      
      ====================
      kselftests/net: add missed tests to Makefile
      
      When generating the selftest to another folder, some tests are missing
      as they are not added in Makefile. e.g.
      
        make -C tools/testing/selftests/ install \
            TARGETS="net" INSTALL_PATH=/tmp/kselftests
      
      These pathset add them separately to make the Fixes tags less. It would
      also make the stable tree or downstream backport easier.
      
      If you think there is no need to add the Fixes tag for this minor issue.
      I can repost a new patch and merge all the fixes together.
      
      Thanks
      
      v3: no update, just rebase to latest net tree.
      v2: move toeplitz.sh/toeplitz_client.sh under TEST_PROGS_EXTENDED.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      843c3cbb
    • Hangbin Liu's avatar
      kselftests/net: add missed toeplitz.sh/toeplitz_client.sh to Makefile · 17b67370
      Hangbin Liu authored
      When generating the selftests to another folder, the toeplitz.sh
      and toeplitz_client.sh are missing as they are not in Makefile, e.g.
      
        make -C tools/testing/selftests/ install \
            TARGETS="net" INSTALL_PATH=/tmp/kselftests
      
      Making them under TEST_PROGS_EXTENDED as they test NIC hardware features
      and are not intended to be run from kselftests.
      
      Fixes: 5ebfb4cc ("selftests/net: toeplitz test")
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17b67370
    • Hangbin Liu's avatar
      kselftests/net: add missed vrf_strict_mode_test.sh test to Makefile · 8883deb5
      Hangbin Liu authored
      When generating the selftests to another folder, the
      vrf_strict_mode_test.sh test will miss as it is not in Makefile, e.g.
      
        make -C tools/testing/selftests/ install \
            TARGETS="net" INSTALL_PATH=/tmp/kselftests
      
      Fixes: 8735e6ea ("selftests: add selftest for the VRF strict mode")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8883deb5
    • Hangbin Liu's avatar
      kselftests/net: add missed SRv6 tests · 653e7f19
      Hangbin Liu authored
      When generating the selftests to another folder, the SRv6 tests are
      missing as they are not in Makefile, e.g.
      
        make -C tools/testing/selftests/ install \
            TARGETS="net" INSTALL_PATH=/tmp/kselftests
      
      Fixes: 03a0b567 ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior")
      Fixes: 2195444e ("selftests: add selftest for the SRv6 End.DT4 behavior")
      Fixes: 2bc03553 ("selftests: add selftest for the SRv6 End.DT6 (VRF) behavior")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      653e7f19
    • Hangbin Liu's avatar
      kselftests/net: add missed setup_loopback.sh/setup_veth.sh to Makefile · b99ac184
      Hangbin Liu authored
      When generating the selftests to another folder, the include file
      setup_loopback.sh/setup_veth.sh for gro.sh/gre_gro.sh are missing as
      they are not in Makefile, e.g.
      
        make -C tools/testing/selftests/ install \
            TARGETS="net" INSTALL_PATH=/tmp/kselftests
      
      Fixes: 7d157501 ("selftests/net: GRO coalesce test")
      Fixes: 9af771d2 ("selftests/net: allow GRO coalesce test on veth")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b99ac184
    • Hangbin Liu's avatar
      kselftests/net: add missed icmp.sh test to Makefile · ca3676f9
      Hangbin Liu authored
      When generating the selftests to another folder, the icmp.sh test will
      miss as it is not in Makefile, e.g.
      
        make -C tools/testing/selftests/ install \
            TARGETS="net" INSTALL_PATH=/tmp/kselftests
      
      Fixes: 7e9838b7 ("selftests/net: Add icmp.sh for testing ICMP dummy address responses")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca3676f9
    • Jiapeng Chong's avatar
      amt: Remove duplicate include · a4414341
      Jiapeng Chong authored
      Clean up the following includecheck warning:
      
      ./drivers/net/amt.c: net/protocol.h is included more than once.
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Reviewed-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4414341
    • Yang Yingliang's avatar
      amt: fix error return code in amt_init() · db243434
      Yang Yingliang authored
      Return error code when alloc_workqueue()
      fails in amt_init().
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Reviewed-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20211102130353.1666999-1-yangyingliang@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      db243434
    • Shay Agroskin's avatar
      MAINTAINERS: Update ENA maintainers information · 18635d52
      Shay Agroskin authored
      The ENA driver is no longer maintained by Netanel and Guy
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Link: https://lore.kernel.org/r/20211102110358.193920-1-shayagr@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      18635d52
    • Eric Dumazet's avatar
      net: add and use skb_unclone_keeptruesize() helper · c4777efa
      Eric Dumazet authored
      While commit 097b9146 ("net: fix up truesize of cloned
      skb in skb_prepare_for_shift()") fixed immediate issues found
      when KFENCE was enabled/tested, there are still similar issues,
      when tcp_trim_head() hits KFENCE while the master skb
      is cloned.
      
      This happens under heavy networking TX workloads,
      when the TX completion might be delayed after incoming ACK.
      
      This patch fixes the WARNING in sk_stream_kill_queues
      when sk->sk_mem_queued/sk->sk_forward_alloc are not zero.
      
      Fixes: d3fb45f3 ("mm, kfence: insert KFENCE hooks for SLAB")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Link: https://lore.kernel.org/r/20211102004555.1359210-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c4777efa
    • Geert Uytterhoeven's avatar
      net: marvell: prestera: Add explicit padding · 236f57fe
      Geert Uytterhoeven authored
      On m68k:
      
          In function ‘prestera_hw_build_tests’,
      	inlined from ‘prestera_hw_switch_init’ at drivers/net/ethernet/marvell/prestera/prestera_hw.c:788:2:
          ././include/linux/compiler_types.h:335:38: error: call to ‘__compiletime_assert_345’ declared with attribute error: BUILD_BUG_ON failed: sizeof(struct prestera_msg_switch_attr_req) != 16
          ...
      
      The driver assumes structure members are naturally aligned, but does not
      add explicit padding, thus breaking architectures where integral values
      are not always naturally aligned (e.g. on m68k, __alignof(int) is 2, not
      4).
      
      Fixes: bb5dbf2c ("net: marvell: prestera: add firmware v4.0 support")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20211102082433.3820514-1-geert@linux-m68k.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      236f57fe
    • Wan Jiabing's avatar
      bnxt_en: avoid newline at end of message in NL_SET_ERR_MSG_MOD · 6ab9f57a
      Wan Jiabing authored
      Fix following coccicheck warning:
      ./drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c:446:8-56: WARNING
      avoid newline at end of message in NL_SET_ERR_MSG_MOD.
      Signed-off-by: default avatarWan Jiabing <wanjiabing@vivo.com>
      Link: https://lore.kernel.org/r/20211102020312.16567-1-wanjiabing@vivo.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6ab9f57a
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 71229d04
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      1) Fix mac address UAF reported by KASAN in nfnetlink_queue,
         from Florian Westphal.
      
      2) Autoload genetlink IPVS on demand, from Thomas Weissschuh.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
        ipvs: autoload ipvs on genl access
        netfilter: nfnetlink_queue: fix OOB when mac header was cleared
      ====================
      
      Link: https://lore.kernel.org/r/20211101221528.236114-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      71229d04
    • Maxim Kiselev's avatar
      net: davinci_emac: Fix interrupt pacing disable · d52bcb47
      Maxim Kiselev authored
      This patch allows to use 0 for `coal->rx_coalesce_usecs` param to
      disable rx irq coalescing.
      
      Previously we could enable rx irq coalescing via ethtool
      (For ex: `ethtool -C eth0 rx-usecs 2000`) but we couldn't disable
      it because this part rejects 0 value:
      
             if (!coal->rx_coalesce_usecs)
                     return -EINVAL;
      
      Fixes: 84da2658 ("TI DaVinci EMAC : Implement interrupt pacing functionality.")
      Signed-off-by: default avatarMaxim Kiselev <bigunclemax@gmail.com>
      Reviewed-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Link: https://lore.kernel.org/r/20211101152343.4193233-1-bigunclemax@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d52bcb47
    • Yuiko Oshino's avatar
      net: phy: microchip_t1: add lan87xx_config_rgmii_delay for lan87xx phy · 26499499
      Yuiko Oshino authored
      Add a function to initialize phy rgmii delay according to phydev->interface.
      Signed-off-by: default avatarYuiko Oshino <yuiko.oshino@microchip.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20211101162119.29275-1-yuiko.oshino@microchip.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      26499499
  2. 02 Nov, 2021 7 commits
    • Linus Torvalds's avatar
      Merge tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cc0356d6
      Linus Torvalds authored
      Pull x86 core updates from Borislav Petkov:
      
       - Do not #GP on userspace use of CLI/STI but pretend it was a NOP to
         keep old userspace from breaking. Adjust the corresponding iopl
         selftest to that.
      
       - Improve stack overflow warnings to say which stack got overflowed and
         raise the exception stack sizes to 2 pages since overflowing the
         single page of exception stack is very easy to do nowadays with all
         the tracing machinery enabled. With that, rip out the custom mapping
         of AMD SEV's too.
      
       - A bunch of changes in preparation for FGKASLR like supporting more
         than 64K section headers in the relocs tool, correct ORC lookup table
         size to cover the whole kernel .text and other adjustments.
      
      * tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        selftests/x86/iopl: Adjust to the faked iopl CLI/STI usage
        vmlinux.lds.h: Have ORC lookup cover entire _etext - _stext
        x86/boot/compressed: Avoid duplicate malloc() implementations
        x86/boot: Allow a "silent" kaslr random byte fetch
        x86/tools/relocs: Support >64K section headers
        x86/sev: Make the #VC exception stacks part of the default stacks storage
        x86: Increase exception stack sizes
        x86/mm/64: Improve stack overflow warnings
        x86/iopl: Fake iopl(3) CLI/STI usage
      cc0356d6
    • Linus Torvalds's avatar
      Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · fc02cb2b
      Linus Torvalds authored
      Pull networking updates from Jakub Kicinski:
       "Core:
      
         - Remove socket skb caches
      
         - Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
           avoid memory accounting overhead on each message sent
      
         - Introduce managed neighbor entries - added by control plane and
           resolved by the kernel for use in acceleration paths (BPF / XDP
           right now, HW offload users will benefit as well)
      
         - Make neighbor eviction on link down controllable by userspace to
           work around WiFi networks with bad roaming implementations
      
         - vrf: Rework interaction with netfilter/conntrack
      
         - fq_codel: implement L4S style ce_threshold_ect1 marking
      
         - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
      
        BPF:
      
         - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
           as implemented in LLVM14
      
         - Introduce bpf_get_branch_snapshot() to capture Last Branch Records
      
         - Implement variadic trace_printk helper
      
         - Add a new Bloomfilter map type
      
         - Track <8-byte scalar spill and refill
      
         - Access hw timestamp through BPF's __sk_buff
      
         - Disallow unprivileged BPF by default
      
         - Document BPF licensing
      
        Netfilter:
      
         - Introduce egress hook for looking at raw outgoing packets
      
         - Allow matching on and modifying inner headers / payload data
      
         - Add NFT_META_IFTYPE to match on the interface type either from
           ingress or egress
      
        Protocols:
      
         - Multi-Path TCP:
            - increase default max additional subflows to 2
            - rework forward memory allocation
            - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
      
         - MCTP flow support allowing lower layer drivers to configure msg
           muxing as needed
      
         - Automatic Multicast Tunneling (AMT) driver based on RFC7450
      
         - HSR support the redbox supervision frames (IEC-62439-3:2018)
      
         - Support for the ip6ip6 encapsulation of IOAM
      
         - Netlink interface for CAN-FD's Transmitter Delay Compensation
      
         - Support SMC-Rv2 eliminating the current same-subnet restriction, by
           exploiting the UDP encapsulation feature of RoCE adapters
      
         - TLS: add SM4 GCM/CCM crypto support
      
         - Bluetooth: initial support for link quality and audio/codec offload
      
        Driver APIs:
      
         - Add a batched interface for RX buffer allocation in AF_XDP buffer
           pool
      
         - ethtool: Add ability to control transceiver modules' power mode
      
         - phy: Introduce supported interfaces bitmap to express MAC
           capabilities and simplify PHY code
      
         - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
      
        New drivers:
      
         - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
      
         - Ethernet driver for ASIX AX88796C SPI device (x88796c)
      
        Drivers:
      
         - Broadcom PHYs
            - support 72165, 7712 16nm PHYs
            - support IDDQ-SR for additional power savings
      
         - PHY support for QCA8081, QCA9561 PHYs
      
         - NXP DPAA2: support for IRQ coalescing
      
         - NXP Ethernet (enetc): support for software TCP segmentation
      
         - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
           Gigabit-capable IP found on RZ/G2L SoC
      
         - Intel 100G Ethernet
            - support for eswitch offload of TC/OvS flow API, including
              offload of GRE, VxLAN, Geneve tunneling
            - support application device queues - ability to assign Rx and Tx
              queues to application threads
            - PTP and PPS (pulse-per-second) extensions
      
         - Broadcom Ethernet (bnxt)
            - devlink health reporting and device reload extensions
      
         - Mellanox Ethernet (mlx5)
            - offload macvlan interfaces
            - support HW offload of TC rules involving OVS internal ports
            - support HW-GRO and header/data split
            - support application device queues
      
         - Marvell OcteonTx2:
            - add XDP support for PF
            - add PTP support for VF
      
         - Qualcomm Ethernet switch (qca8k): support for QCA8328
      
         - Realtek Ethernet DSA switch (rtl8366rb)
            - support bridge offload
            - support STP, fast aging, disabling address learning
            - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
      
         - Mellanox Ethernet/IB switch (mlxsw)
            - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
            - offload root TBF qdisc as port shaper
            - support multiple routing interface MAC address prefixes
            - support for IP-in-IP with IPv6 underlay
      
         - MediaTek WiFi (mt76)
            - mt7921 - ASPM, 6GHz, SDIO and testmode support
            - mt7915 - LED and TWT support
      
         - Qualcomm WiFi (ath11k)
            - include channel rx and tx time in survey dump statistics
            - support for 80P80 and 160 MHz bandwidths
            - support channel 2 in 6 GHz band
            - spectral scan support for QCN9074
            - support for rx decapsulation offload (data frames in 802.3
              format)
      
         - Qualcomm phone SoC WiFi (wcn36xx)
            - enable Idle Mode Power Save (IMPS) to reduce power consumption
              during idle
      
         - Bluetooth driver support for MediaTek MT7922 and MT7921
      
         - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
           Realtek 8822C/8852A
      
         - Microsoft vNIC driver (mana)
            - support hibernation and kexec
      
         - Google vNIC driver (gve)
            - support for jumbo frames
            - implement Rx page reuse
      
        Refactor:
      
         - Make all writes to netdev->dev_addr go thru helpers, so that we can
           add this address to the address rbtree and handle the updates
      
         - Various TCP cleanups and optimizations including improvements to
           CPU cache use
      
         - Simplify the gnet_stats, Qdisc stats' handling and remove
           qdisc->running sequence counter
      
         - Driver changes and API updates to address devlink locking
           deficiencies"
      
      * tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
        Revert "net: avoid double accounting for pure zerocopy skbs"
        selftests: net: add arp_ndisc_evict_nocarrier
        net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
        net: arp: introduce arp_evict_nocarrier sysctl parameter
        libbpf: Deprecate AF_XDP support
        kbuild: Unify options for BTF generation for vmlinux and modules
        selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
        bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
        bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
        net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
        net: avoid double accounting for pure zerocopy skbs
        tcp: rename sk_wmem_free_skb
        netdevsim: fix uninit value in nsim_drv_configure_vfs()
        selftests/bpf: Fix also no-alu32 strobemeta selftest
        bpf: Add missing map_delete_elem method to bloom filter map
        selftests/bpf: Add bloom map success test for userspace calls
        bpf: Add alignment padding for "map_extra" + consolidate holes
        bpf: Bloom filter map naming fixups
        selftests/bpf: Add test cases for struct_ops prog
        bpf: Add dummy BPF STRUCT_OPS for test purpose
        ...
      fc02cb2b
    • Jakub Kicinski's avatar
      Revert "net: avoid double accounting for pure zerocopy skbs" · 84882cf7
      Jakub Kicinski authored
      This reverts commit f1a456f8.
      
        WARNING: CPU: 1 PID: 6819 at net/core/skbuff.c:5429 skb_try_coalesce+0x78b/0x7e0
        CPU: 1 PID: 6819 Comm: xxxxxxx Kdump: loaded Tainted: G S                5.15.0-04194-gd852503f7711 #16
        RIP: 0010:skb_try_coalesce+0x78b/0x7e0
        Code: e8 2a bf 41 ff 44 8b b3 bc 00 00 00 48 8b 7c 24 30 e8 19 c0 41 ff 44 89 f0 48 03 83 c0 00 00 00 48 89 44 24 40 e9 47 fb ff ff <0f> 0b e9 ca fc ff ff 4c 8d 70 ff 48 83 c0 07 48 89 44 24 38 e9 61
        RSP: 0018:ffff88881f449688 EFLAGS: 00010282
        RAX: 00000000fffffe96 RBX: ffff8881566e4460 RCX: ffffffff82079f7e
        RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff8881566e47b0
        RBP: ffff8881566e46e0 R08: ffffed102619235d R09: ffffed102619235d
        R10: ffff888130c91ae3 R11: ffffed102619235c R12: ffff88881f4498a0
        R13: 0000000000000056 R14: 0000000000000009 R15: ffff888130c91ac0
        FS:  00007fec2cbb9700(0000) GS:ffff88881f440000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007fec1b060d80 CR3: 00000003acf94005 CR4: 00000000003706e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         <IRQ>
         tcp_try_coalesce+0xeb/0x290
         ? tcp_parse_options+0x610/0x610
         ? mark_held_locks+0x79/0xa0
         tcp_queue_rcv+0x69/0x2f0
         tcp_rcv_established+0xa49/0xd40
         ? tcp_data_queue+0x18a0/0x18a0
         tcp_v6_do_rcv+0x1c9/0x880
         ? rt6_mtu_change_route+0x100/0x100
         tcp_v6_rcv+0x1624/0x1830
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      84882cf7
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · bfc484fe
      Linus Torvalds authored
      Pull crypto updates from Herbert Xu:
       "API:
      
         - Delay boot-up self-test for built-in algorithms
      
        Algorithms:
      
         - Remove fallback path on arm64 as SIMD now runs with softirq off
      
        Drivers:
      
         - Add Keem Bay OCS ECC Driver"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (61 commits)
        crypto: testmgr - fix wrong key length for pkcs1pad
        crypto: pcrypt - Delay write to padata->info
        crypto: ccp - Make use of the helper macro kthread_run()
        crypto: sa2ul - Use the defined variable to clean code
        crypto: s5p-sss - Add error handling in s5p_aes_probe()
        crypto: keembay-ocs-ecc - Add Keem Bay OCS ECC Driver
        dt-bindings: crypto: Add Keem Bay ECC bindings
        crypto: ecc - Export additional helper functions
        crypto: ecc - Move ecc.h to include/crypto/internal
        crypto: engine - Add KPP Support to Crypto Engine
        crypto: api - Do not create test larvals if manager is disabled
        crypto: tcrypt - fix skcipher multi-buffer tests for 1420B blocks
        hwrng: s390 - replace snprintf in show functions with sysfs_emit
        crypto: octeontx2 - set assoclen in aead_do_fallback()
        crypto: ccp - Fix whitespace in sev_cmd_buffer_len()
        hwrng: mtk - Force runtime pm ops for sleep ops
        crypto: testmgr - Only disable migration in crypto_disable_simd_for_test()
        crypto: qat - share adf_enable_pf2vf_comms() from adf_pf2vf_msg.c
        crypto: qat - extract send and wait from adf_vf2pf_request_version()
        crypto: qat - add VF and PF wrappers to common send function
        ...
      bfc484fe
    • Linus Torvalds's avatar
      Merge tag 'audit-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit · d2fac0af
      Linus Torvalds authored
      Pull audit updates from Paul Moore:
       "Add some additional audit logging to capture the openat2() syscall
        open_how struct info.
      
        Previous variations of the open()/openat() syscalls allowed audit
        admins to inspect the syscall args to get the information contained in
        the new open_how struct used in openat2()"
      
      * tag 'audit-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
        audit: return early if the filter rule has a lower priority
        audit: add OPENAT2 record to list "how" info
        audit: add support for the openat2 syscall
        audit: replace magic audit syscall class numbers with macros
        lsm_audit: avoid overloading the "key" audit field
        audit: Convert to SPDX identifier
        audit: rename struct node to struct audit_node to prevent future name collisions
      d2fac0af
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · cdab10bf
      Linus Torvalds authored
      Pull selinux updates from Paul Moore:
      
       - Add LSM/SELinux/Smack controls and auditing for io-uring.
      
         As usual, the individual commit descriptions have more detail, but we
         were basically missing two things which we're adding here:
      
            + establishment of a proper audit context so that auditing of
              io-uring ops works similarly to how it does for syscalls (with
              some io-uring additions because io-uring ops are *not* syscalls)
      
            + additional LSM hooks to enable access control points for some of
              the more unusual io-uring features, e.g. credential overrides.
      
         The additional audit callouts and LSM hooks were done in conjunction
         with the io-uring folks, based on conversations and RFC patches
         earlier in the year.
      
       - Fixup the binder credential handling so that the proper credentials
         are used in the LSM hooks; the commit description and the code
         comment which is removed in these patches are helpful to understand
         the background and why this is the proper fix.
      
       - Enable SELinux genfscon policy support for securityfs, allowing
         improved SELinux filesystem labeling for other subsystems which make
         use of securityfs, e.g. IMA.
      
      * tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        security: Return xattr name from security_dentry_init_security()
        selinux: fix a sock regression in selinux_ip_postroute_compat()
        binder: use cred instead of task for getsecid
        binder: use cred instead of task for selinux checks
        binder: use euid from cred instead of using task
        LSM: Avoid warnings about potentially unused hook variables
        selinux: fix all of the W=1 build warnings
        selinux: make better use of the nf_hook_state passed to the NF hooks
        selinux: fix race condition when computing ocontext SIDs
        selinux: remove unneeded ipv6 hook wrappers
        selinux: remove the SELinux lockdown implementation
        selinux: enable genfscon labeling for securityfs
        Smack: Brutalist io_uring support
        selinux: add support for the io_uring access controls
        lsm,io_uring: add LSM hooks to io_uring
        io_uring: convert io_uring to the secure anon inode interface
        fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure()
        audit: add filtering for io_uring records
        audit,io_uring,io-wq: add some basic audit support to io_uring
        audit: prepare audit_context for use in calling contexts beyond syscalls
      cdab10bf
    • Linus Torvalds's avatar
      Merge tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 6fedc280
      Linus Torvalds authored
      Pull RCU updates from Paul McKenney:
      
       - Miscellaneous fixes
      
       - Torture-test updates for smp_call_function(), most notably improved
         checking of module parameters.
      
       - Tasks-trace RCU updates that fix a number of rare but important
         race-condition bugs.
      
       - Other torture-test updates, most notably better checking of module
         parameters. In addition, rcutorture may once again be run on
         CONFIG_PREEMPT_RT kernels.
      
       - Torture-test scripting updates, most notably specifying the new
         CONFIG_KCSAN_STRICT kconfig option rather than maintaining an
         ever-changing list of individual KCSAN kconfig options.
      
      * tag 'rcu.2021.11.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (46 commits)
        rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr
        rcu: Always inline rcu_dynticks_task*_{enter,exit}()
        torture: Make kvm-remote.sh print size of downloaded tarball
        torture: Allot 1G of memory for scftorture runs
        tools/rcu: Add an extract-stall script
        scftorture: Warn on individual scf_torture_init() error conditions
        scftorture: Count reschedule IPIs
        scftorture: Account for weight_resched when checking for all zeroes
        scftorture: Shut down if nonsensical arguments given
        scftorture: Allow zero weight to exclude an smp_call_function*() category
        rcu: Avoid unneeded function call in rcu_read_unlock()
        rcu-tasks: Update comments to cond_resched_tasks_rcu_qs()
        rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader
        rcu-tasks: Fix read-side primitives comment for call_rcu_tasks_trace
        rcu-tasks: Clarify read side section info for rcu_tasks_rude GP primitives
        rcu-tasks: Correct comparisons for CPU numbers in show_stalled_task_trace
        rcu-tasks: Correct firstreport usage in check_all_holdout_tasks_trace
        rcu-tasks: Fix s/rcu_add_holdout/trc_add_holdout/ typo in comment
        rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop
        rcu-tasks: Fix s/instruction/instructions/ typo in comment
        ...
      6fedc280