1. 02 Apr, 2020 40 commits
    • Wen Xiong's avatar
      scsi: ipr: Fix softlockup when rescanning devices in petitboot · 29aacd43
      Wen Xiong authored
      [ Upstream commit 394b6171 ]
      
      When trying to rescan disks in petitboot shell, we hit the following
      softlockup stacktrace:
      
      Kernel panic - not syncing: System is deadlocked on memory
      [  241.223394] CPU: 32 PID: 693 Comm: sh Not tainted 5.4.16-openpower1 #1
      [  241.223406] Call Trace:
      [  241.223415] [c0000003f07c3180] [c000000000493fc4] dump_stack+0xa4/0xd8 (unreliable)
      [  241.223432] [c0000003f07c31c0] [c00000000007d4ac] panic+0x148/0x3cc
      [  241.223446] [c0000003f07c3260] [c000000000114b10] out_of_memory+0x468/0x4c4
      [  241.223461] [c0000003f07c3300] [c0000000001472b0] __alloc_pages_slowpath+0x594/0x6d8
      [  241.223476] [c0000003f07c3420] [c00000000014757c] __alloc_pages_nodemask+0x188/0x1a4
      [  241.223492] [c0000003f07c34a0] [c000000000153e10] alloc_pages_current+0xcc/0xd8
      [  241.223508] [c0000003f07c34e0] [c0000000001577ac] alloc_slab_page+0x30/0x98
      [  241.223524] [c0000003f07c3520] [c0000000001597fc] new_slab+0x138/0x40c
      [  241.223538] [c0000003f07c35f0] [c00000000015b204] ___slab_alloc+0x1e4/0x404
      [  241.223552] [c0000003f07c36c0] [c00000000015b450] __slab_alloc+0x2c/0x48
      [  241.223566] [c0000003f07c36f0] [c00000000015b754] kmem_cache_alloc_node+0x9c/0x1b4
      [  241.223582] [c0000003f07c3760] [c000000000218c48] blk_alloc_queue_node+0x34/0x270
      [  241.223599] [c0000003f07c37b0] [c000000000226574] blk_mq_init_queue+0x2c/0x78
      [  241.223615] [c0000003f07c37e0] [c0000000002ff710] scsi_mq_alloc_queue+0x28/0x70
      [  241.223631] [c0000003f07c3810] [c0000000003005b8] scsi_alloc_sdev+0x184/0x264
      [  241.223647] [c0000003f07c38a0] [c000000000300ba0] scsi_probe_and_add_lun+0x288/0xa3c
      [  241.223663] [c0000003f07c3a00] [c000000000301768] __scsi_scan_target+0xcc/0x478
      [  241.223679] [c0000003f07c3b20] [c000000000301c64] scsi_scan_channel.part.9+0x74/0x7c
      [  241.223696] [c0000003f07c3b70] [c000000000301df4] scsi_scan_host_selected+0xe0/0x158
      [  241.223712] [c0000003f07c3bd0] [c000000000303f04] store_scan+0x104/0x114
      [  241.223727] [c0000003f07c3cb0] [c0000000002d5ac4] dev_attr_store+0x30/0x4c
      [  241.223741] [c0000003f07c3cd0] [c0000000001dbc34] sysfs_kf_write+0x64/0x78
      [  241.223756] [c0000003f07c3cf0] [c0000000001da858] kernfs_fop_write+0x170/0x1b8
      [  241.223773] [c0000003f07c3d40] [c0000000001621fc] __vfs_write+0x34/0x60
      [  241.223787] [c0000003f07c3d60] [c000000000163c2c] vfs_write+0xa8/0xcc
      [  241.223802] [c0000003f07c3db0] [c000000000163df4] ksys_write+0x70/0xbc
      [  241.223816] [c0000003f07c3e20] [c00000000000b40c] system_call+0x5c/0x68
      
      As a part of the scan process Linux will allocate and configure a
      scsi_device for each target to be scanned. If the device is not present,
      then the scsi_device is torn down. As a part of scsi_device teardown a
      workqueue item will be scheduled and the lockups we see are because there
      are 250k workqueue items to be processed.  Accoding to the specification of
      SIS-64 sas controller, max_channel should be decreased on SIS-64 adapters
      to 4.
      
      The patch fixes softlockup issue.
      
      Thanks for Oliver Halloran's help with debugging and explanation!
      
      Link: https://lore.kernel.org/r/1583510248-23672-1-git-send-email-wenxiong@linux.vnet.ibm.comSigned-off-by: default avatarWen Xiong <wenxiong@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      29aacd43
    • Julian Wiedmann's avatar
      s390/qeth: handle error when backing RX buffer · 547d6d43
      Julian Wiedmann authored
      [ Upstream commit 17413852 ]
      
      qeth_init_qdio_queues() fills the RX ring with an initial set of
      RX buffers. If qeth_init_input_buffer() fails to back one of the RX
      buffers with memory, we need to bail out and report the error.
      
      Fixes: 4a71df50 ("qeth: new qeth device driver")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      547d6d43
    • Madalin Bucur's avatar
      fsl/fman: detect FMan erratum A050385 · 7deaf533
      Madalin Bucur authored
      [ Upstream commit b281f7b9 ]
      
      Detect the presence of the A050385 erratum.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7deaf533
    • Madalin Bucur's avatar
      arm64: dts: ls1043a: FMan erratum A050385 · 63592540
      Madalin Bucur authored
      [ Upstream commit b54d3900 ]
      
      The LS1043A SoC is affected by the A050385 erratum stating that
      FMAN DMA read or writes under heavy traffic load may cause FMAN
      internal resource leak thus stopping further packet processing.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      63592540
    • Madalin Bucur's avatar
      dt-bindings: net: FMan erratum A050385 · fbc835b0
      Madalin Bucur authored
      [ Upstream commit 26d5bb9e ]
      
      FMAN DMA read or writes under heavy traffic load may cause FMAN
      internal resource leak; thus stopping further packet processing.
      
      The FMAN internal queue can overflow when FMAN splits single
      read or write transactions into multiple smaller transactions
      such that more than 17 AXI transactions are in flight from FMAN
      to interconnect. When the FMAN internal queue overflows, it can
      stall further packet processing. The issue can occur with any one
      of the following three conditions:
      
        1. FMAN AXI transaction crosses 4K address boundary (Errata
           A010022)
        2. FMAN DMA address for an AXI transaction is not 16 byte
           aligned, i.e. the last 4 bits of an address are non-zero
        3. Scatter Gather (SG) frames have more than one SG buffer in
           the SG list and any one of the buffers, except the last
           buffer in the SG list has data size that is not a multiple
           of 16 bytes, i.e., other than 16, 32, 48, 64, etc.
      
      With any one of the above three conditions present, there is
      likelihood of stalled FMAN packet processing, especially under
      stress with multiple ports injecting line-rate traffic.
      
      To avoid situations that stall FMAN packet processing, all of the
      above three conditions must be avoided; therefore, configure the
      system with the following rules:
      
        1. Frame buffers must not span a 4KB address boundary, unless
           the frame start address is 256 byte aligned
        2. All FMAN DMA start addresses (for example, BMAN buffer
           address, FD[address] + FD[offset]) are 16B aligned
        3. SG table and buffer addresses are 16B aligned and the size
           of SG buffers are multiple of 16 bytes, except for the last
           SG buffer that can be of any size.
      
      Additional workaround notes:
      - Address alignment of 64 bytes is recommended for maximally
      efficient system bus transactions (although 16 byte alignment is
      sufficient to avoid the stall condition)
      - To support frame sizes that are larger than 4K bytes, there are
      two options:
        1. Large single buffer frames that span a 4KB page boundary can
           be converted into SG frames to avoid transaction splits at
           the 4KB boundary,
        2. Align the large single buffer to 256B address boundaries,
           ensure that the frame address plus offset is 256B aligned.
      - If software generated SG frames have buffers that are unaligned
      and with random non-multiple of 16 byte lengths, before
      transmitting such frames via FMAN, frames will need to be copied
      into a new single buffer or multiple buffer SG frame that is
      compliant with the three rules listed above.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fbc835b0
    • Tycho Andersen's avatar
      cgroup1: don't call release_agent when it is "" · 5a8a6943
      Tycho Andersen authored
      [ Upstream commit 2e5383d7 ]
      
      Older (and maybe current) versions of systemd set release_agent to "" when
      shutting down, but do not set notify_on_release to 0.
      
      Since 64e90a8a ("Introduce STATIC_USERMODEHELPER to mediate
      call_usermodehelper()"), we filter out such calls when the user mode helper
      path is "". However, when used in conjunction with an actual (i.e. non "")
      STATIC_USERMODEHELPER, the path is never "", so the real usermode helper
      will be called with argv[0] == "".
      
      Let's avoid this by not invoking the release_agent when it is "".
      Signed-off-by: default avatarTycho Andersen <tycho@tycho.ws>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5a8a6943
    • Dajun Jin's avatar
      drivers/of/of_mdio.c:fix of_mdiobus_register() · 1bea2330
      Dajun Jin authored
      [ Upstream commit 209c65b6 ]
      
      When registers a phy_device successful, should terminate the loop
      or the phy_device would be registered in other addr. If there are
      multiple PHYs without reg properties, it will go wrong.
      Signed-off-by: default avatarDajun Jin <adajunjin@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1bea2330
    • Mike Gilbert's avatar
      cpupower: avoid multiple definition with gcc -fno-common · 87639a60
      Mike Gilbert authored
      [ Upstream commit 2de7fb60 ]
      
      Building cpupower with -fno-common in CFLAGS results in errors due to
      multiple definitions of the 'cpu_count' and 'start_time' variables.
      
      ./utils/idle_monitor/snb_idle.o:./utils/idle_monitor/cpupower-monitor.h:28:
      multiple definition of `cpu_count';
      ./utils/idle_monitor/nhm_idle.o:./utils/idle_monitor/cpupower-monitor.h:28:
      first defined here
      ...
      ./utils/idle_monitor/cpuidle_sysfs.o:./utils/idle_monitor/cpuidle_sysfs.c:22:
      multiple definition of `start_time';
      ./utils/idle_monitor/amd_fam14h_idle.o:./utils/idle_monitor/amd_fam14h_idle.c:85:
      first defined here
      
      The -fno-common option will be enabled by default in GCC 10.
      
      Bug: https://bugs.gentoo.org/707462Signed-off-by: default avatarMike Gilbert <floppym@gentoo.org>
      Signed-off-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      87639a60
    • Scott Mayhew's avatar
      nfs: add minor version to nfs_server_key for fscache · 968d4ec0
      Scott Mayhew authored
      [ Upstream commit 55dee1bc ]
      
      An NFS client that mounts multiple exports from the same NFS
      server with higher NFSv4 versions disabled (i.e. 4.2) and without
      forcing a specific NFS version results in fscache index cookie
      collisions and the following messages:
      [  570.004348] FS-Cache: Duplicate cookie detected
      
      Each nfs_client structure should have its own fscache index cookie,
      so add the minorversion to nfs_server_key.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=200145Signed-off-by: default avatarScott Mayhew <smayhew@redhat.com>
      Signed-off-by: default avatarDave Wysochanski <dwysocha@redhat.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      968d4ec0
    • Vasily Averin's avatar
      cgroup-v1: cgroup_pidlist_next should update position index · 967e9746
      Vasily Averin authored
      [ Upstream commit db8dd969 ]
      
      if seq_file .next fuction does not change position index,
      read after some lseek can generate unexpected output.
      
       # mount | grep cgroup
       # dd if=/mnt/cgroup.procs bs=1  # normal output
      ...
      1294
      1295
      1296
      1304
      1382
      584+0 records in
      584+0 records out
      584 bytes copied
      
      dd: /mnt/cgroup.procs: cannot skip to specified offset
      83  <<< generates end of last line
      1383  <<< ... and whole last line once again
      0+1 records in
      0+1 records out
      8 bytes copied
      
      dd: /mnt/cgroup.procs: cannot skip to specified offset
      1386  <<< generates last line anyway
      0+1 records in
      0+1 records out
      5 bytes copied
      
      https://bugzilla.kernel.org/show_bug.cgi?id=206283Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      967e9746
    • Taehee Yoo's avatar
      hsr: set .netnsok flag · 85eaea5f
      Taehee Yoo authored
      [ Upstream commit 09e91dbe ]
      
      The hsr module has been supporting the list and status command.
      (HSR_C_GET_NODE_LIST and HSR_C_GET_NODE_STATUS)
      These commands send node information to the user-space via generic netlink.
      But, in the non-init_net namespace, these commands are not allowed
      because .netnsok flag is false.
      So, there is no way to get node information in the non-init_net namespace.
      
      Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85eaea5f
    • Taehee Yoo's avatar
      hsr: add restart routine into hsr_get_node_list() · 0884e787
      Taehee Yoo authored
      [ Upstream commit ca19c70f ]
      
      The hsr_get_node_list() is to send node addresses to the userspace.
      If there are so many nodes, it could fail because of buffer size.
      In order to avoid this failure, the restart routine is added.
      
      Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0884e787
    • Taehee Yoo's avatar
      hsr: use rcu_read_lock() in hsr_get_node_{list/status}() · d45f603e
      Taehee Yoo authored
      [ Upstream commit 173756b8 ]
      
      hsr_get_node_{list/status}() are not under rtnl_lock() because
      they are callback functions of generic netlink.
      But they use __dev_get_by_index() without rtnl_lock().
      So, it would use unsafe data.
      In order to fix it, rcu_read_lock() and dev_get_by_index_rcu()
      are used instead of __dev_get_by_index().
      
      Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d45f603e
    • Taehee Yoo's avatar
      vxlan: check return value of gro_cells_init() · facf9c7e
      Taehee Yoo authored
      [ Upstream commit 384d91c2 ]
      
      gro_cells_init() returns error if memory allocation is failed.
      But the vxlan module doesn't check the return value of gro_cells_init().
      
      Fixes: 58ce31cc ("vxlan: GRO support at tunnel layer")`
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      facf9c7e
    • Eric Dumazet's avatar
      tcp: repair: fix TCP_QUEUE_SEQ implementation · 58b501cc
      Eric Dumazet authored
      [ Upstream commit 6cd6cbf5 ]
      
      When application uses TCP_QUEUE_SEQ socket option to
      change tp->rcv_next, we must also update tp->copied_seq.
      
      Otherwise, stuff relying on tcp_inq() being precise can
      eventually be confused.
      
      For example, tcp_zerocopy_receive() might crash because
      it does not expect tcp_recv_skb() to return NULL.
      
      We could add tests in various places to fix the issue,
      or simply make sure tcp_inq() wont return a random value,
      and leave fast path as it is.
      
      Note that this fixes ioctl(fd, SIOCINQ, &val) at the same
      time.
      
      Fixes: ee995283 ("tcp: Initial repair mode")
      Fixes: 05255b82 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58b501cc
    • Heiner Kallweit's avatar
      r8169: re-enable MSI on RTL8168c · 87559662
      Heiner Kallweit authored
      [ Upstream commit f13bc681 ]
      
      The original change fixed an issue on RTL8168b by mimicking the vendor
      driver behavior to disable MSI on chip versions before RTL8168d.
      This however now caused an issue on a system with RTL8168c, see [0].
      Therefore leave MSI disabled on RTL8168b, but re-enable it on RTL8168c.
      
      [0] https://bugzilla.redhat.com/show_bug.cgi?id=1792839
      
      Fixes: 003bd5b4 ("r8169: don't use MSI before RTL8168d")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87559662
    • Rayagonda Kokatanur's avatar
      net: phy: mdio-mux-bcm-iproc: check clk_prepare_enable() return value · 39c6f2be
      Rayagonda Kokatanur authored
      [ Upstream commit 872307ab ]
      
      Check clk_prepare_enable() return value.
      
      Fixes: 2c723044 ("net: phy: Add pm support to Broadcom iProc mdio mux driver")
      Signed-off-by: default avatarRayagonda Kokatanur <rayagonda.kokatanur@broadcom.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      39c6f2be
    • René van Dorst's avatar
      net: dsa: mt7530: Change the LINK bit to reflect the link status · 579efdbb
      René van Dorst authored
      [ Upstream commit 22259471 ]
      
      Andrew reported:
      
      After a number of network port link up/down changes, sometimes the switch
      port gets stuck in a state where it thinks it is still transmitting packets
      but the cpu port is not actually transmitting anymore. In this state you
      will see a message on the console
      "mtk_soc_eth 1e100000.ethernet eth0: transmit timed out" and the Tx counter
      in ifconfig will be incrementing on virtual port, but not incrementing on
      cpu port.
      
      The issue is that MAC TX/RX status has no impact on the link status or
      queue manager of the switch. So the queue manager just queues up packets
      of a disabled port and sends out pause frames when the queue is full.
      
      Change the LINK bit to reflect the link status.
      
      Fixes: b8f126a8 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
      Reported-by: default avatarAndrew Smith <andrew.smith@digi.com>
      Signed-off-by: default avatarRené van Dorst <opensource@vdorst.com>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      579efdbb
    • Petr Machata's avatar
      net: ip_gre: Accept IFLA_INFO_DATA-less configuration · f5ebb2dd
      Petr Machata authored
      [ Upstream commit 32ca98fe ]
      
      The fix referenced below causes a crash when an ERSPAN tunnel is created
      without passing IFLA_INFO_DATA. Fix by validating passed-in data in the
      same way as ipgre does.
      
      Fixes: e1f8f78f ("net: ip_gre: Separate ERSPAN newlink / changelink callbacks")
      Reported-by: syzbot+1b4ebf4dae4e510dd219@syzkaller.appspotmail.com
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5ebb2dd
    • Petr Machata's avatar
      net: ip_gre: Separate ERSPAN newlink / changelink callbacks · 54266b26
      Petr Machata authored
      [ Upstream commit e1f8f78f ]
      
      ERSPAN shares most of the code path with GRE and gretap code. While that
      helps keep the code compact, it is also error prone. Currently a broken
      userspace can turn a gretap tunnel into a de facto ERSPAN one by passing
      IFLA_GRE_ERSPAN_VER. There has been a similar issue in ip6gretap in the
      past.
      
      To prevent these problems in future, split the newlink and changelink code
      paths. Split the ERSPAN code out of ipgre_netlink_parms() into a new
      function erspan_netlink_parms(). Extract a piece of common logic from
      ipgre_newlink() and ipgre_changelink() into ipgre_newlink_encap_setup().
      Add erspan_newlink() and erspan_changelink().
      
      Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54266b26
    • Vasundhara Volam's avatar
      bnxt_en: Reset rings if ring reservation fails during open() · 2cbbedac
      Vasundhara Volam authored
      [ Upstream commit 5d765a5e ]
      
      If ring counts are not reset when ring reservation fails,
      bnxt_init_dflt_ring_mode() will not be called again to reinitialise
      IRQs when open() is called and results in system crash as napi will
      also be not initialised. This patch fixes it by resetting the ring
      counts.
      
      Fixes: 47558acd ("bnxt_en: Reserve rings at driver open if none was reserved at probe time.")
      Signed-off-by: default avatarVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2cbbedac
    • Edwin Peer's avatar
      bnxt_en: fix memory leaks in bnxt_dcbnl_ieee_getets() · 867c079e
      Edwin Peer authored
      [ Upstream commit 62d4073e ]
      
      The allocated ieee_ets structure goes out of scope without being freed,
      leaking memory. Appropriate result codes should be returned so that
      callers do not rely on invalid data passed by reference.
      
      Also cache the ETS config retrieved from the device so that it doesn't
      need to be freed. The balance of the code was clearly written with the
      intent of having the results of querying the hardware cached in the
      device structure. The commensurate store was evidently missed though.
      
      Fixes: 7df4ae9f ("bnxt_en: Implement DCBNL to support host-based DCBX.")
      Signed-off-by: default avatarEdwin Peer <edwin.peer@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      867c079e
    • Oliver Hartkopp's avatar
      slcan: not call free_netdev before rtnl_unlock in slcan_open · 297e87cf
      Oliver Hartkopp authored
      [ Upstream commit 2091a3d4 ]
      
      As the description before netdev_run_todo, we cannot call free_netdev
      before rtnl_unlock, fix it by reorder the code.
      
      This patch is a 1:1 copy of upstream slip.c commit f596c870
      ("slip: not call free_netdev before rtnl_unlock in slip_open").
      Reported-by: default avataryangerkun <yangerkun@huawei.com>
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      297e87cf
    • Dan Carpenter's avatar
      NFC: fdp: Fix a signedness bug in fdp_nci_send_patch() · b9ac8105
      Dan Carpenter authored
      [ Upstream commit 0dcdf9f6 ]
      
      The nci_conn_max_data_pkt_payload_size() function sometimes returns
      -EPROTO so "max_size" needs to be signed for the error handling to
      work.  We can make "payload_size" an int as well.
      
      Fixes: a06347c0 ("NFC: Add Intel Fields Peak NFC solution driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9ac8105
    • Emil Renner Berthing's avatar
      net: stmmac: dwmac-rk: fix error path in rk_gmac_probe · 47e36be1
      Emil Renner Berthing authored
      [ Upstream commit 9de9aa48 ]
      
      Make sure we clean up devicetree related configuration
      also when clock init fails.
      
      Fixes: fecd4d7e ("net: stmmac: dwmac-rk: Add integrated PHY support")
      Signed-off-by: default avatarEmil Renner Berthing <kernel@esmil.dk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      47e36be1
    • Cong Wang's avatar
      net_sched: keep alloc_hash updated after hash allocation · 557d015f
      Cong Wang authored
      [ Upstream commit 0d1c3530 ]
      
      In commit 599be01e ("net_sched: fix an OOB access in cls_tcindex")
      I moved cp->hash calculation before the first
      tcindex_alloc_perfect_hash(), but cp->alloc_hash is left untouched.
      This difference could lead to another out of bound access.
      
      cp->alloc_hash should always be the size allocated, we should
      update it after this tcindex_alloc_perfect_hash().
      
      Reported-and-tested-by: syzbot+dcc34d54d68ef7d2d53d@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+c72da7b9ed57cde6fca2@syzkaller.appspotmail.com
      Fixes: 599be01e ("net_sched: fix an OOB access in cls_tcindex")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      557d015f
    • Cong Wang's avatar
      net_sched: cls_route: remove the right filter from hashtable · ea3d6652
      Cong Wang authored
      [ Upstream commit ef299cc3 ]
      
      route4_change() allocates a new filter and copies values from
      the old one. After the new filter is inserted into the hash
      table, the old filter should be removed and freed, as the final
      step of the update.
      
      However, the current code mistakenly removes the new one. This
      looks apparently wrong to me, and it causes double "free" and
      use-after-free too, as reported by syzbot.
      
      Reported-and-tested-by: syzbot+f9b32aaacd60305d9687@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+2f8c233f131943d6056d@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+9c2df9fd5e9445b74e01@syzkaller.appspotmail.com
      Fixes: 1109c005 ("net: sched: RCU cls_route")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea3d6652
    • Pawel Dembicki's avatar
      net: qmi_wwan: add support for ASKEY WWHC050 · efec582a
      Pawel Dembicki authored
      [ Upstream commit 12a5ba5a ]
      
      ASKEY WWHC050 is a mcie LTE modem.
      The oem configuration states:
      
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480  MxCh= 0
      D:  Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=1690 ProdID=7588 Rev=ff.ff
      S:  Manufacturer=Android
      S:  Product=Android
      S:  SerialNumber=813f0eef6e6e
      C:* #Ifs= 6 Cfg#= 1 Atr=80 MxPwr=500mA
      I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
      E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
      E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      E:  Ad=84(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      E:  Ad=86(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      E:  Ad=88(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
      E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 5 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=(none)
      E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=125us
      
      Tested on openwrt distribution.
      Signed-off-by: default avatarCezary Jackiewicz <cezary@eko.one.pl>
      Signed-off-by: default avatarPawel Dembicki <paweldembicki@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      efec582a
    • Willem de Bruijn's avatar
      net/packet: tpacket_rcv: avoid a producer race condition · 6fb0e438
      Willem de Bruijn authored
      [ Upstream commit 61fad681 ]
      
      PACKET_RX_RING can cause multiple writers to access the same slot if a
      fast writer wraps the ring while a slow writer is still copying. This
      is particularly likely with few, large, slots (e.g., GSO packets).
      
      Synchronize kernel thread ownership of rx ring slots with a bitmap.
      
      Writers acquire a slot race-free by testing tp_status TP_STATUS_KERNEL
      while holding the sk receive queue lock. They release this lock before
      copying and set tp_status to TP_STATUS_USER to release to userspace
      when done. During copying, another writer may take the lock, also see
      TP_STATUS_KERNEL, and start writing to the same slot.
      
      Introduce a new rx_owner_map bitmap with a bit per slot. To acquire a
      slot, test and set with the lock held. To release race-free, update
      tp_status and owner bit as a transaction, so take the lock again.
      
      This is the one of a variety of discussed options (see Link below):
      
      * instead of a shadow ring, embed the data in the slot itself, such as
      in tp_padding. But any test for this field may match a value left by
      userspace, causing deadlock.
      
      * avoid the lock on release. This leaves a small race if releasing the
      shadow slot before setting TP_STATUS_USER. The below reproducer showed
      that this race is not academic. If releasing the slot after tp_status,
      the race is more subtle. See the first link for details.
      
      * add a new tp_status TP_KERNEL_OWNED to avoid the transactional store
      of two fields. But, legacy applications may interpret all non-zero
      tp_status as owned by the user. As libpcap does. So this is possible
      only opt-in by newer processes. It can be added as an optional mode.
      
      * embed the struct at the tail of pg_vec to avoid extra allocation.
      The implementation proved no less complex than a separate field.
      
      The additional locking cost on release adds contention, no different
      than scaling on multicore or multiqueue h/w. In practice, below
      reproducer nor small packet tcpdump showed a noticeable change in
      perf report in cycles spent in spinlock. Where contention is
      problematic, packet sockets support mitigation through PACKET_FANOUT.
      And we can consider adding opt-in state TP_KERNEL_OWNED.
      
      Easy to reproduce by running multiple netperf or similar TCP_STREAM
      flows concurrently with `tcpdump -B 129 -n greater 60000`.
      
      Based on an earlier patchset by Jon Rosen. See links below.
      
      I believe this issue goes back to the introduction of tpacket_rcv,
      which predates git history.
      
      Link: https://www.mail-archive.com/netdev@vger.kernel.org/msg237222.htmlSuggested-by: default avatarJon Rosen <jrosen@cisco.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarJon Rosen <jrosen@cisco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6fb0e438
    • Jisheng Zhang's avatar
      net: mvneta: Fix the case where the last poll did not process all rx · b411ce50
      Jisheng Zhang authored
      [ Upstream commit 065fd83e ]
      
      For the case where the last mvneta_poll did not process all
      RX packets, we need to xor the pp->cause_rx_tx or port->cause_rx_tx
      before claculating the rx_queue.
      
      Fixes: 2dcf75e2 ("net: mvneta: Associate RX queues with each CPU")
      Signed-off-by: default avatarJisheng Zhang <Jisheng.Zhang@synaptics.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b411ce50
    • Florian Fainelli's avatar
      net: dsa: Fix duplicate frames flooded by learning · e90e9226
      Florian Fainelli authored
      [ Upstream commit 0e62f543 ]
      
      When both the switch and the bridge are learning about new addresses,
      switch ports attached to the bridge would see duplicate ARP frames
      because both entities would attempt to send them.
      
      Fixes: 5037d532 ("net: dsa: add Broadcom tag RX/TX handler")
      Reported-by: default avatarMaxime Bizon <mbizon@freebox.fr>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e90e9226
    • Zh-yuan Ye's avatar
      net: cbs: Fix software cbs to consider packet sending time · c94fbe28
      Zh-yuan Ye authored
      [ Upstream commit 961d0e5b ]
      
      Currently the software CBS does not consider the packet sending time
      when depleting the credits. It caused the throughput to be
      Idleslope[kbps] * (Port transmit rate[kbps] / |Sendslope[kbps]|) where
      Idleslope * (Port transmit rate / (Idleslope + |Sendslope|)) = Idleslope
      is expected. In order to fix the issue above, this patch takes the time
      when the packet sending completes into account by moving the anchor time
      variable "last" ahead to the send completion time upon transmission and
      adding wait when the next dequeue request comes before the send
      completion time of the previous packet.
      
      changelog:
      V2->V3:
       - remove unnecessary whitespace cleanup
       - add the checks if port_rate is 0 before division
      
      V1->V2:
       - combine variable "send_completed" into "last"
       - add the comment for estimate of the packet sending
      
      Fixes: 585d763a ("net/sched: Introduce Credit Based Shaper (CBS) qdisc")
      Signed-off-by: default avatarZh-yuan Ye <ye.zh-yuan@socionext.com>
      Reviewed-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c94fbe28
    • Ido Schimmel's avatar
      mlxsw: spectrum_mr: Fix list iteration in error path · b371fdcd
      Ido Schimmel authored
      [ Upstream commit f6bf1baf ]
      
      list_for_each_entry_from_reverse() iterates backwards over the list from
      the current position, but in the error path we should start from the
      previous position.
      
      Fix this by using list_for_each_entry_continue_reverse() instead.
      
      This suppresses the following error from coccinelle:
      
      drivers/net/ethernet/mellanox/mlxsw//spectrum_mr.c:655:34-38: ERROR:
      invalid reference to the index variable of the iterator on line 636
      
      Fixes: c011ec1b ("mlxsw: spectrum: Add the multicast routing offloading logic")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b371fdcd
    • Willem de Bruijn's avatar
      macsec: restrict to ethernet devices · 1e62437e
      Willem de Bruijn authored
      [ Upstream commit b06d072c ]
      
      Only attach macsec to ethernet devices.
      
      Syzbot was able to trigger a KMSAN warning in macsec_handle_frame
      by attaching to a phonet device.
      
      Macvlan has a similar check in macvlan_port_create.
      
      v1->v2
        - fix commit message typo
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e62437e
    • Taehee Yoo's avatar
      hsr: fix general protection fault in hsr_addr_is_self() · b1ab6a51
      Taehee Yoo authored
      [ Upstream commit 3a303cfd ]
      
      The port->hsr is used in the hsr_handle_frame(), which is a
      callback of rx_handler.
      hsr master and slaves are initialized in hsr_add_port().
      This function initializes several pointers, which includes port->hsr after
      registering rx_handler.
      So, in the rx_handler routine, un-initialized pointer would be used.
      In order to fix this, pointers should be initialized before
      registering rx_handler.
      
      Test commands:
          ip netns del left
          ip netns del right
          modprobe -rv veth
          modprobe -rv hsr
          killall ping
          modprobe hsr
          ip netns add left
          ip netns add right
          ip link add veth0 type veth peer name veth1
          ip link add veth2 type veth peer name veth3
          ip link add veth4 type veth peer name veth5
          ip link set veth1 netns left
          ip link set veth3 netns right
          ip link set veth4 netns left
          ip link set veth5 netns right
          ip link set veth0 up
          ip link set veth2 up
          ip link set veth0 address fc:00:00:00:00:01
          ip link set veth2 address fc:00:00:00:00:02
          ip netns exec left ip link set veth1 up
          ip netns exec left ip link set veth4 up
          ip netns exec right ip link set veth3 up
          ip netns exec right ip link set veth5 up
          ip link add hsr0 type hsr slave1 veth0 slave2 veth2
          ip a a 192.168.100.1/24 dev hsr0
          ip link set hsr0 up
          ip netns exec left ip link add hsr1 type hsr slave1 veth1 slave2 veth4
          ip netns exec left ip a a 192.168.100.2/24 dev hsr1
          ip netns exec left ip link set hsr1 up
          ip netns exec left ip n a 192.168.100.1 dev hsr1 lladdr \
      	    fc:00:00:00:00:01 nud permanent
          ip netns exec left ip n r 192.168.100.1 dev hsr1 lladdr \
      	    fc:00:00:00:00:01 nud permanent
          for i in {1..100}
          do
              ip netns exec left ping 192.168.100.1 &
          done
          ip netns exec left hping3 192.168.100.1 -2 --flood &
          ip netns exec right ip link add hsr2 type hsr slave1 veth3 slave2 veth5
          ip netns exec right ip a a 192.168.100.3/24 dev hsr2
          ip netns exec right ip link set hsr2 up
          ip netns exec right ip n a 192.168.100.1 dev hsr2 lladdr \
      	    fc:00:00:00:00:02 nud permanent
          ip netns exec right ip n r 192.168.100.1 dev hsr2 lladdr \
      	    fc:00:00:00:00:02 nud permanent
          for i in {1..100}
          do
              ip netns exec right ping 192.168.100.1 &
          done
          ip netns exec right hping3 192.168.100.1 -2 --flood &
          while :
          do
              ip link add hsr0 type hsr slave1 veth0 slave2 veth2
      	ip a a 192.168.100.1/24 dev hsr0
      	ip link set hsr0 up
      	ip link del hsr0
          done
      
      Splat looks like:
      [  120.954938][    C0] general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1]I
      [  120.957761][    C0] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
      [  120.959064][    C0] CPU: 0 PID: 1511 Comm: hping3 Not tainted 5.6.0-rc5+ #460
      [  120.960054][    C0] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  120.962261][    C0] RIP: 0010:hsr_addr_is_self+0x65/0x2a0 [hsr]
      [  120.963149][    C0] Code: 44 24 18 70 73 2f c0 48 c1 eb 03 48 8d 04 13 c7 00 f1 f1 f1 f1 c7 40 04 00 f2 f2 f2 4
      [  120.966277][    C0] RSP: 0018:ffff8880d9c09af0 EFLAGS: 00010206
      [  120.967293][    C0] RAX: 0000000000000006 RBX: 1ffff1101b38135f RCX: 0000000000000000
      [  120.968516][    C0] RDX: dffffc0000000000 RSI: ffff8880d17cb208 RDI: 0000000000000000
      [  120.969718][    C0] RBP: 0000000000000030 R08: ffffed101b3c0e3c R09: 0000000000000001
      [  120.972203][    C0] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 0000000000000000
      [  120.973379][    C0] R13: ffff8880aaf80100 R14: ffff8880aaf800f2 R15: ffff8880aaf80040
      [  120.974410][    C0] FS:  00007f58e693f740(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
      [  120.979794][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  120.980773][    C0] CR2: 00007ffcb8b38f29 CR3: 00000000afe8e001 CR4: 00000000000606f0
      [  120.981945][    C0] Call Trace:
      [  120.982411][    C0]  <IRQ>
      [  120.982848][    C0]  ? hsr_add_node+0x8c0/0x8c0 [hsr]
      [  120.983522][    C0]  ? rcu_read_lock_held+0x90/0xa0
      [  120.984159][    C0]  ? rcu_read_lock_sched_held+0xc0/0xc0
      [  120.984944][    C0]  hsr_handle_frame+0x1db/0x4e0 [hsr]
      [  120.985597][    C0]  ? hsr_nl_nodedown+0x2b0/0x2b0 [hsr]
      [  120.986289][    C0]  __netif_receive_skb_core+0x6bf/0x3170
      [  120.992513][    C0]  ? check_chain_key+0x236/0x5d0
      [  120.993223][    C0]  ? do_xdp_generic+0x1460/0x1460
      [  120.993875][    C0]  ? register_lock_class+0x14d0/0x14d0
      [  120.994609][    C0]  ? __netif_receive_skb_one_core+0x8d/0x160
      [  120.995377][    C0]  __netif_receive_skb_one_core+0x8d/0x160
      [  120.996204][    C0]  ? __netif_receive_skb_core+0x3170/0x3170
      [ ... ]
      
      Reported-by: syzbot+fcf5dd39282ceb27108d@syzkaller.appspotmail.com
      Fixes: c5a75911 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b1ab6a51
    • Florian Westphal's avatar
      geneve: move debug check after netdev unregister · 2c1a05e9
      Florian Westphal authored
      [ Upstream commit 0fda7600 ]
      
      The debug check must be done after unregister_netdevice_many() call --
      the list_del() for this is done inside .ndo_stop.
      
      Fixes: 2843a253 ("geneve: speedup geneve tunnels dismantle")
      Reported-and-tested-by: <syzbot+68a8ed58e3d17c700de5@syzkaller.appspotmail.com>
      Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c1a05e9
    • Lyude Paul's avatar
      Revert "drm/dp_mst: Skip validating ports during destruction, just ref" · 013b1465
      Lyude Paul authored
      commit 9765635b upstream.
      
      This reverts commit:
      
      c54c7374 ("drm/dp_mst: Skip validating ports during destruction, just ref")
      
      ugh.
      
      In drm_dp_destroy_connector_work(), we have a pretty good chance of
      freeing the actual struct drm_dp_mst_port. However, after destroying
      things we send a hotplug through (*mgr->cbs->hotplug)(mgr) which is
      where the problems start.
      
      For i915, this calls all the way down to the fbcon probing helpers,
      which start trying to access the port in a modeset.
      
      [   45.062001] ==================================================================
      [   45.062112] BUG: KASAN: use-after-free in ex_handler_refcount+0x146/0x180
      [   45.062196] Write of size 4 at addr ffff8882b4b70968 by task kworker/3:1/53
      
      [   45.062325] CPU: 3 PID: 53 Comm: kworker/3:1 Kdump: loaded Tainted: G           O      4.20.0-rc4Lyude-Test+ #3
      [   45.062442] Hardware name: LENOVO 20BWS1KY00/20BWS1KY00, BIOS JBET71WW (1.35 ) 09/14/2018
      [   45.062554] Workqueue: events drm_dp_destroy_connector_work [drm_kms_helper]
      [   45.062641] Call Trace:
      [   45.062685]  dump_stack+0xbd/0x15a
      [   45.062735]  ? dump_stack_print_info.cold.0+0x1b/0x1b
      [   45.062801]  ? printk+0x9f/0xc5
      [   45.062847]  ? kmsg_dump_rewind_nolock+0xe4/0xe4
      [   45.062909]  ? ex_handler_refcount+0x146/0x180
      [   45.062970]  print_address_description+0x71/0x239
      [   45.063036]  ? ex_handler_refcount+0x146/0x180
      [   45.063095]  kasan_report.cold.5+0x242/0x30b
      [   45.063155]  __asan_report_store4_noabort+0x1c/0x20
      [   45.063313]  ex_handler_refcount+0x146/0x180
      [   45.063371]  ? ex_handler_clear_fs+0xb0/0xb0
      [   45.063428]  fixup_exception+0x98/0xd7
      [   45.063484]  ? raw_notifier_call_chain+0x20/0x20
      [   45.063548]  do_trap+0x6d/0x210
      [   45.063605]  ? _GLOBAL__sub_I_65535_1_drm_dp_aux_unregister_devnode+0x2f/0x1c6 [drm_kms_helper]
      [   45.063732]  do_error_trap+0xc0/0x170
      [   45.063802]  ? _GLOBAL__sub_I_65535_1_drm_dp_aux_unregister_devnode+0x2f/0x1c6 [drm_kms_helper]
      [   45.063929]  do_invalid_op+0x3b/0x50
      [   45.063997]  ? _GLOBAL__sub_I_65535_1_drm_dp_aux_unregister_devnode+0x2f/0x1c6 [drm_kms_helper]
      [   45.064103]  invalid_op+0x14/0x20
      [   45.064162] RIP: 0010:_GLOBAL__sub_I_65535_1_drm_dp_aux_unregister_devnode+0x2f/0x1c6 [drm_kms_helper]
      [   45.064274] Code: 00 48 c7 c7 80 fe 53 a0 48 89 e5 e8 5b 6f 26 e1 5d c3 48 8d 0e 0f 0b 48 8d 0b 0f 0b 48 8d 0f 0f 0b 48 8d 0f 0f 0b 49 8d 4d 00 <0f> 0b 49 8d 0e 0f 0b 48 8d 08 0f 0b 49 8d 4d 00 0f 0b 48 8d 0b 0f
      [   45.064569] RSP: 0018:ffff8882b789ee10 EFLAGS: 00010282
      [   45.064637] RAX: ffff8882af47ae70 RBX: ffff8882af47aa60 RCX: ffff8882b4b70968
      [   45.064723] RDX: ffff8882af47ae70 RSI: 0000000000000008 RDI: ffff8882b788bdb8
      [   45.064808] RBP: ffff8882b789ee28 R08: ffffed1056f13db4 R09: ffffed1056f13db3
      [   45.064894] R10: ffffed1056f13db3 R11: ffff8882b789ed9f R12: ffff8882af47ad28
      [   45.064980] R13: ffff8882b4b70968 R14: ffff8882acd86728 R15: ffff8882b4b75dc8
      [   45.065084]  drm_dp_mst_reset_vcpi_slots+0x12/0x80 [drm_kms_helper]
      [   45.065225]  intel_mst_disable_dp+0xda/0x180 [i915]
      [   45.065361]  intel_encoders_disable.isra.107+0x197/0x310 [i915]
      [   45.065498]  haswell_crtc_disable+0xbe/0x400 [i915]
      [   45.065622]  ? i9xx_disable_plane+0x1c0/0x3e0 [i915]
      [   45.065750]  intel_atomic_commit_tail+0x74e/0x3e60 [i915]
      [   45.065884]  ? intel_pre_plane_update+0xbc0/0xbc0 [i915]
      [   45.065968]  ? drm_atomic_helper_swap_state+0x88b/0x1d90 [drm_kms_helper]
      [   45.066054]  ? kasan_check_write+0x14/0x20
      [   45.066165]  ? i915_gem_track_fb+0x13a/0x330 [i915]
      [   45.066277]  ? i915_sw_fence_complete+0xe9/0x140 [i915]
      [   45.066406]  ? __i915_sw_fence_complete+0xc50/0xc50 [i915]
      [   45.066540]  intel_atomic_commit+0x72e/0xef0 [i915]
      [   45.066635]  ? drm_dev_dbg+0x200/0x200 [drm]
      [   45.066764]  ? intel_atomic_commit_tail+0x3e60/0x3e60 [i915]
      [   45.066898]  ? intel_atomic_commit_tail+0x3e60/0x3e60 [i915]
      [   45.067001]  drm_atomic_commit+0xc4/0xf0 [drm]
      [   45.067074]  restore_fbdev_mode_atomic+0x562/0x780 [drm_kms_helper]
      [   45.067166]  ? drm_fb_helper_debug_leave+0x690/0x690 [drm_kms_helper]
      [   45.067249]  ? kasan_check_read+0x11/0x20
      [   45.067324]  restore_fbdev_mode+0x127/0x4b0 [drm_kms_helper]
      [   45.067364]  ? kasan_check_read+0x11/0x20
      [   45.067406]  drm_fb_helper_restore_fbdev_mode_unlocked+0x164/0x200 [drm_kms_helper]
      [   45.067462]  ? drm_fb_helper_hotplug_event+0x30/0x30 [drm_kms_helper]
      [   45.067508]  ? kasan_check_write+0x14/0x20
      [   45.070360]  ? mutex_unlock+0x22/0x40
      [   45.073748]  drm_fb_helper_set_par+0xb2/0xf0 [drm_kms_helper]
      [   45.075846]  drm_fb_helper_hotplug_event.part.33+0x1cd/0x290 [drm_kms_helper]
      [   45.078088]  drm_fb_helper_hotplug_event+0x1c/0x30 [drm_kms_helper]
      [   45.082614]  intel_fbdev_output_poll_changed+0x9f/0x140 [i915]
      [   45.087069]  drm_kms_helper_hotplug_event+0x67/0x90 [drm_kms_helper]
      [   45.089319]  intel_dp_mst_hotplug+0x37/0x50 [i915]
      [   45.091496]  drm_dp_destroy_connector_work+0x510/0x6f0 [drm_kms_helper]
      [   45.093675]  ? drm_dp_update_payload_part1+0x1220/0x1220 [drm_kms_helper]
      [   45.095851]  ? kasan_check_write+0x14/0x20
      [   45.098473]  ? kasan_check_read+0x11/0x20
      [   45.101155]  ? strscpy+0x17c/0x530
      [   45.103808]  ? __switch_to_asm+0x34/0x70
      [   45.106456]  ? syscall_return_via_sysret+0xf/0x7f
      [   45.109711]  ? read_word_at_a_time+0x20/0x20
      [   45.113138]  ? __switch_to_asm+0x40/0x70
      [   45.116529]  ? __switch_to_asm+0x34/0x70
      [   45.119891]  ? __switch_to_asm+0x40/0x70
      [   45.123224]  ? __switch_to_asm+0x34/0x70
      [   45.126540]  ? __switch_to_asm+0x34/0x70
      [   45.129824]  process_one_work+0x88d/0x15d0
      [   45.133172]  ? pool_mayday_timeout+0x850/0x850
      [   45.136459]  ? pci_mmcfg_check_reserved+0x110/0x128
      [   45.139739]  ? wake_q_add+0xb0/0xb0
      [   45.143010]  ? check_preempt_wakeup+0x652/0x1050
      [   45.146304]  ? worker_enter_idle+0x29e/0x740
      [   45.149589]  ? __schedule+0x1ec0/0x1ec0
      [   45.152937]  ? kasan_check_read+0x11/0x20
      [   45.156179]  ? _raw_spin_lock_irq+0xa3/0x130
      [   45.159382]  ? _raw_read_unlock_irqrestore+0x30/0x30
      [   45.162542]  ? kasan_check_write+0x14/0x20
      [   45.165657]  worker_thread+0x1a5/0x1470
      [   45.168725]  ? set_load_weight+0x2e0/0x2e0
      [   45.171755]  ? process_one_work+0x15d0/0x15d0
      [   45.174806]  ? __switch_to_asm+0x34/0x70
      [   45.177645]  ? __switch_to_asm+0x40/0x70
      [   45.180323]  ? __switch_to_asm+0x34/0x70
      [   45.182936]  ? __switch_to_asm+0x40/0x70
      [   45.185539]  ? __switch_to_asm+0x34/0x70
      [   45.188100]  ? __switch_to_asm+0x40/0x70
      [   45.190628]  ? __schedule+0x7d4/0x1ec0
      [   45.193143]  ? save_stack+0xa9/0xd0
      [   45.195632]  ? kasan_check_write+0x10/0x20
      [   45.198162]  ? kasan_kmalloc+0xc4/0xe0
      [   45.200609]  ? kmem_cache_alloc_trace+0xdd/0x190
      [   45.203046]  ? kthread+0x9f/0x3b0
      [   45.205470]  ? ret_from_fork+0x35/0x40
      [   45.207876]  ? unwind_next_frame+0x43/0x50
      [   45.210273]  ? __save_stack_trace+0x82/0x100
      [   45.212658]  ? deactivate_slab.isra.67+0x3d4/0x580
      [   45.215026]  ? default_wake_function+0x35/0x50
      [   45.217399]  ? kasan_check_read+0x11/0x20
      [   45.219825]  ? _raw_spin_lock_irqsave+0xae/0x140
      [   45.222174]  ? __lock_text_start+0x8/0x8
      [   45.224521]  ? replenish_dl_entity.cold.62+0x4f/0x4f
      [   45.226868]  ? __kthread_parkme+0x87/0xf0
      [   45.229200]  kthread+0x2f7/0x3b0
      [   45.231557]  ? process_one_work+0x15d0/0x15d0
      [   45.233923]  ? kthread_park+0x120/0x120
      [   45.236249]  ret_from_fork+0x35/0x40
      
      [   45.240875] Allocated by task 242:
      [   45.243136]  save_stack+0x43/0xd0
      [   45.245385]  kasan_kmalloc+0xc4/0xe0
      [   45.247597]  kmem_cache_alloc_trace+0xdd/0x190
      [   45.249793]  drm_dp_add_port+0x1e0/0x2170 [drm_kms_helper]
      [   45.252000]  drm_dp_send_link_address+0x4a7/0x740 [drm_kms_helper]
      [   45.254389]  drm_dp_check_and_send_link_address+0x1a7/0x210 [drm_kms_helper]
      [   45.256803]  drm_dp_mst_link_probe_work+0x6f/0xb0 [drm_kms_helper]
      [   45.259200]  process_one_work+0x88d/0x15d0
      [   45.261597]  worker_thread+0x1a5/0x1470
      [   45.264038]  kthread+0x2f7/0x3b0
      [   45.266371]  ret_from_fork+0x35/0x40
      
      [   45.270937] Freed by task 53:
      [   45.273170]  save_stack+0x43/0xd0
      [   45.275382]  __kasan_slab_free+0x139/0x190
      [   45.277604]  kasan_slab_free+0xe/0x10
      [   45.279826]  kfree+0x99/0x1b0
      [   45.282044]  drm_dp_free_mst_port+0x4a/0x60 [drm_kms_helper]
      [   45.284330]  drm_dp_destroy_connector_work+0x43e/0x6f0 [drm_kms_helper]
      [   45.286660]  process_one_work+0x88d/0x15d0
      [   45.288934]  worker_thread+0x1a5/0x1470
      [   45.291231]  kthread+0x2f7/0x3b0
      [   45.293547]  ret_from_fork+0x35/0x40
      
      [   45.298206] The buggy address belongs to the object at ffff8882b4b70968
                      which belongs to the cache kmalloc-2k of size 2048
      [   45.303047] The buggy address is located 0 bytes inside of
                      2048-byte region [ffff8882b4b70968, ffff8882b4b71168)
      [   45.308010] The buggy address belongs to the page:
      [   45.310477] page:ffffea000ad2dc00 count:1 mapcount:0 mapping:ffff8882c080cf40 index:0x0 compound_mapcount: 0
      [   45.313051] flags: 0x8000000000010200(slab|head)
      [   45.315635] raw: 8000000000010200 ffffea000aac2808 ffffea000abe8608 ffff8882c080cf40
      [   45.318300] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
      [   45.320966] page dumped because: kasan: bad access detected
      
      [   45.326312] Memory state around the buggy address:
      [   45.329085]  ffff8882b4b70800: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   45.331845]  ffff8882b4b70880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   45.334584] >ffff8882b4b70900: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
      [   45.337302]                                                           ^
      [   45.340061]  ffff8882b4b70980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   45.342910]  ffff8882b4b70a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   45.345748] ==================================================================
      
      So, this definitely isn't a fix that we want. This being said; there's
      no real easy fix for this problem because of some of the catch-22's of
      the MST helpers current design. For starters; we always need to validate
      a port with drm_dp_get_validated_port_ref(), but validation relies on
      the lifetime of the port in the actual topology. So once the port is
      gone, it can't be validated again.
      
      If we were to try to make the payload helpers not use port validation,
      then we'd cause another problem: if the port isn't validated, it could
      be freed and we'd just start causing more KASAN issues. There are
      already hacks that attempt to workaround this in
      drm_dp_mst_destroy_connector_work() by re-initializing the kref so that
      it can be used again and it's memory can be freed once the VCPI helpers
      finish removing the port's respective payloads. But none of these really
      do anything helpful since the port still can't be validated since it's
      gone from the topology. Also, that workaround is immensely confusing to
      read through.
      
      What really needs to be done in order to fix this is to teach DRM how to
      track the lifetime of the structs for MST ports and branch devices
      separately from their lifetime in the actual topology. Simply put; this
      means having two different krefs-one that removes the port/branch device
      from the topology, and one that finally calls kfree(). This would let us
      simplify things, since we'd now be able to keep ports around without
      having to keep them in the topology at the same time, which is exactly
      what we need in order to teach our VCPI helpers to only validate ports
      when it's actually necessary without running the risk of trying to use
      unallocated memory.
      
      Such a fix is on it's way, but for now let's play it safe and just
      revert this. If this bug has been around for well over a year, we can
      wait a little while to get an actual proper fix here.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: c54c7374 ("drm/dp_mst: Skip validating ports during destruction, just ref")
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Sean Paul <sean@poorly.run>
      Cc: Jerry Zuo <Jerry.Zuo@amd.com>
      Cc: Harry Wentland <Harry.Wentland@amd.com>
      Cc: stable@vger.kernel.org # v4.6+
      Acked-by: default avatarSean Paul <sean@poorly.run>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181128210005.24434-1-lyude@redhat.com
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      013b1465
    • Ulf Hansson's avatar
      mmc: sdhci-tegra: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY · ff6471fe
      Ulf Hansson authored
      [ Upstream commit d2f8bfa4 ]
      
      It has turned out that the sdhci-tegra controller requires the R1B response,
      for commands that has this response associated with them. So, converting
      from an R1B to an R1 response for a CMD6 for example, leads to problems
      with the HW busy detection support.
      
      Fix this by informing the mmc core about the requirement, via setting the
      host cap, MMC_CAP_NEED_RSP_BUSY.
      Reported-by: default avatarBitan Biswas <bbiswas@nvidia.com>
      Reported-by: default avatarPeter Geis <pgwipeout@gmail.com>
      Suggested-by: default avatarSowjanya Komatineni <skomatineni@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Tested-by: default avatarSowjanya Komatineni <skomatineni@nvidia.com>
      Tested-By: default avatarPeter Geis <pgwipeout@gmail.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ff6471fe
    • Ulf Hansson's avatar
      mmc: sdhci-omap: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY · 23161bed
      Ulf Hansson authored
      [ Upstream commit 055e0483 ]
      
      It has turned out that the sdhci-omap controller requires the R1B response,
      for commands that has this response associated with them. So, converting
      from an R1B to an R1 response for a CMD6 for example, leads to problems
      with the HW busy detection support.
      
      Fix this by informing the mmc core about the requirement, via setting the
      host cap, MMC_CAP_NEED_RSP_BUSY.
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Reported-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Reported-by: default avatarFaiz Abbas <faiz_abbas@ti.com>
      Cc: <stable@vger.kernel.org>
      Tested-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Tested-by: default avatarFaiz Abbas <faiz_abbas@ti.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      23161bed
    • Ulf Hansson's avatar
      mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for eMMC sleep command · d091259b
      Ulf Hansson authored
      [ Upstream commit 18d20046 ]
      
      The busy timeout for the CMD5 to put the eMMC into sleep state, is specific
      to the card. Potentially the timeout may exceed the host->max_busy_timeout.
      If that becomes the case, mmc_sleep() converts from using an R1B response
      to an R1 response, as to prevent the host from doing HW busy detection.
      
      However, it has turned out that some hosts requires an R1B response no
      matter what, so let's respect that via checking MMC_CAP_NEED_RSP_BUSY. Note
      that, if the R1B gets enforced, the host becomes fully responsible of
      managing the needed busy timeout, in one way or the other.
      Suggested-by: default avatarSowjanya Komatineni <skomatineni@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200311092036.16084-1-ulf.hansson@linaro.orgSigned-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d091259b