1. 13 Mar, 2020 11 commits
    • Linus Torvalds's avatar
      Merge tag 'fuse-fixes-5.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 7e6d869f
      Linus Torvalds authored
      Pull fuse fix from Miklos Szeredi:
       "Fix an Oops introduced in v5.4"
      
      * tag 'fuse-fixes-5.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: fix stack use after return
      7e6d869f
    • Linus Torvalds's avatar
      Merge tag 'ovl-fixes-5.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 2af82177
      Linus Torvalds authored
      Pull overlayfs fixes from Miklos Szeredi:
       "Fix three bugs introduced in this cycle"
      
      * tag 'ovl-fixes-5.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: fix lockdep warning for async write
        ovl: fix some xino configurations
        ovl: fix lock in ovl_llseek()
      2af82177
    • Linus Torvalds's avatar
      Merge tag 'pm-5.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 78511edc
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "Fix cpupower utility build failures with -fno-common enabled (Mike
        Gilbert)"
      
      * tag 'pm-5.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpupower: avoid multiple definition with gcc -fno-common
      78511edc
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.6-2020-03-13' of git://git.kernel.dk/linux-block · 5007928e
      Linus Torvalds authored
      Pull io_uring fix from Jens Axboe:
       "Just a single fix here, improving the RCU callback ordering from last
        week. After a bit more perusing by Paul, he poked a hole in the
        original"
      
      * tag 'io_uring-5.6-2020-03-13' of git://git.kernel.dk/linux-block:
        io_uring: ensure RCU callback ordering with rcu_barrier()
      5007928e
    • Linus Torvalds's avatar
      Merge tag 'block-5.6-2020-03-13' of git://git.kernel.dk/linux-block · 17829c5a
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A few fixes that should go into this release. This contains:
      
         - Fix for a corruption issue with the s390 dasd driver (Stefan)
      
         - Fixup/improvement for the flush insertion change that we had in
           this series (Ming)
      
         - Fix for the partition suppor for host aware zoned devices
           (Shin'ichiro)
      
         - Fix incorrect blk-iocost comparison (Tejun)
      
        The diffstat looks large, but that's a) mostly dasd, and b) the flush
        fix from Ming adds a big comment"
      
      * tag 'block-5.6-2020-03-13' of git://git.kernel.dk/linux-block:
        block: Fix partition support for host aware zoned block devices
        blk-mq: insert flush request to the front of dispatch queue
        s390/dasd: fix data corruption for thin provisioned devices
        blk-iocost: fix incorrect vtime comparison in iocg_is_idle()
      17829c5a
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · d3656129
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "MMC core:
      
         - Fix HW busy detection support for host controllers requiring the
           MMC_RSP_BUSY response flag (R1B) to be set for the command. In
           particular for CMD6 (eMMC), erase/trim/discard (SD/eMMC) and CMD5
           (eMMC sleep).
      
        MMC host:
      
         - sdhci-omap|tegra: Fix support for HW busy detection"
      
      * tag 'mmc-v5.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for eMMC sleep command
        mmc: sdhci-tegra: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY
        mmc: sdhci-omap: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY
        mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for erase/trim/discard
        mmc: core: Allow host controllers to require R1B for CMD6
      d3656129
    • Jann Horn's avatar
      afs: Use kfree_rcu() instead of casting kfree() to rcu_callback_t · ddd2b85f
      Jann Horn authored
      afs_put_addrlist() casts kfree() to rcu_callback_t. Apart from being wrong
      in theory, this might also blow up when people start enforcing function
      types via compiler instrumentation, and it means the rcu_head has to be
      first in struct afs_addr_list.
      
      Use kfree_rcu() instead, it's simpler and more correct.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ddd2b85f
    • Miklos Szeredi's avatar
      ovl: fix lockdep warning for async write · c8536804
      Miklos Szeredi authored
      Lockdep reports "WARNING: lock held when returning to user space!" due to
      async write holding freeze lock over the write.  Apparently aio.c already
      deals with this by lying to lockdep about the state of the lock.
      
      Do the same here.  No need to check for S_IFREG() here since these file ops
      are regular-only.
      
      Reported-by: syzbot+9331a354f4f624a52a55@syzkaller.appspotmail.com
      Fixes: 2406a307 ("ovl: implement async IO routines")
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      c8536804
    • Amir Goldstein's avatar
      ovl: fix some xino configurations · 53afcd31
      Amir Goldstein authored
      Fix up two bugs in the coversion to xino_mode:
      1. xino=off does not always end up in disabled mode
      2. xino=auto on 32bit arch should end up in disabled mode
      
      Take a proactive approach to disabling xino on 32bit kernel:
      1. Disable XINO_AUTO config during build time
      2. Disable xino with a warning on mount time
      
      As a by product, xino=on on 32bit arch also ends up in disabled mode.
      We never intended to enable xino on 32bit arch and this will make the
      rest of the logic simpler.
      
      Fixes: 0f831ec8 ("ovl: simplify ovl_same_sb() helper")
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      53afcd31
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2020-03-13' of git://anongit.freedesktop.org/drm/drm · 0d81a3f2
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "It's a bit quieter, probably not as much as it could be.
      
        There is on large regression fix in here from Lyude for displayport
        bandwidth calculations, there've been reports of multi-monitor in
        docks not working since -rc1 and this has been tested to fix those.
      
        Otherwise it's a bunch of i915 (with some GVT fixes), a set of amdgpu
        watermark + bios fixes, and an exynos iommu cleanup fix.
      
        core:
         - DP MST bandwidth regression fix.
      
        i915:
         - hard lockup fix
         - GVT fixes
         - 32-bit alignment issue fix
         - timeline wait fixes
         - cacheline_retire and free
      
        amdgpu:
         - Update the display watermark bounding box for navi14
         - Fix fetching vbios directly from rom on vega20/arcturus
         - Navi and renoir watermark fixes
      
        exynos:
         - iommu object cleanup fix"
      
      `
      
      * tag 'drm-fixes-2020-03-13' of git://anongit.freedesktop.org/drm/drm:
        drm/dp_mst: Rewrite and fix bandwidth limit checks
        drm/dp_mst: Reprobe path resources in CSN handler
        drm/dp_mst: Use full_pbn instead of available_pbn for bandwidth checks
        drm/dp_mst: Rename drm_dp_mst_is_dp_mst_end_device() to be less redundant
        drm/i915: Defer semaphore priority bumping to a workqueue
        drm/i915/gt: Close race between cacheline_retire and free
        drm/i915/execlists: Enable timeslice on partial virtual engine dequeue
        drm/i915: be more solid in checking the alignment
        drm/i915/gvt: Fix dma-buf display blur issue on CFL
        drm/i915: Return early for await_start on same timeline
        drm/i915: Actually emit the await_start
        drm/amdgpu/powerplay: nv1x, renior copy dcn clock settings of watermark to smu during boot up
        drm/exynos: Fix cleanup of IOMMU related objects
        drm/amdgpu: correct ROM_INDEX/DATA offset for VEGA20
        drm/amd/display: update soc bb for nv14
        drm/i915/gvt: Fix emulated vbt size issue
        drm/i915/gvt: Fix unnecessary schedule timer when no vGPU exits
      0d81a3f2
    • Dave Airlie's avatar
      Merge tag 'topic/mst-bw-check-fixes-for-airlied-2020-03-12-2' of... · 16b78f05
      Dave Airlie authored
      Merge tag 'topic/mst-bw-check-fixes-for-airlied-2020-03-12-2' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
      
      UAPI Changes: None
      
      Cross-subsystem Changes: None
      
      Core Changes: Fixed regressions introduced by commit cd82d82c
      ("drm/dp_mst: Add branch bandwidth validation to MST atomic check"),
      which would cause us to:
      
      * Calculate the available bandwidth on an MST topology incorrectly, and
        as a result reject most display configurations that would try to enable
        more then one sink on a topology
      * Occasionally expose MST connectors to userspace before finishing
        probing their PBN capabilities, resulting in us rejecting display
        configurations because we assumed briefly that no bandwidth was
        available
      
      Driver Changes: None
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Lyude Paul <lyude@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/bf16ee577567beed91c86b7d9cda3ec2e8c50a71.camel@redhat.com
      16b78f05
  2. 12 Mar, 2020 29 commits
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2020-03-12' of... · f31d83f0
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2020-03-12' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      drm/i915 fixes for v5.6-rc6:
      - hard lockup fix
      - GVT fixes
      - 32-bit alignment issue fix
      - timeline wait fixes
      - cacheline_retire and free
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Jani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/87lfo6ksvw.fsf@intel.com
      f31d83f0
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-5.6-2020-03-11' of... · d9443265
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-5.6-2020-03-11' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
      
      amd-drm-fixes-5.6-2020-03-11:
      
      amdgpu:
      - Update the display watermark bounding box for navi14
      - Fix fetching vbios directly from rom on vega20/arcturus
      - Navi and renoir watermark fixes
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Alex Deucher <alexdeucher@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200312020924.4161-1-alexander.deucher@amd.com
      d9443265
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 1b51f694
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "It looks like a decent sized set of fixes, but a lot of these are one
        liner off-by-one and similar type changes:
      
         1) Fix netlink header pointer to calcular bad attribute offset
            reported to user. From Pablo Neira Ayuso.
      
         2) Don't double clear PHY interrupts when ->did_interrupt is set,
            from Heiner Kallweit.
      
         3) Add missing validation of various (devlink, nl802154, fib, etc.)
            attributes, from Jakub Kicinski.
      
         4) Missing *pos increments in various netfilter seq_next ops, from
            Vasily Averin.
      
         5) Missing break in of_mdiobus_register() loop, from Dajun Jin.
      
         6) Don't double bump tx_dropped in veth driver, from Jiang Lidong.
      
         7) Work around FMAN erratum A050385, from Madalin Bucur.
      
         8) Make sure ARP header is pulled early enough in bonding driver,
            from Eric Dumazet.
      
         9) Do a cond_resched() during multicast processing of ipvlan and
            macvlan, from Mahesh Bandewar.
      
        10) Don't attach cgroups to unrelated sockets when in interrupt
            context, from Shakeel Butt.
      
        11) Fix tpacket ring state management when encountering unknown GSO
            types. From Willem de Bruijn.
      
        12) Fix MDIO bus PHY resume by checking mdio_bus_phy_may_suspend()
            only in the suspend context. From Heiner Kallweit"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (112 commits)
        net: systemport: fix index check to avoid an array out of bounds access
        tc-testing: add ETS scheduler to tdc build configuration
        net: phy: fix MDIO bus PM PHY resuming
        net: hns3: clear port base VLAN when unload PF
        net: hns3: fix RMW issue for VLAN filter switch
        net: hns3: fix VF VLAN table entries inconsistent issue
        net: hns3: fix "tc qdisc del" failed issue
        taprio: Fix sending packets without dequeueing them
        net: mvmdio: avoid error message for optional IRQ
        net: dsa: mv88e6xxx: Add missing mask of ATU occupancy register
        net: memcg: fix lockdep splat in inet_csk_accept()
        s390/qeth: implement smarter resizing of the RX buffer pool
        s390/qeth: refactor buffer pool code
        s390/qeth: use page pointers to manage RX buffer pool
        seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number
        net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed
        net/packet: tpacket_rcv: do not increment ring index on drop
        sxgbe: Fix off by one in samsung driver strncpy size arg
        net: caif: Add lockdep expression to RCU traversal primitive
        MAINTAINERS: remove Sathya Perla as Emulex NIC maintainer
        ...
      1b51f694
    • Lyude Paul's avatar
      drm/dp_mst: Rewrite and fix bandwidth limit checks · 047d4cd2
      Lyude Paul authored
      Sigh, this is mostly my fault for not giving commit cd82d82c
      ("drm/dp_mst: Add branch bandwidth validation to MST atomic check")
      enough scrutiny during review. The way we're checking bandwidth
      limitations here is mostly wrong:
      
      For starters, drm_dp_mst_atomic_check_bw_limit() determines the
      pbn_limit of a branch by simply scanning each port on the current branch
      device, then uses the last non-zero full_pbn value that it finds. It
      then counts the sum of the PBN used on each branch device for that
      level, and compares against the full_pbn value it found before.
      
      This is wrong because ports can and will have different PBN limitations
      on many hubs, especially since a number of DisplayPort hubs out there
      will be clever and only use the smallest link rate required for each
      downstream sink - potentially giving every port a different full_pbn
      value depending on what link rate it's trained at. This means with our
      current code, which max PBN value we end up with is not well defined.
      
      Additionally, we also need to remember when checking bandwidth
      limitations that the top-most device in any MST topology is a branch
      device, not a port. This means that the first level of a topology
      doesn't technically have a full_pbn value that needs to be checked.
      Instead, we should assume that so long as our VCPI allocations fit we're
      within the bandwidth limitations of the primary MSTB.
      
      We do however, want to check full_pbn on every port including those of
      the primary MSTB. However, it's important to keep in mind that this
      value represents the minimum link rate /between a port's sink or mstb,
      and the mstb itself/. A quick diagram to explain:
      
                                      MSTB #1
                                     /       \
                                    /         \
                                 Port #1    Port #2
             full_pbn for Port #1 → |          | ← full_pbn for Port #2
                                 Sink #1    MSTB #2
                                               |
                                             etc...
      
      Note that in the above diagram, the combined PBN from all VCPI
      allocations on said hub should not exceed the full_pbn value of port #2,
      and the display configuration on sink #1 should not exceed the full_pbn
      value of port #1. However, port #1 and port #2 can otherwise consume as
      much bandwidth as they want so long as their VCPI allocations still fit.
      
      And finally - our current bandwidth checking code also makes the mistake
      of not checking whether something is an end device or not before trying
      to traverse down it.
      
      So, let's fix it by rewriting our bandwidth checking helpers. We split
      the function into one part for handling branches which simply adds up
      the total PBN on each branch and returns it, and one for checking each
      port to ensure we're not going over its PBN limit. Phew.
      
      This should fix regressions seen, where we erroneously reject display
      configurations due to thinking they're going over our bandwidth limits
      when they're not.
      
      Changes since v1:
      * Took an even closer look at how PBN limitations are supposed to be
        handled, and did some experimenting with Sean Paul. Ended up rewriting
        these helpers again, but this time they should actually be correct!
      Changes since v2:
      * Small indenting fix
      * Fix pbn_used check in drm_dp_mst_atomic_check_port_bw_limit()
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: cd82d82c ("drm/dp_mst: Add branch bandwidth validation to MST atomic check")
      Cc: Sean Paul <seanpaul@google.com>
      Acked-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: default avatarMikita Lipski <mikita.lipski@amd.com>
      Tested-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200309210131.1497545-1-lyude@redhat.com
      047d4cd2
    • Lyude Paul's avatar
      drm/dp_mst: Reprobe path resources in CSN handler · 87212b51
      Lyude Paul authored
      We used to punt off reprobing path resources to the link address probe
      work, but now that we handle CSNs asynchronously from the driver's HPD
      handling we can do whatever the heck we want from the CSN!
      
      So, reprobe the path resources from drm_dp_mst_handle_conn_stat(). Also,
      get rid of the path resource reprobing code in
      drm_dp_check_and_send_link_address() since it's needlessly complicated
      when we already reprobe path resources from
      drm_dp_handle_link_address_port(). And finally, teach
      drm_dp_send_enum_path_resources() to return 1 on PBN changes so we know
      if we need to send another hotplug or not.
      
      This fixes issues where we've indicated to userspace that a port has
      just been connected, before we actually probed it's available PBN -
      something that results in unexpected atomic check failures.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: cd82d82c ("drm/dp_mst: Add branch bandwidth validation to MST atomic check")
      Cc: Mikita Lipski <mikita.lipski@amd.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Sean Paul <sean@poorly.run>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200306234623.547525-4-lyude@redhat.comReviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Tested-by: default avatarHans de Goede <hdegoede@redhat.com>
      87212b51
    • Lyude Paul's avatar
      drm/dp_mst: Use full_pbn instead of available_pbn for bandwidth checks · fcf46380
      Lyude Paul authored
      DisplayPort specifications are fun. For a while, it's been really
      unclear to us what available_pbn actually does. There's a somewhat vague
      explanation in the DisplayPort spec (starting from 1.2) that partially
      explains it:
      
        The minimum payload bandwidth number supported by the path. Each node
        updates this number with its available payload bandwidth number if its
        payload bandwidth number is less than that in the Message Transaction
        reply.
      
      So, it sounds like available_pbn represents the smallest link rate in
      use between the source and the branch device. Cool, so full_pbn is just
      the highest possible PBN that the branch device supports right?
      
      Well, we assumed that for quite a while until Sean Paul noticed that on
      some MST hubs, available_pbn will actually get set to 0 whenever there's
      any active payloads on the respective branch device. This caused quite a
      bit of confusion since clearing the payload ID table would end up fixing
      the available_pbn value.
      
      So, we just went with that until commit cd82d82c ("drm/dp_mst: Add
      branch bandwidth validation to MST atomic check") started breaking
      people's setups due to us getting erroneous available_pbn values. So, we
      did some more digging and got confused until we finally looked at the
      definition for full_pbn:
      
        The bandwidth of the link at the trained link rate and lane count
        between the DP Source device and the DP Sink device with no time slots
        allocated to VC Payloads, represented as a Payload Bandwidth Number. As
        with the Available_Payload_Bandwidth_Number, this number is determined
        by the link with the lowest lane count and link rate.
      
      That's what we get for not reading specs closely enough, hehe. So, since
      full_pbn is definitely what we want for doing bandwidth restriction
      checks - let's start using that instead and ignore available_pbn
      entirely.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: cd82d82c ("drm/dp_mst: Add branch bandwidth validation to MST atomic check")
      Cc: Mikita Lipski <mikita.lipski@amd.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Sean Paul <sean@poorly.run>
      Reviewed-by: default avatarMikita Lipski <mikita.lipski@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200306234623.547525-3-lyude@redhat.comReviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Tested-by: default avatarHans de Goede <hdegoede@redhat.com>
      fcf46380
    • Lyude Paul's avatar
      drm/dp_mst: Rename drm_dp_mst_is_dp_mst_end_device() to be less redundant · b2feb1d6
      Lyude Paul authored
      It's already prefixed by dp_mst, so we don't really need to repeat
      ourselves here. One of the changes I should have picked up originally
      when reviewing MST DSC support.
      
      There should be no functional changes here
      
      Cc: Mikita Lipski <mikita.lipski@amd.com>
      Cc: Sean Paul <seanpaul@google.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Tested-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200306234623.547525-2-lyude@redhat.com
      b2feb1d6
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 807f030b
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "A couple of fixes for old crap in ->atomic_open() instances"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        cifs_atomic_open(): fix double-put on late allocation failure
        gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache
      807f030b
    • Colin Ian King's avatar
      net: systemport: fix index check to avoid an array out of bounds access · c0368595
      Colin Ian King authored
      Currently the bounds check on index is off by one and can lead to
      an out of bounds access on array priv->filters_loc when index is
      RXCHK_BRCM_TAG_MAX.
      
      Fixes: bb9051a2 ("net: systemport: Add support for WAKE_FILTER")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0368595
    • Davide Caratti's avatar
      tc-testing: add ETS scheduler to tdc build configuration · 9d0e0cd9
      Davide Caratti authored
      add CONFIG_NET_SCH_ETS to 'config', otherwise test suites using this file
      to perform a full tdc run will encounter the following warning:
      
        ok 645 e90e - Add ETS qdisc using bands # skipped - "-----> teardown stage" did not complete successfully
      
      Fixes: 82c664b6 ("selftests: qdiscs: Add test coverage for ETS Qdisc")
      Reported-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d0e0cd9
    • Heiner Kallweit's avatar
      net: phy: fix MDIO bus PM PHY resuming · 611d779a
      Heiner Kallweit authored
      So far we have the unfortunate situation that mdio_bus_phy_may_suspend()
      is called in suspend AND resume path, assuming that function result is
      the same. After the original change this is no longer the case,
      resulting in broken resume as reported by Geert.
      
      To fix this call mdio_bus_phy_may_suspend() in the suspend path only,
      and let the phy_device store the info whether it was suspended by
      MDIO bus PM.
      
      Fixes: 503ba7c6 ("net: phy: Avoid multiple suspends")
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      611d779a
    • Al Viro's avatar
      cifs_atomic_open(): fix double-put on late allocation failure · d9a9f484
      Al Viro authored
      several iterations of ->atomic_open() calling conventions ago, we
      used to need fput() if ->atomic_open() failed at some point after
      successful finish_open().  Now (since 2016) it's not needed -
      struct file carries enough state to make fput() work regardless
      of the point in struct file lifecycle and discarding it on
      failure exits in open() got unified.  Unfortunately, I'd missed
      the fact that we had an instance of ->atomic_open() (cifs one)
      that used to need that fput(), as well as the stale comment in
      finish_open() demanding such late failure handling.  Trivially
      fixed...
      
      Fixes: fe9ec829 "do_last(): take fput() on error after opening to out:"
      Cc: stable@kernel.org # v4.7+
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      d9a9f484
    • Al Viro's avatar
      gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache · 21039132
      Al Viro authored
      with the way fs/namei.c:do_last() had been done, ->atomic_open()
      instances needed to recognize the case when existing file got
      found with O_EXCL|O_CREAT, either by falling back to finish_no_open()
      or failing themselves.  gfs2 one didn't.
      
      Fixes: 6d4ade98 (GFS2: Add atomic_open support)
      Cc: stable@kernel.org # v3.11
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      21039132
    • David S. Miller's avatar
      Merge branch 'hns3-fixes' · e4792ffe
      David S. Miller authored
      Huazhong Tan says:
      
      ====================
      net: hns3: fixes for -net
      
      This series includes several bugfixes for the HNS3 ethernet driver.
      
      [patch 1] fixes an "tc qdisc del" failure.
      [patch 2] fixes SW & HW VLAN table not consistent issue.
      [patch 3] fixes a RMW issue related to VLAN filter switch.
      [patch 4] clears port based VLAN when uploading PF.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4792ffe
    • Jian Shen's avatar
      net: hns3: clear port base VLAN when unload PF · 59359fc8
      Jian Shen authored
      Currently, PF missed to clear the port base VLAN for VF when
      unload. In this case, the VLAN id will remain in the VLAN
      table. This patch fixes it.
      
      Fixes: 92f11ea1 ("net: hns3: fix set port based VLAN issue for VF")
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59359fc8
    • Jian Shen's avatar
      net: hns3: fix RMW issue for VLAN filter switch · 903b85d3
      Jian Shen authored
      According to the user manual, the ingress and egress VLAN filter
      are configured at the same time. Currently, hclge_init_vlan_config()
      and hclge_set_vlan_spoofchk() will both change the VLAN filter
      switch. So it's necessary to read the old configuration before
      modifying it.
      
      Fixes: 22044f95 ("net: hns3: add support for spoof check setting")
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      903b85d3
    • Jian Shen's avatar
      net: hns3: fix VF VLAN table entries inconsistent issue · 23b4201d
      Jian Shen authored
      Currently, if VF is loaded on the host side, the host doesn't
      clear the VF's VLAN table entries when VF removing. In this
      case, when doing reset and disabling sriov at the same time the
      VLAN device over VF will be removed, but the VLAN table entries
      in hardware are remained.
      
      This patch fixes it by asking PF to clear the VLAN table entries for
      VF when VF is removing. It also clears the VLAN table full bit
      after VF VLAN table entries being cleared.
      
      Fixes: c6075b19 ("net: hns3: Record VF vlan tables")
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23b4201d
    • Yonglong Liu's avatar
      net: hns3: fix "tc qdisc del" failed issue · 5eb01ddf
      Yonglong Liu authored
      The HNS3 driver supports to configure TC numbers and TC to priority
      map via "tc" tool. But when delete the rule, will fail, because
      the HNS3 driver needs at least one TC, but the "tc" tool sets TC
      number to zero when delete.
      
      This patch makes sure that the TC number is at least one.
      
      Fixes: 30d240df ("net: hns3: Add mqprio hardware offload support in hns3 driver")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5eb01ddf
    • Vinicius Costa Gomes's avatar
      taprio: Fix sending packets without dequeueing them · b09fe70e
      Vinicius Costa Gomes authored
      There was a bug that was causing packets to be sent to the driver
      without first calling dequeue() on the "child" qdisc. And the KASAN
      report below shows that sending a packet without calling dequeue()
      leads to bad results.
      
      The problem is that when checking the last qdisc "child" we do not set
      the returned skb to NULL, which can cause it to be sent to the driver,
      and so after the skb is sent, it may be freed, and in some situations a
      reference to it may still be in the child qdisc, because it was never
      dequeued.
      
      The crash log looks like this:
      
      [   19.937538] ==================================================================
      [   19.938300] BUG: KASAN: use-after-free in taprio_dequeue_soft+0x620/0x780
      [   19.938968] Read of size 4 at addr ffff8881128628cc by task swapper/1/0
      [   19.939612]
      [   19.939772] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc3+ #97
      [   19.940397] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qe4
      [   19.941523] Call Trace:
      [   19.941774]  <IRQ>
      [   19.941985]  dump_stack+0x97/0xe0
      [   19.942323]  print_address_description.constprop.0+0x3b/0x60
      [   19.942884]  ? taprio_dequeue_soft+0x620/0x780
      [   19.943325]  ? taprio_dequeue_soft+0x620/0x780
      [   19.943767]  __kasan_report.cold+0x1a/0x32
      [   19.944173]  ? taprio_dequeue_soft+0x620/0x780
      [   19.944612]  kasan_report+0xe/0x20
      [   19.944954]  taprio_dequeue_soft+0x620/0x780
      [   19.945380]  __qdisc_run+0x164/0x18d0
      [   19.945749]  net_tx_action+0x2c4/0x730
      [   19.946124]  __do_softirq+0x268/0x7bc
      [   19.946491]  irq_exit+0x17d/0x1b0
      [   19.946824]  smp_apic_timer_interrupt+0xeb/0x380
      [   19.947280]  apic_timer_interrupt+0xf/0x20
      [   19.947687]  </IRQ>
      [   19.947912] RIP: 0010:default_idle+0x2d/0x2d0
      [   19.948345] Code: 00 00 41 56 41 55 65 44 8b 2d 3f 8d 7c 7c 41 54 55 53 0f 1f 44 00 00 e8 b1 b2 c5 fd e9 07 00 3
      [   19.950166] RSP: 0018:ffff88811a3efda0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
      [   19.950909] RAX: 0000000080000000 RBX: ffff88811a3a9600 RCX: ffffffff8385327e
      [   19.951608] RDX: 1ffff110234752c0 RSI: 0000000000000000 RDI: ffffffff8385262f
      [   19.952309] RBP: ffffed10234752c0 R08: 0000000000000001 R09: ffffed10234752c1
      [   19.953009] R10: ffffed10234752c0 R11: ffff88811a3a9607 R12: 0000000000000001
      [   19.953709] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
      [   19.954408]  ? default_idle_call+0x2e/0x70
      [   19.954816]  ? default_idle+0x1f/0x2d0
      [   19.955192]  default_idle_call+0x5e/0x70
      [   19.955584]  do_idle+0x3d4/0x500
      [   19.955909]  ? arch_cpu_idle_exit+0x40/0x40
      [   19.956325]  ? _raw_spin_unlock_irqrestore+0x23/0x30
      [   19.956829]  ? trace_hardirqs_on+0x30/0x160
      [   19.957242]  cpu_startup_entry+0x19/0x20
      [   19.957633]  start_secondary+0x2a6/0x380
      [   19.958026]  ? set_cpu_sibling_map+0x18b0/0x18b0
      [   19.958486]  secondary_startup_64+0xa4/0xb0
      [   19.958921]
      [   19.959078] Allocated by task 33:
      [   19.959412]  save_stack+0x1b/0x80
      [   19.959747]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [   19.960222]  kmem_cache_alloc+0xe4/0x230
      [   19.960617]  __alloc_skb+0x91/0x510
      [   19.960967]  ndisc_alloc_skb+0x133/0x330
      [   19.961358]  ndisc_send_ns+0x134/0x810
      [   19.961735]  addrconf_dad_work+0xad5/0xf80
      [   19.962144]  process_one_work+0x78e/0x13a0
      [   19.962551]  worker_thread+0x8f/0xfa0
      [   19.962919]  kthread+0x2ba/0x3b0
      [   19.963242]  ret_from_fork+0x3a/0x50
      [   19.963596]
      [   19.963753] Freed by task 33:
      [   19.964055]  save_stack+0x1b/0x80
      [   19.964386]  __kasan_slab_free+0x12f/0x180
      [   19.964830]  kmem_cache_free+0x80/0x290
      [   19.965231]  ip6_mc_input+0x38a/0x4d0
      [   19.965617]  ipv6_rcv+0x1a4/0x1d0
      [   19.965948]  __netif_receive_skb_one_core+0xf2/0x180
      [   19.966437]  netif_receive_skb+0x8c/0x3c0
      [   19.966846]  br_handle_frame_finish+0x779/0x1310
      [   19.967302]  br_handle_frame+0x42a/0x830
      [   19.967694]  __netif_receive_skb_core+0xf0e/0x2a90
      [   19.968167]  __netif_receive_skb_one_core+0x96/0x180
      [   19.968658]  process_backlog+0x198/0x650
      [   19.969047]  net_rx_action+0x2fa/0xaa0
      [   19.969420]  __do_softirq+0x268/0x7bc
      [   19.969785]
      [   19.969940] The buggy address belongs to the object at ffff888112862840
      [   19.969940]  which belongs to the cache skbuff_head_cache of size 224
      [   19.971202] The buggy address is located 140 bytes inside of
      [   19.971202]  224-byte region [ffff888112862840, ffff888112862920)
      [   19.972344] The buggy address belongs to the page:
      [   19.972820] page:ffffea00044a1800 refcount:1 mapcount:0 mapping:ffff88811a2bd1c0 index:0xffff8881128625c0 compo0
      [   19.973930] flags: 0x8000000000010200(slab|head)
      [   19.974388] raw: 8000000000010200 ffff88811a2ed650 ffff88811a2ed650 ffff88811a2bd1c0
      [   19.975151] raw: ffff8881128625c0 0000000000190013 00000001ffffffff 0000000000000000
      [   19.975915] page dumped because: kasan: bad access detected
      [   19.976461] page_owner tracks the page as allocated
      [   19.976946] page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NO)
      [   19.978332]  prep_new_page+0x24b/0x330
      [   19.978707]  get_page_from_freelist+0x2057/0x2c90
      [   19.979170]  __alloc_pages_nodemask+0x218/0x590
      [   19.979619]  new_slab+0x9d/0x300
      [   19.979948]  ___slab_alloc.constprop.0+0x2f9/0x6f0
      [   19.980421]  __slab_alloc.constprop.0+0x30/0x60
      [   19.980870]  kmem_cache_alloc+0x201/0x230
      [   19.981269]  __alloc_skb+0x91/0x510
      [   19.981620]  alloc_skb_with_frags+0x78/0x4a0
      [   19.982043]  sock_alloc_send_pskb+0x5eb/0x750
      [   19.982476]  unix_stream_sendmsg+0x399/0x7f0
      [   19.982904]  sock_sendmsg+0xe2/0x110
      [   19.983262]  ____sys_sendmsg+0x4de/0x6d0
      [   19.983660]  ___sys_sendmsg+0xe4/0x160
      [   19.984032]  __sys_sendmsg+0xab/0x130
      [   19.984396]  do_syscall_64+0xe7/0xae0
      [   19.984761] page last free stack trace:
      [   19.985142]  __free_pages_ok+0x432/0xbc0
      [   19.985533]  qlist_free_all+0x56/0xc0
      [   19.985907]  quarantine_reduce+0x149/0x170
      [   19.986315]  __kasan_kmalloc.constprop.0+0x9e/0xd0
      [   19.986791]  kmem_cache_alloc+0xe4/0x230
      [   19.987182]  prepare_creds+0x24/0x440
      [   19.987548]  do_faccessat+0x80/0x590
      [   19.987906]  do_syscall_64+0xe7/0xae0
      [   19.988276]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [   19.988775]
      [   19.988930] Memory state around the buggy address:
      [   19.989402]  ffff888112862780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   19.990111]  ffff888112862800: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
      [   19.990822] >ffff888112862880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [   19.991529]                                               ^
      [   19.992081]  ffff888112862900: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
      [   19.992796]  ffff888112862980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      
      Fixes: 5a781ccb ("tc: Add support for configuring the taprio scheduler")
      Reported-by: default avatarMichael Schmidt <michael.schmidt@eti.uni-siegen.de>
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Acked-by: default avatarAndre Guedes <andre.guedes@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b09fe70e
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.6-2' of git://github.com/cminyard/linux-ipmi · 3cc6e2c5
      Linus Torvalds authored
      Pull IPMI fix from Corey Minyard:
       "Fix a message spew on some system
      
        The call to platform_get_irq() was changed to print a log if the
        interrupt was not available, and that was causing bogus messages to
        spew out for the IPMI driver. People have requested that this get in
        to 5.6 so I'm sending it along"
      
      * tag 'for-linus-5.6-2' of git://github.com/cminyard/linux-ipmi:
        ipmi_si: Avoid spurious errors for optional IRQs
      3cc6e2c5
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 2644bc85
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "Fix a build problem with x86/curve25519"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: x86/curve25519 - support assemblers with no adx support
      2644bc85
    • Amir Goldstein's avatar
      ovl: fix lock in ovl_llseek() · 531d3040
      Amir Goldstein authored
      ovl_inode_lock() is interruptible. When inode_lock() in ovl_llseek()
      was replaced with ovl_inode_lock(), we did not add a check for error.
      
      Fix this by making ovl_inode_lock() uninterruptible and change the
      existing call sites to use an _interruptible variant.
      
      Reported-by: syzbot+66a9752fa927f745385e@syzkaller.appspotmail.com
      Fixes: b1f9d385 ("ovl: use ovl_inode_lock in ovl_llseek()")
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      531d3040
    • Shin'ichiro Kawasaki's avatar
      block: Fix partition support for host aware zoned block devices · b53df2e7
      Shin'ichiro Kawasaki authored
      Commit b7205307 ("block: allow partitions on host aware zone
      devices") introduced the helper function disk_has_partitions() to check
      if a given disk has valid partitions. However, since this function result
      directly depends on the disk partition table length rather than the
      actual existence of valid partitions in the table, it returns true even
      after all partitions are removed from the disk. For host aware zoned
      block devices, this results in zone management support to be kept
      disabled even after removing all partitions.
      
      Fix this by changing disk_has_partitions() to walk through the partition
      table entries and return true if and only if a valid non-zero size
      partition is found.
      
      Fixes: b7205307 ("block: allow partitions on host aware zone devices")
      Cc: stable@vger.kernel.org # 5.5
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b53df2e7
    • Ming Lei's avatar
      blk-mq: insert flush request to the front of dispatch queue · cc3200ea
      Ming Lei authored
      commit 01e99aec ("blk-mq: insert passthrough request into
      hctx->dispatch directly") may change to add flush request to the tail
      of dispatch by applying the 'add_head' parameter of
      blk_mq_sched_insert_request.
      
      Turns out this way causes performance regression on NCQ controller because
      flush is non-NCQ command, which can't be queued when there is any in-flight
      NCQ command. When adding flush rq to the front of hctx->dispatch, it is
      easier to introduce extra time to flush rq's latency compared with adding
      to the tail of dispatch queue because of S_SCHED_RESTART, then chance of
      flush merge is increased, and less flush requests may be issued to
      controller.
      
      So always insert flush request to the front of dispatch queue just like
      before applying commit 01e99aec ("blk-mq: insert passthrough request
      into hctx->dispatch directly").
      
      Cc: Damien Le Moal <Damien.LeMoal@wdc.com>
      Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reported-by: default avatarShinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Fixes: 01e99aec ("blk-mq: insert passthrough request into hctx->dispatch directly")
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cc3200ea
    • Stefan Haberland's avatar
      s390/dasd: fix data corruption for thin provisioned devices · 5e6bdd37
      Stefan Haberland authored
      Devices are formatted in multiple of tracks.
      For an Extent Space Efficient (ESE) volume we get errors when accessing
      unformatted tracks. In this case the driver either formats the track on
      the flight for write requests or returns zero data for read requests.
      
      In case a request spans multiple tracks, the indication of an unformatted
      track presented for the first track is incorrectly applied to all tracks
      covered by the request. As a result, tracks containing data will be handled
      as empty, resulting in zero data being returned on read, or overwriting
      existing data with zero on write.
      
      Fix by determining the track that gets the NRF error.
      For write requests only format the track that is surely not formatted.
      For Read requests all tracks before have returned valid data and should not
      be touched.
      All tracks after the unformatted track might be formatted or not. Those are
      returned to the blocklayer to build a new request.
      
      When using alias devices there is a chance that multiple write requests
      trigger a format of the same track which might lead to data loss. Ensure
      that a track is formatted only once by maintaining a list of currently
      processed tracks.
      
      Fixes: 5e2b17e7 ("s390/dasd: Add dynamic formatting support for ESE volumes")
      Cc: stable@vger.kernel.org # 5.3+
      Signed-off-by: default avatarStefan Haberland <sth@linux.ibm.com>
      Reviewed-by: default avatarJan Hoeppner <hoeppner@linux.ibm.com>
      Reviewed-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5e6bdd37
    • Ulf Hansson's avatar
      mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for eMMC sleep command · 18d20046
      Ulf Hansson authored
      The busy timeout for the CMD5 to put the eMMC into sleep state, is specific
      to the card. Potentially the timeout may exceed the host->max_busy_timeout.
      If that becomes the case, mmc_sleep() converts from using an R1B response
      to an R1 response, as to prevent the host from doing HW busy detection.
      
      However, it has turned out that some hosts requires an R1B response no
      matter what, so let's respect that via checking MMC_CAP_NEED_RSP_BUSY. Note
      that, if the R1B gets enforced, the host becomes fully responsible of
      managing the needed busy timeout, in one way or the other.
      Suggested-by: default avatarSowjanya Komatineni <skomatineni@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200311092036.16084-1-ulf.hansson@linaro.orgSigned-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      18d20046
    • Chris Packham's avatar
      net: mvmdio: avoid error message for optional IRQ · e1f550dc
      Chris Packham authored
      Per the dt-binding the interrupt is optional so use
      platform_get_irq_optional() instead of platform_get_irq(). Since
      commit 7723f4c5 ("driver core: platform: Add an error message to
      platform_get_irq*()") platform_get_irq() produces an error message
      
        orion-mdio f1072004.mdio: IRQ index 0 not found
      
      which is perfectly normal if one hasn't specified the optional property
      in the device tree.
      Signed-off-by: default avatarChris Packham <chris.packham@alliedtelesis.co.nz>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1f550dc
    • Andrew Lunn's avatar
      net: dsa: mv88e6xxx: Add missing mask of ATU occupancy register · 012fc745
      Andrew Lunn authored
      Only the bottom 12 bits contain the ATU bin occupancy statistics. The
      upper bits need masking off.
      
      Fixes: e0c69ca7 ("net: dsa: mv88e6xxx: Add ATU occupancy via devlink resources")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      012fc745
    • Eric Dumazet's avatar
      net: memcg: fix lockdep splat in inet_csk_accept() · 06669ea3
      Eric Dumazet authored
      Locking newsk while still holding the listener lock triggered
      a lockdep splat [1]
      
      We can simply move the memcg code after we release the listener lock,
      as this can also help if multiple threads are sharing a common listener.
      
      Also fix a typo while reading socket sk_rmem_alloc.
      
      [1]
      WARNING: possible recursive locking detected
      5.6.0-rc3-syzkaller #0 Not tainted
      --------------------------------------------
      syz-executor598/9524 is trying to acquire lock:
      ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline]
      ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492
      
      but task is already holding lock:
      ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline]
      ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(sk_lock-AF_INET6);
        lock(sk_lock-AF_INET6);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      1 lock held by syz-executor598/9524:
       #0: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline]
       #0: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445
      
      stack backtrace:
      CPU: 0 PID: 9524 Comm: syz-executor598 Not tainted 5.6.0-rc3-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x188/0x20d lib/dump_stack.c:118
       print_deadlock_bug kernel/locking/lockdep.c:2370 [inline]
       check_deadlock kernel/locking/lockdep.c:2411 [inline]
       validate_chain kernel/locking/lockdep.c:2954 [inline]
       __lock_acquire.cold+0x114/0x288 kernel/locking/lockdep.c:3954
       lock_acquire+0x197/0x420 kernel/locking/lockdep.c:4484
       lock_sock_nested+0xc5/0x110 net/core/sock.c:2947
       lock_sock include/net/sock.h:1541 [inline]
       inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492
       inet_accept+0xe9/0x7c0 net/ipv4/af_inet.c:734
       __sys_accept4_file+0x3ac/0x5b0 net/socket.c:1758
       __sys_accept4+0x53/0x90 net/socket.c:1809
       __do_sys_accept4 net/socket.c:1821 [inline]
       __se_sys_accept4 net/socket.c:1818 [inline]
       __x64_sys_accept4+0x93/0xf0 net/socket.c:1818
       do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4445c9
      Code: e8 0c 0d 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffc35b37608 EFLAGS: 00000246 ORIG_RAX: 0000000000000120
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004445c9
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 0000000000306777 R09: 0000000000306777
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 00000000004053d0 R14: 0000000000000000 R15: 0000000000000000
      
      Fixes: d752a498 ("net: memcg: late association of sock to memcg")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06669ea3