1. 20 Feb, 2022 2 commits
    • Linus Torvalds's avatar
      Merge tag 'pidfd.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · c1034d24
      Linus Torvalds authored
      Pull pidfd fix from Christian Brauner:
       "This fixes a problem reported by lockdep when installing a pidfd via
        fd_install() with siglock and the tasklisk write lock held in
        copy_process() when calling clone()/clone3() with CLONE_PIDFD.
      
        Originally a pidfd was created prior to holding any of these locks but
        this required a call to ksys_close(). So quite some time ago in
        6fd2fe49 ("copy_process(): don't use ksys_close() on cleanups") we
        switched to a get_unused_fd_flags() + fd_install() model.
      
        As part of that we moved fd_install() as late as possible. This was
        done for two main reasons. First, because we needed to ensure that we
        call fd_install() past the point of no return as once that's called
        the fd is live in the task's file table. Second, because we tried to
        ensure that the fd is visible in /proc/<pid>/fd/<pidfd> right when the
        task is visible.
      
        This fix moves the fd_install() to an even later point which means
        that a task will be visible in proc while the pidfd isn't yet under
        /proc/<pid>/fd/<pidfd>.
      
        While this is a user visible change it's very unlikely that this will
        have any impact. Nobody should be relying on that and if they do we
        need to come up with something better but again, it's doubtful this is
        relevant"
      
      * tag 'pidfd.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        copy_process(): Move fd_install() out of sighand->siglock critical section
      c1034d24
    • Linus Torvalds's avatar
      Merge branch 'ucount-rlimit-fixes-for-v5.17' of... · 2d3409eb
      Linus Torvalds authored
      Merge branch 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull ucounts fixes from Eric Biederman:
       "Michal Koutný recently found some bugs in the enforcement of
        RLIMIT_NPROC in the recent ucount rlimit implementation.
      
        In this set of patches I have developed a very conservative approach
        changing only what is necessary to fix the bugs that I can see
        clearly. Cleanups and anything that is making the code more consistent
        can follow after we have the code working as it has historically.
      
        The problem is not so much inconsistencies (although those exist) but
        that it is very difficult to figure out what the code should be doing
        in the case of RLIMIT_NPROC.
      
        All other rlimits are only enforced where the resource is acquired
        (allocated). RLIMIT_NPROC by necessity needs to be enforced in an
        additional location, and our current implementation stumbled it's way
        into that implementation"
      
      * 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        ucounts: Handle wrapping in is_ucounts_overlimit
        ucounts: Move RLIMIT_NPROC handling after set_user
        ucounts: Base set_cred_ucounts changes on the real user
        ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1
        rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user
      2d3409eb
  2. 19 Feb, 2022 3 commits
  3. 18 Feb, 2022 7 commits
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-processor' · 82926564
      Rafael J. Wysocki authored
      Merge fix for a recent boot lockup regression on 32-bit ThinkPad T40.
      
      * acpi-processor:
        ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40
      82926564
    • Linus Torvalds's avatar
      Merge tag 'mtd/fixes-for-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · 7993e65f
      Linus Torvalds authored
      Pull MTD fixes from Miquel Raynal:
       "MTD changes:
      
         - Qcom:
            - Don't print error message on -EPROBE_DEFER
            - Fix kernel panic on skipped partition
            - Fix missing free for pparts in cleanup
      
         - phram: Prevent divide by zero bug in phram_setup()
      
        Raw NAND controller changes:
      
         - ingenic: Fix missing put_device in ingenic_ecc_get
      
         - qcom: Fix clock sequencing in qcom_nandc_probe()
      
         - omap2: Prevent invalid configuration and build error
      
         - gpmi: Don't leak PM reference in error path
      
         - brcmnand: Fix incorrect sub-page ECC status"
      
      * tag 'mtd/fixes-for-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
        mtd: rawnand: brcmnand: Fixed incorrect sub-page ECC status
        mtd: rawnand: gpmi: don't leak PM reference in error path
        mtd: phram: Prevent divide by zero bug in phram_setup()
        mtd: rawnand: omap2: Prevent invalid configuration and build error
        mtd: parsers: qcom: Fix missing free for pparts in cleanup
        mtd: parsers: qcom: Fix kernel panic on skipped partition
        mtd: parsers: qcom: Don't print error message on -EPROBE_DEFER
        mtd: rawnand: qcom: Fix clock sequencing in qcom_nandc_probe()
        mtd: rawnand: ingenic: Fix missing put_device in ingenic_ecc_get
      7993e65f
    • Linus Torvalds's avatar
      Merge tag 'block-5.17-2022-02-17' of git://git.kernel.dk/linux-block · b9889768
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Surprise removal fix (Christoph)
      
       - Ensure that pages are zeroed before submitted for userspace IO
         (Haimin)
      
       - Fix blk-wbt accounting issue with BFQ (Laibin)
      
       - Use bsize for discard granularity in loop (Ming)
      
       - Fix missing zone handling in blk_complete_request() (Pankaj)
      
      * tag 'block-5.17-2022-02-17' of git://git.kernel.dk/linux-block:
        block/wbt: fix negative inflight counter when remove scsi device
        block: fix surprise removal for drivers calling blk_set_queue_dying
        block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern
        block: loop:use kstatfs.f_bsize of backing file to set discard granularity
        block: Add handling for zone append command in blk_complete_request
      b9889768
    • Linus Torvalds's avatar
      Merge tag 'sound-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2848551b
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small patches, mostly for old and new regressions and
        device-specific fixes.
      
         - Regression fixes regarding ALSA core SG-buffer helpers
      
         - Regression fix for Realtek HD-audio mutex deadlock
      
         - Regression fix for USB-audio PM resume error
      
         - More coverage of ASoC core control API notification fixes
      
         - Old regression fixes for HD-audio probe mask
      
         - Fixes for ASoC Realtek codec work handling
      
         - Other device-specific quirks / fixes"
      
      * tag 'sound-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits)
        ASoC: intel: skylake: Set max DMA segment size
        ASoC: SOF: hda: Set max DMA segment size
        ALSA: hda: Set max DMA segment size
        ALSA: hda/realtek: Fix deadlock by COEF mutex
        ALSA: usb-audio: Don't abort resume upon errors
        ALSA: hda: Fix missing codec probe on Shenker Dock 15
        ALSA: hda: Fix regression on forced probe mask option
        ALSA: hda/realtek: Add quirk for Legion Y9000X 2019
        ALSA: usb-audio: revert to IMPLICIT_FB_FIXED_DEV for M-Audio FastTrack Ultra
        ASoC: wm_adsp: Correct control read size when parsing compressed buffer
        ASoC: qcom: Actually clear DMA interrupt register for HDMI
        ALSA: memalloc: invalidate SG pages before sync
        ALSA: memalloc: Fix dma_need_sync() checks
        MAINTAINERS: update cros_ec_codec maintainers
        ASoC: rt5682: do not block workqueue if card is unbound
        ASoC: rt5668: do not block workqueue if card is unbound
        ASoC: rt5682s: do not block workqueue if card is unbound
        ASoC: tas2770: Insert post reset delay
        ASoC: Revert "ASoC: mediatek: Check for error clk pointer"
        ASoC: amd: acp: Set gpio_spkr_en to None for max speaker amplifer in machine driver
        ...
      2848551b
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 45a98a71
      Linus Torvalds authored
      Pull arm64 fix from Catalin Marinas:
       "Fix wrong branch label in the EL2 GICv3 initialisation code"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: Correct wrong label in macro __init_el2_gicv3
      45a98a71
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · ea4b3d29
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix boot failure on 603 with DEBUG_PAGEALLOC and KFENCE
      
       - Fix 32-build with newer binutils that rejects 'ptesync' etc
      
      Thanks to Anders Roxell, Christophe Leroy, and Maxime Bizon.
      
      * tag 'powerpc-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/lib/sstep: fix 'ptesync' build error
        powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE
      ea4b3d29
    • Linus Torvalds's avatar
      Merge tag '5.17-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 7476b043
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Six small smb3 client fixes, three for stable:
      
         - fix for snapshot mount option
      
         - two ACL related fixes
      
         - use after free race fix
      
         - fix for confusing warning message logged with older dialects"
      
      * tag '5.17-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix confusing unneeded warning message on smb2.1 and earlier
        cifs: modefromsids must add an ACE for authenticated users
        cifs: fix double free race when mount fails in cifs_get_root()
        cifs: do not use uninitialized data in the owner/group sid
        cifs: fix set of group SID via NTSD xattrs
        smb3: fix snapshot mount option
      7476b043
  4. 17 Feb, 2022 28 commits
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-fixes-5.17-rc5' of... · 9195e5e0
      Linus Torvalds authored
      Merge tag 'linux-kselftest-fixes-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull Kselftest fixes from Shuah Khan:
       "Fixes to ftrace, exec, and seccomp tests build, run-time and install
        bugs. These bugs are in the way of running the tests"
      
      * tag 'linux-kselftest-fixes-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests/ftrace: Do not trace do_softirq because of PREEMPT_RT
        selftests/seccomp: Fix seccomp failure by adding missing headers
        selftests/exec: Add non-regular to TEST_GEN_PROGS
      9195e5e0
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2022-02-18' of git://anongit.freedesktop.org/drm/drm · b3d971ec
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Regular fixes for rc5, nothing really stands out, mostly some amdgpu
        and i915 fixes with mediatek, radeon and some misc fixes.
      
        cma-helper:
         - set VM_DONTEXPAND
      
        atomic:
         - error handling fix
      
        mediatek:
         - fix probe defer loop with external bridge
      
        amdgpu:
         - Stable pstate clock fixes for Dimgrey Cavefish and Beige Goby
         - S0ix SDMA fix
         - Yellow Carp GPU reset fix
      
        radeon:
         - Backlight fix for iMac 12,1
      
        i915:
         - GVT kerneldoc cleanup.
         - GVT Kconfig should depend on X86
         - Prevent out of range access in SWSCI display code
         - Fix mbus join and dbuf slice config lookup
         - Fix inverted priority selection in the TTM backend
         - Fix FBC plane end Y offset check"
      
      * tag 'drm-fixes-2022-02-18' of git://anongit.freedesktop.org/drm/drm:
        drm/atomic: Don't pollute crtc_state->mode_blob with error pointers
        drm/radeon: Fix backlight control on iMac 12,1
        drm/amd/pm: correct the sequence of sending gpu reset msg
        drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix.
        drm/amd/pm: correct UMD pstate clocks for Dimgrey Cavefish and Beige Goby
        drm/i915/fbc: Fix the plane end Y offset check
        drm/i915/opregion: check port number bounds for SWSCI display power state
        drm/i915/ttm: tweak priority hint selection
        drm/i915: Fix mbus join config lookup
        drm/i915: Fix dbuf slice config lookup
        drm/cma-helper: Set VM_DONTEXPAND for mmap
        drm/mediatek: mtk_dsi: Avoid EPROBE_DEFER loop with external bridge
        drm/i915/gvt: Make DRM_I915_GVT depend on X86
        drm/i915/gvt: clean up kernel-doc in gtt.c
      b3d971ec
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2022-02-17' of... · 5666b610
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2022-02-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      - GVT kerneldoc cleanup. (Randy Dunlap)
      - GVT Kconfig should depend on X86. (Siva Mullati)
      - Prevent out of range access in SWSCI display code. (Jani Nikula)
      - Fix mbus join and dbuf slice config lookup. (Ville Syrjälä)
      - Fix inverted priority selection in the TTM backend. (Matthew Auld)
      - Fix FBC plane end Y offset check. (Ville Syrjälä)
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/Yg4lA6k8+xp8u3aB@tursulin-mobl2
      5666b610
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2022-02-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · babb1fc3
      Dave Airlie authored
       * drm/cma-helper: Set VM_DONTEXPAND
       * drm/atomic: Fix error handling in drm_atomic_set_mode_for_crtc()
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Thomas Zimmermann <tzimmermann@suse.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/Yg4mzQALMX69UmA3@linux-uq9g
      babb1fc3
    • Linus Torvalds's avatar
      Merge tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 8b97cae3
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from wireless and netfilter.
      
        Current release - regressions:
      
         - dsa: lantiq_gswip: fix use after free in gswip_remove()
      
         - smc: avoid overwriting the copies of clcsock callback functions
      
        Current release - new code bugs:
      
         - iwlwifi:
            - fix use-after-free when no FW is present
            - mei: fix the pskb_may_pull check in ipv4
            - mei: retry mapping the shared area
            - mvm: don't feed the hardware RFKILL into iwlmei
      
        Previous releases - regressions:
      
         - ipv6: mcast: use rcu-safe version of ipv6_get_lladdr()
      
         - tipc: fix wrong publisher node address in link publications
      
         - iwlwifi: mvm: don't send SAR GEO command for 3160 devices, avoid FW
           assertion
      
         - bgmac: make idm and nicpm resource optional again
      
         - atl1c: fix tx timeout after link flap
      
        Previous releases - always broken:
      
         - vsock: remove vsock from connected table when connect is
           interrupted by a signal
      
         - ping: change destination interface checks to match raw sockets
      
         - crypto: af_alg - get rid of alg_memory_allocated to avoid confusing
           semantics (and null-deref) after SO_RESERVE_MEM was added
      
         - ipv6: make exclusive flowlabel checks per-netns
      
         - bonding: force carrier update when releasing slave
      
         - sched: limit TC_ACT_REPEAT loops
      
         - bridge: multicast: notify switchdev driver whenever MC processing
           gets disabled because of max entries reached
      
         - wifi: brcmfmac: fix crash in brcm_alt_fw_path when WLAN not found
      
         - iwlwifi: fix locking when "HW not ready"
      
         - phy: mediatek: remove PHY mode check on MT7531
      
         - dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN
      
         - dsa: lan9303:
            - fix polarity of reset during probe
            - fix accelerated VLAN handling"
      
      * tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
        bonding: force carrier update when releasing slave
        nfp: flower: netdev offload check for ip6gretap
        ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt
        ipv4: fix data races in fib_alias_hw_flags_set
        net: dsa: lan9303: add VLAN IDs to master device
        net: dsa: lan9303: handle hwaccel VLAN tags
        vsock: remove vsock from connected table when connect is interrupted by a signal
        Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname"
        ping: fix the dif and sdif check in ping_lookup
        net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990
        net: sched: limit TC_ACT_REPEAT loops
        tipc: fix wrong notification node addresses
        net: dsa: lantiq_gswip: fix use after free in gswip_remove()
        ipv6: per-netns exclusive flowlabel checks
        net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled
        CDC-NCM: avoid overflow in sanity checking
        mctp: fix use after free
        net: mscc: ocelot: fix use-after-free in ocelot_vlan_del()
        bonding: fix data-races around agg_select_timer
        dpaa2-eth: Initialize mutex used in one step timestamping path
        ...
      8b97cae3
    • Zhang Changzhong's avatar
      bonding: force carrier update when releasing slave · a6ab75ce
      Zhang Changzhong authored
      In __bond_release_one(), bond_set_carrier() is only called when bond
      device has no slave. Therefore, if we remove the up slave from a master
      with two slaves and keep the down slave, the master will remain up.
      
      Fix this by moving bond_set_carrier() out of if (!bond_has_slaves(bond))
      statement.
      
      Reproducer:
      $ insmod bonding.ko mode=0 miimon=100 max_bonds=2
      $ ifconfig bond0 up
      $ ifenslave bond0 eth0 eth1
      $ ifconfig eth0 down
      $ ifenslave -d bond0 eth1
      $ cat /proc/net/bonding/bond0
      
      Fixes: ff59c456 ("[PATCH] bonding: support carrier state for master")
      Signed-off-by: default avatarZhang Changzhong <zhangchangzhong@huawei.com>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Link: https://lore.kernel.org/r/1645021088-38370-1-git-send-email-zhangchangzhong@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a6ab75ce
    • Luis Chamberlain's avatar
      fs/file_table: fix adding missing kmemleak_not_leak() · a3580ac9
      Luis Chamberlain authored
      Commit b42bc9a3 ("Fix regression due to "fs: move binfmt_misc sysctl
      to its own file") fixed a regression, however it failed to add a
      kmemleak_not_leak().
      
      Fixes: b42bc9a3 ("Fix regression due to "fs: move binfmt_misc sysctl to its own file")
      Reported-by: default avatarTong Zhang <ztong0001@gmail.com>
      Cc: Tong Zhang <ztong0001@gmail.com>
      Signed-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3580ac9
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v5.17-2022-02-17' of... · 2dd3a8a1
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Fix corrupt inject files when only last branch option is enabled with
         ARM CoreSight ETM
      
       - Fix use-after-free for realloc(..., 0) in libsubcmd, found by gcc 12
      
       - Defer freeing string after possible strlen() on it in the BPF loader,
         found by gcc 12
      
       - Avoid early exit in 'perf trace' due SIGCHLD from non-workload
         processes
      
       - Fix arm64 perf_event_attr 'perf test's wrt --call-graph
         initialization
      
       - Fix libperf 32-bit build for 'perf test' wrt uint64_t printf
      
       - Fix perf_cpu_map__for_each_cpu macro in libperf, providing access to
         the CPU iterator
      
       - Sync linux/perf_event.h UAPI with the kernel sources
      
       - Update Jiri Olsa's email address in MAINTAINERS
      
      * tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf bpf: Defer freeing string after possible strlen() on it
        perf test: Fix arm64 perf_event_attr tests wrt --call-graph initialization
        libsubcmd: Fix use-after-free for realloc(..., 0)
        libperf: Fix perf_cpu_map__for_each_cpu macro
        perf cs-etm: Fix corrupt inject files when only last branch option is enabled
        perf cs-etm: No-op refactor of synth opt usage
        libperf: Fix 32-bit build for tests uint64_t printf
        tools headers UAPI: Sync linux/perf_event.h with the kernel sources
        perf trace: Avoid early exit due SIGCHLD from non-workload processes
        MAINTAINERS: Update Jiri's email address
      2dd3a8a1
    • Linus Torvalds's avatar
      Merge tag 'modules-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux · edbd6c62
      Linus Torvalds authored
      Pull module fix from Luis Chamberlain:
       "Fixes module decompression when CONFIG_SYSFS=n
      
        The only fix trickled down for v5.17-rc cycle so far is the fix for
        module decompression when CONFIG_SYSFS=n. This was reported through
        0-day"
      
      * tag 'modules-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
        module: fix building with sysfs disabled
      edbd6c62
    • Danie du Toit's avatar
      nfp: flower: netdev offload check for ip6gretap · 7dbcda58
      Danie du Toit authored
      IPv6 GRE tunnels are not being offloaded, this is caused by a missing
      netdev offload check. The functionality of IPv6 GRE tunnel offloading
      was previously added but this check was not included. Adding the
      ip6gretap check allows IPv6 GRE tunnels to be offloaded correctly.
      
      Fixes: f7536ffb ("nfp: flower: Allow ipv6gretap interface for offloading")
      Signed-off-by: default avatarDanie du Toit <danie.dutoit@corigine.com>
      Signed-off-by: default avatarLouis Peens <louis.peens@corigine.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20220217124820.40436-1-louis.peens@corigine.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7dbcda58
    • Eric Dumazet's avatar
      ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt · d95d6320
      Eric Dumazet authored
      Because fib6_info_hw_flags_set() is called without any synchronization,
      all accesses to gi6->offload, fi->trap and fi->offload_failed
      need some basic protection like READ_ONCE()/WRITE_ONCE().
      
      BUG: KCSAN: data-race in fib6_info_hw_flags_set / fib6_purge_rt
      
      read to 0xffff8881087d5886 of 1 bytes by task 13953 on cpu 0:
       fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1007 [inline]
       fib6_purge_rt+0x4f/0x580 net/ipv6/ip6_fib.c:1033
       fib6_del_route net/ipv6/ip6_fib.c:1983 [inline]
       fib6_del+0x696/0x890 net/ipv6/ip6_fib.c:2028
       __ip6_del_rt net/ipv6/route.c:3876 [inline]
       ip6_del_rt+0x83/0x140 net/ipv6/route.c:3891
       __ipv6_dev_ac_dec+0x2b5/0x370 net/ipv6/anycast.c:374
       ipv6_dev_ac_dec net/ipv6/anycast.c:387 [inline]
       __ipv6_sock_ac_close+0x141/0x200 net/ipv6/anycast.c:207
       ipv6_sock_ac_close+0x79/0x90 net/ipv6/anycast.c:220
       inet6_release+0x32/0x50 net/ipv6/af_inet6.c:476
       __sock_release net/socket.c:650 [inline]
       sock_close+0x6c/0x150 net/socket.c:1318
       __fput+0x295/0x520 fs/file_table.c:280
       ____fput+0x11/0x20 fs/file_table.c:313
       task_work_run+0x8e/0x110 kernel/task_work.c:164
       tracehook_notify_resume include/linux/tracehook.h:189 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
       exit_to_user_mode_prepare+0x160/0x190 kernel/entry/common.c:207
       __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
       syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:300
       do_syscall_64+0x50/0xd0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      write to 0xffff8881087d5886 of 1 bytes by task 1912 on cpu 1:
       fib6_info_hw_flags_set+0x155/0x3b0 net/ipv6/route.c:6230
       nsim_fib6_rt_hw_flags_set drivers/net/netdevsim/fib.c:668 [inline]
       nsim_fib6_rt_add drivers/net/netdevsim/fib.c:691 [inline]
       nsim_fib6_rt_insert drivers/net/netdevsim/fib.c:756 [inline]
       nsim_fib6_event drivers/net/netdevsim/fib.c:853 [inline]
       nsim_fib_event drivers/net/netdevsim/fib.c:886 [inline]
       nsim_fib_event_work+0x284f/0x2cf0 drivers/net/netdevsim/fib.c:1477
       process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
       worker_thread+0x616/0xa70 kernel/workqueue.c:2454
       kthread+0x2c7/0x2e0 kernel/kthread.c:327
       ret_from_fork+0x1f/0x30
      
      value changed: 0x22 -> 0x2a
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 1912 Comm: kworker/1:3 Not tainted 5.16.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: events nsim_fib_event_work
      
      Fixes: 0c5fcf9e ("IPv6: Add "offload failed" indication to routes")
      Fixes: bb3c4ab9 ("ipv6: Add "offload" and "trap" indications to routes")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Amit Cohen <amcohen@nvidia.com>
      Cc: Ido Schimmel <idosch@nvidia.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Link: https://lore.kernel.org/r/20220216173217.3792411-2-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d95d6320
    • Eric Dumazet's avatar
      ipv4: fix data races in fib_alias_hw_flags_set · 9fcf986c
      Eric Dumazet authored
      fib_alias_hw_flags_set() can be used by concurrent threads,
      and is only RCU protected.
      
      We need to annotate accesses to following fields of struct fib_alias:
      
          offload, trap, offload_failed
      
      Because of READ_ONCE()WRITE_ONCE() limitations, make these
      field u8.
      
      BUG: KCSAN: data-race in fib_alias_hw_flags_set / fib_alias_hw_flags_set
      
      read to 0xffff888134224a6a of 1 bytes by task 2013 on cpu 1:
       fib_alias_hw_flags_set+0x28a/0x470 net/ipv4/fib_trie.c:1050
       nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline]
       nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline]
       nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline]
       nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline]
       nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline]
       nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477
       process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
       process_scheduled_works kernel/workqueue.c:2370 [inline]
       worker_thread+0x7df/0xa70 kernel/workqueue.c:2456
       kthread+0x1bf/0x1e0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30
      
      write to 0xffff888134224a6a of 1 bytes by task 4872 on cpu 0:
       fib_alias_hw_flags_set+0x2d5/0x470 net/ipv4/fib_trie.c:1054
       nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline]
       nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline]
       nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline]
       nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline]
       nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline]
       nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477
       process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
       process_scheduled_works kernel/workqueue.c:2370 [inline]
       worker_thread+0x7df/0xa70 kernel/workqueue.c:2456
       kthread+0x1bf/0x1e0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30
      
      value changed: 0x00 -> 0x02
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 4872 Comm: kworker/0:0 Not tainted 5.17.0-rc3-syzkaller-00188-g1d41d2e8-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: events nsim_fib_event_work
      
      Fixes: 90b93f1b ("ipv4: Add "offload" and "trap" indications to routes")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Link: https://lore.kernel.org/r/20220216173217.3792411-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9fcf986c
    • Mans Rullgard's avatar
      net: dsa: lan9303: add VLAN IDs to master device · 430065e2
      Mans Rullgard authored
      If the master device does VLAN filtering, the IDs used by the switch
      must be added for any frames to be received.  Do this in the
      port_enable() function, and remove them in port_disable().
      
      Fixes: a1292595 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
      Signed-off-by: default avatarMans Rullgard <mans@mansr.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220216204818.28746-1-mans@mansr.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      430065e2
    • Mans Rullgard's avatar
      net: dsa: lan9303: handle hwaccel VLAN tags · 017b355b
      Mans Rullgard authored
      Check for a hwaccel VLAN tag on rx and use it if present.  Otherwise,
      use __skb_vlan_pop() like the other tag parsers do.  This fixes the case
      where the VLAN tag has already been consumed by the master.
      
      Fixes: a1292595 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
      Signed-off-by: default avatarMans Rullgard <mans@mansr.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20220216124634.23123-1-mans@mansr.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      017b355b
    • Linus Torvalds's avatar
      mm: don't try to NUMA-migrate COW pages that have other uses · 80d47f5d
      Linus Torvalds authored
      Oded Gabbay reports that enabling NUMA balancing causes corruption with
      his Gaudi accelerator test load:
      
       "All the details are in the bug, but the bottom line is that somehow,
        this patch causes corruption when the numa balancing feature is
        enabled AND we don't use process affinity AND we use GUP to pin pages
        so our accelerator can DMA to/from system memory.
      
        Either disabling numa balancing, using process affinity to bind to
        specific numa-node or reverting this patch causes the bug to
        disappear"
      
      and Oded bisected the issue to commit 09854ba9 ("mm: do_wp_page()
      simplification").
      
      Now, the NUMA balancing shouldn't actually be changing the writability
      of a page, and as such shouldn't matter for COW.  But it appears it
      does.  Suspicious.
      
      However, regardless of that, the condition for enabling NUMA faults in
      change_pte_range() is nonsensical.  It uses "page_mapcount(page)" to
      decide if a COW page should be NUMA-protected or not, and that makes
      absolutely no sense.
      
      The number of mappings a page has is irrelevant: not only does GUP get a
      reference to a page as in Oded's case, but the other mappings migth be
      paged out and the only reference to them would be in the page count.
      
      Since we should never try to NUMA-balance a page that we can't move
      anyway due to other references, just fix the code to use 'page_count()'.
      Oded confirms that that fixes his issue.
      
      Now, this does imply that something in NUMA balancing ends up changing
      page protections (other than the obvious one of making the page
      inaccessible to get the NUMA faulting information).  Otherwise the COW
      simplification wouldn't matter - since doing the GUP on the page would
      make sure it's writable.
      
      The cause of that permission change would be good to figure out too,
      since it clearly results in spurious COW events - but fixing the
      nonsensical test that just happened to work before is obviously the
      CorrectThing(tm) to do regardless.
      
      Fixes: 09854ba9 ("mm: do_wp_page() simplification")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215616
      Link: https://lore.kernel.org/all/CAFCwf10eNmwq2wD71xjUhqkvv5+_pJMR1nPug2RqNDcFT4H86Q@mail.gmail.com/Reported-and-tested-by: default avatarOded Gabbay <oded.gabbay@gmail.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Peter Xu <peterx@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      80d47f5d
    • Seth Forshee's avatar
      vsock: remove vsock from connected table when connect is interrupted by a signal · b9208492
      Seth Forshee authored
      vsock_connect() expects that the socket could already be in the
      TCP_ESTABLISHED state when the connecting task wakes up with a signal
      pending. If this happens the socket will be in the connected table, and
      it is not removed when the socket state is reset. In this situation it's
      common for the process to retry connect(), and if the connection is
      successful the socket will be added to the connected table a second
      time, corrupting the list.
      
      Prevent this by calling vsock_remove_connected() if a signal is received
      while waiting for a connection. This is harmless if the socket is not in
      the connected table, and if it is in the table then removing it will
      prevent list corruption from a double add.
      
      Note for backporting: this patch requires d5afa82c ("vsock: correct
      removal of socket from the list"), which is in all current stable trees
      except 4.9.y.
      
      Fixes: d021c344 ("VSOCK: Introduce VM Sockets")
      Signed-off-by: default avatarSeth Forshee <sforshee@digitalocean.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Link: https://lore.kernel.org/r/20220217141312.2297547-1-sforshee@digitalocean.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b9208492
    • Jonas Gorski's avatar
      Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname" · 6aba04ee
      Jonas Gorski authored
      This reverts commit 3710e809.
      
      Since idm_base and nicpm_base are still optional resources not present
      on all platforms, this breaks the driver for everything except Northstar
      2 (which has both).
      
      The same change was already reverted once with 755f5738 ("net:
      broadcom: fix a mistake about ioremap resource").
      
      So let's do it again.
      
      Fixes: 3710e809 ("net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname")
      Signed-off-by: default avatarJonas Gorski <jonas.gorski@gmail.com>
      [florian: Added comments to explain the resources are optional]
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220216184634.2032460-1-f.fainelli@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6aba04ee
    • Eric W. Biederman's avatar
      ucounts: Handle wrapping in is_ucounts_overlimit · 0cbae9e2
      Eric W. Biederman authored
      While examining is_ucounts_overlimit and reading the various messages
      I realized that is_ucounts_overlimit fails to deal with counts that
      may have wrapped.
      
      Being wrapped should be a transitory state for counts and they should
      never be wrapped for long, but it can happen so handle it.
      
      Cc: stable@vger.kernel.org
      Fixes: 21d1c5e3 ("Reimplement RLIMIT_NPROC on top of ucounts")
      Link: https://lkml.kernel.org/r/20220216155832.680775-5-ebiederm@xmission.comReviewed-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      0cbae9e2
    • Eric W. Biederman's avatar
      ucounts: Move RLIMIT_NPROC handling after set_user · c923a8e7
      Eric W. Biederman authored
      During set*id() which cred->ucounts to charge the the current process
      to is not known until after set_cred_ucounts.  So move the
      RLIMIT_NPROC checking into a new helper flag_nproc_exceeded and call
      flag_nproc_exceeded after set_cred_ucounts.
      
      This is very much an arbitrary subset of the places where we currently
      change the RLIMIT_NPROC accounting, designed to preserve the existing
      logic.
      
      Fixing the existing logic will be the subject of another series of
      changes.
      
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20220216155832.680775-4-ebiederm@xmission.com
      Fixes: 21d1c5e3 ("Reimplement RLIMIT_NPROC on top of ucounts")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      c923a8e7
    • Eric W. Biederman's avatar
      ucounts: Base set_cred_ucounts changes on the real user · a55d0729
      Eric W. Biederman authored
      Michal Koutný <mkoutny@suse.com> wrote:
      > Tasks are associated to multiple users at once. Historically and as per
      > setrlimit(2) RLIMIT_NPROC is enforce based on real user ID.
      >
      > The commit 21d1c5e3 ("Reimplement RLIMIT_NPROC on top of ucounts")
      > made the accounting structure "indexed" by euid and hence potentially
      > account tasks differently.
      >
      > The effective user ID may be different e.g. for setuid programs but
      > those are exec'd into already existing task (i.e. below limit), so
      > different accounting is moot.
      >
      > Some special setresuid(2) users may notice the difference, justifying
      > this fix.
      
      I looked at cred->ucount and it is only used for rlimit operations
      that were previously stored in cred->user.  Making the fact
      cred->ucount can refer to a different user from cred->user a bug,
      affecting all uses of cred->ulimit not just RLIMIT_NPROC.
      
      Fix set_cred_ucounts to always use the real uid not the effective uid.
      
      Further simplify set_cred_ucounts by noticing that set_cred_ucounts
      somehow retained a draft version of the check to see if alloc_ucounts
      was needed that checks the new->user and new->user_ns against the
      current_real_cred().  Remove that draft version of the check.
      
      All that matters for setting the cred->ucounts are the user_ns and uid
      fields in the cred.
      
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20220207121800.5079-4-mkoutny@suse.com
      Link: https://lkml.kernel.org/r/20220216155832.680775-3-ebiederm@xmission.comReported-by: default avatarMichal Koutný <mkoutny@suse.com>
      Reviewed-by: default avatarMichal Koutný <mkoutny@suse.com>
      Fixes: 21d1c5e3 ("Reimplement RLIMIT_NPROC on top of ucounts")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      a55d0729
    • Eric W. Biederman's avatar
      ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1 · 8f2f9c4d
      Eric W. Biederman authored
      Michal Koutný <mkoutny@suse.com> wrote:
      
      > It was reported that v5.14 behaves differently when enforcing
      > RLIMIT_NPROC limit, namely, it allows one more task than previously.
      > This is consequence of the commit 21d1c5e3 ("Reimplement
      > RLIMIT_NPROC on top of ucounts") that missed the sharpness of
      > equality in the forking path.
      
      This can be fixed either by fixing the test or by moving the increment
      to be before the test.  Fix it my moving copy_creds which contains
      the increment before is_ucounts_overlimit.
      
      In the case of CLONE_NEWUSER the ucounts in the task_cred changes.
      The function is_ucounts_overlimit needs to use the final version of
      the ucounts for the new process.  Which means moving the
      is_ucounts_overlimit test after copy_creds is necessary.
      
      Both the test in fork and the test in set_user were semantically
      changed when the code moved to ucounts.  The change of the test in
      fork was bad because it was before the increment.  The test in
      set_user was wrong and the change to ucounts fixed it.  So this
      fix only restores the old behavior in one lcation not two.
      
      Link: https://lkml.kernel.org/r/20220204181144.24462-1-mkoutny@suse.com
      Link: https://lkml.kernel.org/r/20220216155832.680775-2-ebiederm@xmission.com
      Cc: stable@vger.kernel.org
      Reported-by: default avatarMichal Koutný <mkoutny@suse.com>
      Reviewed-by: default avatarMichal Koutný <mkoutny@suse.com>
      Fixes: 21d1c5e3 ("Reimplement RLIMIT_NPROC on top of ucounts")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      8f2f9c4d
    • Eric W. Biederman's avatar
      rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user · c16bdeb5
      Eric W. Biederman authored
      Solar Designer <solar@openwall.com> wrote:
      > I'm not aware of anyone actually running into this issue and reporting
      > it.  The systems that I personally know use suexec along with rlimits
      > still run older/distro kernels, so would not yet be affected.
      >
      > So my mention was based on my understanding of how suexec works, and
      > code review.  Specifically, Apache httpd has the setting RLimitNPROC,
      > which makes it set RLIMIT_NPROC:
      >
      > https://httpd.apache.org/docs/2.4/mod/core.html#rlimitnproc
      >
      > The above documentation for it includes:
      >
      > "This applies to processes forked from Apache httpd children servicing
      > requests, not the Apache httpd children themselves. This includes CGI
      > scripts and SSI exec commands, but not any processes forked from the
      > Apache httpd parent, such as piped logs."
      >
      > In code, there are:
      >
      > ./modules/generators/mod_cgid.c:        ( (cgid_req.limits.limit_nproc_set) && ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
      > ./modules/generators/mod_cgi.c:        ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
      > ./modules/filters/mod_ext_filter.c:    rv = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, conf->limit_nproc);
      >
      > For example, in mod_cgi.c this is in run_cgi_child().
      >
      > I think this means an httpd child sets RLIMIT_NPROC shortly before it
      > execs suexec, which is a SUID root program.  suexec then switches to the
      > target user and execs the CGI script.
      >
      > Before 2863643f, the setuid() in suexec would set the flag, and the
      > target user's process count would be checked against RLIMIT_NPROC on
      > execve().  After 2863643f, the setuid() in suexec wouldn't set the
      > flag because setuid() is (naturally) called when the process is still
      > running as root (thus, has those limits bypass capabilities), and
      > accordingly execve() would not check the target user's process count
      > against RLIMIT_NPROC.
      
      In commit 2863643f ("set_user: add capability check when
      rlimit(RLIMIT_NPROC) exceeds") capable calls were added to set_user to
      make it more consistent with fork.  Unfortunately because of call site
      differences those capable calls were checking the credentials of the
      user before set*id() instead of after set*id().
      
      This breaks enforcement of RLIMIT_NPROC for applications that set the
      rlimit and then call set*id() while holding a full set of
      capabilities.  The capabilities are only changed in the new credential
      in security_task_fix_setuid().
      
      The code in apache suexec appears to follow this pattern.
      
      Commit 909cc4ae ("[PATCH] Fix two bugs with process limits
      (RLIMIT_NPROC)") where this check was added describes the targes of this
      capability check as:
      
        2/ When a root-owned process (e.g. cgiwrap) sets up process limits and then
            calls setuid, the setuid should fail if the user would then be running
            more than rlim_cur[RLIMIT_NPROC] processes, but it doesn't.  This patch
            adds an appropriate test.  With this patch, and per-user process limit
            imposed in cgiwrap really works.
      
      So the original use case of this check also appears to match the broken
      pattern.
      
      Restore the enforcement of RLIMIT_NPROC by removing the bad capable
      checks added in set_user.  This unfortunately restores the
      inconsistent state the code has been in for the last 11 years, but
      dealing with the inconsistencies looks like a larger problem.
      
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/all/20210907213042.GA22626@openwall.com/
      Link: https://lkml.kernel.org/r/20220212221412.GA29214@openwall.com
      Link: https://lkml.kernel.org/r/20220216155832.680775-1-ebiederm@xmission.com
      Fixes: 2863643f ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
      History-Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.gitReviewed-by: default avatarSolar Designer <solar@openwall.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      c16bdeb5
    • Xin Long's avatar
      ping: fix the dif and sdif check in ping_lookup · 35a79e64
      Xin Long authored
      When 'ping' changes to use PING socket instead of RAW socket by:
      
         # sysctl -w net.ipv4.ping_group_range="0 100"
      
      There is another regression caused when matching sk_bound_dev_if
      and dif, RAW socket is using inet_iif() while PING socket lookup
      is using skb->dev->ifindex, the cmd below fails due to this:
      
        # ip link add dummy0 type dummy
        # ip link set dummy0 up
        # ip addr add 192.168.111.1/24 dev dummy0
        # ping -I dummy0 192.168.111.1 -c1
      
      The issue was also reported on:
      
        https://github.com/iputils/iputils/issues/104
      
      But fixed in iputils in a wrong way by not binding to device when
      destination IP is on device, and it will cause some of kselftests
      to fail, as Jianlin noticed.
      
      This patch is to use inet(6)_iif and inet(6)_sdif to get dif and
      sdif for PING socket, and keep consistent with RAW socket.
      
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35a79e64
    • Laibin Qiu's avatar
      block/wbt: fix negative inflight counter when remove scsi device · e92bc4cd
      Laibin Qiu authored
      Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
      wbt_disable_default() when switch elevator to bfq. And when
      we remove scsi device, wbt will be enabled by wbt_enable_default.
      If it become false positive between wbt_wait() and wbt_track()
      when submit write request.
      
      The following is the scenario that triggered the problem.
      
      T1                          T2                           T3
                                  elevator_switch_mq
                                  bfq_init_queue
                                  wbt_disable_default <= Set
                                  rwb->enable_state (OFF)
      Submit_bio
      blk_mq_make_request
      rq_qos_throttle
      <= rwb->enable_state (OFF)
                                                               scsi_remove_device
                                                               sd_remove
                                                               del_gendisk
                                                               blk_unregister_queue
                                                               elv_unregister_queue
                                                               wbt_enable_default
                                                               <= Set rwb->enable_state (ON)
      q_qos_track
      <= rwb->enable_state (ON)
      ^^^^^^ this request will mark WBT_TRACKED without inflight add and will
      lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
      
      Fix this by move wbt_enable_default() from elv_unregister to
      bfq_exit_queue(). Only re-enable wbt when bfq exit.
      
      Fixes: 76a80408 ("blk-wbt: make sure throttle is enabled properly")
      
      Remove oneline stale comment, and kill one oneshot local variable.
      Signed-off-by: default avatarMing Lei <ming.lei@rehdat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/linux-block/20211214133103.551813-1-qiulaibin@huawei.com/Signed-off-by: default avatarLaibin Qiu <qiulaibin@huawei.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e92bc4cd
    • Christoph Hellwig's avatar
      block: fix surprise removal for drivers calling blk_set_queue_dying · 7a5428dc
      Christoph Hellwig authored
      Various block drivers call blk_set_queue_dying to mark a disk as dead due
      to surprise removal events, but since commit 8e141f9e that doesn't
      work given that the GD_DEAD flag needs to be set to stop I/O.
      
      Replace the driver calls to blk_set_queue_dying with a new (and properly
      documented) blk_mark_disk_dead API, and fold blk_set_queue_dying into the
      only remaining caller.
      
      Fixes: 8e141f9e ("block: drain file system I/O on del_gendisk")
      Reported-by: default avatarMarkus Blöchl <markus.bloechl@ipetronik.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Link: https://lore.kernel.org/r/20220217075231.1140-1-hch@lst.deSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7a5428dc
    • Haimin Zhang's avatar
      block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern · cc8f7fe1
      Haimin Zhang authored
      Add __GFP_ZERO flag for alloc_page in function bio_copy_kern to initialize
      the buffer of a bio.
      Signed-off-by: default avatarHaimin Zhang <tcs.kernel@gmail.com>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220216084038.15635-1-tcs.kernel@gmail.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cc8f7fe1
    • Daniele Palmas's avatar
      net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990 · 21e8a963
      Daniele Palmas authored
      Add quirk CDC_MBIM_FLAG_AVOID_ALTSETTING_TOGGLE for Telit FN990
      0x1071 composition in order to avoid bind error.
      Signed-off-by: default avatarDaniele Palmas <dnlplm@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21e8a963
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Defer freeing string after possible strlen() on it · 31ded153
      Arnaldo Carvalho de Melo authored
      This was detected by the gcc in Fedora Rawhide's gcc:
      
        50    11.01 fedora:rawhide                : FAIL gcc version 12.0.1 20220205 (Red Hat 12.0.1-0) (GCC)
              inlined from 'bpf__config_obj' at util/bpf-loader.c:1242:9:
          util/bpf-loader.c:1225:34: error: pointer 'map_opt' may be used after 'free' [-Werror=use-after-free]
           1225 |                 *key_scan_pos += strlen(map_opt);
                |                                  ^~~~~~~~~~~~~~~
          util/bpf-loader.c:1223:9: note: call to 'free' here
           1223 |         free(map_name);
                |         ^~~~~~~~~~~~~~
          cc1: all warnings being treated as errors
      
      So do the calculations on the pointer before freeing it.
      
      Fixes: 04f9bf2b ("perf bpf-loader: Add missing '*' for key_scan_pos")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang ShaoBo <bobo.shaobowang@huawei.com>
      Link: https://lore.kernel.org/lkml/Yg1VtQxKrPpS3uNA@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      31ded153