1. 13 Jun, 2020 3 commits
    • Linus Torvalds's avatar
      Merge tag 'ras-core-2020-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a9429089
      Linus Torvalds authored
      Pull x86 RAS updates from Thomas Gleixner:
       "RAS updates from Borislav Petkov:
      
         - Unmap a whole guest page if an MCE is encountered in it to avoid
           follow-on MCEs leading to the guest crashing, by Tony Luck.
      
           This change collided with the entry changes and the merge
           resolution would have been rather unpleasant. To avoid that the
           entry branch was merged in before applying this. The resulting code
           did not change over the rebase.
      
         - AMD MCE error thresholding machinery cleanup and hotplug
           sanitization, by Thomas Gleixner.
      
         - Change the MCE notifiers to denote whether they have handled the
           error and not break the chain early by returning NOTIFY_STOP, thus
           giving the opportunity for the later handlers in the chain to see
           it. By Tony Luck.
      
         - Add AMD family 0x17, models 0x60-6f support, by Alexander Monakov.
      
         - Last but not least, the usual round of fixes and improvements"
      
      * tag 'ras-core-2020-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
        x86/mce/dev-mcelog: Fix -Wstringop-truncation warning about strncpy()
        x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned
        EDAC/amd64: Add AMD family 17h model 60h PCI IDs
        hwmon: (k10temp) Add AMD family 17h model 60h PCI match
        x86/amd_nb: Add AMD family 17h model 60h PCI IDs
        x86/mcelog: Add compat_ioctl for 32-bit mcelog support
        x86/mce: Drop bogus comment about mce.kflags
        x86/mce: Fixup exception only for the correct MCEs
        EDAC: Drop the EDAC report status checks
        x86/mce: Add mce=print_all option
        x86/mce: Change default MCE logger to check mce->kflags
        x86/mce: Fix all mce notifiers to update the mce->kflags bitmask
        x86/mce: Add a struct mce.kflags field
        x86/mce: Convert the CEC to use the MCE notifier
        x86/mce: Rename "first" function as "early"
        x86/mce/amd, edac: Remove report_gart_errors
        x86/mce/amd: Make threshold bank setting hotplug robust
        x86/mce/amd: Cleanup threshold device remove path
        x86/mce/amd: Straighten CPU hotplug path
        x86/mce/amd: Sanitize thresholding device creation hotplug path
        ...
      a9429089
    • Linus Torvalds's avatar
      Merge tag 'x86-entry-2020-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 076f14be
      Linus Torvalds authored
      Pull x86 entry updates from Thomas Gleixner:
       "The x86 entry, exception and interrupt code rework
      
        This all started about 6 month ago with the attempt to move the Posix
        CPU timer heavy lifting out of the timer interrupt code and just have
        lockless quick checks in that code path. Trivial 5 patches.
      
        This unearthed an inconsistency in the KVM handling of task work and
        the review requested to move all of this into generic code so other
        architectures can share.
      
        Valid request and solved with another 25 patches but those unearthed
        inconsistencies vs. RCU and instrumentation.
      
        Digging into this made it obvious that there are quite some
        inconsistencies vs. instrumentation in general. The int3 text poke
        handling in particular was completely unprotected and with the batched
        update of trace events even more likely to expose to endless int3
        recursion.
      
        In parallel the RCU implications of instrumenting fragile entry code
        came up in several discussions.
      
        The conclusion of the x86 maintainer team was to go all the way and
        make the protection against any form of instrumentation of fragile and
        dangerous code pathes enforcable and verifiable by tooling.
      
        A first batch of preparatory work hit mainline with commit
        d5f744f9 ("Pull x86 entry code updates from Thomas Gleixner")
      
        That (almost) full solution introduced a new code section
        '.noinstr.text' into which all code which needs to be protected from
        instrumentation of all sorts goes into. Any call into instrumentable
        code out of this section has to be annotated. objtool has support to
        validate this.
      
        Kprobes now excludes this section fully which also prevents BPF from
        fiddling with it and all 'noinstr' annotated functions also keep
        ftrace off. The section, kprobes and objtool changes are already
        merged.
      
        The major changes coming with this are:
      
          - Preparatory cleanups
      
          - Annotating of relevant functions to move them into the
            noinstr.text section or enforcing inlining by marking them
            __always_inline so the compiler cannot misplace or instrument
            them.
      
          - Splitting and simplifying the idtentry macro maze so that it is
            now clearly separated into simple exception entries and the more
            interesting ones which use interrupt stacks and have the paranoid
            handling vs. CR3 and GS.
      
          - Move quite some of the low level ASM functionality into C code:
      
             - enter_from and exit to user space handling. The ASM code now
               calls into C after doing the really necessary ASM handling and
               the return path goes back out without bells and whistels in
               ASM.
      
             - exception entry/exit got the equivivalent treatment
      
             - move all IRQ tracepoints from ASM to C so they can be placed as
               appropriate which is especially important for the int3
               recursion issue.
      
          - Consolidate the declaration and definition of entry points between
            32 and 64 bit. They share a common header and macros now.
      
          - Remove the extra device interrupt entry maze and just use the
            regular exception entry code.
      
          - All ASM entry points except NMI are now generated from the shared
            header file and the corresponding macros in the 32 and 64 bit
            entry ASM.
      
          - The C code entry points are consolidated as well with the help of
            DEFINE_IDTENTRY*() macros. This allows to ensure at one central
            point that all corresponding entry points share the same
            semantics. The actual function body for most entry points is in an
            instrumentable and sane state.
      
            There are special macros for the more sensitive entry points, e.g.
            INT3 and of course the nasty paranoid #NMI, #MCE, #DB and #DF.
            They allow to put the whole entry instrumentation and RCU handling
            into safe places instead of the previous pray that it is correct
            approach.
      
          - The INT3 text poke handling is now completely isolated and the
            recursion issue banned. Aside of the entry rework this required
            other isolation work, e.g. the ability to force inline bsearch.
      
          - Prevent #DB on fragile entry code, entry relevant memory and
            disable it on NMI, #MC entry, which allowed to get rid of the
            nested #DB IST stack shifting hackery.
      
          - A few other cleanups and enhancements which have been made
            possible through this and already merged changes, e.g.
            consolidating and further restricting the IDT code so the IDT
            table becomes RO after init which removes yet another popular
            attack vector
      
          - About 680 lines of ASM maze are gone.
      
        There are a few open issues:
      
         - An escape out of the noinstr section in the MCE handler which needs
           some more thought but under the aspect that MCE is a complete
           trainwreck by design and the propability to survive it is low, this
           was not high on the priority list.
      
         - Paravirtualization
      
           When PV is enabled then objtool complains about a bunch of indirect
           calls out of the noinstr section. There are a few straight forward
           ways to fix this, but the other issues vs. general correctness were
           more pressing than parawitz.
      
         - KVM
      
           KVM is inconsistent as well. Patches have been posted, but they
           have not yet been commented on or picked up by the KVM folks.
      
         - IDLE
      
           Pretty much the same problems can be found in the low level idle
           code especially the parts where RCU stopped watching. This was
           beyond the scope of the more obvious and exposable problems and is
           on the todo list.
      
        The lesson learned from this brain melting exercise to morph the
        evolved code base into something which can be validated and understood
        is that once again the violation of the most important engineering
        principle "correctness first" has caused quite a few people to spend
        valuable time on problems which could have been avoided in the first
        place. The "features first" tinkering mindset really has to stop.
      
        With that I want to say thanks to everyone involved in contributing to
        this effort. Special thanks go to the following people (alphabetical
        order): Alexandre Chartre, Andy Lutomirski, Borislav Petkov, Brian
        Gerst, Frederic Weisbecker, Josh Poimboeuf, Juergen Gross, Lai
        Jiangshan, Macro Elver, Paolo Bonzin,i Paul McKenney, Peter Zijlstra,
        Vitaly Kuznetsov, and Will Deacon"
      
      * tag 'x86-entry-2020-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (142 commits)
        x86/entry: Force rcu_irq_enter() when in idle task
        x86/entry: Make NMI use IDTENTRY_RAW
        x86/entry: Treat BUG/WARN as NMI-like entries
        x86/entry: Unbreak __irqentry_text_start/end magic
        x86/entry: __always_inline CR2 for noinstr
        lockdep: __always_inline more for noinstr
        x86/entry: Re-order #DB handler to avoid *SAN instrumentation
        x86/entry: __always_inline arch_atomic_* for noinstr
        x86/entry: __always_inline irqflags for noinstr
        x86/entry: __always_inline debugreg for noinstr
        x86/idt: Consolidate idt functionality
        x86/idt: Cleanup trap_init()
        x86/idt: Use proper constants for table size
        x86/idt: Add comments about early #PF handling
        x86/idt: Mark init only functions __init
        x86/entry: Rename trace_hardirqs_off_prepare()
        x86/entry: Clarify irq_{enter,exit}_rcu()
        x86/entry: Remove DBn stacks
        x86/entry: Remove debug IDT frobbing
        x86/entry: Optimize local_db_save() for virt
        ...
      076f14be
    • Linus Torvalds's avatar
      Merge tag 'notifications-20200601' of... · 6c329784
      Linus Torvalds authored
      Merge tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
      
      Pull notification queue from David Howells:
       "This adds a general notification queue concept and adds an event
        source for keys/keyrings, such as linking and unlinking keys and
        changing their attributes.
      
        Thanks to Debarshi Ray, we do have a pull request to use this to fix a
        problem with gnome-online-accounts - as mentioned last time:
      
           https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47
      
        Without this, g-o-a has to constantly poll a keyring-based kerberos
        cache to find out if kinit has changed anything.
      
        [ There are other notification pending: mount/sb fsinfo notifications
          for libmount that Karel Zak and Ian Kent have been working on, and
          Christian Brauner would like to use them in lxc, but let's see how
          this one works first ]
      
        LSM hooks are included:
      
         - A set of hooks are provided that allow an LSM to rule on whether or
           not a watch may be set. Each of these hooks takes a different
           "watched object" parameter, so they're not really shareable. The
           LSM should use current's credentials. [Wanted by SELinux & Smack]
      
         - A hook is provided to allow an LSM to rule on whether or not a
           particular message may be posted to a particular queue. This is
           given the credentials from the event generator (which may be the
           system) and the watch setter. [Wanted by Smack]
      
        I've provided SELinux and Smack with implementations of some of these
        hooks.
      
        WHY
        ===
      
        Key/keyring notifications are desirable because if you have your
        kerberos tickets in a file/directory, your Gnome desktop will monitor
        that using something like fanotify and tell you if your credentials
        cache changes.
      
        However, we also have the ability to cache your kerberos tickets in
        the session, user or persistent keyring so that it isn't left around
        on disk across a reboot or logout. Keyrings, however, cannot currently
        be monitored asynchronously, so the desktop has to poll for it - not
        so good on a laptop. This facility will allow the desktop to avoid the
        need to poll.
      
        DESIGN DECISIONS
        ================
      
         - The notification queue is built on top of a standard pipe. Messages
           are effectively spliced in. The pipe is opened with a special flag:
      
              pipe2(fds, O_NOTIFICATION_PIPE);
      
           The special flag has the same value as O_EXCL (which doesn't seem
           like it will ever be applicable in this context)[?]. It is given up
           front to make it a lot easier to prohibit splice&co from accessing
           the pipe.
      
           [?] Should this be done some other way?  I'd rather not use up a new
               O_* flag if I can avoid it - should I add a pipe3() system call
               instead?
      
           The pipe is then configured::
      
              ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
              ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);
      
           Messages are then read out of the pipe using read().
      
         - It should be possible to allow write() to insert data into the
           notification pipes too, but this is currently disabled as the
           kernel has to be able to insert messages into the pipe *without*
           holding pipe->mutex and the code to make this work needs careful
           auditing.
      
         - sendfile(), splice() and vmsplice() are disabled on notification
           pipes because of the pipe->mutex issue and also because they
           sometimes want to revert what they just did - but one or more
           notification messages might've been interleaved in the ring.
      
         - The kernel inserts messages with the wait queue spinlock held. This
           means that pipe_read() and pipe_write() have to take the spinlock
           to update the queue pointers.
      
         - Records in the buffer are binary, typed and have a length so that
           they can be of varying size.
      
           This allows multiple heterogeneous sources to share a common
           buffer; there are 16 million types available, of which I've used
           just a few, so there is scope for others to be used. Tags may be
           specified when a watchpoint is created to help distinguish the
           sources.
      
         - Records are filterable as types have up to 256 subtypes that can be
           individually filtered. Other filtration is also available.
      
         - Notification pipes don't interfere with each other; each may be
           bound to a different set of watches. Any particular notification
           will be copied to all the queues that are currently watching for it
           - and only those that are watching for it.
      
         - When recording a notification, the kernel will not sleep, but will
           rather mark a queue as having lost a message if there's
           insufficient space. read() will fabricate a loss notification
           message at an appropriate point later.
      
         - The notification pipe is created and then watchpoints are attached
           to it, using one of:
      
              keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
              watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
              watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);
      
           where in both cases, fd indicates the queue and the number after is
           a tag between 0 and 255.
      
         - Watches are removed if either the notification pipe is destroyed or
           the watched object is destroyed. In the latter case, a message will
           be generated indicating the enforced watch removal.
      
        Things I want to avoid:
      
         - Introducing features that make the core VFS dependent on the
           network stack or networking namespaces (ie. usage of netlink).
      
         - Dumping all this stuff into dmesg and having a daemon that sits
           there parsing the output and distributing it as this then puts the
           responsibility for security into userspace and makes handling
           namespaces tricky. Further, dmesg might not exist or might be
           inaccessible inside a container.
      
         - Letting users see events they shouldn't be able to see.
      
        TESTING AND MANPAGES
        ====================
      
         - The keyutils tree has a pipe-watch branch that has keyctl commands
           for making use of notifications. Proposed manual pages can also be
           found on this branch, though a couple of them really need to go to
           the main manpages repository instead.
      
           If the kernel supports the watching of keys, then running "make
           test" on that branch will cause the testing infrastructure to spawn
           a monitoring process on the side that monitors a notifications pipe
           for all the key/keyring changes induced by the tests and they'll
           all be checked off to make sure they happened.
      
              https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch
      
         - A test program is provided (samples/watch_queue/watch_test) that
           can be used to monitor for keyrings, mount and superblock events.
           Information on the notifications is simply logged to stdout"
      
      * tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        smack: Implement the watch_key and post_notification hooks
        selinux: Implement the watch_key security hook
        keys: Make the KEY_NEED_* perms an enum rather than a mask
        pipe: Add notification lossage handling
        pipe: Allow buffers to be marked read-whole-or-error for notifications
        Add sample notification program
        watch_queue: Add a key/keyring notification facility
        security: Add hooks to rule on setting a watch
        pipe: Add general notification queue support
        pipe: Add O_NOTIFICATION_PIPE
        security: Add a hook for the point of notification insertion
        uapi: General notification queue definitions
      6c329784
  2. 12 Jun, 2020 26 commits
    • Linus Torvalds's avatar
      Merge tag 'thermal-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux · df2fbf5b
      Linus Torvalds authored
      Pull thermal updates from Daniel Lezcano:
      
       - Add the hwmon support on the i.MX SC (Anson Huang)
      
       - Thermal framework cleanups (self-encapsulation, pointless stubs,
         private structures) (Daniel Lezcano)
      
       - Use the PM QoS frequency changes for the devfreq cooling device
         (Matthias Kaehlcke)
      
       - Remove duplicate error messages from platform_get_irq() error
         handling (Markus Elfring)
      
       - Add support for the bandgap sensors (Keerthy)
      
       - Statically initialize .get_mode/.set_mode ops (Andrzej Pietrasiewicz)
      
       - Add Renesas R-Car maintainer entry (Niklas Söderlund)
      
       - Fix error checking after calling ti_bandgap_get_sensor_data() for the
         TI SoC thermal (Sudip Mukherjee)
      
       - Add latency constraint for the idle injection, the DT binding and the
         change the registering function (Daniel Lezcano)
      
       - Convert the thermal framework binding to the Yaml schema (Amit
         Kucheria)
      
       - Replace zero-length array with flexible-array on i.MX 8MM (Gustavo A.
         R. Silva)
      
       - Thermal framework cleanups (alphabetic order for heads, replace
         module.h by export.h, make file naming consistent) (Amit Kucheria)
      
       - Merge tsens-common into the tsens driver (Amit Kucheria)
      
       - Fix platform dependency for the Qoriq driver (Geert Uytterhoeven)
      
       - Clean up the rcar_thermal_update_temp() function in the rcar thermal
         driver (Niklas Söderlund)
      
       - Fix the TMSAR register for the TMUv2 on the Qoriq platform (Yuantian
         Tang)
      
       - Export GDDV, OEM vendor variables, and don't require IDSP for the
         int340x thermal driver - trivial conflicts fixed (Matthew Garrett)
      
      * tag 'thermal-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (48 commits)
        thermal/int340x_thermal: Don't require IDSP to exist
        thermal/int340x_thermal: Export OEM vendor variables
        thermal/int340x_thermal: Export GDDV
        thermal: qoriq: Update the settings for TMUv2
        thermal: rcar_thermal: Clean up rcar_thermal_update_temp()
        thermal: qoriq: Add platform dependencies
        drivers: thermal: tsens: Merge tsens-common.c into tsens.c
        thermal/of: Rename of-thermal.c
        thermal/governors: Prefix all source files with gov_
        thermal/drivers/user_space: Sort headers alphabetically
        thermal/drivers/of-thermal: Sort headers alphabetically
        thermal/drivers/cpufreq_cooling: Replace module.h with export.h
        thermal/drivers/cpufreq_cooling: Sort headers alphabetically
        thermal/drivers/clock_cooling: Include export.h
        thermal/drivers/clock_cooling: Sort headers alphabetically
        thermal/drivers/thermal_hwmon: Include export.h
        thermal/drivers/thermal_hwmon: Sort headers alphabetically
        thermal/drivers/thermal_helpers: Include export.h
        thermal/drivers/thermal_helpers: Sort headers alphabetically
        thermal/core: Replace module.h with export.h
        ...
      df2fbf5b
    • Linus Torvalds's avatar
      Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · 44ebe016
      Linus Torvalds authored
      Pull proc fix from Eric Biederman:
       "Much to my surprise syzbot found a very old bug in proc that the
        recent changes made easier to reproce. This bug is subtle enough it
        looks like it fooled everyone who should know better"
      
      * 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        proc: Use new_inode not new_inode_pseudo
      44ebe016
    • Thomas Gleixner's avatar
      x86/entry: Force rcu_irq_enter() when in idle task · 0bf3924b
      Thomas Gleixner authored
      The idea of conditionally calling into rcu_irq_enter() only when RCU is
      not watching turned out to be not completely thought through.
      
      Paul noticed occasional premature end of grace periods in RCU torture
      testing. Bisection led to the commit which made the invocation of
      rcu_irq_enter() conditional on !rcu_is_watching().
      
      It turned out that this conditional breaks RCU assumptions about the idle
      task when the scheduler tick happens to be a nested interrupt. Nested
      interrupts can happen when the first interrupt invokes softirq processing
      on return which enables interrupts.
      
      If that nested tick interrupt does not invoke rcu_irq_enter() then the
      RCU's irq-nesting checks will believe that this interrupt came directly
      from idle, which will cause RCU to report a quiescent state.  Because this
      interrupt instead came from a softirq handler which might have been
      executing an RCU read-side critical section, this can cause the grace
      period to end prematurely.
      
      Change the condition from !rcu_is_watching() to is_idle_task(current) which
      enforces that interrupts in the idle task unconditionally invoke
      rcu_irq_enter() independent of the RCU state.
      
      This is also correct vs. user mode entries in NOHZ full scenarios because
      user mode entries bring RCU out of EQS and force the RCU irq nesting state
      accounting to nested. As only the first interrupt can enter from user mode
      a nested tick interrupt will enter from kernel mode and as the nesting
      state accounting is forced to nesting it will not do anything stupid even
      if rcu_irq_enter() has not been invoked.
      
      Fixes: 3eeec385 ("x86/entry: Provide idtentry_entry/exit_cond_rcu()")
      Reported-by: default avatar"Paul E. McKenney" <paulmck@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatar"Paul E. McKenney" <paulmck@kernel.org>
      Reviewed-by: default avatar"Paul E. McKenney" <paulmck@kernel.org>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Link: https://lkml.kernel.org/r/87wo4cxubv.fsf@nanos.tec.linutronix.de
      0bf3924b
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-5.8-rc1' of... · 9433a51e
      Linus Torvalds authored
      Merge tag 'pwm/for-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm updates from Thierry Reding:
       "Nothing too exciting for this cycle. A couple of fixes across the
        board, and Lee volunteered to help with patch review"
      
      * tag 'pwm/for-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
        pwm: Add missing "CONFIG_" prefix
        MAINTAINERS: Add Lee Jones as reviewer for the PWM subsystem
        pwm: imx27: Fix rounding behavior
        pwm: rockchip: Simplify rockchip_pwm_get_state()
        pwm: img: Call pm_runtime_put() in pm_runtime_get_sync() failed case
        pwm: tegra: Support dynamic clock frequency configuration
        pwm: jz4740: Add support for the JZ4725B
        pwm: jz4740: Make PWM start with the active part
        pwm: jz4740: Enhance precision in calculation of duty cycle
        pwm: jz4740: Drop dependency on MACH_INGENIC
        pwm: lpss: Fix get_state runtime-pm reference handling
        pwm: sun4i: Support direct clock output on Allwinner A64
        pwm: Add support for Azoteq IQS620A PWM generator
        dt-bindings: pwm: rcar: add r8a77961 support
        pwm: Add missing '\n' in log messages
      9433a51e
    • Linus Torvalds's avatar
      Merge tag 'iommu-drivers-move-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 8f02f363
      Linus Torvalds authored
      Pull iommu driver directory structure cleanup from Joerg Roedel:
       "Move the Intel and AMD IOMMU drivers into their own subdirectory.
      
        Both drivers consist of several files by now and giving them their own
        directory unclutters the IOMMU top-level directory a bit"
      
      * tag 'iommu-drivers-move-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Move Intel IOMMU driver into subdirectory
        iommu/amd: Move AMD IOMMU driver into subdirectory
      8f02f363
    • Linus Torvalds's avatar
      Merge tag 'printk-for-5.8-kdb-nmi' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux · 5c2fb57a
      Linus Torvalds authored
      Pull printk fix from Petr Mladek:
       "One more printk change for 5.8: make sure that messages printed from
        KDB context are redirected to KDB console handlers. It did not work
        when KDB interrupted NMI or printk_safe contexts.
      
        Arm people started hitting this problem more often recently. I forgot
        to add the fix into the previous pull request by mistake"
      
      * tag 'printk-for-5.8-kdb-nmi' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
        printk/kdb: Redirect printk messages into kdb in any context
      5c2fb57a
    • Eric W. Biederman's avatar
      proc: Use new_inode not new_inode_pseudo · ef1548ad
      Eric W. Biederman authored
      Recently syzbot reported that unmounting proc when there is an ongoing
      inotify watch on the root directory of proc could result in a use
      after free when the watch is removed after the unmount of proc
      when the watcher exits.
      
      Commit 69879c01 ("proc: Remove the now unnecessary internal mount
      of proc") made it easier to unmount proc and allowed syzbot to see the
      problem, but looking at the code it has been around for a long time.
      
      Looking at the code the fsnotify watch should have been removed by
      fsnotify_sb_delete in generic_shutdown_super.  Unfortunately the inode
      was allocated with new_inode_pseudo instead of new_inode so the inode
      was not on the sb->s_inodes list.  Which prevented
      fsnotify_unmount_inodes from finding the inode and removing the watch
      as well as made it so the "VFS: Busy inodes after unmount" warning
      could not find the inodes to warn about them.
      
      Make all of the inodes in proc visible to generic_shutdown_super,
      and fsnotify_sb_delete by using new_inode instead of new_inode_pseudo.
      The only functional difference is that new_inode places the inodes
      on the sb->s_inodes list.
      
      I wrote a small test program and I can verify that without changes it
      can trigger this issue, and by replacing new_inode_pseudo with
      new_inode the issues goes away.
      
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/000000000000d788c905a7dfa3f4@google.com
      Reported-by: syzbot+7d2debdcdb3cb93c1e5e@syzkaller.appspotmail.com
      Fixes: 0097875b ("proc: Implement /proc/thread-self to point at the directory of the current thread")
      Fixes: 021ada7d ("procfs: switch /proc/self away from proc_dir_entry")
      Fixes: 51f0885e ("vfs,proc: guarantee unique inodes in /proc")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      ef1548ad
    • Linus Torvalds's avatar
      Merge tag 'integrity-v5.8-fix' of... · 923ea163
      Linus Torvalds authored
      Merge tag 'integrity-v5.8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
      
      Pull integrity fix from Mimi Zohar:
       "ima mprotect performance fix"
      
      * tag 'integrity-v5.8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
        ima: fix mprotect checking
      923ea163
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 4071b856
      Linus Torvalds authored
      Pull Devicetree fixes from Rob Herring:
      
       - Another round of whack-a-mole removing 'allOf', redundant cases of
         'maxItems' and incorrect 'reg' sizes
      
       - Fix support for yaml.h in non-standard paths
      
      * tag 'devicetree-fixes-for-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: Remove redundant 'maxItems'
        dt-bindings: Fix more incorrect 'reg' property sizes in examples
        dt-bindings: phy: qcom: Fix missing 'ranges' and example addresses
        dt-bindings: Remove more cases of 'allOf' containing a '$ref'
        scripts/dtc: use pkg-config to include <yaml.h> in non-standard path
      4071b856
    • Linus Torvalds's avatar
      Merge tag 'nios2-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 · 7de26c41
      Linus Torvalds authored
      Pull nios2 update from Ley Foon Tan:
       "Mark expected switch fall-through in signal handling"
      
      * tag 'nios2-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
        nios2: signal: Mark expected switch fall-through
      7de26c41
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 52cd0d97
      Linus Torvalds authored
      Pull more KVM updates from Paolo Bonzini:
       "The guest side of the asynchronous page fault work has been delayed to
        5.9 in order to sync with Thomas's interrupt entry rework, but here's
        the rest of the KVM updates for this merge window.
      
        MIPS:
         - Loongson port
      
        PPC:
         - Fixes
      
        ARM:
         - Fixes
      
        x86:
         - KVM_SET_USER_MEMORY_REGION optimizations
         - Fixes
         - Selftest fixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (62 commits)
        KVM: x86: do not pass poisoned hva to __kvm_set_memory_region
        KVM: selftests: fix sync_with_host() in smm_test
        KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injected
        KVM: async_pf: Cleanup kvm_setup_async_pf()
        kvm: i8254: remove redundant assignment to pointer s
        KVM: x86: respect singlestep when emulating instruction
        KVM: selftests: Don't probe KVM_CAP_HYPERV_ENLIGHTENED_VMCS when nested VMX is unsupported
        KVM: selftests: do not substitute SVM/VMX check with KVM_CAP_NESTED_STATE check
        KVM: nVMX: Consult only the "basic" exit reason when routing nested exit
        KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h
        KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
        KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts
        KVM: arm64: Remove host_cpu_context member from vcpu structure
        KVM: arm64: Stop sparse from moaning at __hyp_this_cpu_ptr
        KVM: arm64: Handle PtrAuth traps early
        KVM: x86: Unexport x86_fpu_cache and make it static
        KVM: selftests: Ignore KVM 5-level paging support for VM_MODE_PXXV48_4K
        KVM: arm64: Save the host's PtrAuth keys in non-preemptible context
        KVM: arm64: Stop save/restoring ACTLR_EL1
        KVM: arm64: Add emulation for 32bit guests accessing ACTLR2
        ...
      52cd0d97
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.8b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · d2d5439d
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
      
       - several smaller cleanups
      
       - a fix for a Xen guest regression with CPU offlining
      
       - a small fix in the xen pvcalls backend driver
      
       - an update of MAINTAINERS
      
      * tag 'for-linus-5.8b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE
        xen/pci: Get rid of verbose_request and use dev_dbg() instead
        xenbus: Use dev_printk() when possible
        xen-pciback: Use dev_printk() when possible
        xen: enable BALLOON_MEMORY_HOTPLUG by default
        xen: expand BALLOON_MEMORY_HOTPLUG description
        xen/pvcalls: Make pvcalls_back_global static
        xen/cpuhotplug: Fix initial CPU offlining for PV(H) guests
        xen-platform: Constify dev_pm_ops
        xen/pvcalls-back: test for errors when calling backend_connect()
      d2d5439d
    • Rob Herring's avatar
      8440d4a7
    • Rob Herring's avatar
      dt-bindings: Remove redundant 'maxItems' · 44761570
      Rob Herring authored
      There's no need to specify 'maxItems' with the same value as the number
      of entries in 'items'. A meta-schema update will catch future cases.
      
      Cc: Shawn Guo <shawnguo@kernel.org>
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Cc: Anson Huang <Anson.Huang@nxp.com>
      Cc: linux-clk@vger.kernel.org
      Cc: linux-pwm@vger.kernel.org
      Cc: linux-usb@vger.kernel.org
      Reviewed-by: Stephen Boyd <sboyd@kernel.org> # clk
      Acked-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      44761570
    • Mimi Zohar's avatar
      ima: fix mprotect checking · 4235b1a4
      Mimi Zohar authored
      Make sure IMA is enabled before checking mprotect change.  Addresses
      report of a 3.7% regression of boot-time.dhcp.
      
      Fixes: 8eb613c0 ("ima: verify mprotect change is consistent with mmap policy")
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Reviewed-by: default avatarLakshmi Ramasubramanian <nramas@linux.microsoft.com>
      Tested-by: default avatarXing Zhengjun <zhengjun.xing@linux.intel.com>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      4235b1a4
    • Thomas Gleixner's avatar
      x86/entry: Make NMI use IDTENTRY_RAW · 71ed49d8
      Thomas Gleixner authored
      For no reason other than beginning brainmelt, IDTENTRY_NMI was mapped to
      IDTENTRY_IST.
      
      This is not a problem on 64bit because the IST default entry point maps to
      IDTENTRY_RAW which does not any entry handling. The surplus function
      declaration for the noist C entry point is unused and as there is no ASM
      code emitted for NMI this went unnoticed.
      
      On 32bit IDTENTRY_IST maps to a regular IDTENTRY which does the normal
      entry handling. That is clearly the wrong thing to do for NMI.
      
      Map it to IDTENTRY_RAW to unbreak it. The IDTENTRY_NMI mapping needs to
      stay to avoid emitting ASM code.
      
      Fixes: 6271fef0 ("x86/entry: Convert NMI to IDTENTRY_NMI")
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Debugged-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/CA+G9fYvF3cyrY+-iw_SZtpN-i2qA2BruHg4M=QYECU2-dNdsMw@mail.gmail.com
      71ed49d8
    • Andy Lutomirski's avatar
      x86/entry: Treat BUG/WARN as NMI-like entries · 15a416e8
      Andy Lutomirski authored
      BUG/WARN are cleverly optimized using UD2 to handle the BUG/WARN out of
      line in an exception fixup.
      
      But if BUG or WARN is issued in a funny RCU context, then the
      idtentry_enter...() path might helpfully WARN that the RCU context is
      invalid, which results in infinite recursion.
      
      Split the BUG/WARN handling into an nmi_enter()/nmi_exit() path in
      exc_invalid_op() to increase the chance to survive the experience.
      
      [ tglx: Make the declaration match the implementation ]
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/f8fe40e0088749734b4435b554f73eee53dcf7a8.1591932307.git.luto@kernel.org
      15a416e8
    • Ley Foon Tan's avatar
      nios2: signal: Mark expected switch fall-through · 6b57fa4d
      Ley Foon Tan authored
      Mark switch cases where we are expecting to fall through.
      
      Fix the following warning through the use of the new the new
      pseudo-keyword fallthrough;
      
      arch/nios2/kernel/signal.c:254:12: warning: this statement may fall through [-Wimplicit-fallthrough=]
        254 |    restart = -2;
            |    ~~~~~~~~^~~~
      arch/nios2/kernel/signal.c:255:3: note: here
        255 |   case ERESTARTNOHAND:
            |   ^~~~
      Reported-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      6b57fa4d
    • Linus Torvalds's avatar
      Merge tag 'locking-kcsan-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b791d1bd
      Linus Torvalds authored
      Pull the Kernel Concurrency Sanitizer from Thomas Gleixner:
       "The Kernel Concurrency Sanitizer (KCSAN) is a dynamic race detector,
        which relies on compile-time instrumentation, and uses a
        watchpoint-based sampling approach to detect races.
      
        The feature was under development for quite some time and has already
        found legitimate bugs.
      
        Unfortunately it comes with a limitation, which was only understood
        late in the development cycle:
      
           It requires an up to date CLANG-11 compiler
      
        CLANG-11 is not yet released (scheduled for June), but it's the only
        compiler today which handles the kernel requirements and especially
        the annotations of functions to exclude them from KCSAN
        instrumentation correctly.
      
        These annotations really need to work so that low level entry code and
        especially int3 text poke handling can be completely isolated.
      
        A detailed discussion of the requirements and compiler issues can be
        found here:
      
          https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/
      
        We came to the conclusion that trying to work around compiler
        limitations and bugs again would end up in a major trainwreck, so
        requiring a working compiler seemed to be the best choice.
      
        For Continous Integration purposes the compiler restriction is
        manageable and that's where most xxSAN reports come from.
      
        For a change this limitation might make GCC people actually look at
        their bugs. Some issues with CSAN in GCC are 7 years old and one has
        been 'fixed' 3 years ago with a half baken solution which 'solved' the
        reported issue but not the underlying problem.
      
        The KCSAN developers also ponder to use a GCC plugin to become
        independent, but that's not something which will show up in a few
        days.
      
        Blocking KCSAN until wide spread compiler support is available is not
        a really good alternative because the continuous growth of lockless
        optimizations in the kernel demands proper tooling support"
      
      * tag 'locking-kcsan-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (76 commits)
        compiler_types.h, kasan: Use __SANITIZE_ADDRESS__ instead of CONFIG_KASAN to decide inlining
        compiler.h: Move function attributes to compiler_types.h
        compiler.h: Avoid nested statement expression in data_race()
        compiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE()
        kcsan: Update Documentation to change supported compilers
        kcsan: Remove 'noinline' from __no_kcsan_or_inline
        kcsan: Pass option tsan-instrument-read-before-write to Clang
        kcsan: Support distinguishing volatile accesses
        kcsan: Restrict supported compilers
        kcsan: Avoid inserting __tsan_func_entry/exit if possible
        ubsan, kcsan: Don't combine sanitizer with kcov on clang
        objtool, kcsan: Add kcsan_disable_current() and kcsan_enable_current_nowarn()
        kcsan: Add __kcsan_{enable,disable}_current() variants
        checkpatch: Warn about data_race() without comment
        kcsan: Use GFP_ATOMIC under spin lock
        Improve KCSAN documentation a bit
        kcsan: Make reporting aware of KCSAN tests
        kcsan: Fix function matching in report
        kcsan: Change data_race() to no longer require marking racing accesses
        kcsan: Move kcsan_{disable,enable}_current() to kcsan-checks.h
        ...
      b791d1bd
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9716e57a
      Linus Torvalds authored
      Pull atomics rework from Thomas Gleixner:
       "Peter Zijlstras rework of atomics and fallbacks. This solves two
        problems:
      
         1) Compilers uninline small atomic_* static inline functions which
            can expose them to instrumentation.
      
         2) The instrumentation of atomic primitives was done at the
            architecture level while composites or fallbacks were provided at
            the generic level. As a result there are no uninstrumented
            variants of the fallbacks.
      
        Both issues were in the way of fully isolating fragile entry code
        pathes and especially the text poke int3 handler which is prone to an
        endless recursion problem when anything in that code path is about to
        be instrumented. This was always a problem, but got elevated due to
        the new batch mode updates of tracing.
      
        The solution is to mark the functions __always_inline and to flip the
        fallback and instrumentation so the non-instrumented variants are at
        the architecture level and the instrumentation is done in generic
        code.
      
        The latter introduces another fallback variant which will go away once
        all architectures have been moved over to arch_atomic_*"
      
      * tag 'locking-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/atomics: Flip fallbacks and instrumentation
        asm-generic/atomic: Use __always_inline for fallback wrappers
      9716e57a
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · b1a62749
      Linus Torvalds authored
      Pull updates from Andrew Morton:
       "A few fixes and stragglers.
      
        Subsystems affected by this patch series: mm/memory-failure, ocfs2,
        lib/lzo, misc"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        amdgpu: a NULL ->mm does not mean a thread is a kthread
        lib/lzo: fix ambiguous encoding bug in lzo-rle
        ocfs2: fix build failure when TCP/IP is disabled
        mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread
        mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill
      b1a62749
    • Christoph Hellwig's avatar
      amdgpu: a NULL ->mm does not mean a thread is a kthread · 8449d150
      Christoph Hellwig authored
      Use the proper API instead.
      
      Fixes: 70539bd7 ("drm/amd: Update MEC HQD loading code for KFD")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: default avatarJens Axboe <axboe@kernel.dk>
      Reviewed-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
      Cc: Zhi Wang <zhi.a.wang@intel.com>
      Cc: Felipe Balbi <balbi@kernel.org>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
      Link: http://lkml.kernel.org/r/20200404094101.672954-2-hch@lst.deSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8449d150
    • Dave Rodgman's avatar
      lib/lzo: fix ambiguous encoding bug in lzo-rle · b5265c81
      Dave Rodgman authored
      In some rare cases, for input data over 32 KB, lzo-rle could encode two
      different inputs to the same compressed representation, so that
      decompression is then ambiguous (i.e.  data may be corrupted - although
      zram is not affected because it operates over 4 KB pages).
      
      This modifies the compressor without changing the decompressor or the
      bitstream format, such that:
      
       - there is no change to how data produced by the old compressor is
         decompressed
      
       - an old decompressor will correctly decode data from the updated
         compressor
      
       - performance and compression ratio are not affected
      
       - we avoid introducing a new bitstream format
      
      In testing over 12.8M real-world files totalling 903 GB, three files
      were affected by this bug.  I also constructed 37M semi-random 64 KB
      files totalling 2.27 TB, and saw no affected files.  Finally I tested
      over files constructed to contain each of the ~1024 possible bad input
      sequences; for all of these cases, updated lzo-rle worked correctly.
      
      There is no significant impact to performance or compression ratio.
      Signed-off-by: default avatarDave Rodgman <dave.rodgman@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Dave Rodgman <dave.rodgman@arm.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Chao Yu <yuchao0@huawei.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200507100203.29785-1-dave.rodgman@arm.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b5265c81
    • Tom Seewald's avatar
      ocfs2: fix build failure when TCP/IP is disabled · fce1affe
      Tom Seewald authored
      After commit 12abc5ee ("tcp: add tcp_sock_set_nodelay") and commit
      c488aead ("tcp: add tcp_sock_set_user_timeout"), building the kernel
      with OCFS2_FS=y but without INET=y causes it to fail with:
      
        ld: fs/ocfs2/cluster/tcp.o: in function `o2net_accept_many':
        tcp.c:(.text+0x21b1): undefined reference to `tcp_sock_set_nodelay'
        ld: tcp.c:(.text+0x21c1): undefined reference to `tcp_sock_set_user_timeout'
        ld: fs/ocfs2/cluster/tcp.o: in function `o2net_start_connect':
        tcp.c:(.text+0x2633): undefined reference to `tcp_sock_set_nodelay'
        ld: tcp.c:(.text+0x2643): undefined reference to `tcp_sock_set_user_timeout'
      
      This is due to tcp_sock_set_nodelay() and tcp_sock_set_user_timeout()
      being declared in linux/tcp.h and defined in net/ipv4/tcp.c, which
      depend on TCP/IP being enabled.
      
      To fix this, make OCFS2_FS depend on INET=y which already requires
      NET=y.
      
      Fixes: 12abc5ee ("tcp: add tcp_sock_set_nodelay")
      Fixes: c488aead ("tcp: add tcp_sock_set_user_timeout")
      Signed-off-by: default avatarTom Seewald <tseewald@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Acked-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Link: http://lkml.kernel.org/r/20200606190827.23954-1-tseewald@gmail.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fce1affe
    • Naoya Horiguchi's avatar
      mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread · 03151c6e
      Naoya Horiguchi authored
      Action Required memory error should happen only when a processor is
      about to access to a corrupted memory, so it's synchronous and only
      affects current process/thread.
      
      Recently commit 872e9a20 ("mm, memory_failure: don't send
      BUS_MCEERR_AO for action required error") fixed the issue that Action
      Required memory could unnecessarily send SIGBUS to the processes which
      share the error memory.  But we still have another issue that we could
      send SIGBUS to a wrong thread.
      
      This is because collect_procs() and task_early_kill() fails to add the
      current process to "to-kill" list.  So this patch is suggesting to fix
      it.  With this fix, SIGBUS(BUS_MCEERR_AR) is never sent to non-current
      process/thread.
      Signed-off-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarTony Luck <tony.luck@intel.com>
      Acked-by: default avatarPankaj Gupta <pankaj.gupta.linux@gmail.com>
      Link: http://lkml.kernel.org/r/1591321039-22141-3-git-send-email-naoya.horiguchi@nec.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      03151c6e
    • Naoya Horiguchi's avatar
      mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill · 4e018b45
      Naoya Horiguchi authored
      Patch series "hwpoison: fixes signaling on memory error"
      
      This is a small patchset to solve issues in memory error handler to send
      SIGBUS to proper process/thread as expected in configuration.  Please
      see descriptions in individual patches for more details.
      
      This patch (of 2):
      
      Early-kill policy is controlled from two types of settings, one is
      per-process setting prctl(PR_MCE_KILL) and the other is system-wide
      setting vm.memory_failure_early_kill.  Users expect per-process setting
      to override system-wide setting as many other settings do, but
      early-kill setting doesn't work as such.
      
      For example, if a system configures vm.memory_failure_early_kill to 1
      (enabled), a process receives SIGBUS even if it's configured to
      explicitly disable PF_MCE_KILL by prctl().  That's not desirable for
      applications with their own policies.
      
      This patch is suggesting to change the priority of these two types of
      settings, by checking sysctl_memory_failure_early_kill only when a given
      process has the default kill policy.
      
      Note that this patch is solving a thread choice issue too.
      
      Originally, collect_procs() always chooses the main thread when
      vm.memory_failure_early_kill is 1, even if the process has a dedicated
      thread for memory error handling.  SIGBUS should be sent to the
      dedicated thread if early-kill is enabled via
      vm.memory_failure_early_kill as we are doing for PR_MCE_KILL_EARLY
      processes.
      Signed-off-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
      Link: http://lkml.kernel.org/r/1591321039-22141-1-git-send-email-naoya.horiguchi@nec.com
      Link: http://lkml.kernel.org/r/1591321039-22141-2-git-send-email-naoya.horiguchi@nec.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4e018b45
  3. 11 Jun, 2020 11 commits
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block · b961f8dc
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "A few late stragglers in here. In particular:
      
         - Validate full range for provided buffers (Bijan)
      
         - Fix bad use of kfree() in buffer registration failure (Denis)
      
         - Don't allow close of ring itself, it's not fully safe. Making it
           fully safe would require making the system call more expensive,
           which isn't worth it.
      
         - Buffer selection fix
      
         - Regression fix for O_NONBLOCK retry
      
         - Make IORING_OP_ACCEPT honor O_NONBLOCK (Jiufei)
      
         - Restrict opcode handling for SQ/IOPOLL (Pavel)
      
         - io-wq work handling cleanups and improvements (Pavel, Xiaoguang)
      
         - IOPOLL race fix (Xiaoguang)"
      
      * tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
        io_uring: fix io_kiocb.flags modification race in IOPOLL mode
        io_uring: check file O_NONBLOCK state for accept
        io_uring: avoid unnecessary io_wq_work copy for fast poll feature
        io_uring: avoid whole io_wq_work copy for requests completed inline
        io_uring: allow O_NONBLOCK async retry
        io_wq: add per-wq work handler instead of per work
        io_uring: don't arm a timeout through work.func
        io_uring: remove custom ->func handlers
        io_uring: don't derive close state from ->func
        io_uring: use kvfree() in io_sqe_buffer_register()
        io_uring: validate the full range of provided buffers for access
        io_uring: re-set iov base/len for buffer select retry
        io_uring: move send/recv IOPOLL check into prep
        io_uring: deduplicate io_openat{,2}_prep()
        io_uring: do build_open_how() only once
        io_uring: fix {SQ,IO}POLL with unsupported opcodes
        io_uring: disallow close of ring itself
      b961f8dc
    • Linus Torvalds's avatar
      Merge tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block · a58dfea2
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Some followup fixes for this merge window. In particular:
      
         - Seqcount write missing preemption disable for stats (Ahmed)
      
         - blktrace fixes (Chaitanya)
      
         - Redundant initializations (Colin)
      
         - Various small NVMe fixes (Chaitanya, Christoph, Daniel, Max,
           Niklas, Rikard)
      
         - loop flag bug regression fix (Martijn)
      
         - blk-mq tagging fixes (Christoph, Ming)"
      
      * tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
        umem: remove redundant initialization of variable ret
        pktcdvd: remove redundant initialization of variable ret
        nvmet: fail outstanding host posted AEN req
        nvme-pci: use simple suspend when a HMB is enabled
        nvme-fc: don't call nvme_cleanup_cmd() for AENs
        nvmet-tcp: constify nvmet_tcp_ops
        nvme-tcp: constify nvme_tcp_mq_ops and nvme_tcp_admin_mq_ops
        nvme: do not call del_gendisk() on a disk that was never added
        blk-mq: fix blk_mq_all_tag_iter
        blk-mq: split out a __blk_mq_get_driver_tag helper
        blktrace: fix endianness for blk_log_remap()
        blktrace: fix endianness in get_pdu_int()
        blktrace: use errno instead of bi_status
        block: nr_sects_write(): Disable preemption on seqcount write
        block: remove the error argument to the block_bio_complete tracepoint
        loop: Fix wrong masking of status flags
        block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed
      a58dfea2
    • David Howells's avatar
      afs: Fix afs_store_data() to set mtime in new operation descriptor · b3597945
      David Howells authored
      Fix afs_store_data() so that it sets the mtime in the new operation
      descriptor otherwise the mtime on the server gets set to 0 when a write is
      stored to the server.
      
      Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept")
      Reported-by: default avatarDave Botsch <botsch@cnf.cornell.edu>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b3597945
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6a45a658
      Linus Torvalds authored
      Pull more x86 updates from Thomas Gleixner:
       "A set of fixes and updates for x86:
      
         - Unbreak paravirt VDSO clocks.
      
           While the VDSO code was moved into lib for sharing a subtle check
           for the validity of paravirt clocks got replaced. While the
           replacement works perfectly fine for bare metal as the update of
           the VDSO clock mode is synchronous, it fails for paravirt clocks
           because the hypervisor can invalidate them asynchronously.
      
           Bring it back as an optional function so it does not inflict this
           on architectures which are free of PV damage.
      
         - Fix the jiffies to jiffies64 mapping on 64bit so it does not
           trigger an ODR violation on newer compilers
      
         - Three fixes for the SSBD and *IB* speculation mitigation maze to
           ensure consistency, not disabling of some *IB* variants wrongly and
           to prevent a rogue cross process shutdown of SSBD. All marked for
           stable.
      
         - Add yet more CPU models to the splitlock detection capable list
           !@#%$!
      
         - Bring the pr_info() back which tells that TSC deadline timer is
           enabled.
      
         - Reboot quirk for MacBook6,1"
      
      * tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/vdso: Unbreak paravirt VDSO clocks
        lib/vdso: Provide sanity check for cycles (again)
        clocksource: Remove obsolete ifdef
        x86_64: Fix jiffies ODR violation
        x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches.
        x86/speculation: Prevent rogue cross-process SSBD shutdown
        x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.
        x86/cpu: Add Sapphire Rapids CPU model number
        x86/split_lock: Add Icelake microserver and Tigerlake CPU models
        x86/apic: Make TSC deadline timer detection message visible
        x86/reboot/quirks: Add MacBook6,1 reboot quirk
      6a45a658
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 92ac9712
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "A small fix for the VDSO code to force inline
        __cvdso_clock_gettime_common() so the compiler
        can't generate horrible code"
      
      * tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        lib/vdso: Force inlining of __cvdso_clock_gettime_common()
      92ac9712
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 623f6dc5
      Linus Torvalds authored
      Merge some more updates from Andrew Morton:
      
       - various hotfixes and minor things
      
       - hch's use_mm/unuse_mm clearnups
      
      Subsystems affected by this patch series: mm/hugetlb, scripts, kcov,
      lib, nilfs, checkpatch, lib, mm/debug, ocfs2, lib, misc.
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        kernel: set USER_DS in kthread_use_mm
        kernel: better document the use_mm/unuse_mm API contract
        kernel: move use_mm/unuse_mm to kthread.c
        kernel: move use_mm/unuse_mm to kthread.c
        stacktrace: cleanup inconsistent variable type
        lib: test get_count_order/long in test_bitops.c
        mm: add comments on pglist_data zones
        ocfs2: fix spelling mistake and grammar
        mm/debug_vm_pgtable: fix kernel crash by checking for THP support
        lib: fix bitmap_parse() on 64-bit big endian archs
        checkpatch: correct check for kernel parameters doc
        nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
        lib/lz4/lz4_decompress.c: document deliberate use of `&'
        kcov: check kcov_softirq in kcov_remote_stop()
        scripts/spelling: add a few more typos
        khugepaged: selftests: fix timeout condition in wait_for_scan()
      623f6dc5
    • Rob Herring's avatar
      dt-bindings: Fix more incorrect 'reg' property sizes in examples · 0db958b6
      Rob Herring authored
      The examples template is a 'simple-bus' with a size of 1 cell for
      had between 2 and 4 cells which really only errors on I2C or SPI type
      devices with a single cell.
      
      The easiest fix in most cases is to change the 'reg' property to 1 cell
      for address and size.
      
      Cc: "Heiko Stübner" <heiko@sntech.de>
      Cc: Ezequiel Garcia <ezequiel@collabora.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Philipp Zabel <p.zabel@pengutronix.de>
      Cc: Miquel Raynal <miquel.raynal@bootlin.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Cc: Kishon Vijay Abraham I <kishon@ti.com>
      Cc: Vinod Koul <vkoul@kernel.org>
      Cc: Liam Girdwood <lgirdwood@gmail.com>
      Cc: linux-rockchip@lists.infradead.org
      Cc: linux-media@vger.kernel.org
      Cc: linux-mtd@lists.infradead.org
      Cc: netdev@vger.kernel.org
      Cc: alsa-devel@alsa-project.org
      Acked-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      0db958b6
    • Joerg Roedel's avatar
      alpha: Fix build around srm_sysrq_reboot_op · 5cd221e8
      Joerg Roedel authored
      The patch introducing the struct was probably never compile tested,
      because it sets a handler with a wrong function signature. Wrap the
      handler into a functions with the correct signature to fix the build.
      
      Fixes: 0f1c9688 ("tty/sysrq: alpha: export and use __sysrq_get_key_op()")
      Cc: Emil Velikov <emil.l.velikov@gmail.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5cd221e8
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · cd16ed33
      Linus Torvalds authored
      Pull more RISC-V updates from Palmer Dabbelt:
      
       - Kconfig select statements are now sorted alphanumerically
      
       - first-level interrupts are now handled via a full irqchip driver
      
       - CPU hotplug is fixed
      
       - vDSO calls now use the common vDSO infrastructure
      
      * tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: set the permission of vdso_data to read-only
        riscv: use vDSO common flow to reduce the latency of the time-related functions
        riscv: fix build warning of missing prototypes
        RISC-V: Don't mark init section as non-executable
        RISC-V: Force select RISCV_INTC for CONFIG_RISCV
        RISC-V: Remove do_IRQ() function
        clocksource/drivers/timer-riscv: Use per-CPU timer interrupt
        irqchip: RISC-V per-HART local interrupt controller driver
        RISC-V: Rename and move plic_find_hart_id() to arch directory
        RISC-V: self-contained IPI handling routine
        RISC-V: Sort select statements alphanumerically
      cd16ed33
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 55d728b2
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "arm64 fixes that came in during the merge window.
      
        There will probably be more to come, but it doesn't seem like it's
        worth me sitting on these in the meantime.
      
         - Fix SCS debug check to report max stack usage in bytes as advertised
      
         - Fix typo: CONFIG_FTRACE_WITH_REGS => CONFIG_DYNAMIC_FTRACE_WITH_REGS
      
         - Fix incorrect mask in HiSilicon L3C perf PMU driver
      
         - Fix compat vDSO compilation under some toolchain configurations
      
         - Fix false UBSAN warning from ACPI IORT parsing code
      
         - Fix booting under bootloaders that ignore TEXT_OFFSET
      
         - Annotate debug initcall function with '__init'"
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: warn on incorrect placement of the kernel by the bootloader
        arm64: acpi: fix UBSAN warning
        arm64: vdso32: add CONFIG_THUMB2_COMPAT_VDSO
        drivers/perf: hisi: Fix wrong value for all counters enable
        arm64: ftrace: Change CONFIG_FTRACE_WITH_REGS to CONFIG_DYNAMIC_FTRACE_WITH_REGS
        arm64: debug: mark a function as __init to save some memory
        scs: Report SCS usage in bytes rather than number of entries
      55d728b2
    • Linus Torvalds's avatar
      Merge tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu · d3ea6934
      Linus Torvalds authored
      Pull m68knommu updates from Greg Ungerer:
      
       - casting clean up in the user access macros
      
       - memory leak on error case fix for PCI probing
      
       - update of a defconfig
      
      * tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
        m68k,nommu: fix implicit cast from __user in __{get,put}_user_asm()
        m68k,nommu: add missing __user in uaccess' __ptr() macro
        m68k: Drop CONFIG_MTD_M25P80 in stmark2_defconfig
        m68k/PCI: Fix a memory leak in an error handling path
      d3ea6934