1. 19 Apr, 2019 2 commits
    • Mathieu Desnoyers's avatar
      rseq: Remove superfluous rseq_len from task_struct · 83b0b15b
      Mathieu Desnoyers authored
      The rseq system call, when invoked with flags of "0" or
      "RSEQ_FLAG_UNREGISTER" values, expects the rseq_len parameter to
      be equal to sizeof(struct rseq), which is fixed-size and fixed-layout,
      specified in uapi linux/rseq.h.
      
      Expecting a fixed size for rseq_len is a design choice that ensures
      multiple libraries and application defining __rseq_abi in the same
      process agree on its exact size.
      
      Considering that this size is and will always be the same value, there
      is no point in saving this value within task_struct rseq_len. Remove
      this field from task_struct.
      
      No change in functionality intended.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ben Maurer <bmaurer@fb.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Lameter <cl@linux.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-api@vger.kernel.org
      Link: http://lkml.kernel.org/r/20190305194755.2602-3-mathieu.desnoyers@efficios.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      83b0b15b
    • Mathieu Desnoyers's avatar
      rseq: Clean up comments by reflecting removal of event counter · bff9504b
      Mathieu Desnoyers authored
      The "event counter" was removed from rseq before it was merged upstream.
      However, a few comments in the source code still refer to it. Adapt the
      comments to match reality.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ben Maurer <bmaurer@fb.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Lameter <cl@linux.com>
      Cc: Dave Watson <davejwatson@fb.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Joel Fernandes <joelaf@google.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-api@vger.kernel.org
      Link: http://lkml.kernel.org/r/20190305194755.2602-2-mathieu.desnoyers@efficios.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      bff9504b
  2. 17 Apr, 2019 16 commits
    • Linus Torvalds's avatar
      Merge tag '5.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · e53f31bf
      Linus Torvalds authored
      Pull smb3 fixes from Steve French:
       "Five small SMB3 fixes, all also for stable - an important fix for an
        oplock (lease) bug, a handle leak, and three bugs spotted by KASAN"
      
      * tag '5.1-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        CIFS: keep FileInfo handle live during oplock break
        cifs: fix handle leak in smb2_query_symlink()
        cifs: Fix lease buffer length error
        cifs: Fix use-after-free in SMB2_read
        cifs: Fix use-after-free in SMB2_write
      e53f31bf
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.1-2' of git://github.com/cminyard/linux-ipmi · fe5cdef2
      Linus Torvalds authored
      Pull IPMI fixes from Corey Minyard:
       "Fixes for some bugs cause by recent changes. One crash if you feed bad
        data to the module parameters, one BUG that sometimes occurs when a
        user closes the connection, and one bug that cause the driver to not
        work if the configuration information only comes in from SMBIOS"
      
      * tag 'for-linus-5.1-2' of git://github.com/cminyard/linux-ipmi:
        ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier
        ipmi: ipmi_si_hardcode.c: init si_type array to fix a crash
        ipmi: Fix failure on SMBIOS specified devices
      fe5cdef2
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 2a3a028f
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Handle init flow failures properly in iwlwifi driver, from Shahar S
          Matityahu.
      
       2) mac80211 TXQs need to be unscheduled on powersave start, from Felix
          Fietkau.
      
       3) SKB memory accounting fix in A-MDSU aggregation, from Felix Fietkau.
      
       4) Increase RCU lock hold time in mlx5 FPGA code, from Saeed Mahameed.
      
       5) Avoid checksum complete with XDP in mlx5, also from Saeed.
      
       6) Fix netdev feature clobbering in ibmvnic driver, from Thomas Falcon.
      
       7) Partial sent TLS record leak fix from Jakub Kicinski.
      
       8) Reject zero size iova range in vhost, from Jason Wang.
      
       9) Allow pending work to complete before clcsock release from Karsten
          Graul.
      
      10) Fix XDP handling max MTU in thunderx, from Matteo Croce.
      
      11) A lot of protocols look at the sa_family field of a sockaddr before
          validating it's length is large enough, from Tetsuo Handa.
      
      12) Don't write to free'd pointer in qede ptp error path, from Colin Ian
          King.
      
      13) Have to recompile IP options in ipv4_link_failure because it can be
          invoked from ARP, from Stephen Suryaputra.
      
      14) Doorbell handling fixes in qed from Denis Bolotin.
      
      15) Revert net-sysfs kobject register leak fix, it causes new problems.
          From Wang Hai.
      
      16) Spectre v1 fix in ATM code, from Gustavo A. R. Silva.
      
      17) Fix put of BROPT_VLAN_STATS_PER_PORT in bridging code, from Nikolay
          Aleksandrov.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (111 commits)
        socket: fix compat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW
        tcp: tcp_grow_window() needs to respect tcp_space()
        ocelot: Clean up stats update deferred work
        ocelot: Don't sleep in atomic context (irqs_disabled())
        net: bridge: fix netlink export of vlan_stats_per_port option
        qed: fix spelling mistake "faspath" -> "fastpath"
        tipc: set sysctl_tipc_rmem and named_timeout right range
        tipc: fix link established but not in session
        net: Fix missing meta data in skb with vlan packet
        net: atm: Fix potential Spectre v1 vulnerabilities
        net/core: work around section mismatch warning for ptp_classifier
        net: bridge: fix per-port af_packet sockets
        bnx2x: fix spelling mistake "dicline" -> "decline"
        route: Avoid crash from dereferencing NULL rt->from
        MAINTAINERS: normalize Woojung Huh's email address
        bonding: fix event handling for stacked bonds
        Revert "net-sysfs: Fix memory leak in netdev_register_kobject"
        rtnetlink: fix rtnl_valid_stats_req() nlmsg_len check
        qed: Fix the DORQ's attentions handling
        qed: Fix missing DORQ attentions
        ...
      2a3a028f
    • Corey Minyard's avatar
      ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier · 3b9a9072
      Corey Minyard authored
      free_user() could be called in atomic context.
      
      This patch pushed the free operation off into a workqueue.
      
      Example:
      
       BUG: sleeping function called from invalid context at kernel/workqueue.c:2856
       in_atomic(): 1, irqs_disabled(): 0, pid: 177, name: ksoftirqd/27
       CPU: 27 PID: 177 Comm: ksoftirqd/27 Not tainted 4.19.25-3 #1
       Hardware name: AIC 1S-HV26-08/MB-DPSB04-06, BIOS IVYBV060 10/21/2015
       Call Trace:
        dump_stack+0x5c/0x7b
        ___might_sleep+0xec/0x110
        __flush_work+0x48/0x1f0
        ? try_to_del_timer_sync+0x4d/0x80
        _cleanup_srcu_struct+0x104/0x140
        free_user+0x18/0x30 [ipmi_msghandler]
        ipmi_free_recv_msg+0x3a/0x50 [ipmi_msghandler]
        deliver_response+0xbd/0xd0 [ipmi_msghandler]
        deliver_local_response+0xe/0x30 [ipmi_msghandler]
        handle_one_recv_msg+0x163/0xc80 [ipmi_msghandler]
        ? dequeue_entity+0xa0/0x960
        handle_new_recv_msgs+0x15c/0x1f0 [ipmi_msghandler]
        tasklet_action_common.isra.22+0x103/0x120
        __do_softirq+0xf8/0x2d7
        run_ksoftirqd+0x26/0x50
        smpboot_thread_fn+0x11d/0x1e0
        kthread+0x103/0x140
        ? sort_range+0x20/0x20
        ? kthread_destroy_worker+0x40/0x40
        ret_from_fork+0x1f/0x40
      
      Fixes: 77f82696 ("ipmi: fix use-after-free of user->release_barrier.rda")
      Reported-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Cc: stable@vger.kernel.org # 5.0
      Cc: Yang Yingliang <yangyingliang@huawei.com>
      3b9a9072
    • Arnd Bergmann's avatar
      socket: fix compat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW · e6986423
      Arnd Bergmann authored
      It looks like the new socket options only work correctly
      for native execution, but in case of compat mode fall back
      to the old behavior as we ignore the 'old_timeval' flag.
      
      Rework so we treat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW the
      same way in compat and native 32-bit mode.
      
      Cc: Deepa Dinamani <deepa.kernel@gmail.com>
      Fixes: a9beb86a ("sock: Add SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6986423
    • Eric Dumazet's avatar
      tcp: tcp_grow_window() needs to respect tcp_space() · 50ce163a
      Eric Dumazet authored
      For some reason, tcp_grow_window() correctly tests if enough room
      is present before attempting to increase tp->rcv_ssthresh,
      but does not prevent it to grow past tcp_space()
      
      This is causing hard to debug issues, like failing
      the (__tcp_select_window(sk) >= tp->rcv_wnd) test
      in __tcp_ack_snd_check(), causing ACK delays and possibly
      slow flows.
      
      Depending on tcp_rmem[2], MTU, skb->len/skb->truesize ratio,
      we can see the problem happening on "netperf -t TCP_RR -- -r 2000,2000"
      after about 60 round trips, when the active side no longer sends
      immediate acks.
      
      This bug predates git history.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50ce163a
    • Claudiu Manoil's avatar
      ocelot: Clean up stats update deferred work · 1e1caa97
      Claudiu Manoil authored
      This is preventive cleanup that may save troubles later.
      No need to cancel repeateadly queued work if code is properly
      refactored.
      Don't let the ethtool -s process interfere with the stat workqueue
      scheduling.
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e1caa97
    • Claudiu Manoil's avatar
      ocelot: Don't sleep in atomic context (irqs_disabled()) · a8fd48b5
      Claudiu Manoil authored
      Preemption disabled at:
       [<ffff000008cabd54>] dev_set_rx_mode+0x1c/0x38
       Call trace:
       [<ffff00000808a5c0>] dump_backtrace+0x0/0x3d0
       [<ffff00000808a9a4>] show_stack+0x14/0x20
       [<ffff000008e6c0c0>] dump_stack+0xac/0xe4
       [<ffff0000080fe76c>] ___might_sleep+0x164/0x238
       [<ffff0000080fe890>] __might_sleep+0x50/0x88
       [<ffff0000082261e4>] kmem_cache_alloc+0x17c/0x1d0
       [<ffff000000ea0ae8>] ocelot_set_rx_mode+0x108/0x188 [mscc_ocelot_common]
       [<ffff000008cabcf0>] __dev_set_rx_mode+0x58/0xa0
       [<ffff000008cabd5c>] dev_set_rx_mode+0x24/0x38
      
      Fixes: a556c76a ("net: mscc: Add initial Ocelot switch support")
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8fd48b5
    • Nikolay Aleksandrov's avatar
      net: bridge: fix netlink export of vlan_stats_per_port option · 600bea7d
      Nikolay Aleksandrov authored
      Since the introduction of the vlan_stats_per_port option the netlink
      export of it has been broken since I made a typo and used the ifla
      attribute instead of the bridge option to retrieve its state.
      Sysfs export is fine, only netlink export has been affected.
      
      Fixes: 9163a0fc ("net: bridge: add support for per-port vlan stats")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      600bea7d
    • Colin Ian King's avatar
      qed: fix spelling mistake "faspath" -> "fastpath" · 3321b6c2
      Colin Ian King authored
      There is a spelling mistake in a DP_INFO message, fix it.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarMukesh Ojha <mojha@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3321b6c2
    • Jie Liu's avatar
      tipc: set sysctl_tipc_rmem and named_timeout right range · 4bcd4ec1
      Jie Liu authored
      We find that sysctl_tipc_rmem and named_timeout do not have the right minimum
      setting. sysctl_tipc_rmem should be larger than zero, like sysctl_tcp_rmem.
      And named_timeout as a timeout setting should be not less than zero.
      
      Fixes: cc79dd1b ("tipc: change socket buffer overflow control to respect sk_rcvbuf")
      Fixes: a5325ae5 ("tipc: add name distributor resiliency queue")
      Signed-off-by: default avatarJie Liu <liujie165@huawei.com>
      Reported-by: default avatarQiang Ning <ningqiang1@huawei.com>
      Reviewed-by: default avatarZhiqiang Liu <liuzhiqiang26@huawei.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4bcd4ec1
    • Tuong Lien's avatar
      tipc: fix link established but not in session · f7a93780
      Tuong Lien authored
      According to the link FSM, when a link endpoint got RESET_MSG (- a
      traditional one without the stopping bit) from its peer, it moves to
      PEER_RESET state and raises a LINK_DOWN event which then resets the
      link itself. Its state will become ESTABLISHING after the reset event
      and the link will be re-established soon after this endpoint starts to
      send ACTIVATE_MSG to the peer.
      
      There is no problem with this mechanism, however the link resetting has
      cleared the link 'in_session' flag (along with the other important link
      data such as: the link 'mtu') that was correctly set up at the 1st step
      (i.e. when this endpoint received the peer RESET_MSG). As a result, the
      link will become ESTABLISHED, but the 'in_session' flag is not set, and
      all STATE_MSG from its peer will be dropped at the link_validate_msg().
      It means the link not synced and will sooner or later face a failure.
      
      Since the link reset action is obviously needed for a new link session
      (this is also true in the other situations), the problem here is that
      the link is re-established a bit too early when the link endpoints are
      not really in-sync yet. The commit forces a resync as already done in
      the previous commit 91986ee1 ("tipc: fix link session and
      re-establish issues") by simply varying the link 'peer_session' value
      at the link_reset().
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7a93780
    • Yuya Kusakabe's avatar
      net: Fix missing meta data in skb with vlan packet · d85e8be2
      Yuya Kusakabe authored
      skb_reorder_vlan_header() should move XDP meta data with ethernet header
      if XDP meta data exists.
      
      Fixes: de8f3a83 ("bpf: add meta pointer for direct access")
      Signed-off-by: default avatarYuya Kusakabe <yuya.kusakabe@gmail.com>
      Signed-off-by: default avatarTakeru Hayasaka <taketarou2@gmail.com>
      Co-developed-by: default avatarTakeru Hayasaka <taketarou2@gmail.com>
      Reviewed-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d85e8be2
    • Gustavo A. R. Silva's avatar
      net: atm: Fix potential Spectre v1 vulnerabilities · 899537b7
      Gustavo A. R. Silva authored
      arg is controlled by user-space, hence leading to a potential
      exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      net/atm/lec.c:715 lec_mcast_attach() warn: potential spectre issue 'dev_lec' [r] (local cap)
      
      Fix this by sanitizing arg before using it to index dev_lec.
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      899537b7
    • Ard Biesheuvel's avatar
      net/core: work around section mismatch warning for ptp_classifier · ad910c7c
      Ard Biesheuvel authored
      The routine ptp_classifier_init() uses an initializer for an
      automatic struct type variable which refers to an __initdata
      symbol. This is perfectly legal, but may trigger a section
      mismatch warning when running the compiler in -fpic mode, due
      to the fact that the initializer may be emitted into an anonymous
      .data section thats lack the __init annotation. So work around it
      by using assignments instead.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad910c7c
    • Nikolay Aleksandrov's avatar
      net: bridge: fix per-port af_packet sockets · 3b2e2904
      Nikolay Aleksandrov authored
      When the commit below was introduced it changed two visible things:
       - the skb was no longer passed through the protocol handlers with the
         original device
       - the skb was passed up the stack with skb->dev = bridge
      
      The first change broke af_packet sockets on bridge ports. For example we
      use them for hostapd which listens for ETH_P_PAE packets on the ports.
      We discussed two possible fixes:
       - create a clone and pass it through NF_HOOK(), act on the original skb
         based on the result
       - somehow signal to the caller from the okfn() that it was called,
         meaning the skb is ok to be passed, which this patch is trying to
         implement via returning 1 from the bridge link-local okfn()
      
      Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and
      drop/error would return < 0 thus the okfn() is called only when the
      return was 1, so we signal to the caller that it was called by preserving
      the return value from nf_hook().
      
      Fixes: 8626c56c ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b2e2904
  3. 16 Apr, 2019 22 commits
    • Tony Camuso's avatar
      ipmi: ipmi_si_hardcode.c: init si_type array to fix a crash · a885bcfd
      Tony Camuso authored
      The intended behavior of function ipmi_hardcode_init_one() is to default
      to kcs interface when no type argument is presented when initializing
      ipmi with hard coded addresses.
      
      However, the array of char pointers allocated on the stack by function
      ipmi_hardcode_init() was not inited to zeroes, so it contained stack
      debris.
      
      Consequently, passing the cruft stored in this array to function
      ipmi_hardcode_init_one() caused a crash when it was unable to detect
      that the char * being passed was nonsense and tried to access the
      address specified by the bogus pointer.
      
      The fix is simply to initialize the si_type array to zeroes, so if
      there were no type argument given to at the command line, function
      ipmi_hardcode_init_one() could properly default to the kcs interface.
      Signed-off-by: default avatarTony Camuso <tcamuso@redhat.com>
      Message-Id: <1554837603-40299-1-git-send-email-tcamuso@redhat.com>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      a885bcfd
    • Corey Minyard's avatar
      ipmi: Fix failure on SMBIOS specified devices · bd2e98b3
      Corey Minyard authored
      An extra memset was put into a place that cleared the interface
      type.
      Reported-by: default avatarTony Camuso <tcamuso@redhat.com>
      Fixes: 3cd83bac ("ipmi: Consolidate the adding of platform devices")
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      bd2e98b3
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.1-rc6' of... · 444fe991
      Linus Torvalds authored
      Merge tag 'riscv-for-linus-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
      
      Pull RISC-V fixes from Palmer Dabbelt:
       "This contains an assortment of RISC-V-related fixups that we found
        after rc4. They're all really unrelated:
      
         - The addition of a 32-bit defconfig, to emphasize testing the 32-bit
           port.
      
         - A device tree bindings patch, which is pre-work for some patches
           that target 5.2.
      
         - A fix to support booting on systems with more physical memory than
           the maximum supported by the kernel"
      
      * tag 'riscv-for-linus-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
        RISC-V: Fix Maximum Physical Memory 2GiB option for 64bit systems
        dt-bindings: clock: sifive: add FU540-C000 PRCI clock constants
        RISC-V: Add separate defconfig for 32bit systems
      444fe991
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · b5de3c50
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "5.1 keeps its reputation as a big bugfix release for KVM x86.
      
         - Fix for a memory leak introduced during the merge window
      
         - Fixes for nested VMX with ept=0
      
         - Fixes for AMD (APIC virtualization, NMI injection)
      
         - Fixes for Hyper-V under KVM and KVM under Hyper-V
      
         - Fixes for 32-bit SMM and tests for SMM virtualization
      
         - More array_index_nospec peppering"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
        KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracing
        KVM: fix spectrev1 gadgets
        KVM: x86: fix warning Using plain integer as NULL pointer
        selftests: kvm: add a selftest for SMM
        selftests: kvm: fix for compilers that do not support -no-pie
        selftests: kvm/evmcs_test: complete I/O before migrating guest state
        KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels
        KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU
        KVM: x86: clear SMM flags before loading state while leaving SMM
        KVM: x86: Open code kvm_set_hflags
        KVM: x86: Load SMRAM in a single shot when leaving SMM
        KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU
        KVM: x86: Raise #GP when guest vCPU do not support PMU
        x86/kvm: move kvm_load/put_guest_xcr0 into atomic context
        KVM: x86: svm: make sure NMI is injected after nmi_singlestep
        svm/avic: Fix invalidate logical APIC id entry
        Revert "svm: Fix AVIC incomplete IPI emulation"
        kvm: mmu: Fix overflow on kvm mmu page limit calculation
        KVM: nVMX: always use early vmcs check when EPT is disabled
        KVM: nVMX: allow tests to use bad virtual-APIC page address
        ...
      b5de3c50
    • Aurelien Aptel's avatar
      CIFS: keep FileInfo handle live during oplock break · b98749ca
      Aurelien Aptel authored
      In the oplock break handler, writing pending changes from pages puts
      the FileInfo handle. If the refcount reaches zero it closes the handle
      and waits for any oplock break handler to return, thus causing a deadlock.
      
      To prevent this situation:
      
      * We add a wait flag to cifsFileInfo_put() to decide whether we should
        wait for running/pending oplock break handlers
      
      * We keep an additionnal reference of the SMB FileInfo handle so that
        for the rest of the handler putting the handle won't close it.
        - The ref is bumped everytime we queue the handler via the
          cifs_queue_oplock_break() helper.
        - The ref is decremented at the end of the handler
      
      This bug was triggered by xfstest 464.
      
      Also important fix to address the various reports of
      oops in smb2_push_mandatory_locks
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      b98749ca
    • Ronnie Sahlberg's avatar
      cifs: fix handle leak in smb2_query_symlink() · e6d0fb7b
      Ronnie Sahlberg authored
      If we enter smb2_query_symlink() for something that is not a symlink
      and where the SMB2_open() would succeed we would never end up
      closing this handle and would thus leak a handle on the server.
      
      Fix this by immediately calling SMB2_close() on successfull open.
      Signed-off-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      e6d0fb7b
    • ZhangXiaoxu's avatar
      cifs: Fix lease buffer length error · b57a55e2
      ZhangXiaoxu authored
      There is a KASAN slab-out-of-bounds:
      BUG: KASAN: slab-out-of-bounds in _copy_from_iter_full+0x783/0xaa0
      Read of size 80 at addr ffff88810c35e180 by task mount.cifs/539
      
      CPU: 1 PID: 539 Comm: mount.cifs Not tainted 4.19 #10
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
                  rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
      Call Trace:
       dump_stack+0xdd/0x12a
       print_address_description+0xa7/0x540
       kasan_report+0x1ff/0x550
       check_memory_region+0x2f1/0x310
       memcpy+0x2f/0x80
       _copy_from_iter_full+0x783/0xaa0
       tcp_sendmsg_locked+0x1840/0x4140
       tcp_sendmsg+0x37/0x60
       inet_sendmsg+0x18c/0x490
       sock_sendmsg+0xae/0x130
       smb_send_kvec+0x29c/0x520
       __smb_send_rqst+0x3ef/0xc60
       smb_send_rqst+0x25a/0x2e0
       compound_send_recv+0x9e8/0x2af0
       cifs_send_recv+0x24/0x30
       SMB2_open+0x35e/0x1620
       open_shroot+0x27b/0x490
       smb2_open_op_close+0x4e1/0x590
       smb2_query_path_info+0x2ac/0x650
       cifs_get_inode_info+0x1058/0x28f0
       cifs_root_iget+0x3bb/0xf80
       cifs_smb3_do_mount+0xe00/0x14c0
       cifs_do_mount+0x15/0x20
       mount_fs+0x5e/0x290
       vfs_kern_mount+0x88/0x460
       do_mount+0x398/0x31e0
       ksys_mount+0xc6/0x150
       __x64_sys_mount+0xea/0x190
       do_syscall_64+0x122/0x590
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      It can be reproduced by the following step:
        1. samba configured with: server max protocol = SMB2_10
        2. mount -o vers=default
      
      When parse the mount version parameter, the 'ops' and 'vals'
      was setted to smb30,  if negotiate result is smb21, just
      update the 'ops' to smb21, but the 'vals' is still smb30.
      When add lease context, the iov_base is allocated with smb21
      ops, but the iov_len is initiallited with the smb30. Because
      the iov_len is longer than iov_base, when send the message,
      copy array out of bounds.
      
      we need to keep the 'ops' and 'vals' consistent.
      
      Fixes: 9764c02f ("SMB3: Add support for multidialect negotiate (SMB2.1 and later)")
      Fixes: d5c7076b ("smb3: add smb3.1.1 to default dialect list")
      Signed-off-by: default avatarZhangXiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      b57a55e2
    • ZhangXiaoxu's avatar
      cifs: Fix use-after-free in SMB2_read · 088aaf17
      ZhangXiaoxu authored
      There is a KASAN use-after-free:
      BUG: KASAN: use-after-free in SMB2_read+0x1136/0x1190
      Read of size 8 at addr ffff8880b4e45e50 by task ln/1009
      
      Should not release the 'req' because it will use in the trace.
      
      Fixes: eccb4422 ("smb3: Add ftrace tracepoints for improved SMB3 debugging")
      Signed-off-by: default avatarZhangXiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org> 4.18+
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      088aaf17
    • ZhangXiaoxu's avatar
      cifs: Fix use-after-free in SMB2_write · 6a3eb336
      ZhangXiaoxu authored
      There is a KASAN use-after-free:
      BUG: KASAN: use-after-free in SMB2_write+0x1342/0x1580
      Read of size 8 at addr ffff8880b6a8e450 by task ln/4196
      
      Should not release the 'req' because it will use in the trace.
      
      Fixes: eccb4422 ("smb3: Add ftrace tracepoints for improved SMB3 debugging")
      Signed-off-by: default avatarZhangXiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org> 4.18+
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      6a3eb336
    • Vitaly Kuznetsov's avatar
      KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracing · 7a223e06
      Vitaly Kuznetsov authored
      In __apic_accept_irq() interface trig_mode is int and actually on some code
      paths it is set above u8:
      
      kvm_apic_set_irq() extracts it from 'struct kvm_lapic_irq' where trig_mode
      is u16. This is done on purpose as e.g. kvm_set_msi_irq() sets it to
      (1 << 15) & e->msi.data
      
      kvm_apic_local_deliver sets it to reg & (1 << 15).
      
      Fix the immediate issue by making 'tm' into u16. We may also want to adjust
      __apic_accept_irq() interface and use proper sizes for vector, level,
      trig_mode but this is not urgent.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7a223e06
    • Paolo Bonzini's avatar
      KVM: fix spectrev1 gadgets · 1d487e9b
      Paolo Bonzini authored
      These were found with smatch, and then generalized when applicable.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1d487e9b
    • Hariprasad Kelam's avatar
      KVM: x86: fix warning Using plain integer as NULL pointer · be43c440
      Hariprasad Kelam authored
      Changed passing argument as "0 to NULL" which resolves below sparse warning
      
      arch/x86/kvm/x86.c:3096:61: warning: Using plain integer as NULL pointer
      Signed-off-by: default avatarHariprasad Kelam <hariprasad.kelam@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      be43c440
    • Vitaly Kuznetsov's avatar
      selftests: kvm: add a selftest for SMM · 79904c9d
      Vitaly Kuznetsov authored
      Add a simple test for SMM, based on VMX.  The test implements its own
      sync between the guest and the host as using our ucall library seems to
      be too cumbersome: SMI handler is happening in real-address mode.
      
      This patch also fixes KVM_SET_NESTED_STATE to happen after
      KVM_SET_VCPU_EVENTS, in fact it places it last.  This is because
      KVM needs to know whether the processor is in SMM or not.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      79904c9d
    • Paolo Bonzini's avatar
      selftests: kvm: fix for compilers that do not support -no-pie · c2390f16
      Paolo Bonzini authored
      -no-pie was added to GCC at the same time as their configuration option
      --enable-default-pie.  Compilers that were built before do not have
      -no-pie, but they also do not need it.  Detect the option at build
      time.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c2390f16
    • Paolo Bonzini's avatar
      selftests: kvm/evmcs_test: complete I/O before migrating guest state · c68c21ca
      Paolo Bonzini authored
      Starting state migration after an IO exit without first completing IO
      may result in test failures.  We already have two tests that need this
      (this patch in fact fixes evmcs_test, similar to what was fixed for
      state_test in commit 0f73bbc8, "KVM: selftests: complete IO before
      migrating guest state", 2019-03-13) and a third is coming.  So, move the
      code to vcpu_save_state, and while at it do not access register state
      until after I/O is complete.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c68c21ca
    • Sean Christopherson's avatar
      KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels · b68f3cc7
      Sean Christopherson authored
      Invoking the 64-bit variation on a 32-bit kenrel will crash the guest,
      trigger a WARN, and/or lead to a buffer overrun in the host, e.g.
      rsm_load_state_64() writes r8-r15 unconditionally, but enum kvm_reg and
      thus x86_emulate_ctxt._regs only define r8-r15 for CONFIG_X86_64.
      
      KVM allows userspace to report long mode support via CPUID, even though
      the guest is all but guaranteed to crash if it actually tries to enable
      long mode.  But, a pure 32-bit guest that is ignorant of long mode will
      happily plod along.
      
      SMM complicates things as 64-bit CPUs use a different SMRAM save state
      area.  KVM handles this correctly for 64-bit kernels, e.g. uses the
      legacy save state map if userspace has hid long mode from the guest,
      but doesn't fare well when userspace reports long mode support on a
      32-bit host kernel (32-bit KVM doesn't support 64-bit guests).
      
      Since the alternative is to crash the guest, e.g. by not loading state
      or explicitly requesting shutdown, unconditionally use the legacy SMRAM
      save state map for 32-bit KVM.  If a guest has managed to get far enough
      to handle SMIs when running under a weird/buggy userspace hypervisor,
      then don't deliberately crash the guest since there are no downsides
      (from KVM's perspective) to allow it to continue running.
      
      Fixes: 660a5d51 ("KVM: x86: save/load state on SMM switch")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b68f3cc7
    • Sean Christopherson's avatar
      KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU · 8f4dc2e7
      Sean Christopherson authored
      Neither AMD nor Intel CPUs have an EFER field in the legacy SMRAM save
      state area, i.e. don't save/restore EFER across SMM transitions.  KVM
      somewhat models this, e.g. doesn't clear EFER on entry to SMM if the
      guest doesn't support long mode.  But during RSM, KVM unconditionally
      clears EFER so that it can get back to pure 32-bit mode in order to
      start loading CRs with their actual non-SMM values.
      
      Clear EFER only when it will be written when loading the non-SMM state
      so as to preserve bits that can theoretically be set on 32-bit vCPUs,
      e.g. KVM always emulates EFER_SCE.
      
      And because CR4.PAE is cleared only to play nice with EFER, wrap that
      code in the long mode check as well.  Note, this may result in a
      compiler warning about cr4 being consumed uninitialized.  Re-read CR4
      even though it's technically unnecessary, as doing so allows for more
      readable code and RSM emulation is not a performance critical path.
      
      Fixes: 660a5d51 ("KVM: x86: save/load state on SMM switch")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8f4dc2e7
    • Sean Christopherson's avatar
      KVM: x86: clear SMM flags before loading state while leaving SMM · 9ec19493
      Sean Christopherson authored
      RSM emulation is currently broken on VMX when the interrupted guest has
      CR4.VMXE=1.  Stop dancing around the issue of HF_SMM_MASK being set when
      loading SMSTATE into architectural state, e.g. by toggling it for
      problematic flows, and simply clear HF_SMM_MASK prior to loading
      architectural state (from SMRAM save state area).
      Reported-by: default avatarJon Doron <arilou@gmail.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Fixes: 5bea5123 ("KVM: VMX: check nested state and CR4.VMXE against SMM")
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Tested-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9ec19493
    • Sean Christopherson's avatar
      KVM: x86: Open code kvm_set_hflags · c5833c7a
      Sean Christopherson authored
      Prepare for clearing HF_SMM_MASK prior to loading state from the SMRAM
      save state map, i.e. kvm_smm_changed() needs to be called after state
      has been loaded and so cannot be done automatically when setting
      hflags from RSM.
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c5833c7a
    • Sean Christopherson's avatar
      KVM: x86: Load SMRAM in a single shot when leaving SMM · ed19321f
      Sean Christopherson authored
      RSM emulation is currently broken on VMX when the interrupted guest has
      CR4.VMXE=1.  Rather than dance around the issue of HF_SMM_MASK being set
      when loading SMSTATE into architectural state, ideally RSM emulation
      itself would be reworked to clear HF_SMM_MASK prior to loading non-SMM
      architectural state.
      
      Ostensibly, the only motivation for having HF_SMM_MASK set throughout
      the loading of state from the SMRAM save state area is so that the
      memory accesses from GET_SMSTATE() are tagged with role.smm.  Load
      all of the SMRAM save state area from guest memory at the beginning of
      RSM emulation, and load state from the buffer instead of reading guest
      memory one-by-one.
      
      This paves the way for clearing HF_SMM_MASK prior to loading state,
      and also aligns RSM with the enter_smm() behavior, which fills a
      buffer and writes SMRAM save state in a single go.
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ed19321f
    • Liran Alon's avatar
      KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU · e51bfdb6
      Liran Alon authored
      Issue was discovered when running kvm-unit-tests on KVM running as L1 on
      top of Hyper-V.
      
      When vmx_instruction_intercept unit-test attempts to run RDPMC to test
      RDPMC-exiting, it is intercepted by L1 KVM which it's EXIT_REASON_RDPMC
      handler raise #GP because vCPU exposed by Hyper-V doesn't support PMU.
      Instead of unit-test expectation to be reflected with EXIT_REASON_RDPMC.
      
      The reason vmx_instruction_intercept unit-test attempts to run RDPMC
      even though Hyper-V doesn't support PMU is because L1 expose to L2
      support for RDPMC-exiting. Which is reasonable to assume that is
      supported only in case CPU supports PMU to being with.
      
      Above issue can easily be simulated by modifying
      vmx_instruction_intercept config in x86/unittests.cfg to run QEMU with
      "-cpu host,+vmx,-pmu" and run unit-test.
      
      To handle issue, change KVM to expose RDPMC-exiting only when guest
      supports PMU.
      Reported-by: default avatarSaar Amar <saaramar@microsoft.com>
      Reviewed-by: default avatarMihai Carabas <mihai.carabas@oracle.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e51bfdb6
    • Liran Alon's avatar
      KVM: x86: Raise #GP when guest vCPU do not support PMU · 672ff6cf
      Liran Alon authored
      Before this change, reading a VMware pseduo PMC will succeed even when
      PMU is not supported by guest. This can easily be seen by running
      kvm-unit-test vmware_backdoors with "-cpu host,-pmu" option.
      Reviewed-by: default avatarMihai Carabas <mihai.carabas@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      672ff6cf