1. 14 Jul, 2019 20 commits
  2. 10 Jul, 2019 20 commits
    • Greg Kroah-Hartman's avatar
      Linux 5.1.17 · 4b886fa2
      Greg Kroah-Hartman authored
      4b886fa2
    • Roman Bolshakov's avatar
      scsi: target/iblock: Fix overrun in WRITE SAME emulation · 10a57320
      Roman Bolshakov authored
      commit 5676234f upstream.
      
      WRITE SAME corrupts data on the block device behind iblock if the command
      is emulated. The emulation code issues (M - 1) * N times more bios than
      requested, where M is the number of 512 blocks per real block size and N is
      the NUMBER OF LOGICAL BLOCKS specified in WRITE SAME command. So, for a
      device with 4k blocks, 7 * N more LBAs gets written after the requested
      range.
      
      The issue happens because the number of 512 byte sectors to be written is
      decreased one by one while the real bios are typically from 1 to 8 512 byte
      sectors per bio.
      
      Fixes: c66ac9db ("[SCSI] target: Add LIO target core v4.0.0-rc6")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarRoman Bolshakov <r.bolshakov@yadro.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      10a57320
    • Geert Uytterhoeven's avatar
      fs: VALIDATE_FS_PARSER should default to n · c43646a7
      Geert Uytterhoeven authored
      commit 75f2d86b upstream.
      
      CONFIG_VALIDATE_FS_PARSER is a debugging tool to check that the parser
      tables are vaguely sane.  It was set to default to 'Y' for the moment to
      catch errors in upcoming fs conversion development.
      
      Make sure it is not enabled by default in the final release of v5.1.
      
      Fixes: 31d921c7 ("vfs: Add configuration parser helpers")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c43646a7
    • Dan Carpenter's avatar
      dmaengine: jz4780: Fix an endian bug in IRQ handler · abedf70e
      Dan Carpenter authored
      commit 4c89cc73 upstream.
      
      The "pending" variable was a u32 but we cast it to an unsigned long
      pointer when we do the for_each_set_bit() loop.  The problem is that on
      big endian 64bit systems that results in an out of bounds read.
      
      Fixes: 4e4106f5 ("dmaengine: jz4780: Fix transfers being ACKed too soon")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abedf70e
    • Robin Gong's avatar
      dmaengine: imx-sdma: remove BD_INTR for channel0 · 598ffd87
      Robin Gong authored
      commit 3f93a4f2 upstream.
      
      It is possible for an irq triggered by channel0 to be received later
      after clks are disabled once firmware loaded during sdma probe. If
      that happens then clearing them by writing to SDMA_H_INTR won't work
      and the kernel will hang processing infinite interrupts. Actually,
      don't need interrupt triggered on channel0 since it's pollling
      SDMA_H_STATSTOP to know channel0 done rather than interrupt in
      current code, just clear BD_INTR to disable channel0 interrupt to
      avoid the above case.
      This issue was brought by commit 1d069bfa ("dmaengine: imx-sdma:
      ack channel 0 IRQ in the interrupt handler") which didn't take care
      the above case.
      
      Fixes: 1d069bfa ("dmaengine: imx-sdma: ack channel 0 IRQ in the interrupt handler")
      Cc: stable@vger.kernel.org #5.0+
      Signed-off-by: default avatarRobin Gong <yibin.gong@nxp.com>
      Reported-by: default avatarSven Van Asbroeck <thesven73@gmail.com>
      Tested-by: default avatarSven Van Asbroeck <thesven73@gmail.com>
      Reviewed-by: default avatarMichael Olbrich <m.olbrich@pengutronix.de>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      598ffd87
    • Sricharan R's avatar
      dmaengine: qcom: bam_dma: Fix completed descriptors count · c68eb98a
      Sricharan R authored
      commit f6034225 upstream.
      
      One space is left unused in circular FIFO to differentiate
      'full' and 'empty' cases. So take that in to account while
      counting for the descriptors completed.
      
      Fixes the issue reported here,
      	https://lkml.org/lkml/2019/6/18/669
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarSricharan R <sricharan@codeaurora.org>
      Tested-by: default avatarSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c68eb98a
    • Cedric Hombourger's avatar
      MIPS: have "plain" make calls build dtbs for selected platforms · b7948d98
      Cedric Hombourger authored
      commit 637dfa0f upstream.
      
      scripts/package/builddeb calls "make dtbs_install" after executing
      a plain make (i.e. no build targets specified). It will fail if dtbs
      were not built beforehand. Match the arm64 architecture where DTBs get
      built by the "all" target.
      Signed-off-by: default avatarCedric Hombourger <Cedric_Hombourger@mentor.com>
      [paul.burton@mips.com: s/builddep/builddeb]
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b7948d98
    • Dmitry Korotin's avatar
      MIPS: Add missing EHB in mtc0 -> mfc0 sequence. · 4b961552
      Dmitry Korotin authored
      commit 0b24cae4 upstream.
      
      Add a missing EHB (Execution Hazard Barrier) in mtc0 -> mfc0 sequence.
      Without this execution hazard barrier it's possible for the value read
      back from the KScratch register to be the value from before the mtc0.
      
      Reproducible on P5600 & P6600.
      
      The hazard is documented in the MIPS Architecture Reference Manual Vol.
      III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev
      6.03 table 8.1 which includes:
      
         Producer | Consumer | Hazard
        ----------|----------|----------------------------
         mtc0     | mfc0     | any coprocessor 0 register
      Signed-off-by: default avatarDmitry Korotin <dkorotin@wavecomp.com>
      [paul.burton@mips.com:
        - Commit message tweaks.
        - Add Fixes tags.
        - Mark for stable back to v3.15 where P5600 support was introduced.]
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Fixes: 3d8bfdd0 ("MIPS: Use C0_KScratch (if present) to hold PGD pointer.")
      Fixes: 829dcc0a ("MIPS: Add MIPS P5600 probe support")
      Cc: linux-mips@vger.kernel.org
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b961552
    • Hauke Mehrtens's avatar
      MIPS: Fix bounds check virt_addr_valid · e395c337
      Hauke Mehrtens authored
      commit d6ed083f upstream.
      
      The bounds check used the uninitialized variable vaddr, it should use
      the given parameter kaddr instead. When using the uninitialized value
      the compiler assumed it to be 0 and optimized this function to just
      return 0 in all cases.
      
      This should make the function check the range of the given address and
      only do the page map check in case it is in the expected range of
      virtual addresses.
      
      Fixes: 074a1e11 ("MIPS: Bounds check virt_addr_valid")
      Cc: stable@vger.kernel.org # v4.12+
      Cc: Paul Burton <paul.burton@mips.com>
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: ralf@linux-mips.org
      Cc: jhogan@kernel.org
      Cc: f4bug@amsat.org
      Cc: linux-mips@vger.kernel.org
      Cc: ysu@wavecomp.com
      Cc: jcristau@debian.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e395c337
    • Chuck Lever's avatar
      svcrdma: Ignore source port when computing DRC hash · 3774d334
      Chuck Lever authored
      commit 1e091c3b upstream.
      
      The DRC appears to be effectively empty after an RPC/RDMA transport
      reconnect. The problem is that each connection uses a different
      source port, which defeats the DRC hash.
      
      Clients always have to disconnect before they send retransmissions
      to reset the connection's credit accounting, thus every retransmit
      on NFS/RDMA will miss the DRC.
      
      An NFS/RDMA client's IP source port is meaningless for RDMA
      transports. The transport layer typically sets the source port value
      on the connection to a random ephemeral port. The server already
      ignores it for the "secure port" check. See commit 16e4d93f
      ("NFSD: Ignore client's source port on RDMA transports").
      
      The Linux NFS server's DRC resolves XID collisions from the same
      source IP address by using the checksum of the first 200 bytes of
      the RPC call header.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3774d334
    • Paul Menzel's avatar
      nfsd: Fix overflow causing non-working mounts on 1 TB machines · 94f536c7
      Paul Menzel authored
      commit 3b2d4dcf upstream.
      
      Since commit 10a68cdf (nfsd: fix performance-limiting session
      calculation) (Linux 5.1-rc1 and 4.19.31), shares from NFS servers with
      1 TB of memory cannot be mounted anymore. The mount just hangs on the
      client.
      
      The gist of commit 10a68cdf is the change below.
      
          -avail = clamp_t(int, avail, slotsize, avail/3);
          +avail = clamp_t(int, avail, slotsize, total_avail/3);
      
      Here are the macros.
      
          #define min_t(type, x, y)       __careful_cmp((type)(x), (type)(y), <)
          #define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)
      
      `total_avail` is 8,434,659,328 on the 1 TB machine. `clamp_t()` casts
      the values to `int`, which for 32-bit integers can only hold values
      −2,147,483,648 (−2^31) through 2,147,483,647 (2^31 − 1).
      
      `avail` (in the function signature) is just 65536, so that no overflow
      was happening. Before the commit the assignment would result in 21845,
      and `num = 4`.
      
      When using `total_avail`, it is causing the assignment to be
      18446744072226137429 (printed as %lu), and `num` is then 4164608182.
      
      My next guess is, that `nfsd_drc_mem_used` is then exceeded, and the
      server thinks there is no memory available any more for this client.
      
      Updating the arguments of `clamp_t()` and `min_t()` to `unsigned long`
      fixes the issue.
      
      Now, `avail = 65536` (before commit 10a68cdf `avail = 21845`), but
      `num = 4` remains the same.
      
      Fixes: c54f24e3 (nfsd: fix performance-limiting session calculation)
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      94f536c7
    • Wanpeng Li's avatar
      KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC · 03dbbec3
      Wanpeng Li authored
      commit bb34e690 upstream.
      
      Thomas reported that:
      
       | Background:
       |
       |    In preparation of supporting IPI shorthands I changed the CPU offline
       |    code to software disable the local APIC instead of just masking it.
       |    That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV
       |    register.
       |
       | Failure:
       |
       |    When the CPU comes back online the startup code triggers occasionally
       |    the warning in apic_pending_intr_clear(). That complains that the IRRs
       |    are not empty.
       |
       |    The offending vector is the local APIC timer vector who's IRR bit is set
       |    and stays set.
       |
       | It took me quite some time to reproduce the issue locally, but now I can
       | see what happens.
       |
       | It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1
       | (and hardware support) it behaves correctly.
       |
       | Here is the series of events:
       |
       |     Guest CPU
       |
       |     goes down
       |
       |       native_cpu_disable()
       |
       | 			apic_soft_disable();
       |
       |     play_dead()
       |
       |     ....
       |
       |     startup()
       |
       |       if (apic_enabled())
       |         apic_pending_intr_clear()	<- Not taken
       |
       |      enable APIC
       |
       |         apic_pending_intr_clear()	<- Triggers warning because IRR is stale
       |
       | When this happens then the deadline timer or the regular APIC timer -
       | happens with both, has fired shortly before the APIC is disabled, but the
       | interrupt was not serviced because the guest CPU was in an interrupt
       | disabled region at that point.
       |
       | The state of the timer vector ISR/IRR bits:
       |
       |     	     	       	        ISR     IRR
       | before apic_soft_disable()    0	      1
       | after apic_soft_disable()     0	      1
       |
       | On startup		      		 0	      1
       |
       | Now one would assume that the IRR is cleared after the INIT reset, but this
       | happens only on CPU0.
       |
       | Why?
       |
       | Because our CPU0 hotplug is just for testing to make sure nothing breaks
       | and goes through an NMI wakeup vehicle because INIT would send it through
       | the boots-trap code which is not really working if that CPU was not
       | physically unplugged.
       |
       | Now looking at a real world APIC the situation in that case is:
       |
       |     	     	       	      	ISR     IRR
       | before apic_soft_disable()    0	      1
       | after apic_soft_disable()     0	      1
       |
       | On startup		      		 0	      0
       |
       | Why?
       |
       | Once the dying CPU reenables interrupts the pending interrupt gets
       | delivered as a spurious interupt and then the state is clear.
       |
       | While that CPU0 hotplug test case is surely an esoteric issue, the APIC
       | emulation is still wrong, Even if the play_dead() code would not enable
       | interrupts then the pending IRR bit would turn into an ISR .. interrupt
       | when the APIC is reenabled on startup.
      
      From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled
      * Pending interrupts in the IRR and ISR registers are held and require
        masking or handling by the CPU.
      
      In Thomas's testing, hardware cpu will not respect soft disable LAPIC
      when IRR has already been set or APICv posted-interrupt is in flight,
      so we can skip soft disable APIC checking when clearing IRR and set ISR,
      continue to respect soft disable APIC when attempting to set IRR.
      Reported-by: default avatarRong Chen <rong.a.chen@intel.com>
      Reported-by: default avatarFeng Tang <feng.tang@intel.com>
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Rong Chen <rong.a.chen@intel.com>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03dbbec3
    • Paolo Bonzini's avatar
      KVM: x86: degrade WARN to pr_warn_ratelimited · d2bacadc
      Paolo Bonzini authored
      commit 3f16a5c3 upstream.
      
      This warning can be triggered easily by userspace, so it should certainly not
      cause a panic if panic_on_warn is set.
      
      Reported-by: syzbot+c03f30b4f4c46bdf8575@syzkaller.appspotmail.com
      Suggested-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2bacadc
    • Martin Schwidefsky's avatar
      s390/mm: fix pxd_bad with folded page tables · 06157573
      Martin Schwidefsky authored
      [ Upstream commit c9f62152 ]
      
      With git commit d1874a0c
      "s390/mm: make the pxd_offset functions more robust" and a 2-level page
      table it can now happen that pgd_bad() gets asked to verify a large
      segment table entry. If the entry is marked as dirty pgd_bad() will
      incorrectly return true.
      
      Change the pgd_bad(), p4d_bad(), pud_bad() and pmd_bad() functions to
      first verify the table type, return false if the table level is lower
      than what the function is suppossed to check, return true if the table
      level is too high, and otherwise check the relevant region and segment
      table bits. pmd_bad() has to check against ~SEGMENT_ENTRY_BITS for
      normal page table pointers or ~SEGMENT_ENTRY_BITS_LARGE for large
      segment table entries. Same for pud_bad() which has to check against
      ~_REGION_ENTRY_BITS or ~_REGION_ENTRY_BITS_LARGE.
      
      Fixes: d1874a0c ("s390/mm: make the pxd_offset functions more robust")
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      06157573
    • Linus Torvalds's avatar
      tty: rocket: fix incorrect forward declaration of 'rp_init()' · 3e9e67d0
      Linus Torvalds authored
      [ Upstream commit 423ea325 ]
      
      Make the forward declaration actually match the real function
      definition, something that previous versions of gcc had just ignored.
      
      This is another patch to fix new warnings from gcc-9 before I start the
      merge window pulls.  I don't want to miss legitimate new warnings just
      because my system update brought a new compiler with new warnings.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3e9e67d0
    • Nikolay Borisov's avatar
      btrfs: Ensure replaced device doesn't have pending chunk allocation · 723e3866
      Nikolay Borisov authored
      commit debd1c06 upstream.
      
      Recent FITRIM work, namely bbbf7243 ("btrfs: combine device update
      operations during transaction commit") combined the way certain
      operations are recoded in a transaction. As a result an ASSERT was added
      in dev_replace_finish to ensure the new code works correctly.
      Unfortunately I got reports that it's possible to trigger the assert,
      meaning that during a device replace it's possible to have an unfinished
      chunk allocation on the source device.
      
      This is supposed to be prevented by the fact that a transaction is
      committed before finishing the replace oepration and alter acquiring the
      chunk mutex. This is not sufficient since by the time the transaction is
      committed and the chunk mutex acquired it's possible to allocate a chunk
      depending on the workload being executed on the replaced device. This
      bug has been present ever since device replace was introduced but there
      was never code which checks for it.
      
      The correct way to fix is to ensure that there is no pending device
      modification operation when the chunk mutex is acquire and if there is
      repeat transaction commit. Unfortunately it's not possible to just
      exclude the source device from btrfs_fs_devices::dev_alloc_list since
      this causes ENOSPC to be hit in transaction commit.
      
      Fixing that in another way would need to add special cases to handle the
      last writes and forbid new ones. The looped transaction fix is more
      obvious, and can be easily backported. The runtime of dev-replace is
      long so there's no noticeable delay caused by that.
      Reported-by: default avatarDavid Sterba <dsterba@suse.com>
      Fixes: 391cd9df ("Btrfs: fix unprotected alloc list insertion during the finishing procedure of replace")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      723e3866
    • Shakeel Butt's avatar
      mm/vmscan.c: prevent useless kswapd loops · d4ad26ed
      Shakeel Butt authored
      commit dffcac2c upstream.
      
      In production we have noticed hard lockups on large machines running
      large jobs due to kswaps hoarding lru lock within isolate_lru_pages when
      sc->reclaim_idx is 0 which is a small zone.  The lru was couple hundred
      GiBs and the condition (page_zonenum(page) > sc->reclaim_idx) in
      isolate_lru_pages() was basically skipping GiBs of pages while holding
      the LRU spinlock with interrupt disabled.
      
      On further inspection, it seems like there are two issues:
      
      (1) If kswapd on the return from balance_pgdat() could not sleep (i.e.
          node is still unbalanced), the classzone_idx is unintentionally set
          to 0 and the whole reclaim cycle of kswapd will try to reclaim only
          the lowest and smallest zone while traversing the whole memory.
      
      (2) Fundamentally isolate_lru_pages() is really bad when the
          allocation has woken kswapd for a smaller zone on a very large machine
          running very large jobs.  It can hoard the LRU spinlock while skipping
          over 100s of GiBs of pages.
      
      This patch only fixes (1).  (2) needs a more fundamental solution.  To
      fix (1), in the kswapd context, if pgdat->kswapd_classzone_idx is
      invalid use the classzone_idx of the previous kswapd loop otherwise use
      the one the waker has requested.
      
      Link: http://lkml.kernel.org/r/20190701201847.251028-1-shakeelb@google.com
      Fixes: e716f2eb ("mm, vmscan: prevent kswapd sleeping prematurely due to mismatched classzone_idx")
      Signed-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Reviewed-by: default avatarYang Shi <yang.shi@linux.alibaba.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Roman Gushchin <guro@fb.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d4ad26ed
    • Petr Mladek's avatar
      ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code() · d0a2a510
      Petr Mladek authored
      commit d5b844a2 upstream.
      
      The commit 9f255b63 ("module: Fix livepatch/ftrace module text
      permissions race") causes a possible deadlock between register_kprobe()
      and ftrace_run_update_code() when ftrace is using stop_machine().
      
      The existing dependency chain (in reverse order) is:
      
      -> #1 (text_mutex){+.+.}:
             validate_chain.isra.21+0xb32/0xd70
             __lock_acquire+0x4b8/0x928
             lock_acquire+0x102/0x230
             __mutex_lock+0x88/0x908
             mutex_lock_nested+0x32/0x40
             register_kprobe+0x254/0x658
             init_kprobes+0x11a/0x168
             do_one_initcall+0x70/0x318
             kernel_init_freeable+0x456/0x508
             kernel_init+0x22/0x150
             ret_from_fork+0x30/0x34
             kernel_thread_starter+0x0/0xc
      
      -> #0 (cpu_hotplug_lock.rw_sem){++++}:
             check_prev_add+0x90c/0xde0
             validate_chain.isra.21+0xb32/0xd70
             __lock_acquire+0x4b8/0x928
             lock_acquire+0x102/0x230
             cpus_read_lock+0x62/0xd0
             stop_machine+0x2e/0x60
             arch_ftrace_update_code+0x2e/0x40
             ftrace_run_update_code+0x40/0xa0
             ftrace_startup+0xb2/0x168
             register_ftrace_function+0x64/0x88
             klp_patch_object+0x1a2/0x290
             klp_enable_patch+0x554/0x980
             do_one_initcall+0x70/0x318
             do_init_module+0x6e/0x250
             load_module+0x1782/0x1990
             __s390x_sys_finit_module+0xaa/0xf0
             system_call+0xd8/0x2d0
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(text_mutex);
                                     lock(cpu_hotplug_lock.rw_sem);
                                     lock(text_mutex);
        lock(cpu_hotplug_lock.rw_sem);
      
      It is similar problem that has been solved by the commit 2d1e38f5
      ("kprobes: Cure hotplug lock ordering issues"). Many locks are involved.
      To be on the safe side, text_mutex must become a low level lock taken
      after cpu_hotplug_lock.rw_sem.
      
      This can't be achieved easily with the current ftrace design.
      For example, arm calls set_all_modules_text_rw() already in
      ftrace_arch_code_modify_prepare(), see arch/arm/kernel/ftrace.c.
      This functions is called:
      
        + outside stop_machine() from ftrace_run_update_code()
        + without stop_machine() from ftrace_module_enable()
      
      Fortunately, the problematic fix is needed only on x86_64. It is
      the only architecture that calls set_all_modules_text_rw()
      in ftrace path and supports livepatching at the same time.
      
      Therefore it is enough to move text_mutex handling from the generic
      kernel/trace/ftrace.c into arch/x86/kernel/ftrace.c:
      
         ftrace_arch_code_modify_prepare()
         ftrace_arch_code_modify_post_process()
      
      This patch basically reverts the ftrace part of the problematic
      commit 9f255b63 ("module: Fix livepatch/ftrace module
      text permissions race"). And provides x86_64 specific-fix.
      
      Some refactoring of the ftrace code will be needed when livepatching
      is implemented for arm or nds32. These architectures call
      set_all_modules_text_rw() and use stop_machine() at the same time.
      
      Link: http://lkml.kernel.org/r/20190627081334.12793-1-pmladek@suse.com
      
      Fixes: 9f255b63 ("module: Fix livepatch/ftrace module text permissions race")
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reported-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Reviewed-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      [
        As reviewed by Miroslav Benes <mbenes@suse.cz>, removed return value of
        ftrace_run_update_code() as it is a void function.
      ]
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0a2a510
    • Robert Beckett's avatar
      drm/imx: only send event on crtc disable if kept disabled · 3d2a8000
      Robert Beckett authored
      commit 5aeab2bf upstream.
      
      The event will be sent as part of the vblank enable during the modeset
      if the crtc is not being kept disabled.
      
      Fixes: 5f2f9115 ("drm/imx: atomic phase 3 step 1: Use atomic configuration")
      Signed-off-by: default avatarRobert Beckett <bob.beckett@collabora.com>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d2a8000
    • Robert Beckett's avatar
      drm/imx: notify drm core before sending event during crtc disable · c61d1981
      Robert Beckett authored
      commit 78c68e8f upstream.
      
      Notify drm core before sending pending events during crtc disable.
      This fixes the first event after disable having an old stale timestamp
      by having drm_crtc_vblank_off update the timestamp to now.
      
      This was seen while debugging weston log message:
      Warning: computed repaint delay is insane: -8212 msec
      
      This occurred due to:
      1. driver starts up
      2. fbcon comes along and restores fbdev, enabling vblank
      3. vblank_disable_fn fires via timer disabling vblank, keeping vblank
      seq number and time set at current value
      (some time later)
      4. weston starts and does a modeset
      5. atomic commit disables crtc while it does the modeset
      6. ipu_crtc_atomic_disable sends vblank with old seq number and time
      
      Fixes: a4744786 ("drm/imx: fix crtc vblank state regression")
      Signed-off-by: default avatarRobert Beckett <bob.beckett@collabora.com>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c61d1981