1. 19 Nov, 2014 40 commits
    • Johan Hovold's avatar
      pcmcia: at91_cf: fix deferred probe from __init · ae04c825
      Johan Hovold authored
      commit 16a7c7cf upstream.
      
      Move probe out of __init section and don't use platform_driver_probe
      which cannot be used with deferred probing.
      
      Since commit e9354576 ("gpiolib: Defer failed gpio requests by default")
      this driver might return -EPROBE_DEFER if a gpio_request fails.
      
      Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
      Signed-off-by: default avatarJohan Hovold <jhovold@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ae04c825
    • Dmitry Eremin-Solenikov's avatar
      Input: wm97xx - adapt parameters to tosa touchscreen. · 05b598ab
      Dmitry Eremin-Solenikov authored
      commit 859abd1d upstream.
      
      Sharp SL-6000 (tosa) touchscreen needs wider limits to properly map all
      points on the screen. Expand ranges in abs_x and abs_y arrays according
      to the touchscreen area.
      Signed-off-by: default avatarDmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      05b598ab
    • Tobias Klauser's avatar
      Input: altera_ps2 - write to correct register when disabling interrupts · 937d4c93
      Tobias Klauser authored
      commit d0269b84 upstream.
      
      In altera_ps2_close, the data register (offset 0) is written instead of
      the control register (offset 4), leading to the RX interrupt not being
      disabled. Fix this by calling writel() with the offset for the proper
      register.
      Signed-off-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      937d4c93
    • Dan Carpenter's avatar
      [media] usbvision-video: two use after frees · 214c97a1
      Dan Carpenter authored
      commit 470a9147 upstream.
      
      The lock has been freed in usbvision_release() so there is no need to
      call mutex_unlock() here.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      214c97a1
    • Micky Ching's avatar
      drivers/memstick/host/rtsx_pci_ms.c: add cancel_work when remove driver · fed533bb
      Micky Ching authored
      commit b6226b45 upstream.
      
      Add cancel_work_sync() in rtsx_pci_ms_drv_remove() to cancel pending
      request work when removing the driver.
      Signed-off-by: default avatarMicky Ching <micky_ching@realsil.com.cn>
      Cc: Samuel Ortiz <sameo@linux.intel.com> says:
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Alex Dubov <oakad@yahoo.com>
      Cc: Roger Tseng <rogerable@realtek.com>
      Cc: Wei WANG <wei_wang@realsil.com.cn>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fed533bb
    • Daniel Borkmann's avatar
      net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks · bbd951a2
      Daniel Borkmann authored
      commit 9de7922b upstream.
      
      Commit 6f4c618d ("SCTP : Add paramters validity check for
      ASCONF chunk") added basic verification of ASCONF chunks, however,
      it is still possible to remotely crash a server by sending a
      special crafted ASCONF chunk, even up to pre 2.6.12 kernels:
      
      skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768
       head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950
       end:0x440 dev:<NULL>
       ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:129!
      [...]
      Call Trace:
       <IRQ>
       [<ffffffff8144fb1c>] skb_put+0x5c/0x70
       [<ffffffffa01ea1c3>] sctp_addto_chunk+0x63/0xd0 [sctp]
       [<ffffffffa01eadaf>] sctp_process_asconf+0x1af/0x540 [sctp]
       [<ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20
       [<ffffffffa01e0038>] sctp_sf_do_asconf+0x168/0x240 [sctp]
       [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
       [<ffffffff8147645d>] ? fib_rules_lookup+0xad/0xf0
       [<ffffffffa01e6b22>] ? sctp_cmp_addr_exact+0x32/0x40 [sctp]
       [<ffffffffa01e8393>] sctp_assoc_bh_rcv+0xd3/0x180 [sctp]
       [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
       [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
       [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
       [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
       [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
       [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
       [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
       [<ffffffff81496ded>] ip_local_deliver_finish+0xdd/0x2d0
       [<ffffffff81497078>] ip_local_deliver+0x98/0xa0
       [<ffffffff8149653d>] ip_rcv_finish+0x12d/0x440
       [<ffffffff81496ac5>] ip_rcv+0x275/0x350
       [<ffffffff8145c88b>] __netif_receive_skb+0x4ab/0x750
       [<ffffffff81460588>] netif_receive_skb+0x58/0x60
      
      This can be triggered e.g., through a simple scripted nmap
      connection scan injecting the chunk after the handshake, for
      example, ...
      
        -------------- INIT[ASCONF; ASCONF_ACK] ------------->
        <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
        -------------------- COOKIE-ECHO -------------------->
        <-------------------- COOKIE-ACK ---------------------
        ------------------ ASCONF; UNKNOWN ------------------>
      
      ... where ASCONF chunk of length 280 contains 2 parameters ...
      
        1) Add IP address parameter (param length: 16)
        2) Add/del IP address parameter (param length: 255)
      
      ... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the
      Address Parameter in the ASCONF chunk is even missing, too.
      This is just an example and similarly-crafted ASCONF chunks
      could be used just as well.
      
      The ASCONF chunk passes through sctp_verify_asconf() as all
      parameters passed sanity checks, and after walking, we ended
      up successfully at the chunk end boundary, and thus may invoke
      sctp_process_asconf(). Parameter walking is done with
      WORD_ROUND() to take padding into account.
      
      In sctp_process_asconf()'s TLV processing, we may fail in
      sctp_process_asconf_param() e.g., due to removal of the IP
      address that is also the source address of the packet containing
      the ASCONF chunk, and thus we need to add all TLVs after the
      failure to our ASCONF response to remote via helper function
      sctp_add_asconf_response(), which basically invokes a
      sctp_addto_chunk() adding the error parameters to the given
      skb.
      
      When walking to the next parameter this time, we proceed
      with ...
      
        length = ntohs(asconf_param->param_hdr.length);
        asconf_param = (void *)asconf_param + length;
      
      ... instead of the WORD_ROUND()'ed length, thus resulting here
      in an off-by-one that leads to reading the follow-up garbage
      parameter length of 12336, and thus throwing an skb_over_panic
      for the reply when trying to sctp_addto_chunk() next time,
      which implicitly calls the skb_put() with that length.
      
      Fix it by using sctp_walk_params() [ which is also used in
      INIT parameter processing ] macro in the verification *and*
      in ASCONF processing: it will make sure we don't spill over,
      that we walk parameters WORD_ROUND()'ed. Moreover, we're being
      more defensive and guard against unknown parameter types and
      missized addresses.
      
      Joint work with Vlad Yasevich.
      
      Fixes: b896b82b ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bbd951a2
    • Daniel Borkmann's avatar
      net: sctp: fix panic on duplicate ASCONF chunks · a723db0b
      Daniel Borkmann authored
      commit b69040d8 upstream.
      
      When receiving a e.g. semi-good formed connection scan in the
      form of ...
      
        -------------- INIT[ASCONF; ASCONF_ACK] ------------->
        <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
        -------------------- COOKIE-ECHO -------------------->
        <-------------------- COOKIE-ACK ---------------------
        ---------------- ASCONF_a; ASCONF_b ----------------->
      
      ... where ASCONF_a equals ASCONF_b chunk (at least both serials
      need to be equal), we panic an SCTP server!
      
      The problem is that good-formed ASCONF chunks that we reply with
      ASCONF_ACK chunks are cached per serial. Thus, when we receive a
      same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
      not need to process them again on the server side (that was the
      idea, also proposed in the RFC). Instead, we know it was cached
      and we just resend the cached chunk instead. So far, so good.
      
      Where things get nasty is in SCTP's side effect interpreter, that
      is, sctp_cmd_interpreter():
      
      While incoming ASCONF_a (chunk = event_arg) is being marked
      !end_of_packet and !singleton, and we have an association context,
      we do not flush the outqueue the first time after processing the
      ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
      queued up, although we set local_cork to 1. Commit 2e3216cd
      changed the precedence, so that as long as we get bundled, incoming
      chunks we try possible bundling on outgoing queue as well. Before
      this commit, we would just flush the output queue.
      
      Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
      continue to process the same ASCONF_b chunk from the packet. As
      we have cached the previous ASCONF_ACK, we find it, grab it and
      do another SCTP_CMD_REPLY command on it. So, effectively, we rip
      the chunk->list pointers and requeue the same ASCONF_ACK chunk
      another time. Since we process ASCONF_b, it's correctly marked
      with end_of_packet and we enforce an uncork, and thus flush, thus
      crashing the kernel.
      
      Fix it by testing if the ASCONF_ACK is currently pending and if
      that is the case, do not requeue it. When flushing the output
      queue we may relink the chunk for preparing an outgoing packet,
      but eventually unlink it when it's copied into the skb right
      before transmission.
      
      Joint work with Vlad Yasevich.
      
      Fixes: 2e3216cd ("sctp: Follow security requirement of responding with 1 packet")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a723db0b
    • Daniel Borkmann's avatar
      net: sctp: fix remote memory pressure from excessive queueing · e4768414
      Daniel Borkmann authored
      commit 26b87c78 upstream.
      
      This scenario is not limited to ASCONF, just taken as one
      example triggering the issue. When receiving ASCONF probes
      in the form of ...
      
        -------------- INIT[ASCONF; ASCONF_ACK] ------------->
        <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
        -------------------- COOKIE-ECHO -------------------->
        <-------------------- COOKIE-ACK ---------------------
        ---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------>
        [...]
        ---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------>
      
      ... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
      ASCONFs and have increasing serial numbers, we process such
      ASCONF chunk(s) marked with !end_of_packet and !singleton,
      since we have not yet reached the SCTP packet end. SCTP does
      only do verification on a chunk by chunk basis, as an SCTP
      packet is nothing more than just a container of a stream of
      chunks which it eats up one by one.
      
      We could run into the case that we receive a packet with a
      malformed tail, above marked as trailing JUNK. All previous
      chunks are here goodformed, so the stack will eat up all
      previous chunks up to this point. In case JUNK does not fit
      into a chunk header and there are no more other chunks in
      the input queue, or in case JUNK contains a garbage chunk
      header, but the encoded chunk length would exceed the skb
      tail, or we came here from an entirely different scenario
      and the chunk has pdiscard=1 mark (without having had a flush
      point), it will happen, that we will excessively queue up
      the association's output queue (a correct final chunk may
      then turn it into a response flood when flushing the
      queue ;)): I ran a simple script with incremental ASCONF
      serial numbers and could see the server side consuming
      excessive amount of RAM [before/after: up to 2GB and more].
      
      The issue at heart is that the chunk train basically ends
      with !end_of_packet and !singleton markers and since commit
      2e3216cd ("sctp: Follow security requirement of responding
      with 1 packet") therefore preventing an output queue flush
      point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
      chunk (chunk = event_arg) even though local_cork is set,
      but its precedence has changed since then. In the normal
      case, the last chunk with end_of_packet=1 would trigger the
      queue flush to accommodate possible outgoing bundling.
      
      In the input queue, sctp_inq_pop() seems to do the right thing
      in terms of discarding invalid chunks. So, above JUNK will
      not enter the state machine and instead be released and exit
      the sctp_assoc_bh_rcv() chunk processing loop. It's simply
      the flush point being missing at loop exit. Adding a try-flush
      approach on the output queue might not work as the underlying
      infrastructure might be long gone at this point due to the
      side-effect interpreter run.
      
      One possibility, albeit a bit of a kludge, would be to defer
      invalid chunk freeing into the state machine in order to
      possibly trigger packet discards and thus indirectly a queue
      flush on error. It would surely be better to discard chunks
      as in the current, perhaps better controlled environment, but
      going back and forth, it's simply architecturally not possible.
      I tried various trailing JUNK attack cases and it seems to
      look good now.
      
      Joint work with Vlad Yasevich.
      
      Fixes: 2e3216cd ("sctp: Follow security requirement of responding with 1 packet")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e4768414
    • Nadav Amit's avatar
      KVM: x86: Don't report guest userspace emulation error to userspace · 7e1ebf02
      Nadav Amit authored
      commit a2b9e6c1 upstream.
      
      Commit fc3a9157 ("KVM: X86: Don't report L2 emulation failures to
      user-space") disabled the reporting of L2 (nested guest) emulation failures to
      userspace due to race-condition between a vmexit and the instruction emulator.
      The same rational applies also to userspace applications that are permitted by
      the guest OS to access MMIO area or perform PIO.
      
      This patch extends the current behavior - of injecting a #UD instead of
      reporting it to userspace - also for guest userspace code.
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7e1ebf02
    • Pranith Kumar's avatar
      rcu: Use rcu_gp_kthread_wake() to wake up grace period kthreads · 4ffc1f8b
      Pranith Kumar authored
      commit 2aa792e6 upstream.
      
      The rcu_gp_kthread_wake() function checks for three conditions before
      waking up grace period kthreads:
      
      *  Is the thread we are trying to wake up the current thread?
      *  Are the gp_flags zero? (all threads wait on non-zero gp_flags condition)
      *  Is there no thread created for this flavour, hence nothing to wake up?
      
      If any one of these condition is true, we do not call wake_up().
      It was found that there are quite a few avoidable wake ups both during
      idle time and under stress induced by rcutorture.
      
      Idle:
      
      Total:66000, unnecessary:66000, case1:61827, case2:66000, case3:0
      Total:68000, unnecessary:68000, case1:63696, case2:68000, case3:0
      
      rcutorture:
      
      Total:254000, unnecessary:254000, case1:199913, case2:254000, case3:0
      Total:256000, unnecessary:256000, case1:201784, case2:256000, case3:0
      
      Here case{1-3} are the cases listed above. We can avoid these wake
      ups by using rcu_gp_kthread_wake() to conditionally wake up the grace
      period kthreads.
      
      There is a comment about an implied barrier supplied by the wake_up()
      logic.  This barrier is necessary for the awakened thread to see the
      updated ->gp_flags.  This flag is always being updated with the root node
      lock held. Also, the awakened thread tries to acquire the root node lock
      before reading ->gp_flags because of which there is proper ordering.
      
      Hence this commit tries to avoid calling wake_up() whenever we can by
      using rcu_gp_kthread_wake() function.
      Signed-off-by: default avatarPranith Kumar <bobby.prani@gmail.com>
      CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Kamal Mostafa <kamal@canonical.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4ffc1f8b
    • Paul E. McKenney's avatar
      rcu: Make callers awaken grace-period kthread · 9b3128c4
      Paul E. McKenney authored
      commit 48a7639c upstream.
      
      The rcu_start_gp_advanced() function currently uses irq_work_queue()
      to defer wakeups of the RCU grace-period kthread.  This deferring
      is necessary to avoid RCU-scheduler deadlocks involving the rcu_node
      structure's lock, meaning that RCU cannot call any of the scheduler's
      wake-up functions while holding one of these locks.
      
      Unfortunately, the second and subsequent calls to irq_work_queue() are
      ignored, and the first call will be ignored (aside from queuing the work
      item) if the scheduler-clock tick is turned off.  This is OK for many
      uses, especially those where irq_work_queue() is called from an interrupt
      or softirq handler, because in those cases the scheduler-clock-tick state
      will be re-evaluated, which will turn the scheduler-clock tick back on.
      On the next tick, any deferred work will then be processed.
      
      However, this strategy does not always work for RCU, which can be invoked
      at process level from idle CPUs.  In this case, the tick might never
      be turned back on, indefinitely defering a grace-period start request.
      Note that the RCU CPU stall detector cannot see this condition, because
      there is no RCU grace period in progress.  Therefore, we can (and do!)
      see long tens-of-seconds stalls in grace-period handling.  In theory,
      we could see a full grace-period hang, but rcutorture testing to date
      has seen only the tens-of-seconds stalls.  Event tracing demonstrates
      that irq_work_queue() is being called repeatedly to no effect during
      these stalls: The "newreq" event appears repeatedly from a task that is
      not one of the grace-period kthreads.
      
      In theory, irq_work_queue() might be fixed to avoid this sort of issue,
      but RCU's requirements are unusual and it is quite straightforward to pass
      wake-up responsibility up through RCU's call chain, so that the wakeup
      happens when the offending locks are released.
      
      This commit therefore makes this change.  The rcu_start_gp_advanced(),
      rcu_start_future_gp(), rcu_accelerate_cbs(), rcu_advance_cbs(),
      __note_gp_changes(), and rcu_start_gp() functions now return a boolean
      which indicates when a wake-up is needed.  A new rcu_gp_kthread_wake()
      does the wakeup when it is necessary and safe to do so: No self-wakes,
      no wake-ups if the ->gp_flags field indicates there is no need (as in
      someone else did the wake-up before we got around to it), and no wake-ups
      before the grace-period kthread has been created.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      [ Pranith: backport to 3.13-stable: just rcu_gp_kthread_wake(),
        prereq for 2aa792e6 "rcu: Use rcu_gp_kthread_wake() to wake up grace
        period kthreads" ]
      Signed-off-by: default avatarPranith Kumar <bobby.prani@gmail.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9b3128c4
    • Ben Dooks's avatar
      ARM: probes: fix instruction fetch order with <asm/opcodes.h> · 612d0786
      Ben Dooks authored
      commit 888be254 upstream.
      
      If we are running BE8, the data and instruction endianness do not
      match, so use <asm/opcodes.h> to correctly translate memory accesses
      into ARM instructions.
      Acked-by: default avatarJon Medhurst <tixy@linaro.org>
      Signed-off-by: default avatarBen Dooks <ben.dooks@codethink.co.uk>
      [taras.kondratiuk@linaro.org: fixed Thumb instruction fetch order]
      Signed-off-by: default avatarTaras Kondratiuk <taras.kondratiuk@linaro.org>
      [wangnan: backport to 3.10 and 3.14:
       - adjust context
       - backport all changes on arch/arm/kernel/probes.c to
         arch/arm/kernel/kprobes-common.c since we don't have
         commit c18377c3.
       - After the above adjustments, becomes same to Taras Kondratiuk's
         original patch:
           http://lists.linaro.org/pipermail/linaro-kernel/2014-January/010346.html
      ]
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      612d0786
    • Pablo Neira's avatar
      netfilter: xt_bpf: add mising opaque struct sk_filter definition · afdb2106
      Pablo Neira authored
      commit e10038a8 upstream.
      
      This structure is not exposed to userspace, so fix this by defining
      struct sk_filter; so we skip the casting in kernelspace. This is safe
      since userspace has no way to lurk with that internal pointer.
      
      Fixes: e6f30c73 ("netfilter: x_tables: add xt_bpf match")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      afdb2106
    • Houcheng Lin's avatar
      netfilter: nf_log: release skbuff on nlmsg put failure · 37bdac28
      Houcheng Lin authored
      commit b51d3fa3 upstream.
      
      The kernel should reserve enough room in the skb so that the DONE
      message can always be appended.  However, in case of e.g. new attribute
      erronously not being size-accounted for, __nfulnl_send() will still
      try to put next nlmsg into this full skbuf, causing the skb to be stuck
      forever and blocking delivery of further messages.
      
      Fix issue by releasing skb immediately after nlmsg_put error and
      WARN() so we can track down the cause of such size mismatch.
      
      [ fw@strlen.de: add tailroom/len info to WARN ]
      Signed-off-by: default avatarHoucheng Lin <houcheng@gmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      37bdac28
    • Florian Westphal's avatar
      netfilter: nfnetlink_log: fix maximum packet length logged to userspace · 45ebf364
      Florian Westphal authored
      commit c1e7dc91 upstream.
      
      don't try to queue payloads > 0xffff - NLA_HDRLEN, it does not work.
      The nla length includes the size of the nla struct, so anything larger
      results in u16 integer overflow.
      
      This patch is similar to
      9cefbbc9 (netfilter: nfnetlink_queue: cleanup copy_range usage).
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      45ebf364
    • Florian Westphal's avatar
      netfilter: nf_log: account for size of NLMSG_DONE attribute · 13117284
      Florian Westphal authored
      commit 9dfa1dfe upstream.
      
      We currently neither account for the nlattr size, nor do we consider
      the size of the trailing NLMSG_DONE when allocating nlmsg skb.
      
      This can result in nflog to stop working, as __nfulnl_send() re-tries
      sending forever if it failed to append NLMSG_DONE (which will never
      work if buffer is not large enough).
      Reported-by: default avatarHoucheng Lin <houcheng@gmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      13117284
    • Andrey Vagin's avatar
      ipc: always handle a new value of auto_msgmni · 04e7556e
      Andrey Vagin authored
      commit 1195d94e upstream.
      
      proc_dointvec_minmax() returns zero if a new value has been set.  So we
      don't need to check all charecters have been handled.
      
      Below you can find two examples.  In the new value has not been handled
      properly.
      
      $ strace ./a.out
      open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
      write(3, "0\n\0", 3)                    = 2
      close(3)                                = 0
      exit_group(0)
      $ cat /sys/kernel/debug/tracing/trace
      
      $strace ./a.out
      open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
      write(3, "0\n", 2)                      = 2
      close(3)                                = 0
      
      $ cat /sys/kernel/debug/tracing/trace
      a.out-697   [000] ....  3280.998235: unregister_ipcns_notifier <-proc_ipcauto_dointvec_minmax
      
      Fixes: 9eefe520 ("ipc: do not use a negative value to re-enable msgmni automatic recomputin")
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Cc: Mathias Krause <minipli@googlemail.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      04e7556e
    • Bjorn Helgaas's avatar
      clocksource: Remove "weak" from clocksource_default_clock() declaration · 0f207b54
      Bjorn Helgaas authored
      commit 96a2adbc upstream.
      
      kernel/time/jiffies.c provides a default clocksource_default_clock()
      definition explicitly marked "weak".  arch/s390 provides its own definition
      intended to override the default, but the "weak" attribute on the
      declaration applied to the s390 definition as well, so the linker chose one
      based on link order (see 10629d71 ("PCI: Remove __weak annotation from
      pcibios_get_phb_of_node decl")).
      
      Remove the "weak" attribute from the clocksource_default_clock()
      declaration so we always prefer a non-weak definition over the weak one,
      independent of link order.
      
      Fixes: f1b82746 ("clocksource: Cleanup clocksource selection")
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      CC: Daniel Lezcano <daniel.lezcano@linaro.org>
      CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0f207b54
    • Bjorn Helgaas's avatar
      kgdb: Remove "weak" from kgdb_arch_pc() declaration · 33176fe6
      Bjorn Helgaas authored
      commit 107bcc6d upstream.
      
      kernel/debug/debug_core.c provides a default kgdb_arch_pc() definition
      explicitly marked "weak".  Several architectures provide their own
      definitions intended to override the default, but the "weak" attribute on
      the declaration applied to the arch definitions as well, so the linker
      chose one based on link order (see 10629d71 ("PCI: Remove __weak
      annotation from pcibios_get_phb_of_node decl")).
      
      Remove the "weak" attribute from the declaration so we always prefer a
      non-weak definition over the weak one, independent of link order.
      
      Fixes: 688b744d ("kgdb: fix signedness mixmatches, add statics, add declaration to header")
      Tested-by: Vineet Gupta <vgupta@synopsys.com>	# for ARC build
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      33176fe6
    • Bjorn Helgaas's avatar
      vmcore: Remove "weak" from function declarations · c9813eda
      Bjorn Helgaas authored
      commit 5ab03ac5 upstream.
      
      For the following functions:
      
        elfcorehdr_alloc()
        elfcorehdr_free()
        elfcorehdr_read()
        elfcorehdr_read_notes()
        remap_oldmem_pfn_range()
      
      fs/proc/vmcore.c provides default definitions explicitly marked "weak".
      arch/s390 provides its own definitions intended to override the default
      ones, but the "weak" attribute on the declarations applied to the s390
      definitions as well, so the linker chose one based on link order (see
      10629d71 ("PCI: Remove __weak annotation from pcibios_get_phb_of_node
      decl")).
      
      Remove the "weak" attribute from the declarations so we always prefer a
      non-weak definition over the weak one, independent of link order.
      
      Fixes: be8a8d06 ("vmcore: introduce ELF header in new memory feature")
      Fixes: 9cb21813 ("vmcore: introduce remap_oldmem_pfn_range()")
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      CC: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c9813eda
    • Trond Myklebust's avatar
      NFSv4.1: nfs41_clear_delegation_stateid shouldn't trust NFS_DELEGATED_STATE · eac34ad1
      Trond Myklebust authored
      commit 0c116cad upstream.
      
      This patch removes the assumption made previously, that we only need to
      check the delegation stateid when it matches the stateid on a cached
      open.
      
      If we believe that we hold a delegation for this file, then we must assume
      that its stateid may have been revoked or expired too. If we don't test it
      then our state recovery process may end up caching open/lock state in a
      situation where it should not.
      We therefore rename the function nfs41_clear_delegation_stateid as
      nfs41_check_delegation_stateid, and change it to always run through the
      delegation stateid test and recovery process as outlined in RFC5661.
      
      http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.comSigned-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      eac34ad1
    • Trond Myklebust's avatar
      NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return · 1f093632
      Trond Myklebust authored
      commit 869f9dfa upstream.
      
      Any attempt to call nfs_remove_bad_delegation() while a delegation is being
      returned is currently a no-op. This means that we can end up looping
      forever in nfs_end_delegation_return() if something causes the delegation
      to be revoked.
      This patch adds a mechanism whereby the state recovery code can communicate
      to the delegation return code that the delegation is no longer valid and
      that it should not be used when reclaiming state.
      It also changes the return value for nfs4_handle_delegation_recall_error()
      to ensure that nfs_end_delegation_return() does not reattempt the lock
      reclaim before state recovery is done.
      
      http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.comSigned-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1f093632
    • Jan Kara's avatar
      nfs: Fix use of uninitialized variable in nfs_getattr() · 77683fa6
      Jan Kara authored
      commit 16caf5b6 upstream.
      
      Variable 'err' needn't be initialized when nfs_getattr() uses it to
      check whether it should call generic_fillattr() or not. That can result
      in spurious error returns. Initialize 'err' properly.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      77683fa6
    • Trond Myklebust's avatar
      NFS: Don't try to reclaim delegation open state if recovery failed · a78d2dde
      Trond Myklebust authored
      commit f8ebf7a8 upstream.
      
      If state recovery failed, then we should not attempt to reclaim delegated
      state.
      
      http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.comSigned-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a78d2dde
    • Trond Myklebust's avatar
      NFSv4: Ensure that we remove NFSv4.0 delegations when state has expired · ae87d577
      Trond Myklebust authored
      commit 4dfd4f7a upstream.
      
      NFSv4.0 does not have TEST_STATEID/FREE_STATEID functionality, so
      unlike NFSv4.1, the recovery procedure when stateids have expired or
      have been revoked requires us to just forget the delegation.
      
      http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.comSigned-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ae87d577
    • Pali Rohár's avatar
      Input: alps - ignore bad data on Dell Latitudes E6440 and E7440 · 4d735897
      Pali Rohár authored
      commit a7ef82ae upstream.
      
      Sometimes on Dell Latitude laptops psmouse/alps driver receive invalid ALPS
      protocol V3 packets with bit7 set in last byte. More often it can be
      reproduced on Dell Latitude E6440 or E7440 with closed lid and pushing
      cover above touchpad.
      
      If bit7 in last packet byte is set then it is not valid ALPS packet. I was
      told that ALPS devices never send these packets. It is not know yet who
      send those packets, it could be Dell EC, bug in BIOS and also bug in
      touchpad firmware...
      
      With this patch alps driver does not process those invalid packets, but
      instead of reporting PSMOUSE_BAD_DATA, getting into out of sync state,
      getting back in sync with the next byte and spam dmesg we return
      PSMOUSE_FULL_PACKET. If driver is truly out of sync we'll fail the checks
      on the next byte and report PSMOUSE_BAD_DATA then.
      Signed-off-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Tested-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4d735897
    • Pali Rohár's avatar
      Input: alps - allow up to 2 invalid packets without resetting device · 335d898e
      Pali Rohár authored
      commit 9d720b34 upstream.
      
      On some Dell Latitude laptops ALPS device or Dell EC send one invalid byte
      in 6 bytes ALPS packet. In this case psmouse driver enter out of sync
      state. It looks like that all other bytes in packets are valid and also
      device working properly. So there is no need to do full device reset, just
      need to wait for byte which match condition for first byte (start of
      packet). Because ALPS packets are bigger (6 or 8 bytes) default limit is
      small.
      
      This patch increase number of invalid bytes to size of 2 ALPS packets which
      psmouse driver can drop before do full reset.
      
      Resetting ALPS devices take some time and when doing reset on some Dell
      laptops touchpad, trackstick and also keyboard do not respond. So it is
      better to do it only if really necessary.
      Signed-off-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Tested-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      335d898e
    • Pali Rohár's avatar
      Input: alps - ignore potential bare packets when device is out of sync · 92e2e274
      Pali Rohár authored
      commit 4ab8f7f3 upstream.
      
      5th and 6th byte of ALPS trackstick V3 protocol match condition for first
      byte of PS/2 3 bytes packet. When driver enters out of sync state and ALPS
      trackstick is sending data then driver match 5th, 6th and next 1st bytes as
      PS/2.
      
      It basically means if user is using trackstick when driver is in out of
      sync state driver will never resync. Processing these bytes as 3 bytes PS/2
      data cause total mess (random cursor movements, random clicks) and make
      trackstick unusable until psmouse driver decide to do full device reset.
      
      Lot of users reported problems with ALPS devices on Dell Latitude E6440,
      E6540 and E7440 laptops. ALPS device or Dell EC for unknown reason send
      some invalid ALPS PS/2 bytes which cause driver out of sync. It looks like
      that i8042 and psmouse/alps driver always receive group of 6 bytes packets
      so there are no missing bytes and no bytes were inserted between valid
      ones.
      
      This patch does not fix root of problem with ALPS devices found in Dell
      Latitude laptops but it does not allow to process some (invalid)
      subsequence of 6 bytes ALPS packets as 3 bytes PS/2 when driver is out of
      sync.
      
      So with this patch trackstick input device does not report bogus data when
      also driver is out of sync, so trackstick should be usable on those
      machines.
      Signed-off-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Tested-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      92e2e274
    • Heinz Mauelshagen's avatar
      dm raid: ensure superblock's size matches device's logical block size · e90b799e
      Heinz Mauelshagen authored
      commit 40d43c4b upstream.
      
      The dm-raid superblock (struct dm_raid_superblock) is padded to 512
      bytes and that size is being used to read it in from the metadata
      device into one preallocated page.
      
      Reading or writing this on a 512-byte sector device works fine but on
      a 4096-byte sector device this fails.
      
      Set the dm-raid superblock's size to the logical block size of the
      metadata device, because IO at that size is guaranteed too work.  Also
      add a size check to avoid silent partial metadata loss in case the
      superblock should ever grow past the logical block size or PAGE_SIZE.
      
      [includes pointer math fix from Dan Carpenter]
      Reported-by: default avatar"Liuhua Wang" <lwang@suse.com>
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e90b799e
    • Joe Thornber's avatar
      dm btree: fix a recursion depth bug in btree walking code · d0960e21
      Joe Thornber authored
      commit 9b460d36 upstream.
      
      The walk code was using a 'ro_spine' to hold it's locked btree nodes.
      But this data structure is designed for the rolling lock scheme, and
      as such automatically unlocks blocks that are two steps up the call
      chain.  This is not suitable for the simple recursive walk algorithm,
      which retraces its steps.
      
      This code is only used by the persistent array code, which in turn is
      only used by dm-cache.  In order to trigger it you need to have a
      mapping tree that is more than 2 levels deep; which equates to 8-16
      million cache blocks.  For instance a 4T ssd with a very small block
      size of 32k only just triggers this bug.
      
      The fix just places the locked blocks on the stack, and stops using
      the ro_spine altogether.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d0960e21
    • Mikulas Patocka's avatar
      dm bufio: change __GFP_IO to __GFP_FS in shrinker callbacks · d1bb12f6
      Mikulas Patocka authored
      commit 9d28eb12 upstream.
      
      The shrinker uses gfp flags to indicate what kind of operation can the
      driver wait for. If __GFP_IO flag is present, the driver can wait for
      block I/O operations, if __GFP_FS flag is present, the driver can wait on
      operations involving the filesystem.
      
      dm-bufio tested for __GFP_IO. However, dm-bufio can run on a loop block
      device that makes calls into the filesystem. If __GFP_IO is present and
      __GFP_FS isn't, dm-bufio could still block on filesystem operations if it
      runs on a loop block device.
      
      The change from __GFP_IO to __GFP_FS supposedly fixes one observed (though
      unreproducible) deadlock involving dm-bufio and loop device.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d1bb12f6
    • Jan Kara's avatar
      block: Fix computation of merged request priority · a39c8649
      Jan Kara authored
      commit ece9c72a upstream.
      
      Priority of a merged request is computed by ioprio_best(). If one of the
      requests has undefined priority (IOPRIO_CLASS_NONE) and another request
      has priority from IOPRIO_CLASS_BE, the function will return the
      undefined priority which is wrong. Fix the function to properly return
      priority of a request with the defined priority.
      
      Fixes: d58cdfb8Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a39c8649
    • Helge Deller's avatar
      parisc: Use compat layer for msgctl, shmat, shmctl and semtimedop syscalls · e29a7bc4
      Helge Deller authored
      commit 2fe749f5 upstream.
      
      Switch over the msgctl, shmat, shmctl and semtimedop syscalls to use the compat
      layer. The problem was found with the debian procenv package, which called
      	shmctl(0, SHM_INFO, &info);
      in which the shmctl syscall then overwrote parts of the surrounding areas on
      the stack on which the info variable was stored and thus lead to a segfault
      later on.
      
      Additionally fix the definition of struct shminfo64 to use unsigned longs like
      the other architectures. This has no impact on userspace since we only have a
      32bit userspace up to now.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: John David Anglin <dave.anglin@bell.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e29a7bc4
    • Christoph Hellwig's avatar
      scsi: only re-lock door after EH on devices that were reset · 7a6e7a36
      Christoph Hellwig authored
      commit 48379270 upstream.
      
      Setups that use the blk-mq I/O path can lock up if a host with a single
      device that has its door locked enters EH.  Make sure to only send the
      command to re-lock the door to devices that actually were reset and thus
      might have lost their state.  Otherwise the EH code might be get blocked
      on blk_get_request as all requests for non-reset devices might be in use.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reported-by: default avatarMeelis Roos <meelis.roos@ut.ee>
      Tested-by: default avatarMeelis Roos <meelis.roos@ut.ee>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7a6e7a36
    • Peng Tao's avatar
      nfs: fix pnfs direct write memory leak · d0c8187c
      Peng Tao authored
      commit 8c393f9a upstream.
      
      For pNFS direct writes, layout driver may dynamically allocate ds_cinfo.buckets.
      So we need to take care to free them when freeing dreq.
      
      Ideally this needs to be done inside layout driver where ds_cinfo.buckets
      are allocated. But buckets are attached to dreq and reused across LD IO iterations.
      So I feel it's OK to free them in the generic layer.
      Signed-off-by: default avatarPeng Tao <tao.peng@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d0c8187c
    • Stefan Richter's avatar
      firewire: cdev: prevent kernel stack leaking into ioctl arguments · 9fb5f68e
      Stefan Richter authored
      commit eaca2d8e upstream.
      
      Found by the UC-KLEE tool:  A user could supply less input to
      firewire-cdev ioctls than write- or write/read-type ioctl handlers
      expect.  The handlers used data from uninitialized kernel stack then.
      
      This could partially leak back to the user if the kernel subsequently
      generated fw_cdev_event_'s (to be read from the firewire-cdev fd)
      which notably would contain the _u64 closure field which many of the
      ioctl argument structures contain.
      
      The fact that the handlers would act on random garbage input is a
      lesser issue since all handlers must check their input anyway.
      
      The fix simply always null-initializes the entire ioctl argument buffer
      regardless of the actual length of expected user input.  That is, a
      runtime overhead of memset(..., 40) is added to each firewirew-cdev
      ioctl() call.  [Comment from Clemens Ladisch:  This part of the stack is
      most likely to be already in the cache.]
      
      Remarks:
        - There was never any leak from kernel stack to the ioctl output
          buffer itself.  IOW, it was not possible to read kernel stack by a
          read-type or write/read-type ioctl alone; the leak could at most
          happen in combination with read()ing subsequent event data.
        - The actual expected minimum user input of each ioctl from
          include/uapi/linux/firewire-cdev.h is, in bytes:
          [0x00] = 32, [0x05] =  4, [0x0a] = 16, [0x0f] = 20, [0x14] = 16,
          [0x01] = 36, [0x06] = 20, [0x0b] =  4, [0x10] = 20, [0x15] = 20,
          [0x02] = 20, [0x07] =  4, [0x0c] =  0, [0x11] =  0, [0x16] =  8,
          [0x03] =  4, [0x08] = 24, [0x0d] = 20, [0x12] = 36, [0x17] = 12,
          [0x04] = 20, [0x09] = 24, [0x0e] =  4, [0x13] = 40, [0x18] =  4.
      Reported-by: default avatarDavid Ramos <daramos@stanford.edu>
      Signed-off-by: default avatarStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9fb5f68e
    • Kyle McMartin's avatar
      arm64: __clear_user: handle exceptions on strb · beb762ba
      Kyle McMartin authored
      commit 97fc1543 upstream.
      
      ARM64 currently doesn't fix up faults on the single-byte (strb) case of
      __clear_user... which means that we can cause a nasty kernel panic as an
      ordinary user with any multiple PAGE_SIZE+1 read from /dev/zero.
      i.e.: dd if=/dev/zero of=foo ibs=1 count=1 (or ibs=65537, etc.)
      
      This is a pretty obscure bug in the general case since we'll only
      __do_kernel_fault (since there's no extable entry for pc) if the
      mmap_sem is contended. However, with CONFIG_DEBUG_VM enabled, we'll
      always fault.
      
      if (!down_read_trylock(&mm->mmap_sem)) {
      	if (!user_mode(regs) && !search_exception_tables(regs->pc))
      		goto no_context;
      retry:
      	down_read(&mm->mmap_sem);
      } else {
      	/*
      	 * The above down_read_trylock() might have succeeded in
      	 * which
      	 * case, we'll have missed the might_sleep() from
      	 * down_read().
      	 */
      	might_sleep();
      	if (!user_mode(regs) && !search_exception_tables(regs->pc))
      		goto no_context;
      }
      
      Fix that by adding an extable entry for the strb instruction, since it
      touches user memory, similar to the other stores in __clear_user.
      Signed-off-by: default avatarKyle McMartin <kyle@redhat.com>
      Reported-by: default avatarMiloš Prchlík <mprchlik@redhat.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      beb762ba
    • Joe Thornber's avatar
      dm thin: grab a virtual cell before looking up the mapping · 2c49ae54
      Joe Thornber authored
      commit c822ed96 upstream.
      
      Avoids normal IO racing with discard.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2c49ae54
    • Will Deacon's avatar
      ARM: 8191/1: decompressor: ensure I-side picks up relocated code · f0a72a0e
      Will Deacon authored
      commit 238962ac upstream.
      
      To speed up decompression, the decompressor sets up a flat, cacheable
      mapping of memory. However, when there is insufficient space to hold
      the page tables for this mapping, we don't bother to enable the caches
      and subsequently skip all the cache maintenance hooks.
      
      Skipping the cache maintenance before jumping to the relocated code
      allows the processor to predict the branch and populate the I-cache
      with stale data before the relocation loop has completed (since a
      bootloader may have SCTLR.I set, which permits normal, cacheable
      instruction fetches regardless of SCTLR.M).
      
      This patch moves the cache maintenance check into the maintenance
      routines themselves, allowing the v6/v7 versions to invalidate the
      I-cache regardless of the MMU state.
      Reported-by: default avatarMarc Carino <marc.ceeeee@gmail.com>
      Tested-by: default avatarJulien Grall <julien.grall@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f0a72a0e
    • Nathan Lynch's avatar
      ARM: 8198/1: make kuser helpers depend on MMU · d8149279
      Nathan Lynch authored
      commit 08b964ff upstream.
      
      The kuser helpers page is not set up on non-MMU systems, so it does
      not make sense to allow CONFIG_KUSER_HELPERS to be enabled when
      CONFIG_MMU=n.  Allowing it to be set on !MMU results in an oops in
      set_tls (used in execve and the arm_syscall trap handler):
      
      Unhandled exception: IPSR = 00000005 LR = fffffff1
      CPU: 0 PID: 1 Comm: swapper Not tainted 3.18.0-rc1-00041-ga30465a #216
      task: 8b838000 ti: 8b82a000 task.ti: 8b82a000
      PC is at flush_thread+0x32/0x40
      LR is at flush_thread+0x21/0x40
      pc : [<8f00157a>]    lr : [<8f001569>]    psr: 4100000b
      sp : 8b82be20  ip : 00000000  fp : 8b83c000
      r10: 00000001  r9 : 88018c84  r8 : 8bb85000
      r7 : 8b838000  r6 : 00000000  r5 : 8bb77400  r4 : 8b82a000
      r3 : ffff0ff0  r2 : 8b82a000  r1 : 00000000  r0 : 88020354
      xPSR: 4100000b
      CPU: 0 PID: 1 Comm: swapper Not tainted 3.18.0-rc1-00041-ga30465a #216
      [<8f002bc1>] (unwind_backtrace) from [<8f002033>] (show_stack+0xb/0xc)
      [<8f002033>] (show_stack) from [<8f00265b>] (__invalid_entry+0x4b/0x4c)
      
      As best I can tell this issue existed for the set_tls ARM syscall
      before commit fbfb872f "ARM: 8148/1: flush TLS and thumbee
      register state during exec" consolidated the TLS manipulation code
      into the set_tls helper function, but now that we're using it to flush
      register state during execve, !MMU users encounter the oops at the
      first exec.
      
      Prevent CONFIG_MMU=n configurations from enabling
      CONFIG_KUSER_HELPERS.
      
      Fixes: fbfb872f (ARM: 8148/1: flush TLS and thumbee register state during exec)
      Signed-off-by: default avatarNathan Lynch <nathan_lynch@mentor.com>
      Reported-by: default avatarStefan Agner <stefan@agner.ch>
      Acked-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d8149279