1. 27 Jun, 2017 1 commit
  2. 22 Jun, 2017 2 commits
  3. 21 Jun, 2017 1 commit
    • Hugh Dickins's avatar
      mm: larger stack guard gap, between vmas · 1ad9a25d
      Hugh Dickins authored
      commit 1be7107f upstream.
      
      Stack guard page is a useful feature to reduce a risk of stack smashing
      into a different mapping. We have been using a single page gap which
      is sufficient to prevent having stack adjacent to a different mapping.
      But this seems to be insufficient in the light of the stack usage in
      userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
      used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
      which is 256kB or stack strings with MAX_ARG_STRLEN.
      
      This will become especially dangerous for suid binaries and the default
      no limit for the stack size limit because those applications can be
      tricked to consume a large portion of the stack and a single glibc call
      could jump over the guard page. These attacks are not theoretical,
      unfortunatelly.
      
      Make those attacks less probable by increasing the stack guard gap
      to 1MB (on systems with 4k pages; but make it depend on the page size
      because systems with larger base pages might cap stack allocations in
      the PAGE_SIZE units) which should cover larger alloca() and VLA stack
      allocations. It is obviously not a full fix because the problem is
      somehow inherent, but it should reduce attack space a lot.
      
      One could argue that the gap size should be configurable from userspace,
      but that can be done later when somebody finds that the new 1MB is wrong
      for some special case applications.  For now, add a kernel command line
      option (stack_guard_gap) to specify the stack gap size (in page units).
      
      Implementation wise, first delete all the old code for stack guard page:
      because although we could get away with accounting one extra page in a
      stack vma, accounting a larger gap can break userspace - case in point,
      a program run with "ulimit -S -v 20000" failed when the 1MB gap was
      counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
      and strict non-overcommit mode.
      
      Instead of keeping gap inside the stack vma, maintain the stack guard
      gap as a gap between vmas: using vm_start_gap() in place of vm_start
      (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
      places which need to respect the gap - mainly arch_get_unmapped_area(),
      and and the vma tree's subtree_gap support for that.
      Original-patch-by: default avatarOleg Nesterov <oleg@redhat.com>
      Original-patch-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      [wt: backport to 4.11: adjust context]
      [wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide]
      [wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes]
      [wt: backport to 3.18: adjust context ; no FOLL_POPULATE ;
           s390 uses generic arch_get_unmapped_area()]
      [wt: backport to 3.16: adjust context]
      [wt: backport to 3.10: adjust context ; code logic in PARISC's
           arch_get_unmapped_area() wasn't found ; code inserted into
           expand_upwards() and expand_downwards() runs under anon_vma lock;
           changes for gup.c:faultin_page go to memory.c:__get_user_pages();
           included Hugh Dickins' fixes]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1ad9a25d
  4. 20 Jun, 2017 36 commits
    • Hector Marco-Gisbert's avatar
      x86/mm/32: Enable full randomization on i386 and X86_32 · a5eec86e
      Hector Marco-Gisbert authored
      commit 8b8addf8 upstream.
      
      Currently on i386 and on X86_64 when emulating X86_32 in legacy mode, only
      the stack and the executable are randomized but not other mmapped files
      (libraries, vDSO, etc.). This patch enables randomization for the
      libraries, vDSO and mmap requests on i386 and in X86_32 in legacy mode.
      
      By default on i386 there are 8 bits for the randomization of the libraries,
      vDSO and mmaps which only uses 1MB of VA.
      
      This patch preserves the original randomness, using 1MB of VA out of 3GB or
      4GB. We think that 1MB out of 3GB is not a big cost for having the ASLR.
      
      The first obvious security benefit is that all objects are randomized (not
      only the stack and the executable) in legacy mode which highly increases
      the ASLR effectiveness, otherwise the attackers may use these
      non-randomized areas. But also sensitive setuid/setgid applications are
      more secure because currently, attackers can disable the randomization of
      these applications by setting the ulimit stack to "unlimited". This is a
      very old and widely known trick to disable the ASLR in i386 which has been
      allowed for too long.
      
      Another trick used to disable the ASLR was to set the ADDR_NO_RANDOMIZE
      personality flag, but fortunately this doesn't work on setuid/setgid
      applications because there is security checks which clear Security-relevant
      flags.
      
      This patch always randomizes the mmap_legacy_base address, removing the
      possibility to disable the ASLR by setting the stack to "unlimited".
      Signed-off-by: default avatarHector Marco-Gisbert <hecmargi@upv.es>
      Acked-by: default avatarIsmael Ripoll Ripoll <iripoll@upv.es>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akpm@linux-foundation.org
      Cc: kees Cook <keescook@chromium.org>
      Link: http://lkml.kernel.org/r/1457639460-5242-1-git-send-email-hecmargi@upv.esSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      a5eec86e
    • Kees Cook's avatar
      x86: standardize mmap_rnd() usage · 2d49a81a
      Kees Cook authored
      commit 82168140 upstream.
      
      In preparation for splitting out ET_DYN ASLR, this refactors the use of
      mmap_rnd() to be used similarly to arm, and extracts the checking of
      PF_RANDOMIZE.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      2d49a81a
    • Jamie Bainbridge's avatar
      ipv6: check raw payload size correctly in ioctl · 36805c51
      Jamie Bainbridge authored
      commit 105f5528 upstream.
      
      In situations where an skb is paged, the transport header pointer and
      tail pointer can be the same because the skb contents are in frags.
      
      This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a
      length of 0 when the length to receive is actually greater than zero.
      
      skb->len is already correctly set in ip6_input_finish() with
      pskb_pull(), so use skb->len as it always returns the correct result
      for both linear and paged data.
      Signed-off-by: default avatarJamie Bainbridge <jbainbri@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      36805c51
    • Sergey Senozhatsky's avatar
      printk: use rcuidle console tracepoint · d46354fc
      Sergey Senozhatsky authored
      commit fc98c3c8 upstream.
      
      Use rcuidle console tracepoint because, apparently, it may be issued
      from an idle CPU:
      
        hw-breakpoint: Failed to enable monitor mode on CPU 0.
        hw-breakpoint: CPU 0 failed to disable vector catch
      
        ===============================
        [ ERR: suspicious RCU usage.  ]
        4.10.0-rc8-next-20170215+ #119 Not tainted
        -------------------------------
        ./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
        RCU used illegally from idle CPU!
        rcu_scheduler_active = 2, debug_locks = 0
        RCU used illegally from extended quiescent state!
        2 locks held by swapper/0/0:
         #0:  (cpu_pm_notifier_lock){......}, at: [<c0237e2c>] cpu_pm_exit+0x10/0x54
         #1:  (console_lock){+.+.+.}, at: [<c01ab350>] vprintk_emit+0x264/0x474
      
        stack backtrace:
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119
        Hardware name: Generic OMAP4 (Flattened Device Tree)
          console_unlock
          vprintk_emit
          vprintk_default
          printk
          reset_ctrl_regs
          dbg_cpu_pm_notify
          notifier_call_chain
          cpu_pm_exit
          omap_enter_idle_coupled
          cpuidle_enter_state
          cpuidle_enter_state_coupled
          do_idle
          cpu_startup_entry
          start_kernel
      
      This RCU warning, however, is suppressed by lockdep_off() in printk().
      lockdep_off() increments the ->lockdep_recursion counter and thus
      disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want
      lockdep to be enabled "current->lockdep_recursion == 0".
      
      Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.comSigned-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarTony Lindgren <tony@atomide.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Russell King <rmk@armlinux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [wt: changes are in kernel/printk.c in 3.10]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d46354fc
    • Willem de Bruijn's avatar
      tun: read vnet_hdr_sz once · 302c74b1
      Willem de Bruijn authored
      commit e1edab87 upstream.
      
      When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
      Data length is verified to be greater than or equal to expected header
      length tun->vnet_hdr_sz before copying.
      
      Read this value once and cache locally, as it can be updated between
      the test and use (TOCTOU).
      
      [js] we have TUN_VNET_HDR in 3.12
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      CC: Eric Dumazet <edumazet@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      [wt: s/READ_ONCE/ACCESS_ONCE]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      302c74b1
    • Jim Mattson's avatar
      kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF) · 58e4633a
      Jim Mattson authored
      commit ef85b673 upstream.
      
      When L2 exits to L0 due to "exception or NMI", software exceptions
      (#BP and #OF) for which L1 has requested an intercept should be
      handled by L1 rather than L0. Previously, only hardware exceptions
      were forwarded to L1.
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      58e4633a
    • Josh Poimboeuf's avatar
      ftrace/x86: Fix triple fault with graph tracing and suspend-to-ram · b3d2d8af
      Josh Poimboeuf authored
      commit 34a477e5 upstream.
      
      On x86-32, with CONFIG_FIRMWARE and multiple CPUs, if you enable function
      graph tracing and then suspend to RAM, it will triple fault and reboot when
      it resumes.
      
      The first fault happens when booting a secondary CPU:
      
      startup_32_smp()
        load_ucode_ap()
          prepare_ftrace_return()
            ftrace_graph_is_dead()
              (accesses 'kill_ftrace_graph')
      
      The early head_32.S code calls into load_ucode_ap(), which has an an
      ftrace hook, so it calls prepare_ftrace_return(), which calls
      ftrace_graph_is_dead(), which tries to access the global
      'kill_ftrace_graph' variable with a virtual address, causing a fault
      because the CPU is still in real mode.
      
      The fix is to add a check in prepare_ftrace_return() to make sure it's
      running in protected mode before continuing.  The check makes sure the
      stack pointer is a virtual kernel address.  It's a bit of a hack, but
      it's not very intrusive and it works well enough.
      
      For reference, here are a few other (more difficult) ways this could
      have potentially been fixed:
      
      - Move startup_32_smp()'s call to load_ucode_ap() down to *after* paging
        is enabled.  (No idea what that would break.)
      
      - Track down load_ucode_ap()'s entire callee tree and mark all the
        functions 'notrace'.  (Probably not realistic.)
      
      - Pause graph tracing in ftrace_suspend_notifier_call() or bringup_cpu()
        or __cpu_up(), and ensure that the pause facility can be queried from
        real mode.
      Reported-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Tested-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
      Cc: linux-acpi@vger.kernel.org
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Len Brown <lenb@kernel.org>
      Link: http://lkml.kernel.org/r/5c1272269a580660703ed2eccf44308e790c7a98.1492123841.git.jpoimboe@redhat.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b3d2d8af
    • J. Bruce Fields's avatar
      nfsd: check for oversized NFSv2/v3 arguments · 21039053
      J. Bruce Fields authored
      commit e6838a29 upstream.
      
      A client can append random data to the end of an NFSv2 or NFSv3 RPC call
      without our complaining; we'll just stop parsing at the end of the
      expected data and ignore the rest.
      
      Encoded arguments and replies are stored together in an array of pages,
      and if a call is too large it could leave inadequate space for the
      reply.  This is normally OK because NFS RPC's typically have either
      short arguments and long replies (like READ) or long arguments and short
      replies (like WRITE).  But a client that sends an incorrectly long reply
      can violate those assumptions.  This was observed to cause crashes.
      
      Also, several operations increment rq_next_page in the decode routine
      before checking the argument size, which can leave rq_next_page pointing
      well past the end of the page array, causing trouble later in
      svc_free_pages.
      
      So, following a suggestion from Neil Brown, add a central check to
      enforce our expectation that no NFSv2/v3 call has both a large call and
      a large reply.
      
      As followup we may also want to rewrite the encoding routines to check
      more carefully that they aren't running off the end of the page array.
      
      We may also consider rejecting calls that have any extra garbage
      appended.  That would be safer, and within our rights by spec, but given
      the age of our server and the NFS protocol, and the fact that we've
      never enforced this before, we may need to balance that against the
      possibility of breaking some oddball client.
      Reported-by: default avatarTuomas Haanpää <thaan@synopsys.com>
      Reported-by: default avatarAri Kauppi <ari@synopsys.com>
      Reviewed-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      21039053
    • Al Viro's avatar
      p9_client_readdir() fix · 37a23f22
      Al Viro authored
      commit 71d6ad08 upstream.
      
      Don't assume that server is sane and won't return more data than
      asked for.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      37a23f22
    • Stefano Stabellini's avatar
      xen/x86: don't lose event interrupts · 1e46601d
      Stefano Stabellini authored
      commit c06b6d70 upstream.
      
      On slow platforms with unreliable TSC, such as QEMU emulated machines,
      it is possible for the kernel to request the next event in the past. In
      that case, in the current implementation of xen_vcpuop_clockevent, we
      simply return -ETIME. To be precise the Xen returns -ETIME and we pass
      it on. However the result of this is a missed event, which simply causes
      the kernel to hang.
      
      Instead it is better to always ask the hypervisor for a timer event,
      even if the timeout is in the past. That way there are no lost
      interrupts and the kernel survives. To do that, remove the
      VCPU_SSHOTTMR_future flag.
      Signed-off-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1e46601d
    • santosh.shilimkar@oracle.com's avatar
      RDS: Fix the atomicity for congestion map update · 403642a0
      santosh.shilimkar@oracle.com authored
      commit e47db94e upstream.
      
      Two different threads with different rds sockets may be in
      rds_recv_rcvbuf_delta() via receive path. If their ports
      both map to the same word in the congestion map, then
      using non-atomic ops to update it could cause the map to
      be incorrect. Lets use atomics to avoid such an issue.
      
      Full credit to Wengang <wen.gang.wang@oracle.com> for
      finding the issue, analysing it and also pointing out
      to offending code with spin lock based fix.
      Reviewed-by: default avatarLeon Romanovsky <leon@leon.nu>
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      403642a0
    • Corey Minyard's avatar
      MIPS: Fix crash registers on non-crashing CPUs · 32b0616e
      Corey Minyard authored
      commit c80e1b62 upstream.
      
      As part of handling a crash on an SMP system, an IPI is send to
      all other CPUs to save their current registers and stop.  It was
      using task_pt_regs(current) to get the registers, but that will
      only be accurate if the CPU was interrupted running in userland.
      Instead allow the architecture to pass in the registers (all
      pass NULL now, but allow for the future) and then use get_irq_regs()
      which should be accurate as we are in an interrupt.  Fall back to
      task_pt_regs(current) if nothing else is available.
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Cc: David Daney <ddaney@caviumnetworks.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/13050/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Cc: Julia Lawall <julia.lawall@lip6.fr>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      32b0616e
    • Nikolay Aleksandrov's avatar
      ip6mr: fix notification device destruction · ab93db27
      Nikolay Aleksandrov authored
      commit 723b929c upstream.
      
      Andrey Konovalov reported a BUG caused by the ip6mr code which is caused
      because we call unregister_netdevice_many for a device that is already
      being destroyed. In IPv4's ipmr that has been resolved by two commits
      long time ago by introducing the "notify" parameter to the delete
      function and avoiding the unregister when called from a notifier, so
      let's do the same for ip6mr.
      
      The trace from Andrey:
      ------------[ cut here ]------------
      kernel BUG at net/core/dev.c:6813!
      invalid opcode: 0000 [#1] SMP KASAN
      Modules linked in:
      CPU: 1 PID: 1165 Comm: kworker/u4:3 Not tainted 4.11.0-rc7+ #251
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
      Workqueue: netns cleanup_net
      task: ffff880069208000 task.stack: ffff8800692d8000
      RIP: 0010:rollback_registered_many+0x348/0xeb0 net/core/dev.c:6813
      RSP: 0018:ffff8800692de7f0 EFLAGS: 00010297
      RAX: ffff880069208000 RBX: 0000000000000002 RCX: 0000000000000001
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88006af90569
      RBP: ffff8800692de9f0 R08: ffff8800692dec60 R09: 0000000000000000
      R10: 0000000000000006 R11: 0000000000000000 R12: ffff88006af90070
      R13: ffff8800692debf0 R14: dffffc0000000000 R15: ffff88006af90000
      FS:  0000000000000000(0000) GS:ffff88006cb00000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fe7e897d870 CR3: 00000000657e7000 CR4: 00000000000006e0
      Call Trace:
       unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881
       unregister_netdevice_many+0xc8/0x120 net/core/dev.c:7880
       ip6mr_device_event+0x362/0x3f0 net/ipv6/ip6mr.c:1346
       notifier_call_chain+0x145/0x2f0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1647
       call_netdevice_notifiers net/core/dev.c:1663
       rollback_registered_many+0x919/0xeb0 net/core/dev.c:6841
       unregister_netdevice_many.part.105+0x87/0x440 net/core/dev.c:7881
       unregister_netdevice_many net/core/dev.c:7880
       default_device_exit_batch+0x4fa/0x640 net/core/dev.c:8333
       ops_exit_list.isra.4+0x100/0x150 net/core/net_namespace.c:144
       cleanup_net+0x5a8/0xb40 net/core/net_namespace.c:463
       process_one_work+0xc04/0x1c10 kernel/workqueue.c:2097
       worker_thread+0x223/0x19c0 kernel/workqueue.c:2231
       kthread+0x35e/0x430 kernel/kthread.c:231
       ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
      Code: 3c 32 00 0f 85 70 0b 00 00 48 b8 00 02 00 00 00 00 ad de 49 89
      47 78 e9 93 fe ff ff 49 8d 57 70 49 8d 5f 78 eb 9e e8 88 7a 14 fe <0f>
      0b 48 8b 9d 28 fe ff ff e8 7a 7a 14 fe 48 b8 00 00 00 00 00
      RIP: rollback_registered_many+0x348/0xeb0 RSP: ffff8800692de7f0
      ---[ end trace e0b29c57e9b3292c ]---
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ab93db27
    • Xin Long's avatar
      sctp: listen on the sock only when it's state is listening or closed · 075de215
      Xin Long authored
      commit 34b2789f upstream.
      
      Now sctp doesn't check sock's state before listening on it. It could
      even cause changing a sock with any state to become a listening sock
      when doing sctp_listen.
      
      This patch is to fix it by checking sock's state in sctp_listen, so
      that it will listen on the sock with right state.
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      075de215
    • Eric Dumazet's avatar
      net: neigh: guard against NULL solicit() method · 4d1b81c2
      Eric Dumazet authored
      commit 48481c8f upstream.
      
      Dmitry posted a nice reproducer of a bug triggering in neigh_probe()
      when dereferencing a NULL neigh->ops->solicit method.
      
      This can happen for arp_direct_ops/ndisc_direct_ops and similar,
      which can be used for NUD_NOARP neighbours (created when dev->header_ops
      is NULL). Admin can then force changing nud_state to some other state
      that would fire neigh timer.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4d1b81c2
    • Arnd Bergmann's avatar
      gfs2: avoid uninitialized variable warning · c1b42042
      Arnd Bergmann authored
      commit 67893f12 upstream.
      
      We get a bogus warning about a potential uninitialized variable
      use in gfs2, because the compiler does not figure out that we
      never use the leaf number if get_leaf_nr() returns an error:
      
      fs/gfs2/dir.c: In function 'get_first_leaf':
      fs/gfs2/dir.c:802:9: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized]
      fs/gfs2/dir.c: In function 'dir_split_leaf':
      fs/gfs2/dir.c:1021:8: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized]
      
      Changing the 'if (!error)' to 'if (!IS_ERR_VALUE(error))' is
      sufficient to let gcc understand that this is exactly the same
      condition as in IS_ERR() so it can optimize the code path enough
      to understand it.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c1b42042
    • Arnd Bergmann's avatar
      hostap: avoid uninitialized variable use in hfa384x_get_rid · f6d81f27
      Arnd Bergmann authored
      commit 48dc5fb3 upstream.
      
      The driver reads a value from hfa384x_from_bap(), which may fail,
      and then assigns the value to a local variable. gcc detects that
      in in the failure case, the 'rlen' variable now contains
      uninitialized data:
      
      In file included from ../drivers/net/wireless/intersil/hostap/hostap_pci.c:220:0:
      drivers/net/wireless/intersil/hostap/hostap_hw.c: In function 'hfa384x_get_rid':
      drivers/net/wireless/intersil/hostap/hostap_hw.c:842:5: warning: 'rec' may be used uninitialized in this function [-Wmaybe-uninitialized]
        if (le16_to_cpu(rec.len) == 0) {
      
      This restructures the function as suggested by Russell King, to
      make it more readable and get more reliable error handling, by
      handling each failure mode using a goto.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      f6d81f27
    • Arnd Bergmann's avatar
      tty: nozomi: avoid a harmless gcc warning · 89986693
      Arnd Bergmann authored
      commit a4f642a8 upstream.
      
      The nozomi wireless data driver has its own helper function to
      transfer data from a FIFO, doing an extra byte swap on big-endian
      architectures, presumably to bring the data back into byte-serial
      order after readw() or readl() perform their implicit byteswap.
      
      This helper function is used in the receive_data() function to
      first read the length into a 32-bit variable, which causes
      a compile-time warning:
      
      drivers/tty/nozomi.c: In function 'receive_data':
      drivers/tty/nozomi.c:857:9: warning: 'size' may be used uninitialized in this function [-Wmaybe-uninitialized]
      
      The problem is that gcc is unsure whether the data was actually
      read or not. We know that it is at this point, so we can replace
      it with a single readl() to shut up that warning.
      
      I am leaving the byteswap in there, to preserve the existing
      behavior, even though this seems fishy: Reading the length of
      the data into a cpu-endian variable should normally not use
      a second byteswap on big-endian systems, unless the hardware
      is aware of the CPU endianess.
      
      There appears to be a lot more confusion about endianess in this
      driver, so it probably has not worked on big-endian systems in
      a long time, if ever, and I have no way to test it. It's well
      possible that this driver has not been used by anyone in a while,
      the last patch that looks like it was tested on the hardware is
      from 2008.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      89986693
    • Andrey Konovalov's avatar
      net/packet: fix overflow in check for tp_reserve · a8069aee
      Andrey Konovalov authored
      commit bcc5364b upstream.
      
      When calculating po->tp_hdrlen + po->tp_reserve the result can overflow.
      
      Fix by checking that tp_reserve <= INT_MAX on assign.
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      a8069aee
    • Andrey Konovalov's avatar
      net/packet: fix overflow in check for tp_frame_nr · 492980ca
      Andrey Konovalov authored
      commit 8f8d28e4 upstream.
      
      When calculating rb->frames_per_block * req->tp_block_nr the result
      can overflow.
      
      Add a check that tp_block_size * tp_block_nr <= UINT_MAX.
      
      Since frames_per_block <= tp_block_size, the expression would
      never overflow.
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      492980ca
    • Michael Ellerman's avatar
      powerpc: Reject binutils 2.24 when building little endian · 187d3b32
      Michael Ellerman authored
      commit 60e065f7 upstream.
      
      There is a bug in binutils 2.24 which causes miscompilation if we're
      building little endian and using weak symbols (which the kernel does).
      
      It is fixed in binutils commit 57fa7b8c7e59 "Correct elf_merge_st_other
      arguments for weak symbols", which is in binutils 2.25 and has been
      backported to the binutils 2.24 branch and has been picked up by most
      distros it seems.
      
      However if we're running stock 2.24 (no extra version) then the bug is
      present, so check for that and bail.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      187d3b32
    • Yazen Ghannam's avatar
      x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs · 2c6896ab
      Yazen Ghannam authored
      commit 29f72ce3 upstream.
      
      MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
      However, MCA bank 3 is defined on Fam17h systems and can be accessed
      using legacy MSRs. Without a name we get a stack trace on Fam17h systems
      when trying to register sysfs files for bank 3 on kernels that don't
      recognize Scalable MCA.
      
      Call MCA bank 3 "decode_unit" since this is what it represents on
      Fam17h. This will allow kernels without SMCA support to see this bank on
      Fam17h+ and prevent the stack trace. This will not affect older systems
      since this bank is reserved on them, i.e. it'll be ignored.
      
      Tested on AMD Fam15h and Fam17h systems.
      
        WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
        kobject: (ffff88085bb256c0): attempted to be registered with empty name!
        ...
        Call Trace:
         kobject_add_internal
         kobject_add
         kobject_create_and_add
         threshold_create_device
         threshold_init_device
      Signed-off-by: default avatarYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      2c6896ab
    • Sebastian Siewior's avatar
      ubi/upd: Always flush after prepared for an update · e8983321
      Sebastian Siewior authored
      commit 9cd9a21c upstream.
      
      In commit 6afaf8a4 ("UBI: flush wl before clearing update marker") I
      managed to trigger and fix a similar bug. Now here is another version of
      which I assumed it wouldn't matter back then but it turns out UBI has a
      check for it and will error out like this:
      
      |ubi0 warning: validate_vid_hdr: inconsistent used_ebs
      |ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592
      
      All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a
      powercut in the middle of the operation.
      ubi_start_update() sets the update-marker and puts all EBs on the erase
      list. After that userland can proceed to write new data while the old EB
      aren't erased completely. A powercut at this point is usually not that
      much of a tragedy. UBI won't give read access to the static volume
      because it has the update marker. It will most likely set the corrupted
      flag because it misses some EBs.
      So we are all good. Unless the size of the image that has been written
      differs from the old image in the magnitude of at least one EB. In that
      case UBI will find two different values for `used_ebs' and refuse to
      attach the image with the error message mentioned above.
      
      So in order not to get in the situation, the patch will ensure that we
      wait until everything is removed before it tries to write any data.
      The alternative would be to detect such a case and remove all EBs at the
      attached time after we processed the volume-table and see the
      update-marker set. The patch looks bigger and I doubt it is worth it
      since usually the write() will wait from time to time for a new EB since
      usually there not that many spare EB that can be used.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      e8983321
    • Vitaly Kuznetsov's avatar
      Drivers: hv: get rid of timeout in vmbus_open() · 0854b58f
      Vitaly Kuznetsov authored
      commit 396e287f upstream.
      
      vmbus_teardown_gpadl() can result in infinite wait when it is called on 5
      second timeout in vmbus_open(). The issue is caused by the fact that gpadl
      teardown operation won't ever succeed for an opened channel and the timeout
      isn't always enough. As a guest, we can always trust the host to respond to
      our request (and there is nothing we can do if it doesn't).
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarSumit Semwal <sumit.semwal@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0854b58f
    • Vitaly Kuznetsov's avatar
      Drivers: hv: don't leak memory in vmbus_establish_gpadl() · 0c792ee1
      Vitaly Kuznetsov authored
      commit 7cc80c98 upstream.
      
      In some cases create_gpadl_header() allocates submessages but we never
      free them.
      
      [sumits] Note for stable:
      Upstream commit 4d637632:
      (Drivers: hv: get rid of redundant messagecount in create_gpadl_header())
      changes the list usage to initialize list header in all cases; that patch
      isn't added to stable, so the current patch is modified a little bit from
      the upstream commit to check if the list is valid or not.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarSumit Semwal <sumit.semwal@linaro.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0c792ee1
    • Mantas M's avatar
      net: ipv6: check route protocol when deleting routes · 3fa16561
      Mantas M authored
      commit c2ed1880 upstream.
      
      The protocol field is checked when deleting IPv4 routes, but ignored for
      IPv6, which causes problems with routing daemons accidentally deleting
      externally set routes (observed by multiple bird6 users).
      
      This can be verified using `ip -6 route del <prefix> proto something`.
      Signed-off-by: default avatarMantas Mikulėnas <grawity@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3fa16561
    • Ben Hutchings's avatar
      catc: Use heap buffer for memory size test · 6b7c1525
      Ben Hutchings authored
      commit 2d6a0e9d upstream.
      
      Allocating USB buffers on the stack is not portable, and no longer
      works on x86_64 (with VMAP_STACK enabled as per default).
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Brad Spengler <spender@grsecurity.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6b7c1525
    • Ben Hutchings's avatar
      catc: Combine failure cleanup code in catc_probe() · f3487f84
      Ben Hutchings authored
      commit d4114914 upstream.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      f3487f84
    • Omar Sandoval's avatar
      virtio-console: avoid DMA from stack · fad53245
      Omar Sandoval authored
      commit c4baad50 upstream.
      
      put_chars() stuffs the buffer it gets into an sg, but that buffer may be
      on the stack. This breaks with CONFIG_VMAP_STACK=y (for me, it
      manifested as printks getting turned into NUL bytes).
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarAmit Shah <amit.shah@redhat.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Brad Spengler <spender@grsecurity.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      fad53245
    • Kees Cook's avatar
      mm: Tighten x86 /dev/mem with zeroing reads · 081fb3dc
      Kees Cook authored
      commit a4866aa8 upstream.
      
      Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is
      disallowed. However, on x86, the first 1MB was always allowed for BIOS
      and similar things, regardless of it actually being System RAM. It was
      possible for heap to end up getting allocated in low 1MB RAM, and then
      read by things like x86info or dd, which would trip hardened usercopy:
      
      usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes)
      
      This changes the x86 exception for the low 1MB by reading back zeros for
      System RAM areas instead of blindly allowing them. More work is needed to
      extend this to mmap, but currently mmap doesn't go through usercopy, so
      hardened usercopy won't Oops the kernel.
      Reported-by: default avatarTommi Rantala <tommi.t.rantala@nokia.com>
      Tested-by: default avatarTommi Rantala <tommi.t.rantala@nokia.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Brad Spengler <spender@grsecurity.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      081fb3dc
    • Lee, Chun-Yi's avatar
      platform/x86: acer-wmi: setup accelerometer when ACPI device was found · c9b40c2b
      Lee, Chun-Yi authored
      commit f9ac89f5 upstream.
      
      The 98d610c3 patch was introduced since v4.11-rc1 that it causes
      that the accelerometer input device will not be created on workable
      machines because the HID string comparing logic is wrong.
      
      And, the patch doesn't prevent that the accelerometer input device
      be created on the machines that have no BST0001. That's because
      the acpi_get_devices() returns success even it didn't find any
      match device.
      
      This patch fixed the HID string comparing logic of BST0001 device.
      And, it also makes sure that the acpi_get_devices() returns
      acpi_handle for BST0001.
      
      Fixes: 98d610c3 ("acer-wmi: setup accelerometer when machine has appropriate notify event")
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=193761Reported-by: default avatarSamuel Sieb <samuel-kbugs@sieb.net>
      Signed-off-by: default avatar"Lee, Chun-Yi" <jlee@suse.com>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c9b40c2b
    • Chun-Yi Lee's avatar
      platform/x86: acer-wmi: setup accelerometer when machine has appropriate notify event · 9b2b8b09
      Chun-Yi Lee authored
      commit 98d610c3 upstream.
      
      The accelerometer event relies on the ACERWMID_EVENT_GUID notify.
      So, this patch changes the codes to setup accelerometer input device
      when detected ACERWMID_EVENT_GUID. It avoids that the accel input
      device created on every Acer machines.
      
      In addition, patch adds a clearly parsing logic of accelerometer hid
      to acer_wmi_get_handle_cb callback function. It is positive matching
      the "SENR" name with "BST0001" device to avoid non-supported hardware.
      Reported-by: default avatarBjørn Mork <bjorn@mork.no>
      Cc: Darren Hart <dvhart@infradead.org>
      Signed-off-by: default avatarChun-Yi Lee <jlee@suse.com>
      [andy: slightly massage commit message]
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      9b2b8b09
    • Max Bires's avatar
      char: lack of bool string made CONFIG_DEVPORT always on · c0dd8fff
      Max Bires authored
      commit f2cfa58b upstream.
      
      Without a bool string present, using "# CONFIG_DEVPORT is not set" in
      defconfig files would not actually unset devport. This esnured that
      /dev/port was always on, but there are reasons a user may wish to
      disable it (smaller kernel, attack surface reduction) if it's not being
      used. Adding a message here in order to make this user visible.
      Signed-off-by: default avatarMax Bires <jbires@google.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c0dd8fff
    • Juergen Gross's avatar
      xen, fbfront: fix connecting to backend · 3fa1aeb6
      Juergen Gross authored
      commit 9121b15b upstream.
      
      Connecting to the backend isn't working reliably in xen-fbfront: in
      case XenbusStateInitWait of the backend has been missed the backend
      transition to XenbusStateConnected will trigger the connected state
      only without doing the actions required when the backend has
      connected.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3fa1aeb6
    • Nicholas Bellinger's avatar
      iscsi-target: Drop work-around for legacy GlobalSAN initiator · 66940417
      Nicholas Bellinger authored
      commit 1c99de98 upstream.
      
      Once upon a time back in 2009, a work-around was added to support
      the GlobalSAN iSCSI initiator v3.3 for MacOSX, which during login
      did not propose nor respond to MaxBurstLength, FirstBurstLength,
      DefaultTime2Wait and DefaultTime2Retain keys.
      
      The work-around in iscsi_check_proposer_for_optional_reply()
      allowed the missing keys to be proposed, but did not require
      waiting for a response before moving to full feature phase
      operation.  This allowed GlobalSAN v3.3 to work out-of-the
      box, and for many years we didn't run into login interopt
      issues with any other initiators..
      
      Until recently, when Martin tried a QLogic 57840S iSCSI Offload
      HBA on Windows 2016 which completed login, but subsequently
      failed with:
      
          Got unknown iSCSI OpCode: 0x43
      
      The issue was QLogic MSFT side did not propose DefaultTime2Wait +
      DefaultTime2Retain, so LIO proposes them itself, and immediately
      transitions to full feature phase because of the GlobalSAN hack.
      However, the QLogic MSFT side still attempts to respond to
      DefaultTime2Retain + DefaultTime2Wait, even though LIO has set
      ISCSI_FLAG_LOGIN_NEXT_STAGE3 + ISCSI_FLAG_LOGIN_TRANSIT
      in last login response.
      
      So while the QLogic MSFT side should have been proposing these
      two keys to start, it was doing the correct thing per RFC-3720
      attempting to respond to proposed keys before transitioning to
      full feature phase.
      
      All that said, recent versions of GlobalSAN iSCSI (v5.3.0.541)
      does correctly propose the four keys during login, making the
      original work-around moot.
      
      So in order to allow QLogic MSFT to run unmodified as-is, go
      ahead and drop this long standing work-around.
      Reported-by: default avatarMartin Svec <martin.svec@zoner.cz>
      Cc: Martin Svec <martin.svec@zoner.cz>
      Cc: Himanshu Madhani <Himanshu.Madhani@cavium.com>
      Cc: Arun Easi <arun.easi@cavium.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      66940417
    • Nicholas Bellinger's avatar
      iscsi-target: Fix TMR reference leak during session shutdown · e71bfbe7
      Nicholas Bellinger authored
      commit efb2ea77 upstream.
      
      This patch fixes a iscsi-target specific TMR reference leak
      during session shutdown, that could occur when a TMR was
      quiesced before the hand-off back to iscsi-target code
      via transport_cmd_check_stop_to_fabric().
      
      The reference leak happens because iscsit_free_cmd() was
      incorrectly skipping the final target_put_sess_cmd() for
      TMRs when transport_generic_free_cmd() returned zero because
      the se_cmd->cmd_kref did not reach zero, due to the missing
      se_cmd assignment in original code.
      
      The result was iscsi_cmd and it's associated se_cmd memory
      would be freed once se_sess->sess_cmd_map where released,
      but the associated se_tmr_req was leaked and remained part
      of se_device->dev_tmr_list.
      
      This bug would manfiest itself as kernel paging request
      OOPsen in core_tmr_lun_reset(), when a left-over se_tmr_req
      attempted to dereference it's se_cmd pointer that had
      already been released during normal session shutdown.
      
      To address this bug, go ahead and treat ISCSI_OP_SCSI_CMD
      and ISCSI_OP_SCSI_TMFUNC the same when there is an extra
      se_cmd->cmd_kref to drop in iscsit_free_cmd(), and use
      op_scsi to signal __iscsit_free_cmd() when the former
      needs to clear any further iscsi related I/O state.
      Reported-by: default avatarRob Millner <rlm@daterainc.com>
      Cc: Rob Millner <rlm@daterainc.com>
      Reported-by: default avatarChu Yuan Lin <cyl@datera.io>
      Cc: Chu Yuan Lin <cyl@datera.io>
      Tested-by: default avatarChu Yuan Lin <cyl@datera.io>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      e71bfbe7