1. 07 Jun, 2017 40 commits
    • Aleksa Sarai's avatar
      fs: exec: apply CLOEXEC before changing dumpable task flags · 43fe8188
      Aleksa Sarai authored
      commit 613cc2b6 upstream.
      
      If you have a process that has set itself to be non-dumpable, and it
      then undergoes exec(2), any CLOEXEC file descriptors it has open are
      "exposed" during a race window between the dumpable flags of the process
      being reset for exec(2) and CLOEXEC being applied to the file
      descriptors. This can be exploited by a process by attempting to access
      /proc/<pid>/fd/... during this window, without requiring CAP_SYS_PTRACE.
      
      The race in question is after set_dumpable has been (for get_link,
      though the trace is basically the same for readlink):
      
      [vfs]
      -> proc_pid_link_inode_operations.get_link
         -> proc_pid_get_link
            -> proc_fd_access_allowed
               -> ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
      
      Which will return 0, during the race window and CLOEXEC file descriptors
      will still be open during this window because do_close_on_exec has not
      been called yet. As a result, the ordering of these calls should be
      reversed to avoid this race window.
      
      This is of particular concern to container runtimes, where joining a
      PID namespace with file descriptors referring to the host filesystem
      can result in security issues (since PRCTL_SET_DUMPABLE doesn't protect
      against access of CLOEXEC file descriptors -- file descriptors which may
      reference filesystem objects the container shouldn't have access to).
      
      Cc: dev@opencontainers.org
      Reported-by: default avatarMichael Crosby <crosbymichael@gmail.com>
      Signed-off-by: default avatarAleksa Sarai <asarai@suse.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      43fe8188
    • Dave Jones's avatar
      ipv6: handle -EFAULT from skb_copy_bits · 8676954b
      Dave Jones authored
      commit a98f9175 upstream.
      
      By setting certain socket options on ipv6 raw sockets, we can confuse the
      length calculation in rawv6_push_pending_frames triggering a BUG_ON.
      
      RIP: 0010:[<ffffffff817c6390>] [<ffffffff817c6390>] rawv6_sendmsg+0xc30/0xc40
      RSP: 0018:ffff881f6c4a7c18  EFLAGS: 00010282
      RAX: 00000000fffffff2 RBX: ffff881f6c681680 RCX: 0000000000000002
      RDX: ffff881f6c4a7cf8 RSI: 0000000000000030 RDI: ffff881fed0f6a00
      RBP: ffff881f6c4a7da8 R08: 0000000000000000 R09: 0000000000000009
      R10: ffff881fed0f6a00 R11: 0000000000000009 R12: 0000000000000030
      R13: ffff881fed0f6a00 R14: ffff881fee39ba00 R15: ffff881fefa93a80
      
      Call Trace:
       [<ffffffff8118ba23>] ? unmap_page_range+0x693/0x830
       [<ffffffff81772697>] inet_sendmsg+0x67/0xa0
       [<ffffffff816d93f8>] sock_sendmsg+0x38/0x50
       [<ffffffff816d982f>] SYSC_sendto+0xef/0x170
       [<ffffffff816da27e>] SyS_sendto+0xe/0x10
       [<ffffffff81002910>] do_syscall_64+0x50/0xa0
       [<ffffffff817f7cbc>] entry_SYSCALL64_slow_path+0x25/0x25
      
      Handle by jumping to the failure path if skb_copy_bits gets an EFAULT.
      
      Reproducer:
      
      #include <stdio.h>
      #include <stdlib.h>
      #include <string.h>
      #include <unistd.h>
      #include <sys/types.h>
      #include <sys/socket.h>
      #include <netinet/in.h>
      
      #define LEN 504
      
      int main(int argc, char* argv[])
      {
      	int fd;
      	int zero = 0;
      	char buf[LEN];
      
      	memset(buf, 0, LEN);
      
      	fd = socket(AF_INET6, SOCK_RAW, 7);
      
      	setsockopt(fd, SOL_IPV6, IPV6_CHECKSUM, &zero, 4);
      	setsockopt(fd, SOL_IPV6, IPV6_DSTOPTS, &buf, LEN);
      
      	sendto(fd, buf, 1, 0, (struct sockaddr *) buf, 110);
      }
      Signed-off-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      8676954b
    • Alexander Popov's avatar
      tty: n_hdlc: get rid of racy n_hdlc.tbuf · 030e0b58
      Alexander Popov authored
      commit 82f2341c upstream.
      
      Currently N_HDLC line discipline uses a self-made singly linked list for
      data buffers and has n_hdlc.tbuf pointer for buffer retransmitting after
      an error.
      
      The commit be10eb75
      ("tty: n_hdlc add buffer flushing") introduced racy access to n_hdlc.tbuf.
      After tx error concurrent flush_tx_queue() and n_hdlc_send_frames() can put
      one data buffer to tx_free_buf_list twice. That causes double free in
      n_hdlc_release().
      
      Let's use standard kernel linked list and get rid of n_hdlc.tbuf:
      in case of tx error put current data buffer after the head of tx_buf_list.
      Signed-off-by: default avatarAlexander Popov <alex.popov@linux.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      030e0b58
    • Jiri Slaby's avatar
      TTY: n_hdlc, fix lockdep false positive · a9511e79
      Jiri Slaby authored
      commit e9b736d8 upstream.
      
      The class of 4 n_hdls buf locks is the same because a single function
      n_hdlc_buf_list_init is used to init all the locks. But since
      flush_tx_queue takes n_hdlc->tx_buf_list.spinlock and then calls
      n_hdlc_buf_put which takes n_hdlc->tx_free_buf_list.spinlock, lockdep
      emits a warning:
      =============================================
      [ INFO: possible recursive locking detected ]
      4.3.0-25.g91e30a7-default #1 Not tainted
      ---------------------------------------------
      a.out/1248 is trying to acquire lock:
       (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fd020>] n_hdlc_buf_put+0x20/0x60 [n_hdlc]
      
      but task is already holding lock:
       (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fdc07>] n_hdlc_tty_ioctl+0x127/0x1d0 [n_hdlc]
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&list->spinlock)->rlock);
        lock(&(&list->spinlock)->rlock);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by a.out/1248:
       #0:  (&tty->ldisc_sem){++++++}, at: [<ffffffff814c9eb0>] tty_ldisc_ref_wait+0x20/0x50
       #1:  (&(&list->spinlock)->rlock){......}, at: [<ffffffffa01fdc07>] n_hdlc_tty_ioctl+0x127/0x1d0 [n_hdlc]
      ...
      Call Trace:
      ...
       [<ffffffff81738fd0>] _raw_spin_lock_irqsave+0x50/0x70
       [<ffffffffa01fd020>] n_hdlc_buf_put+0x20/0x60 [n_hdlc]
       [<ffffffffa01fdc24>] n_hdlc_tty_ioctl+0x144/0x1d0 [n_hdlc]
       [<ffffffff814c25c1>] tty_ioctl+0x3f1/0xe40
      ...
      
      Fix it by initializing the spin_locks separately. This removes also
      reduntand memset of a freshly kzallocated space.
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      a9511e79
    • David Hildenbrand's avatar
      KVM: kvm_io_bus_unregister_dev() should never fail · 211bcbcc
      David Hildenbrand authored
      commit 90db1043 upstream.
      
      No caller currently checks the return value of
      kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
      freeing their device. A stale reference will remain in the io_bus,
      getting at least used again, when the iobus gets teared down on
      kvm_destroy_vm() - leading to use after free errors.
      
      There is nothing the callers could do, except retrying over and over
      again.
      
      So let's simply remove the bus altogether, print an error and make
      sure no one can access this broken bus again (returning -ENOMEM on any
      attempt to access it).
      
      Fixes: e93f8a0f ("KVM: convert io_bus to SRCU")
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [wt: no kvm_io_bus_read_cookie in 3.10, slightly different constructs]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      211bcbcc
    • Amos Kong's avatar
      kvm: exclude ioeventfd from counting kvm_io_range limit · d8abb8b8
      Amos Kong authored
      commit 6ea34c9b upstream.
      
      We can easily reach the 1000 limit by start VM with a couple
      hundred I/O devices (multifunction=on). The hardcode limit
      already been adjusted 3 times (6 ~ 200 ~ 300 ~ 1000).
      
      In userspace, we already have maximum file descriptor to
      limit ioeventfd count. But kvm_io_bus devices also are used
      for pit, pic, ioapic, coalesced_mmio. They couldn't be limited
      by maximum file descriptor.
      
      Currently only ioeventfds take too much kvm_io_bus devices,
      so just exclude it from counting kvm_io_range limit.
      
      Also fixed one indent issue in kvm_host.h
      Signed-off-by: default avatarAmos Kong <akong@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarGleb Natapov <gleb@redhat.com>
      [wt: next patch depends on this one]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d8abb8b8
    • Peter Xu's avatar
      KVM: x86: clear bus pointer when destroyed · dd8db853
      Peter Xu authored
      commit df630b8c upstream.
      
      When releasing the bus, let's clear the bus pointers to mark it out. If
      any further device unregister happens on this bus, we know that we're
      done if we found the bus being released already.
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      dd8db853
    • Marcelo Ricardo Leitner's avatar
      sctp: deny peeloff operation on asocs with threads sleeping on it · 620d73a9
      Marcelo Ricardo Leitner authored
      commit dfcb9f4f upstream.
      
      commit 2dcab598 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
      attempted to avoid a BUG_ON call when the association being used for a
      sendmsg() is blocked waiting for more sndbuf and another thread did a
      peeloff operation on such asoc, moving it to another socket.
      
      As Ben Hutchings noticed, then in such case it would return without
      locking back the socket and would cause two unlocks in a row.
      
      Further analysis also revealed that it could allow a double free if the
      application managed to peeloff the asoc that is created during the
      sendmsg call, because then sctp_sendmsg() would try to free the asoc
      that was created only for that call.
      
      This patch takes another approach. It will deny the peeloff operation
      if there is a thread sleeping on the asoc, so this situation doesn't
      exist anymore. This avoids the issues described above and also honors
      the syscalls that are already being handled (it can be multiple sendmsg
      calls).
      
      Joint work with Xin Long.
      
      Fixes: 2dcab598 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      620d73a9
    • Marcelo Ricardo Leitner's avatar
      sctp: avoid BUG_ON on sctp_wait_for_sndbuf · 1f7cb732
      Marcelo Ricardo Leitner authored
      commit 2dcab598 upstream.
      
      Alexander Popov reported that an application may trigger a BUG_ON in
      sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
      waiting on it to queue more data and meanwhile another thread peels off
      the association being used by the first thread.
      
      This patch replaces the BUG_ON call with a proper error handling. It
      will return -EPIPE to the original sendmsg call, similarly to what would
      have been done if the association wasn't found in the first place.
      Acked-by: default avatarAlexander Popov <alex.popov@linux.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1f7cb732
    • Li RongQing's avatar
      ipv6: fix the use of pcpu_tstats in ip6_tunnel · abcfcd04
      Li RongQing authored
      commit abb6013c upstream.
      
      when read/write the 64bit data, the correct lock should be hold.
      
      Fixes: 87b6d218 ("tunnel: implement 64 bits statistics")
      
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarLi RongQing <roy.qing.li@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      abcfcd04
    • Dan Carpenter's avatar
      ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim() · e246c098
      Dan Carpenter authored
      commit 63117f09 upstream.
      
      Casting is a high precedence operation but "off" and "i" are in terms of
      bytes so we need to have some parenthesis here.
      
      Fixes: fbfa743a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      e246c098
    • Eric Dumazet's avatar
      ipv6: fix ip6_tnl_parse_tlv_enc_lim() · 72bbf335
      Eric Dumazet authored
      commit fbfa743a upstream.
      
      This function suffers from multiple issues.
      
      First one is that pskb_may_pull() may reallocate skb->head,
      so the 'raw' pointer needs either to be reloaded or not used at all.
      
      Second issue is that NEXTHDR_DEST handling does not validate
      that the options are present in skb->data, so we might read
      garbage or access non existent memory.
      
      With help from Willem de Bruijn.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov  <dvyukov@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      72bbf335
    • Takashi Iwai's avatar
      xc2028: Fix use-after-free bug properly · 3bdb1571
      Takashi Iwai authored
      commit 22a1e778 upstream.
      
      The commit 8dfbcc43 ("[media] xc2028: avoid use after free") tried
      to address the reported use-after-free by clearing the reference.
      
      However, it's clearing the wrong pointer; it sets NULL to
      priv->ctrl.fname, but it's anyway overwritten by the next line
      memcpy(&priv->ctrl, p, sizeof(priv->ctrl)).
      
      OTOH, the actual code accessing the freed string is the strcmp() call
      with priv->fname:
      	if (!firmware_name[0] && p->fname &&
      	    priv->fname && strcmp(p->fname, priv->fname))
      		free_firmware(priv);
      
      where priv->fname points to the previous file name, and this was
      already freed by kfree().
      
      For fixing the bug properly, this patch does the following:
      
      - Keep the copy of firmware file name in only priv->fname,
        priv->ctrl.fname isn't changed;
      - The allocation is done only when the firmware gets loaded;
      - The kfree() is called in free_firmware() commonly
      
      Fixes: commit 8dfbcc43 ('[media] xc2028: avoid use after free')
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3bdb1571
    • Dan Carpenter's avatar
      xc2028: unlock on error in xc2028_set_config() · d9a854e5
      Dan Carpenter authored
      commit 210bd104 upstream.
      
      We have to unlock before returning -ENOMEM.
      
      Fixes: 8dfbcc43 ('[media] xc2028: avoid use after free')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d9a854e5
    • Mauro Carvalho Chehab's avatar
      xc2028: avoid use after free · bb4884f2
      Mauro Carvalho Chehab authored
      commit 8dfbcc43 upstream.
      
      If struct xc2028_config is passed without a firmware name,
      the following trouble may happen:
      
      [11009.907205] xc2028 5-0061: type set to XCeive xc2028/xc3028 tuner
      [11009.907491] ==================================================================
      [11009.907750] BUG: KASAN: use-after-free in strcmp+0x96/0xb0 at addr ffff8803bd78ab40
      [11009.907992] Read of size 1 by task modprobe/28992
      [11009.907994] =============================================================================
      [11009.907997] BUG kmalloc-16 (Tainted: G        W      ): kasan: bad access detected
      [11009.907999] -----------------------------------------------------------------------------
      
      [11009.908008] INFO: Allocated in xhci_urb_enqueue+0x214/0x14c0 [xhci_hcd] age=0 cpu=3 pid=28992
      [11009.908012] 	___slab_alloc+0x581/0x5b0
      [11009.908014] 	__slab_alloc+0x51/0x90
      [11009.908017] 	__kmalloc+0x27b/0x350
      [11009.908022] 	xhci_urb_enqueue+0x214/0x14c0 [xhci_hcd]
      [11009.908026] 	usb_hcd_submit_urb+0x1e8/0x1c60
      [11009.908029] 	usb_submit_urb+0xb0e/0x1200
      [11009.908032] 	usb_serial_generic_write_start+0xb6/0x4c0
      [11009.908035] 	usb_serial_generic_write+0x92/0xc0
      [11009.908039] 	usb_console_write+0x38a/0x560
      [11009.908045] 	call_console_drivers.constprop.14+0x1ee/0x2c0
      [11009.908051] 	console_unlock+0x40d/0x900
      [11009.908056] 	vprintk_emit+0x4b4/0x830
      [11009.908061] 	vprintk_default+0x1f/0x30
      [11009.908064] 	printk+0x99/0xb5
      [11009.908067] 	kasan_report_error+0x10a/0x550
      [11009.908070] 	__asan_report_load1_noabort+0x43/0x50
      [11009.908074] INFO: Freed in xc2028_set_config+0x90/0x630 [tuner_xc2028] age=1 cpu=3 pid=28992
      [11009.908077] 	__slab_free+0x2ec/0x460
      [11009.908080] 	kfree+0x266/0x280
      [11009.908083] 	xc2028_set_config+0x90/0x630 [tuner_xc2028]
      [11009.908086] 	xc2028_attach+0x310/0x8a0 [tuner_xc2028]
      [11009.908090] 	em28xx_attach_xc3028.constprop.7+0x1f9/0x30d [em28xx_dvb]
      [11009.908094] 	em28xx_dvb_init.part.3+0x8e4/0x5cf4 [em28xx_dvb]
      [11009.908098] 	em28xx_dvb_init+0x81/0x8a [em28xx_dvb]
      [11009.908101] 	em28xx_register_extension+0xd9/0x190 [em28xx]
      [11009.908105] 	em28xx_dvb_register+0x10/0x1000 [em28xx_dvb]
      [11009.908108] 	do_one_initcall+0x141/0x300
      [11009.908111] 	do_init_module+0x1d0/0x5ad
      [11009.908114] 	load_module+0x6666/0x9ba0
      [11009.908117] 	SyS_finit_module+0x108/0x130
      [11009.908120] 	entry_SYSCALL_64_fastpath+0x16/0x76
      [11009.908123] INFO: Slab 0xffffea000ef5e280 objects=25 used=25 fp=0x          (null) flags=0x2ffff8000004080
      [11009.908126] INFO: Object 0xffff8803bd78ab40 @offset=2880 fp=0x0000000000000001
      
      [11009.908130] Bytes b4 ffff8803bd78ab30: 01 00 00 00 2a 07 00 00 9d 28 00 00 01 00 00 00  ....*....(......
      [11009.908133] Object ffff8803bd78ab40: 01 00 00 00 00 00 00 00 b0 1d c3 6a 00 88 ff ff  ...........j....
      [11009.908137] CPU: 3 PID: 28992 Comm: modprobe Tainted: G    B   W       4.5.0-rc1+ #43
      [11009.908140] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0350.2015.0812.1722 08/12/2015
      [11009.908142]  ffff8803bd78a000 ffff8802c273f1b8 ffffffff81932007 ffff8803c6407a80
      [11009.908148]  ffff8802c273f1e8 ffffffff81556759 ffff8803c6407a80 ffffea000ef5e280
      [11009.908153]  ffff8803bd78ab40 dffffc0000000000 ffff8802c273f210 ffffffff8155ccb4
      [11009.908158] Call Trace:
      [11009.908162]  [<ffffffff81932007>] dump_stack+0x4b/0x64
      [11009.908165]  [<ffffffff81556759>] print_trailer+0xf9/0x150
      [11009.908168]  [<ffffffff8155ccb4>] object_err+0x34/0x40
      [11009.908171]  [<ffffffff8155f260>] kasan_report_error+0x230/0x550
      [11009.908175]  [<ffffffff81237d71>] ? trace_hardirqs_off_caller+0x21/0x290
      [11009.908179]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
      [11009.908182]  [<ffffffff8155f5c3>] __asan_report_load1_noabort+0x43/0x50
      [11009.908185]  [<ffffffff8155ea00>] ? __asan_register_globals+0x50/0xa0
      [11009.908189]  [<ffffffff8194cea6>] ? strcmp+0x96/0xb0
      [11009.908192]  [<ffffffff8194cea6>] strcmp+0x96/0xb0
      [11009.908196]  [<ffffffffa13ba4ac>] xc2028_set_config+0x15c/0x630 [tuner_xc2028]
      [11009.908200]  [<ffffffffa13bac90>] xc2028_attach+0x310/0x8a0 [tuner_xc2028]
      [11009.908203]  [<ffffffff8155ea78>] ? memset+0x28/0x30
      [11009.908206]  [<ffffffffa13ba980>] ? xc2028_set_config+0x630/0x630 [tuner_xc2028]
      [11009.908211]  [<ffffffffa157a59a>] em28xx_attach_xc3028.constprop.7+0x1f9/0x30d [em28xx_dvb]
      [11009.908215]  [<ffffffffa157aa2a>] ? em28xx_dvb_init.part.3+0x37c/0x5cf4 [em28xx_dvb]
      [11009.908219]  [<ffffffffa157a3a1>] ? hauppauge_hvr930c_init+0x487/0x487 [em28xx_dvb]
      [11009.908222]  [<ffffffffa01795ac>] ? lgdt330x_attach+0x1cc/0x370 [lgdt330x]
      [11009.908226]  [<ffffffffa01793e0>] ? i2c_read_demod_bytes.isra.2+0x210/0x210 [lgdt330x]
      [11009.908230]  [<ffffffff812e87d0>] ? ref_module.part.15+0x10/0x10
      [11009.908233]  [<ffffffff812e56e0>] ? module_assert_mutex_or_preempt+0x80/0x80
      [11009.908238]  [<ffffffffa157af92>] em28xx_dvb_init.part.3+0x8e4/0x5cf4 [em28xx_dvb]
      [11009.908242]  [<ffffffffa157a6ae>] ? em28xx_attach_xc3028.constprop.7+0x30d/0x30d [em28xx_dvb]
      [11009.908245]  [<ffffffff8195222d>] ? string+0x14d/0x1f0
      [11009.908249]  [<ffffffff8195381f>] ? symbol_string+0xff/0x1a0
      [11009.908253]  [<ffffffff81953720>] ? uuid_string+0x6f0/0x6f0
      [11009.908257]  [<ffffffff811a775e>] ? __kernel_text_address+0x7e/0xa0
      [11009.908260]  [<ffffffff8104b02f>] ? print_context_stack+0x7f/0xf0
      [11009.908264]  [<ffffffff812e9846>] ? __module_address+0xb6/0x360
      [11009.908268]  [<ffffffff8137fdc9>] ? is_ftrace_trampoline+0x99/0xe0
      [11009.908271]  [<ffffffff811a775e>] ? __kernel_text_address+0x7e/0xa0
      [11009.908275]  [<ffffffff81240a70>] ? debug_check_no_locks_freed+0x290/0x290
      [11009.908278]  [<ffffffff8104a24b>] ? dump_trace+0x11b/0x300
      [11009.908282]  [<ffffffffa13e8143>] ? em28xx_register_extension+0x23/0x190 [em28xx]
      [11009.908285]  [<ffffffff81237d71>] ? trace_hardirqs_off_caller+0x21/0x290
      [11009.908289]  [<ffffffff8123ff56>] ? trace_hardirqs_on_caller+0x16/0x590
      [11009.908292]  [<ffffffff812404dd>] ? trace_hardirqs_on+0xd/0x10
      [11009.908296]  [<ffffffffa13e8143>] ? em28xx_register_extension+0x23/0x190 [em28xx]
      [11009.908299]  [<ffffffff822dcbb0>] ? mutex_trylock+0x400/0x400
      [11009.908302]  [<ffffffff810021a1>] ? do_one_initcall+0x131/0x300
      [11009.908306]  [<ffffffff81296dc7>] ? call_rcu_sched+0x17/0x20
      [11009.908309]  [<ffffffff8159e708>] ? put_object+0x48/0x70
      [11009.908314]  [<ffffffffa1579f11>] em28xx_dvb_init+0x81/0x8a [em28xx_dvb]
      [11009.908317]  [<ffffffffa13e81f9>] em28xx_register_extension+0xd9/0x190 [em28xx]
      [11009.908320]  [<ffffffffa0150000>] ? 0xffffffffa0150000
      [11009.908324]  [<ffffffffa0150010>] em28xx_dvb_register+0x10/0x1000 [em28xx_dvb]
      [11009.908327]  [<ffffffff810021b1>] do_one_initcall+0x141/0x300
      [11009.908330]  [<ffffffff81002070>] ? try_to_run_init_process+0x40/0x40
      [11009.908333]  [<ffffffff8123ff56>] ? trace_hardirqs_on_caller+0x16/0x590
      [11009.908337]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
      [11009.908340]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
      [11009.908343]  [<ffffffff8155e926>] ? kasan_unpoison_shadow+0x36/0x50
      [11009.908346]  [<ffffffff8155ea37>] ? __asan_register_globals+0x87/0xa0
      [11009.908350]  [<ffffffff8144da7b>] do_init_module+0x1d0/0x5ad
      [11009.908353]  [<ffffffff812f2626>] load_module+0x6666/0x9ba0
      [11009.908356]  [<ffffffff812e9c90>] ? symbol_put_addr+0x50/0x50
      [11009.908361]  [<ffffffffa1580037>] ? em28xx_dvb_init.part.3+0x5989/0x5cf4 [em28xx_dvb]
      [11009.908366]  [<ffffffff812ebfc0>] ? module_frob_arch_sections+0x20/0x20
      [11009.908369]  [<ffffffff815bc940>] ? open_exec+0x50/0x50
      [11009.908374]  [<ffffffff811671bb>] ? ns_capable+0x5b/0xd0
      [11009.908377]  [<ffffffff812f5e58>] SyS_finit_module+0x108/0x130
      [11009.908379]  [<ffffffff812f5d50>] ? SyS_init_module+0x1f0/0x1f0
      [11009.908383]  [<ffffffff81004044>] ? lockdep_sys_exit_thunk+0x12/0x14
      [11009.908394]  [<ffffffff822e6936>] entry_SYSCALL_64_fastpath+0x16/0x76
      [11009.908396] Memory state around the buggy address:
      [11009.908398]  ffff8803bd78aa00: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [11009.908401]  ffff8803bd78aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [11009.908403] >ffff8803bd78ab00: fc fc fc fc fc fc fc fc 00 00 fc fc fc fc fc fc
      [11009.908405]                                            ^
      [11009.908407]  ffff8803bd78ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [11009.908409]  ffff8803bd78ac00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [11009.908411] ==================================================================
      
      In order to avoid it, let's set the cached value of the firmware
      name to NULL after freeing it. While here, return an error if
      the memory allocation fails.
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      bb4884f2
    • Vitaly Kuznetsov's avatar
      Drivers: hv: avoid vfree() on crash · 9391c802
      Vitaly Kuznetsov authored
      commit a9f61ca7 upstream.
      
      When we crash from NMI context (e.g. after NMI injection from host when
      'sysctl -w kernel.unknown_nmi_panic=1' is set) we hit
      
          kernel BUG at mm/vmalloc.c:1530!
      
      as vfree() is denied. While the issue could be solved with in_nmi() check
      instead I opted for skipping vfree on all sorts of crashes to reduce the
      amount of work which can cause consequent crashes. We don't really need to
      free anything on crash.
      
      [js] no tsc and kexec in 3.12 yet
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      9391c802
    • Eric Dumazet's avatar
      can: Fix kernel panic at security_sock_rcv_skb · 801c8a02
      Eric Dumazet authored
      commit f1712c73 upstream.
      
      Zhang Yanmin reported crashes [1] and provided a patch adding a
      synchronize_rcu() call in can_rx_unregister()
      
      The main problem seems that the sockets themselves are not RCU
      protected.
      
      If CAN uses RCU for delivery, then sockets should be freed only after
      one RCU grace period.
      
      Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
      ease stable backports with the following fix instead.
      
      [1]
      BUG: unable to handle kernel NULL pointer dereference at (null)
      IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0
      
      Call Trace:
       <IRQ>
       [<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60
       [<ffffffff81d55771>] sk_filter+0x41/0x210
       [<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0
       [<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0
       [<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370
       [<ffffffff81f07af9>] can_receive+0xd9/0x120
       [<ffffffff81f07beb>] can_rcv+0xab/0x100
       [<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0
       [<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0
       [<ffffffff81d37f67>] process_backlog+0x127/0x280
       [<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0
       [<ffffffff810c88d4>] __do_softirq+0x184/0x440
       [<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30
       <EOI>
       [<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40
       [<ffffffff810c8bed>] do_softirq+0x1d/0x20
       [<ffffffff81d30085>] netif_rx_ni+0xe5/0x110
       [<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520
       [<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230
       [<ffffffff810e3baf>] process_one_work+0x24f/0x670
       [<ffffffff810e44ed>] worker_thread+0x9d/0x6f0
       [<ffffffff810e4450>] ? rescuer_thread+0x480/0x480
       [<ffffffff810ebafc>] kthread+0x12c/0x150
       [<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70
      Reported-by: default avatarZhang Yanmin <yanmin.zhang@intel.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      801c8a02
    • Oliver O'Halloran's avatar
      mm/init: fix zone boundary creation · 414989d2
      Oliver O'Halloran authored
      commit 90cae1fe upstream.
      
      As a part of memory initialisation the architecture passes an array to
      free_area_init_nodes() which specifies the max PFN of each memory zone.
      This array is not necessarily monotonic (due to unused zones) so this
      array is parsed to build monotonic lists of the min and max PFN for each
      zone.  ZONE_MOVABLE is special cased here as its limits are managed by
      the mm subsystem rather than the architecture.  Unfortunately, this
      special casing is broken when ZONE_MOVABLE is the not the last zone in
      the zone list.  The core of the issue is:
      
      	if (i == ZONE_MOVABLE)
      		continue;
      	arch_zone_lowest_possible_pfn[i] =
      		arch_zone_highest_possible_pfn[i-1];
      
      As ZONE_MOVABLE is skipped the lowest_possible_pfn of the next zone will
      be set to zero.  This patch fixes this bug by adding explicitly tracking
      where the next zone should start rather than relying on the contents
      arch_zone_highest_possible_pfn[].
      
      Thie is low priority.  To get bitten by this you need to enable a zone
      that appears after ZONE_MOVABLE in the zone_type enum.  As far as I can
      tell this means running a kernel with ZONE_DEVICE or ZONE_CMA enabled,
      so I can't see this affecting too many people.
      
      I only noticed this because I've been fiddling with ZONE_DEVICE on
      powerpc and 4.6 broke my test kernel.  This bug, in conjunction with the
      changes in Taku Izumi's kernelcore=mirror patch (d91749c1) and
      powerpc being the odd architecture which initialises max_zone_pfn[] to
      ~0ul instead of 0 caused all of system memory to be placed into
      ZONE_DEVICE at boot, followed a panic since device memory cannot be used
      for kernel allocations.  I've already submitted a patch to fix the
      powerpc specific bits, but I figured this should be fixed too.
      
      Link: http://lkml.kernel.org/r/1462435033-15601-1-git-send-email-oohall@gmail.comSigned-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      414989d2
    • Alan Stern's avatar
      USB: dummy-hcd: fix bug in stop_activity (handle ep0) · 5765ee44
      Alan Stern authored
      commit bcdbeb84 upstream.
      
      The stop_activity() routine in dummy-hcd is supposed to unlink all
      active requests for every endpoint, among other things.  But it
      doesn't handle ep0.  As a result, fuzz testing can generate a WARNING
      like the following:
      
      WARNING: CPU: 0 PID: 4410 at drivers/usb/gadget/udc/dummy_hcd.c:672 dummy_free_request+0x153/0x170
      Modules linked in:
      CPU: 0 PID: 4410 Comm: syz-executor Not tainted 4.9.0-rc7+ #32
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       ffff88006a64ed10 ffffffff81f96b8a ffffffff41b58ab3 1ffff1000d4c9d35
       ffffed000d4c9d2d ffff880065f8ac00 0000000041b58ab3 ffffffff8598b510
       ffffffff81f968f8 0000000041b58ab3 ffffffff859410e0 ffffffff813f0590
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
       [<ffffffff812b808f>] __warn+0x19f/0x1e0 kernel/panic.c:550
       [<ffffffff812b831c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
       [<ffffffff830fcb13>] dummy_free_request+0x153/0x170 drivers/usb/gadget/udc/dummy_hcd.c:672
       [<ffffffff830ed1b0>] usb_ep_free_request+0xc0/0x420 drivers/usb/gadget/udc/core.c:195
       [<ffffffff83225031>] gadgetfs_unbind+0x131/0x190 drivers/usb/gadget/legacy/inode.c:1612
       [<ffffffff830ebd8f>] usb_gadget_remove_driver+0x10f/0x2b0 drivers/usb/gadget/udc/core.c:1228
       [<ffffffff830ec084>] usb_gadget_unregister_driver+0x154/0x240 drivers/usb/gadget/udc/core.c:1357
      
      This patch fixes the problem by iterating over all the endpoints in
      the driver's ep array instead of iterating over the gadget's ep_list,
      which explicitly leaves out ep0.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5765ee44
    • Alan Stern's avatar
      USB: fix problems with duplicate endpoint addresses · 15668b43
      Alan Stern authored
      commit 0a8fd134 upstream.
      
      When checking a new device's descriptors, the USB core does not check
      for duplicate endpoint addresses.  This can cause a problem when the
      sysfs files for those endpoints are created; trying to create multiple
      files with the same name will provoke a WARNING:
      
      WARNING: CPU: 2 PID: 865 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x8a/0xa0
      sysfs: cannot create duplicate filename
      '/devices/platform/dummy_hcd.0/usb2/2-1/2-1:64.0/ep_05'
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 2 PID: 865 Comm: kworker/2:1 Not tainted 4.9.0-rc7+ #34
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: usb_hub_wq hub_event
       ffff88006bee64c8 ffffffff81f96b8a ffffffff00000001 1ffff1000d7dcc2c
       ffffed000d7dcc24 0000000000000001 0000000041b58ab3 ffffffff8598b510
       ffffffff81f968f8 ffffffff850fee20 ffffffff85cff020 dffffc0000000000
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
       [<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
       [<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
       [<ffffffff812b8195>] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
       [<ffffffff819e70ca>] sysfs_warn_dup+0x8a/0xa0 fs/sysfs/dir.c:30
       [<ffffffff819e7308>] sysfs_create_dir_ns+0x178/0x1d0 fs/sysfs/dir.c:59
       [<     inline     >] create_dir lib/kobject.c:71
       [<ffffffff81fa1b07>] kobject_add_internal+0x227/0xa60 lib/kobject.c:229
       [<     inline     >] kobject_add_varg lib/kobject.c:366
       [<ffffffff81fa2479>] kobject_add+0x139/0x220 lib/kobject.c:411
       [<ffffffff82737a63>] device_add+0x353/0x1660 drivers/base/core.c:1088
       [<ffffffff82738d8d>] device_register+0x1d/0x20 drivers/base/core.c:1206
       [<ffffffff82cb77d3>] usb_create_ep_devs+0x163/0x260 drivers/usb/core/endpoint.c:195
       [<ffffffff82c9f27b>] create_intf_ep_devs+0x13b/0x200 drivers/usb/core/message.c:1030
       [<ffffffff82ca39d3>] usb_set_configuration+0x1083/0x18d0 drivers/usb/core/message.c:1937
       [<ffffffff82cc9e2e>] generic_probe+0x6e/0xe0 drivers/usb/core/generic.c:172
       [<ffffffff82caa7fa>] usb_probe_device+0xaa/0xe0 drivers/usb/core/driver.c:263
      
      This patch prevents the problem by checking for duplicate endpoint
      addresses during enumeration and skipping any duplicates.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      15668b43
    • Eric Dumazet's avatar
      ping: implement proper locking · 60f704bd
      Eric Dumazet authored
      commit 43a66845 upstream.
      
      We got a report of yet another bug in ping
      
      http://www.openwall.com/lists/oss-security/2017/03/24/6
      
      ->disconnect() is not called with socket lock held.
      
      Fix this by acquiring ping rwlock earlier.
      
      Thanks to Daniel, Alexander and Andrey for letting us know this problem.
      
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDaniel Jiang <danieljiang0415@gmail.com>
      Reported-by: default avatarSolar Designer <solar@openwall.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [wt: the function is ping_v4_unhash() in 3.10]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      60f704bd
    • Johan Hovold's avatar
      USB: usbtmc: add missing endpoint sanity check · 3d46433d
      Johan Hovold authored
      commit 687e0687 upstream.
      
      USBTMC devices are required to have a bulk-in and a bulk-out endpoint,
      but the driver failed to verify this, something which could lead to the
      endpoint addresses being taken from uninitialised memory.
      
      Make sure to zero all private data as part of allocation, and add the
      missing endpoint sanity check.
      
      Note that this also addresses a more recently introduced issue, where
      the interrupt-in-presence flag would also be uninitialised whenever the
      optional interrupt-in endpoint is not present. This in turn could lead
      to an interrupt urb being allocated, initialised and submitted based on
      uninitialised values.
      
      Fixes: dbf3e7f6 ("Implement an ioctl to support the USMTMC-USB488 READ_STATUS_BYTE operation.")
      Fixes: 5b775f67 ("USB: add USB test and measurement class driver")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      [ johan: backport to v4.4 ]
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3d46433d
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Use the syscall raw_syscalls:sys_enter timestamp · 8c11dcea
      Arnaldo Carvalho de Melo authored
      commit ecf1e225 upstream.
      
      Instead of the one when another syscall takes place while another is being
      processed (in another CPU, but we show it serialized, so need to "interrupt"
      the other), and also when finally showing the sys_enter + sys_exit + duration,
      where we were showing the sample->time for the sys_exit, duh.
      
      Before:
      
        # perf trace sleep 1
        <SNIP>
           0.373 (   0.001 ms): close(fd: 3                   ) = 0
        1000.626 (1000.211 ms): nanosleep(rqtp: 0x7ffd6ddddfb0) = 0
        1000.653 (   0.003 ms): close(fd: 1                   ) = 0
        1000.657 (   0.002 ms): close(fd: 2                   ) = 0
        1000.667 (   0.000 ms): exit_group(                   )
        #
      
      After:
      
        # perf trace sleep 1
        <SNIP>
           0.336 (   0.001 ms): close(fd: 3                   ) = 0
           0.373 (1000.086 ms): nanosleep(rqtp: 0x7ffe303e9550) = 0
        1000.481 (   0.002 ms): close(fd: 1                   ) = 0
        1000.485 (   0.001 ms): close(fd: 2                   ) = 0
        1000.494 (   0.000 ms): exit_group(                   )
      [root@jouet linux]#
      
      [js] no trace__printf_interrupted_entry in 3.12 yet
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ecbzgmu2ni6glc6zkw8p1zmx@git.kernel.org
      Fixes: 752fde44 ("perf trace: Support interrupted syscalls")
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      [wt: 3.10 uses stdout instead of trace->output ;
           no trace__printf_interrupted_entry() function ]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      8c11dcea
    • Daniel Borkmann's avatar
      net: sctp: rework multihoming retransmission path selection to rfc4960 · 8ceaec02
      Daniel Borkmann authored
      commit 4c47af4d upstream.
      
      Problem statement: 1) both paths (primary path1 and alternate
      path2) are up after the association has been established i.e.,
      HB packets are normally exchanged, 2) path2 gets inactive after
      path_max_retrans * max_rto timed out (i.e. path2 is down completely),
      3) now, if a transmission times out on the only surviving/active
      path1 (any ~1sec network service impact could cause this like
      a channel bonding failover), then the retransmitted packets are
      sent over the inactive path2; this happens with partial failover
      and without it.
      
      Besides not being optimal in the above scenario, a small failure
      or timeout in the only existing path has the potential to cause
      long delays in the retransmission (depending on RTO_MAX) until
      the still active path is reselected. Further, when the T3-timeout
      occurs, we have active_patch == retrans_path, and even though the
      timeout occurred on the initial transmission of data, not a
      retransmit, we end up updating retransmit path.
      
      RFC4960, section 6.4. "Multi-Homed SCTP Endpoints" states under
      6.4.1. "Failover from an Inactive Destination Address" the
      following:
      
        Some of the transport addresses of a multi-homed SCTP endpoint
        may become inactive due to either the occurrence of certain
        error conditions (see Section 8.2) or adjustments from the
        SCTP user.
      
        When there is outbound data to send and the primary path
        becomes inactive (e.g., due to failures), or where the SCTP
        user explicitly requests to send data to an inactive
        destination transport address, before reporting an error to
        its ULP, the SCTP endpoint should try to send the data to an
        alternate __active__ destination transport address if one
        exists.
      
        When retransmitting data that timed out, if the endpoint is
        multihomed, it should consider each source-destination address
        pair in its retransmission selection policy. When retransmitting
        timed-out data, the endpoint should attempt to pick the most
        divergent source-destination pair from the original
        source-destination pair to which the packet was transmitted.
      
        Note: Rules for picking the most divergent source-destination
        pair are an implementation decision and are not specified
        within this document.
      
      So, we should first reconsider to take the current active
      retransmission transport if we cannot find an alternative
      active one. If all of that fails, we can still round robin
      through unkown, partial failover, and inactive ones in the
      hope to find something still suitable.
      
      Commit 4141ddc0 ("sctp: retran_path update bug fix") broke
      that behaviour by selecting the next inactive transport when
      no other active transport was found besides the current assoc's
      peer.retran_path. Before commit 4141ddc0, we would have
      traversed through the list until we reach our peer.retran_path
      again, and in case that is still in state SCTP_ACTIVE, we would
      take it and return. Only if that is not the case either, we
      take the next inactive transport.
      
      Besides all that, another issue is that transports in state
      SCTP_UNKNOWN could be preferred over transports in state
      SCTP_ACTIVE in case a SCTP_ACTIVE transport appears after
      SCTP_UNKNOWN in the transport list yielding a weaker transport
      state to be used in retransmission.
      
      This patch mostly reverts 4141ddc0, but also rewrites
      this function to introduce more clarity and strictness into
      the code. A strict priority of transport states is enforced
      in this patch, hence selection is active > unkown > partial
      failover > inactive.
      
      Fixes: 4141ddc0 ("sctp: retran_path update bug fix")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
      Acked-by: default avatarVlad Yasevich <yasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [wt: picked updated function from 3.12 except the debug statement]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      8ceaec02
    • Dan Carpenter's avatar
      Staging: vt6655-6: potential NULL dereference in hostap_disable_hostapd() · ad1336d2
      Dan Carpenter authored
      commit cb4855b4 upstream.
      
      We fixed this to use free_netdev() instead of kfree() but unfortunately
      free_netdev() doesn't accept NULL pointers.  Smatch complains about
      this, it's not something I discovered through testing.
      
      Fixes: 3030d40b ('staging: vt6655: use free_netdev instead of kfree')
      Fixes: 0a438d5b ('staging: vt6656: use free_netdev instead of kfree')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [wt: only vt6656 was converted to free_netdev in 3.10]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ad1336d2
    • Herbert Xu's avatar
      tun: Fix TUN_PKT_STRIP setting · bda08328
      Herbert Xu authored
      commit 2eb783c4 upstream.
      
      We set the flag TUN_PKT_STRIP if the user buffer provided is too
      small to contain the entire packet plus meta-data.  However, this
      has been broken ever since we added GSO meta-data.  VLAN acceleration
      also has the same problem.
      
      This patch fixes this by taking both into account when setting the
      TUN_PKT_STRIP flag.
      
      The fact that this has been broken for six years without anyone
      realising means that nobody actually uses this flag.
      
      Fixes: f43798c2 ("tun: Allow GSO using virtio_net_hdr")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [wt: no tuntap VLAN offloading in 3.10]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      bda08328
    • Vladimir Zapolskiy's avatar
      ARM: dts: imx31: fix AVIC base address · bfac3620
      Vladimir Zapolskiy authored
      commit af92305e upstream.
      
      On i.MX31 AVIC interrupt controller base address is at 0x68000000.
      
      The problem was shadowed by the AVIC driver, which takes the correct
      base address from a SoC specific header file.
      
      Fixes: d2a37b3d ("ARM i.MX31: Add devicetree support")
      Signed-off-by: default avatarVladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
      Reviewed-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      bfac3620
    • Vladimir Zapolskiy's avatar
      ARM: dts: imx31: move CCM device node to AIPS2 bus devices · 3e721e29
      Vladimir Zapolskiy authored
      commit 1f87aee6 upstream.
      
      i.MX31 Clock Control Module controller is found on AIPS2 bus, move it
      there from SPBA bus to avoid a conflict of device IO space mismatch.
      
      Fixes: ef0e4a60 ("ARM: mx31: Replace clk_register_clkdev with clock DT lookup")
      Signed-off-by: default avatarVladimir Zapolskiy <vz@mleia.com>
      Acked-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3e721e29
    • James Hogan's avatar
      MIPS: KGDB: Use kernel context for sleeping threads · d1f922a5
      James Hogan authored
      commit 162b270c upstream.
      
      KGDB is a kernel debug stub and it can't be used to debug userland as it
      can only safely access kernel memory.
      
      On MIPS however KGDB has always got the register state of sleeping
      processes from the userland register context at the beginning of the
      kernel stack. This is meaningless for kernel threads (which never enter
      userland), and for user threads it prevents the user seeing what it is
      doing while in the kernel:
      
      (gdb) info threads
        Id   Target Id         Frame
        ...
        3    Thread 2 (kthreadd) 0x0000000000000000 in ?? ()
        2    Thread 1 (init)   0x000000007705c4b4 in ?? ()
        1    Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201
      
      Get the register state instead from the (partial) kernel register
      context stored in the task's thread_struct for resume() to restore. All
      threads now correctly appear to be in context_switch():
      
      (gdb) info threads
        Id   Target Id         Frame
        ...
        3    Thread 2 (kthreadd) context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
        2    Thread 1 (init)   context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
        1    Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201
      
      Call clobbered registers which aren't saved and exception registers
      (BadVAddr & Cause) which can't be easily determined without stack
      unwinding are reported as 0. The PC is taken from the return address,
      such that the state presented matches that found immediately after
      returning from resume().
      
      Fixes: 88547001 ("[MIPS] kgdb: add arch support for the kernel's kgdb core")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/15829/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d1f922a5
    • Guillaume Nault's avatar
      l2tp: take reference on sessions being dumped · 43f8c8b0
      Guillaume Nault authored
      commit e08293a4 upstream.
      
      Take a reference on the sessions returned by l2tp_session_find_nth()
      (and rename it l2tp_session_get_nth() to reflect this change), so that
      caller is assured that the session isn't going to disappear while
      processing it.
      
      For procfs and debugfs handlers, the session is held in the .start()
      callback and dropped in .show(). Given that pppol2tp_seq_session_show()
      dereferences the associated PPPoL2TP socket and that
      l2tp_dfs_seq_session_show() might call pppol2tp_show(), we also need to
      call the session's .ref() callback to prevent the socket from going
      away from under us.
      
      Fixes: fd558d18 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
      Fixes: 0ad66140 ("l2tp: Add debugfs files for dumping l2tp debug info")
      Fixes: 309795f4 ("l2tp: Add netlink control API for L2TP")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      43f8c8b0
    • Nathan Sullivan's avatar
      net: phy: handle state correctly in phy_stop_machine · 509fda47
      Nathan Sullivan authored
      commit 49d52e81 upstream.
      
      If the PHY is halted on stop, then do not set the state to PHY_UP.  This
      ensures the phy will be restarted later in phy_start when the machine is
      started again.
      
      Fixes: 00db8189 ("This patch adds a PHY Abstraction Layer to the Linux Kernel, enabling ethernet drivers to remain as ignorant as is reasonable of the connected PHY's design and operation details.")
      Signed-off-by: default avatarNathan Sullivan <nathan.sullivan@ni.com>
      Signed-off-by: default avatarBrad Mouring <brad.mouring@ni.com>
      Acked-by: default avatarXander Huff <xander.huff@ni.com>
      Acked-by: default avatarKyle Roeschley <kyle.roeschley@ni.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      509fda47
    • Hongxu Jia's avatar
      netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel · 9bc89352
      Hongxu Jia authored
      commit 17a49cd5 upstream.
      
      Since 09d96860 ("netfilter: x_tables: do compat validation via
      translate_table"), it used compatr structure to assign newinfo
      structure.  In translate_compat_table of ip_tables.c and ip6_tables.c,
      it used compatr->hook_entry to replace info->hook_entry and
      compatr->underflow to replace info->underflow, but not do the same
      replacement in arp_tables.c.
      
      It caused invoking 32-bit "arptbale -P INPUT ACCEPT" failed in 64bit
      kernel.
      --------------------------------------
      root@qemux86-64:~# arptables -P INPUT ACCEPT
      root@qemux86-64:~# arptables -P INPUT ACCEPT
      ERROR: Policy for `INPUT' offset 448 != underflow 0
      arptables: Incompatible with this kernel
      --------------------------------------
      
      Fixes: 09d96860 ("netfilter: x_tables: do compat validation via translate_table")
      Signed-off-by: default avatarHongxu Jia <hongxu.jia@windriver.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      9bc89352
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Have ring_buffer_iter_empty() return true when empty · 6d5f7804
      Steven Rostedt (VMware) authored
      commit 78f7a45d upstream.
      
      I noticed that reading the snapshot file when it is empty no longer gives a
      status. It suppose to show the status of the snapshot buffer as well as how
      to allocate and use it. For example:
      
       ># cat snapshot
       # tracer: nop
       #
       #
       # * Snapshot is allocated *
       #
       # Snapshot commands:
       # echo 0 > snapshot : Clears and frees snapshot buffer
       # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
       #                      Takes a snapshot of the main buffer.
       # echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free)
       #                      (Doesn't have to be '2' works with any number that
       #                       is not a '0' or '1')
      
      But instead it just showed an empty buffer:
      
       ># cat snapshot
       # tracer: nop
       #
       # entries-in-buffer/entries-written: 0/0   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
      
      What happened was that it was using the ring_buffer_iter_empty() function to
      see if it was empty, and if it was, it showed the status. But that function
      was returning false when it was empty. The reason was that the iter header
      page was on the reader page, and the reader page was empty, but so was the
      buffer itself. The check only tested to see if the iter was on the commit
      page, but the commit page was no longer pointing to the reader page, but as
      all pages were empty, the buffer is also.
      
      Fixes: 651e22f2 ("ring-buffer: Always reset iterator to reader page")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6d5f7804
    • Steven Rostedt (VMware)'s avatar
      tracing: Allocate the snapshot buffer before enabling probe · 95948b1e
      Steven Rostedt (VMware) authored
      commit df62db5b upstream.
      
      Currently the snapshot trigger enables the probe and then allocates the
      snapshot. If the probe triggers before the allocation, it could cause the
      snapshot to fail and turn tracing off. It's best to allocate the snapshot
      buffer first, and then enable the trigger. If something goes wrong in the
      enabling of the trigger, the snapshot buffer is still allocated, but it can
      also be freed by the user by writting zero into the snapshot buffer file.
      
      Also add a check of the return status of alloc_snapshot().
      
      Fixes: 77fd5c15 ("tracing: Add snapshot trigger to function probes")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      95948b1e
    • Ben Hutchings's avatar
      rtl8150: Use heap buffers for all register access · 57420afa
      Ben Hutchings authored
      commit 7926aff5 upstream.
      
      Allocating USB buffers on the stack is not portable, and no longer
      works on x86_64 (with VMAP_STACK enabled as per default).
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Brad Spengler <spender@grsecurity.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      57420afa
    • Ben Hutchings's avatar
      pegasus: Use heap buffers for all register access · 6a1d611e
      Ben Hutchings authored
      commit 5593523f upstream.
      
      Allocating USB buffers on the stack is not portable, and no longer
      works on x86_64 (with VMAP_STACK enabled as per default).
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      References: https://bugs.debian.org/852556Reported-by: default avatarLisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
      Tested-by: default avatarLisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Brad Spengler <spender@grsecurity.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6a1d611e
    • Benjamin Herrenschmidt's avatar
      powerpc: Disable HFSCR[TM] if TM is not supported · 797971a0
      Benjamin Herrenschmidt authored
      commit 7ed23e1b upstream.
      
      On Power8 & Power9 the early CPU inititialisation in __init_HFSCR()
      turns on HFSCR[TM] (Hypervisor Facility Status and Control Register
      [Transactional Memory]), but that doesn't take into account that TM
      might be disabled by CPU features, or disabled by the kernel being built
      with CONFIG_PPC_TRANSACTIONAL_MEM=n.
      
      So later in boot, when we have setup the CPU features, clear HSCR[TM] if
      the TM CPU feature has been disabled. We use CPU_FTR_TM_COMP to account
      for the CONFIG_PPC_TRANSACTIONAL_MEM=n case.
      
      Without this a KVM guest might try use TM, even if told not to, and
      cause an oops in the host kernel. Typically the oops is seen in
      __kvmppc_vcore_entry() and may or may not be fatal to the host, but is
      always bad news.
      
      In practice all shipping CPU revisions do support TM, and all host
      kernels we are aware of build with TM support enabled, so no one should
      actually be able to hit this in the wild.
      
      Fixes: 2a3563b0 ("powerpc: Setup in HFSCR for POWER8")
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Tested-by: default avatarSam Bobroff <sam.bobroff@au1.ibm.com>
      [mpe: Rewrite change log with input from Sam, add Fixes/stable]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      [sb: Backported to linux-4.4.y: adjusted context]
      Signed-off-by: default avatarSam Bobroff <sam.bobroff@au1.ibm.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      797971a0
    • Geert Uytterhoeven's avatar
      char: Drop bogus dependency of DEVPORT on !M68K · 6800d881
      Geert Uytterhoeven authored
      commit 309124e2 upstream.
      
      According to full-history-linux commit d3794f4fa7c3edc3 ("[PATCH] M68k
      update (part 25)"), port operations are allowed on m68k if CONFIG_ISA is
      defined.
      
      However, commit 153dcc54 ("[PATCH] mem driver: fix conditional
      on isa i/o support") accidentally changed an "||" into an "&&",
      disabling it completely on m68k. This logic was retained when
      introducing the DEVPORT symbol in commit 4f911d64 ("Make
      /dev/port conditional on config symbol").
      
      Drop the bogus dependency on !M68K to fix this.
      
      Fixes: 153dcc54 ("[PATCH] mem driver: fix conditional on isa i/o support")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: default avatarAl Stone <ahs3@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6800d881
    • Jack Morgenstein's avatar
      net/mlx4_core: Fix racy CQ (Completion Queue) free · d5efe5c2
      Jack Morgenstein authored
      commit 291c566a upstream.
      
      In function mlx4_cq_completion() and mlx4_cq_event(), the
      radix_tree_lookup requires a rcu_read_lock.
      This is mandatory: if another core frees the CQ, it could
      run the radix_tree_node_rcu_free() call_rcu() callback while
      its being used by the radix tree lookup function.
      
      Additionally, in function mlx4_cq_event(), since we are adding
      the rcu lock around the radix-tree lookup, we no longer need to take
      the spinlock. Also, the synchronize_irq() call for the async event
      eliminates the need for incrementing the cq reference count in
      mlx4_cq_event().
      
      Other changes:
      1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock:
         we no longer take this spinlock in the interrupt context.
         The spinlock here, therefore, simply protects against different
         threads simultaneously invoking mlx4_cq_free() for different cq's.
      
      2. In function mlx4_cq_free(), we move the radix tree delete to before
         the synchronize_irq() calls. This guarantees that we will not
         access this cq during any subsequent interrupts, and therefore can
         safely free the CQ after the synchronize_irq calls. The rcu_read_lock
         in the interrupt handlers only needs to protect against corrupting the
         radix tree; the interrupt handlers may access the cq outside the
         rcu_read_lock due to the synchronize_irq calls which protect against
         premature freeing of the cq.
      
      3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg.
      
      4. We leave the cq reference count mechanism in place, because it is
         still needed for the cq completion tasklet mechanism.
      
      Fixes: 6d90aa5c ("net/mlx4_core: Make sure there are no pending async events when freeing CQ")
      Fixes: 225c7b1f ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSumit Semwal <sumit.semwal@linaro.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d5efe5c2
    • Eugenia Emantayev's avatar
      net/mlx4_en: Fix bad WQE issue · 2b3d0c06
      Eugenia Emantayev authored
      commit 6496bbf0 upstream.
      
      Single send WQE in RX buffer should be stamped with software
      ownership in order to prevent the flow of QP in error in FW
      once UPDATE_QP is called.
      
      Fixes: 9f519f68 ('mlx4_en: Not using Shared Receive Queues')
      Signed-off-by: default avatarEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSumit Semwal <sumit.semwal@linaro.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      2b3d0c06