1. 13 Nov, 2015 10 commits
    • Joe Perches's avatar
      ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings · e2fd4b16
      Joe Perches authored
      [ Upstream commit 077cb37f ]
      
      It seems that kernel memory can leak into userspace by a
      kmalloc, ethtool_get_strings, then copy_to_user sequence.
      
      Avoid this by using kcalloc to zero fill the copied buffer.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e2fd4b16
    • Guillaume Nault's avatar
      ppp: don't override sk->sk_state in pppoe_flush_dev() · d2c7ba29
      Guillaume Nault authored
      [ Upstream commit e6740165 ]
      
      Since commit 2b018d57 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release"),
      pppoe_release() calls dev_put(po->pppoe_dev) if sk is in the
      PPPOX_ZOMBIE state. But pppoe_flush_dev() can set sk->sk_state to
      PPPOX_ZOMBIE _and_ reset po->pppoe_dev to NULL. This leads to the
      following oops:
      
      [  570.140800] BUG: unable to handle kernel NULL pointer dereference at 00000000000004e0
      [  570.142931] IP: [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
      [  570.144601] PGD 3d119067 PUD 3dbc1067 PMD 0
      [  570.144601] Oops: 0000 [#1] SMP
      [  570.144601] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc loop crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper acpi_cpufreq evdev serio_raw processor button ext4 crc16 mbcache jbd2 virtio_net virtio_blk virtio_pci virtio_ring virtio
      [  570.144601] CPU: 1 PID: 15738 Comm: ppp-apitest Not tainted 4.2.0 #1
      [  570.144601] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
      [  570.144601] task: ffff88003d30d600 ti: ffff880036b60000 task.ti: ffff880036b60000
      [  570.144601] RIP: 0010:[<ffffffffa018c701>]  [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
      [  570.144601] RSP: 0018:ffff880036b63e08  EFLAGS: 00010202
      [  570.144601] RAX: 0000000000000000 RBX: ffff880034340000 RCX: 0000000000000206
      [  570.144601] RDX: 0000000000000006 RSI: ffff88003d30dd20 RDI: ffff88003d30dd20
      [  570.144601] RBP: ffff880036b63e28 R08: 0000000000000001 R09: 0000000000000000
      [  570.144601] R10: 00007ffee9b50420 R11: ffff880034340078 R12: ffff8800387ec780
      [  570.144601] R13: ffff8800387ec7b0 R14: ffff88003e222aa0 R15: ffff8800387ec7b0
      [  570.144601] FS:  00007f5672f48700(0000) GS:ffff88003fc80000(0000) knlGS:0000000000000000
      [  570.144601] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  570.144601] CR2: 00000000000004e0 CR3: 0000000037f7e000 CR4: 00000000000406a0
      [  570.144601] Stack:
      [  570.144601]  ffffffffa018f240 ffff8800387ec780 ffffffffa018f240 ffff8800387ec7b0
      [  570.144601]  ffff880036b63e48 ffffffff812caabe ffff880039e4e000 0000000000000008
      [  570.144601]  ffff880036b63e58 ffffffff812cabad ffff880036b63ea8 ffffffff811347f5
      [  570.144601] Call Trace:
      [  570.144601]  [<ffffffff812caabe>] sock_release+0x1a/0x75
      [  570.144601]  [<ffffffff812cabad>] sock_close+0xd/0x11
      [  570.144601]  [<ffffffff811347f5>] __fput+0xff/0x1a5
      [  570.144601]  [<ffffffff811348cb>] ____fput+0x9/0xb
      [  570.144601]  [<ffffffff81056682>] task_work_run+0x66/0x90
      [  570.144601]  [<ffffffff8100189e>] prepare_exit_to_usermode+0x8c/0xa7
      [  570.144601]  [<ffffffff81001a26>] syscall_return_slowpath+0x16d/0x19b
      [  570.144601]  [<ffffffff813babb1>] int_ret_from_sys_call+0x25/0x9f
      [  570.144601] Code: 48 8b 83 c8 01 00 00 a8 01 74 12 48 89 df e8 8b 27 14 e1 b8 f7 ff ff ff e9 b7 00 00 00 8a 43 12 a8 0b 74 1c 48 8b 83 a8 04 00 00 <48> 8b 80 e0 04 00 00 65 ff 08 48 c7 83 a8 04 00 00 00 00 00 00
      [  570.144601] RIP  [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
      [  570.144601]  RSP <ffff880036b63e08>
      [  570.144601] CR2: 00000000000004e0
      [  570.200518] ---[ end trace 46956baf17349563 ]---
      
      pppoe_flush_dev() has no reason to override sk->sk_state with
      PPPOX_ZOMBIE. pppox_unbind_sock() already sets sk->sk_state to
      PPPOX_DEAD, which is the correct state given that sk is unbound and
      po->pppoe_dev is NULL.
      
      Fixes: 2b018d57 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release")
      Tested-by: default avatarOleksii Berezhniak <core@irc.lg.ua>
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d2c7ba29
    • Eric Dumazet's avatar
      net: add pfmemalloc check in sk_add_backlog() · d48fa362
      Eric Dumazet authored
      [ Upstream commit c7c49b8f ]
      
      Greg reported crashes hitting the following check in __sk_backlog_rcv()
      
      	BUG_ON(!sock_flag(sk, SOCK_MEMALLOC));
      
      The pfmemalloc bit is currently checked in sk_filter().
      
      This works correctly for TCP, because sk_filter() is ran in
      tcp_v[46]_rcv() before hitting the prequeue or backlog checks.
      
      For UDP or other protocols, this does not work, because the sk_filter()
      is ran from sock_queue_rcv_skb(), which might be called _after_ backlog
      queuing if socket is owned by user by the time packet is processed by
      softirq handler.
      
      Fixes: b4b9e355 ("netvm: set PF_MEMALLOC as appropriate during SKB processing")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarGreg Thelen <gthelen@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d48fa362
    • Pravin B Shelar's avatar
      skbuff: Fix skb checksum partial check. · f688246d
      Pravin B Shelar authored
      [ Upstream commit 31b33dfb ]
      
      Earlier patch 6ae459bd tried to detect void ckecksum partial
      skb by comparing pull length to checksum offset. But it does
      not work for all cases since checksum-offset depends on
      updates to skb->data.
      
      Following patch fixes it by validating checksum start offset
      after skb-data pointer is updated. Negative value of checksum
      offset start means there is no need to checksum.
      
      Fixes: 6ae459bd ("skbuff: Fix skb checksum flag on skb pull")
      Reported-by: default avatarAndrew Vagin <avagin@odin.com>
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      f688246d
    • Pravin B Shelar's avatar
      skbuff: Fix skb checksum flag on skb pull · 6532c6b2
      Pravin B Shelar authored
      [ Upstream commit 6ae459bd ]
      
      VXLAN device can receive skb with checksum partial. But the checksum
      offset could be in outer header which is pulled on receive. This results
      in negative checksum offset for the skb. Such skb can cause the assert
      failure in skb_checksum_help(). Following patch fixes the bug by setting
      checksum-none while pulling outer header.
      
      Following is the kernel panic msg from old kernel hitting the bug.
      
      ------------[ cut here ]------------
      kernel BUG at net/core/dev.c:1906!
      RIP: 0010:[<ffffffff81518034>] skb_checksum_help+0x144/0x150
      Call Trace:
      <IRQ>
      [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch]
      [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch]
      [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch]
      [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch]
      [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch]
      [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch]
      [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch]
      [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0
      [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0
      [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20
      [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280
      [<ffffffff81550128>] ip_local_deliver+0x88/0x90
      [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370
      [<ffffffff81550365>] ip_rcv+0x235/0x300
      [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620
      [<ffffffff8151c360>] netif_receive_skb+0x80/0x90
      [<ffffffff81459935>] virtnet_poll+0x555/0x6f0
      [<ffffffff8151cd04>] net_rx_action+0x134/0x290
      [<ffffffff810683d8>] __do_softirq+0xa8/0x210
      [<ffffffff8162fe6c>] call_softirq+0x1c/0x30
      [<ffffffff810161a5>] do_softirq+0x65/0xa0
      [<ffffffff810687be>] irq_exit+0x8e/0xb0
      [<ffffffff81630733>] do_IRQ+0x63/0xe0
      [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e
      Reported-by: default avatarAnupam Chanda <achanda@vmware.com>
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Acked-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6532c6b2
    • Andrey Vagin's avatar
      net/unix: fix logic about sk_peek_offset · 74587138
      Andrey Vagin authored
      [ Upstream commit e9193d60 ]
      
      Now send with MSG_PEEK can return data from multiple SKBs.
      
      Unfortunately we take into account the peek offset for each skb,
      that is wrong. We need to apply the peek offset only once.
      
      In addition, the peek offset should be used only if MSG_PEEK is set.
      
      Cc: "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING
      Cc: Eric Dumazet <edumazet@google.com> (commit_signer:1/14=7%)
      Cc: Aaron Conole <aconole@bytheb.org>
      Fixes: 9f389e35 ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Tested-by: default avatarAaron Conole <aconole@bytheb.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      74587138
    • Aaron Conole's avatar
      af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag · 3427bee1
      Aaron Conole authored
      [ Upstream commit 9f389e35 ]
      
      AF_UNIX sockets now return multiple skbs from recv() when MSG_PEEK flag
      is set.
      
      This is referenced in kernel bugzilla #12323 @
      https://bugzilla.kernel.org/show_bug.cgi?id=12323
      
      As described both in the BZ and lkml thread @
      http://lkml.org/lkml/2008/1/8/444 calling recv() with MSG_PEEK on an
      AF_UNIX socket only reads a single skb, where the desired effect is
      to return as much skb data has been queued, until hitting the recv
      buffer size (whichever comes first).
      
      The modified MSG_PEEK path will now move to the next skb in the tree
      and jump to the again: label, rather than following the natural loop
      structure. This requires duplicating some of the loop head actions.
      
      This was tested using the python socketpair python code attached to
      the bugzilla issue.
      Signed-off-by: default avatarAaron Conole <aconole@bytheb.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      3427bee1
    • Aaron Conole's avatar
      af_unix: Convert the unix_sk macro to an inline function for type safety · 0a97833a
      Aaron Conole authored
      [ Upstream commit 4613012d ]
      
      As suggested by Eric Dumazet this change replaces the
      #define with a static inline function to enjoy
      complaints by the compiler when misusing the API.
      Signed-off-by: default avatarAaron Conole <aconole@bytheb.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      0a97833a
    • Alexander Couzens's avatar
      l2tp: protect tunnel->del_work by ref_count · 3ebbc131
      Alexander Couzens authored
      [ Upstream commit 06a15f51 ]
      
      There is a small chance that tunnel_free() is called before tunnel->del_work scheduled
      resulting in a zero pointer dereference.
      Signed-off-by: default avatarAlexander Couzens <lynxis@fe80.eu>
      Acked-by: default avatarJames Chapman <jchapman@katalix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      3ebbc131
    • Jan Kara's avatar
      jbd2: avoid infinite loop when destroying aborted journal · ffb04e43
      Jan Kara authored
      [ Upstream commit 841df7df ]
      
      Commit 6f6a6fda "jbd2: fix ocfs2 corrupt when updating journal
      superblock fails" changed jbd2_cleanup_journal_tail() to return EIO
      when the journal is aborted. That makes logic in
      jbd2_log_do_checkpoint() bail out which is fine, except that
      jbd2_journal_destroy() expects jbd2_log_do_checkpoint() to always make
      a progress in cleaning the journal. Without it jbd2_journal_destroy()
      just loops in an infinite loop.
      
      Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
      jbd2_log_do_checkpoint() fails with error.
      Reported-by: default avatarEryu Guan <guaneryu@gmail.com>
      Tested-by: default avatarEryu Guan <guaneryu@gmail.com>
      Fixes: 6f6a6fdaSigned-off-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ffb04e43
  2. 31 Oct, 2015 3 commits
    • Sasha Levin's avatar
      Linux 3.18.24 · b1240304
      Sasha Levin authored
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b1240304
    • Kosuke Tatsukawa's avatar
      tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c · b008351f
      Kosuke Tatsukawa authored
      [ Upstream commit e81107d4 ]
      
      My colleague ran into a program stall on a x86_64 server, where
      n_tty_read() was waiting for data even if there was data in the buffer
      in the pty.  kernel stack for the stuck process looks like below.
       #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
       #1 [ffff88303d107bd0] schedule at ffffffff815c513e
       #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
       #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
       #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
       #5 [ffff88303d107dd0] tty_read at ffffffff81368013
       #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
       #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
       #8 [ffff88303d107f00] sys_read at ffffffff811a4306
       #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
      
      There seems to be two problems causing this issue.
      
      First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
      updates ldata->commit_head using smp_store_release() and then checks
      the wait queue using waitqueue_active().  However, since there is no
      memory barrier, __receive_buf() could return without calling
      wake_up_interactive_poll(), and at the same time, n_tty_read() could
      start to wait in wait_woken() as in the following chart.
      
              __receive_buf()                         n_tty_read()
      ------------------------------------------------------------------------
      if (waitqueue_active(&tty->read_wait))
      /* Memory operations issued after the
         RELEASE may be completed before the
         RELEASE operation has completed */
                                              add_wait_queue(&tty->read_wait, &wait);
                                              ...
                                              if (!input_available_p(tty, 0)) {
      smp_store_release(&ldata->commit_head,
                        ldata->read_head);
                                              ...
                                              timeout = wait_woken(&wait,
                                                TASK_INTERRUPTIBLE, timeout);
      ------------------------------------------------------------------------
      
      The second problem is that n_tty_read() also lacks a memory barrier
      call and could also cause __receive_buf() to return without calling
      wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
      as in the chart below.
      
              __receive_buf()                         n_tty_read()
      ------------------------------------------------------------------------
                                              spin_lock_irqsave(&q->lock, flags);
                                              /* from add_wait_queue() */
                                              ...
                                              if (!input_available_p(tty, 0)) {
                                              /* Memory operations issued after the
                                                 RELEASE may be completed before the
                                                 RELEASE operation has completed */
      smp_store_release(&ldata->commit_head,
                        ldata->read_head);
      if (waitqueue_active(&tty->read_wait))
                                              __add_wait_queue(q, wait);
                                              spin_unlock_irqrestore(&q->lock,flags);
                                              /* from add_wait_queue() */
                                              ...
                                              timeout = wait_woken(&wait,
                                                TASK_INTERRUPTIBLE, timeout);
      ------------------------------------------------------------------------
      
      There are also other places in drivers/tty/n_tty.c which have similar
      calls to waitqueue_active(), so instead of adding many memory barrier
      calls, this patch simply removes the call to waitqueue_active(),
      leaving just wake_up*() behind.
      
      This fixes both problems because, even though the memory access before
      or after the spinlocks in both wake_up*() and add_wait_queue() can
      sneak into the critical section, it cannot go past it and the critical
      section assures that they will be serialized (please see "INTER-CPU
      ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
      better explanation).  Moreover, the resulting code is much simpler.
      
      Latency measurement using a ping-pong test over a pty doesn't show any
      visible performance drop.
      Signed-off-by: default avatarKosuke Tatsukawa <tatsu@ab.jp.nec.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b008351f
    • Sasha Levin's avatar
      Revert "tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c" · 80e5c4dd
      Sasha Levin authored
      This reverts commit af32cc7b.
      
      The commit was incorrectly backported and was causing hangs.
      Reported-by: default avatarCorey Wright <undefined@pobox.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      80e5c4dd
  3. 29 Oct, 2015 1 commit
  4. 28 Oct, 2015 26 commits