1. 13 Nov, 2015 40 commits
    • Arad, Ronen's avatar
      netlink: Trim skb to alloc size to avoid MSG_TRUNC · 337dbfd5
      Arad, Ronen authored
      [ Upstream commit db65a3aa ]
      
      netlink_dump() allocates skb based on the calculated min_dump_alloc or
      a per socket max_recvmsg_len.
      min_alloc_size is maximum space required for any single netdev
      attributes as calculated by rtnl_calcit().
      max_recvmsg_len tracks the user provided buffer to netlink_recvmsg.
      It is capped at 16KiB.
      The intention is to avoid small allocations and to minimize the number
      of calls required to obtain dump information for all net devices.
      
      netlink_dump packs as many small messages as could fit within an skb
      that was sized for the largest single netdev information. The actual
      space available within an skb is larger than what is requested. It could
      be much larger and up to near 2x with align to next power of 2 approach.
      
      Allowing netlink_dump to use all the space available within the
      allocated skb increases the buffer size a user has to provide to avoid
      truncaion (i.e. MSG_TRUNG flag set).
      
      It was observed that with many VLANs configured on at least one netdev,
      a larger buffer of near 64KiB was necessary to avoid "Message truncated"
      error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when
      min_alloc_size was only little over 32KiB.
      
      This patch trims skb to allocated size in order to allow the user to
      avoid truncation with more reasonable buffer size.
      Signed-off-by: default avatarRonen Arad <ronen.arad@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      337dbfd5
    • Joe Perches's avatar
      ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings · fbf85150
      Joe Perches authored
      [ Upstream commit 077cb37f ]
      
      It seems that kernel memory can leak into userspace by a
      kmalloc, ethtool_get_strings, then copy_to_user sequence.
      
      Avoid this by using kcalloc to zero fill the copied buffer.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fbf85150
    • Konstantin Khlebnikov's avatar
      ovs: do not allocate memory from offline numa node · 2fc2a320
      Konstantin Khlebnikov authored
      [ Upstream commit 598c12d0 ]
      
      When openvswitch tries allocate memory from offline numa node 0:
      stats = kmem_cache_alloc_node(flow_stats_cache, GFP_KERNEL | __GFP_ZERO, 0)
      It catches VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
      [ replaced with VM_WARN_ON(!node_online(nid)) recently ] in linux/gfp.h
      This patch disables numa affinity in this case.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2fc2a320
    • Guillaume Nault's avatar
      ppp: don't override sk->sk_state in pppoe_flush_dev() · 0dded495
      Guillaume Nault authored
      [ Upstream commit e6740165 ]
      
      Since commit 2b018d57 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release"),
      pppoe_release() calls dev_put(po->pppoe_dev) if sk is in the
      PPPOX_ZOMBIE state. But pppoe_flush_dev() can set sk->sk_state to
      PPPOX_ZOMBIE _and_ reset po->pppoe_dev to NULL. This leads to the
      following oops:
      
      [  570.140800] BUG: unable to handle kernel NULL pointer dereference at 00000000000004e0
      [  570.142931] IP: [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
      [  570.144601] PGD 3d119067 PUD 3dbc1067 PMD 0
      [  570.144601] Oops: 0000 [#1] SMP
      [  570.144601] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc loop crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper acpi_cpufreq evdev serio_raw processor button ext4 crc16 mbcache jbd2 virtio_net virtio_blk virtio_pci virtio_ring virtio
      [  570.144601] CPU: 1 PID: 15738 Comm: ppp-apitest Not tainted 4.2.0 #1
      [  570.144601] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
      [  570.144601] task: ffff88003d30d600 ti: ffff880036b60000 task.ti: ffff880036b60000
      [  570.144601] RIP: 0010:[<ffffffffa018c701>]  [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
      [  570.144601] RSP: 0018:ffff880036b63e08  EFLAGS: 00010202
      [  570.144601] RAX: 0000000000000000 RBX: ffff880034340000 RCX: 0000000000000206
      [  570.144601] RDX: 0000000000000006 RSI: ffff88003d30dd20 RDI: ffff88003d30dd20
      [  570.144601] RBP: ffff880036b63e28 R08: 0000000000000001 R09: 0000000000000000
      [  570.144601] R10: 00007ffee9b50420 R11: ffff880034340078 R12: ffff8800387ec780
      [  570.144601] R13: ffff8800387ec7b0 R14: ffff88003e222aa0 R15: ffff8800387ec7b0
      [  570.144601] FS:  00007f5672f48700(0000) GS:ffff88003fc80000(0000) knlGS:0000000000000000
      [  570.144601] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  570.144601] CR2: 00000000000004e0 CR3: 0000000037f7e000 CR4: 00000000000406a0
      [  570.144601] Stack:
      [  570.144601]  ffffffffa018f240 ffff8800387ec780 ffffffffa018f240 ffff8800387ec7b0
      [  570.144601]  ffff880036b63e48 ffffffff812caabe ffff880039e4e000 0000000000000008
      [  570.144601]  ffff880036b63e58 ffffffff812cabad ffff880036b63ea8 ffffffff811347f5
      [  570.144601] Call Trace:
      [  570.144601]  [<ffffffff812caabe>] sock_release+0x1a/0x75
      [  570.144601]  [<ffffffff812cabad>] sock_close+0xd/0x11
      [  570.144601]  [<ffffffff811347f5>] __fput+0xff/0x1a5
      [  570.144601]  [<ffffffff811348cb>] ____fput+0x9/0xb
      [  570.144601]  [<ffffffff81056682>] task_work_run+0x66/0x90
      [  570.144601]  [<ffffffff8100189e>] prepare_exit_to_usermode+0x8c/0xa7
      [  570.144601]  [<ffffffff81001a26>] syscall_return_slowpath+0x16d/0x19b
      [  570.144601]  [<ffffffff813babb1>] int_ret_from_sys_call+0x25/0x9f
      [  570.144601] Code: 48 8b 83 c8 01 00 00 a8 01 74 12 48 89 df e8 8b 27 14 e1 b8 f7 ff ff ff e9 b7 00 00 00 8a 43 12 a8 0b 74 1c 48 8b 83 a8 04 00 00 <48> 8b 80 e0 04 00 00 65 ff 08 48 c7 83 a8 04 00 00 00 00 00 00
      [  570.144601] RIP  [<ffffffffa018c701>] pppoe_release+0x50/0x101 [pppoe]
      [  570.144601]  RSP <ffff880036b63e08>
      [  570.144601] CR2: 00000000000004e0
      [  570.200518] ---[ end trace 46956baf17349563 ]---
      
      pppoe_flush_dev() has no reason to override sk->sk_state with
      PPPOX_ZOMBIE. pppox_unbind_sock() already sets sk->sk_state to
      PPPOX_DEAD, which is the correct state given that sk is unbound and
      po->pppoe_dev is NULL.
      
      Fixes: 2b018d57 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release")
      Tested-by: default avatarOleksii Berezhniak <core@irc.lg.ua>
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0dded495
    • Eric Dumazet's avatar
      net: add pfmemalloc check in sk_add_backlog() · bcbe1f3a
      Eric Dumazet authored
      [ Upstream commit c7c49b8f ]
      
      Greg reported crashes hitting the following check in __sk_backlog_rcv()
      
      	BUG_ON(!sock_flag(sk, SOCK_MEMALLOC));
      
      The pfmemalloc bit is currently checked in sk_filter().
      
      This works correctly for TCP, because sk_filter() is ran in
      tcp_v[46]_rcv() before hitting the prequeue or backlog checks.
      
      For UDP or other protocols, this does not work, because the sk_filter()
      is ran from sock_queue_rcv_skb(), which might be called _after_ backlog
      queuing if socket is owned by user by the time packet is processed by
      softirq handler.
      
      Fixes: b4b9e355 ("netvm: set PF_MEMALLOC as appropriate during SKB processing")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarGreg Thelen <gthelen@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      bcbe1f3a
    • Pravin B Shelar's avatar
      skbuff: Fix skb checksum partial check. · 38043da1
      Pravin B Shelar authored
      [ Upstream commit 31b33dfb ]
      
      Earlier patch 6ae459bd tried to detect void ckecksum partial
      skb by comparing pull length to checksum offset. But it does
      not work for all cases since checksum-offset depends on
      updates to skb->data.
      
      Following patch fixes it by validating checksum start offset
      after skb-data pointer is updated. Negative value of checksum
      offset start means there is no need to checksum.
      
      Fixes: 6ae459bd ("skbuff: Fix skb checksum flag on skb pull")
      Reported-by: default avatarAndrew Vagin <avagin@odin.com>
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      38043da1
    • Pravin B Shelar's avatar
      skbuff: Fix skb checksum flag on skb pull · 7ed2ca12
      Pravin B Shelar authored
      [ Upstream commit 6ae459bd ]
      
      VXLAN device can receive skb with checksum partial. But the checksum
      offset could be in outer header which is pulled on receive. This results
      in negative checksum offset for the skb. Such skb can cause the assert
      failure in skb_checksum_help(). Following patch fixes the bug by setting
      checksum-none while pulling outer header.
      
      Following is the kernel panic msg from old kernel hitting the bug.
      
      ------------[ cut here ]------------
      kernel BUG at net/core/dev.c:1906!
      RIP: 0010:[<ffffffff81518034>] skb_checksum_help+0x144/0x150
      Call Trace:
      <IRQ>
      [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch]
      [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch]
      [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch]
      [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch]
      [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch]
      [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch]
      [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch]
      [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0
      [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0
      [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20
      [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280
      [<ffffffff81550128>] ip_local_deliver+0x88/0x90
      [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370
      [<ffffffff81550365>] ip_rcv+0x235/0x300
      [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620
      [<ffffffff8151c360>] netif_receive_skb+0x80/0x90
      [<ffffffff81459935>] virtnet_poll+0x555/0x6f0
      [<ffffffff8151cd04>] net_rx_action+0x134/0x290
      [<ffffffff810683d8>] __do_softirq+0xa8/0x210
      [<ffffffff8162fe6c>] call_softirq+0x1c/0x30
      [<ffffffff810161a5>] do_softirq+0x65/0xa0
      [<ffffffff810687be>] irq_exit+0x8e/0xb0
      [<ffffffff81630733>] do_IRQ+0x63/0xe0
      [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e
      Reported-by: default avatarAnupam Chanda <achanda@vmware.com>
      Signed-off-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Acked-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7ed2ca12
    • Andrey Vagin's avatar
      net/unix: fix logic about sk_peek_offset · af9d87a1
      Andrey Vagin authored
      [ Upstream commit e9193d60 ]
      
      Now send with MSG_PEEK can return data from multiple SKBs.
      
      Unfortunately we take into account the peek offset for each skb,
      that is wrong. We need to apply the peek offset only once.
      
      In addition, the peek offset should be used only if MSG_PEEK is set.
      
      Cc: "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING
      Cc: Eric Dumazet <edumazet@google.com> (commit_signer:1/14=7%)
      Cc: Aaron Conole <aconole@bytheb.org>
      Fixes: 9f389e35 ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Tested-by: default avatarAaron Conole <aconole@bytheb.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      af9d87a1
    • Aaron Conole's avatar
      af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag · 3d10783a
      Aaron Conole authored
      [ Upstream commit 9f389e35 ]
      
      AF_UNIX sockets now return multiple skbs from recv() when MSG_PEEK flag
      is set.
      
      This is referenced in kernel bugzilla #12323 @
      https://bugzilla.kernel.org/show_bug.cgi?id=12323
      
      As described both in the BZ and lkml thread @
      http://lkml.org/lkml/2008/1/8/444 calling recv() with MSG_PEEK on an
      AF_UNIX socket only reads a single skb, where the desired effect is
      to return as much skb data has been queued, until hitting the recv
      buffer size (whichever comes first).
      
      The modified MSG_PEEK path will now move to the next skb in the tree
      and jump to the again: label, rather than following the natural loop
      structure. This requires duplicating some of the loop head actions.
      
      This was tested using the python socketpair python code attached to
      the bugzilla issue.
      Signed-off-by: default avatarAaron Conole <aconole@bytheb.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3d10783a
    • Aaron Conole's avatar
      af_unix: Convert the unix_sk macro to an inline function for type safety · 2ffdfb97
      Aaron Conole authored
      [ Upstream commit 4613012d ]
      
      As suggested by Eric Dumazet this change replaces the
      #define with a static inline function to enjoy
      complaints by the compiler when misusing the API.
      Signed-off-by: default avatarAaron Conole <aconole@bytheb.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2ffdfb97
    • Alexander Couzens's avatar
      l2tp: protect tunnel->del_work by ref_count · 806b1faa
      Alexander Couzens authored
      [ Upstream commit 06a15f51 ]
      
      There is a small chance that tunnel_free() is called before tunnel->del_work scheduled
      resulting in a zero pointer dereference.
      Signed-off-by: default avatarAlexander Couzens <lynxis@fe80.eu>
      Acked-by: default avatarJames Chapman <jchapman@katalix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      806b1faa
    • Uwe Kleine-König's avatar
      pinctrl: imx25: ensure that a pin with id i is at position i in the info array · 2f1b4ff3
      Uwe Kleine-König authored
      commit 9911a2d5 upstream.
      
      The code in pinctrl-imx.c only works correctly if in the
      imx_pinctrl_soc_info passed to imx_pinctrl_probe we have:
      
      	info->pins[i].number = i
      	conf_reg(info->pins[i]) = 4 * i
      
      (which conf_reg(pin) being the offset of the pin's configuration
      register).
      
      When the imx25 specific part was introduced in b4a87c9b ("pinctrl:
      pinctrl-imx: add imx25 pinctrl driver") we had:
      
      	info->pins[i].number = i + 1
      	conf_reg(info->pins[i]) = 4 * i
      
      . Commit 34027ca2 ("pinctrl: imx25: fix numbering for pins") tried
      to fix that but made the situation:
      
      	info->pins[i-1].number = i
      	conf_reg(info->pins[i-1]) = 4 * i
      
      which is hardly better but fixed the error seen back then.
      
      So insert another reserved entry in the array to finally yield:
      
      	info->pins[i].number = i
      	conf_reg(info->pins[i]) = 4 * i
      
      Fixes: 34027ca2 ("pinctrl: imx25: fix numbering for pins")
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2f1b4ff3
    • Mika Westerberg's avatar
      i2c: designware: Do not use parameters from ACPI on Dell Inspiron 7348 · 2a50314b
      Mika Westerberg authored
      commit 56d4b8a2 upstream.
      
      ACPI SSCN/FMCN methods were originally added because then the platform can
      provide the most accurate HCNT/LCNT values to the driver. However, this
      seems not to be true for Dell Inspiron 7348 where using these causes the
      touchpad to fail in boot:
      
        i2c_hid i2c-DLL0675:00: failed to retrieve report from device.
        i2c_designware INT3433:00: i2c_dw_handle_tx_abort: lost arbitration
        i2c_hid i2c-DLL0675:00: failed to retrieve report from device.
        i2c_designware INT3433:00: controller timed out
      
      The values received from ACPI are (in fast mode):
      
        HCNT: 72
        LCNT: 160
      
      this translates to following timings (input clock is 100MHz on Broadwell):
      
        tHIGH: 720 ns (spec min 600 ns)
        tLOW: 1600 ns (spec min 1300 ns)
        Bus period: 2920 ns (assuming 300 ns tf and tr)
        Bus speed: 342.5 kHz
      
      Both tHIGH and tLOW are within the I2C specification.
      
      The calculated values when ACPI parameters are not used are (in fast mode):
      
        HCNT: 87
        LCNT: 159
      
      which translates to:
      
        tHIGH: 870 ns (spec min 600 ns)
        tLOW: 1590 ns (spec min 1300 ns)
        Bus period 3060 ns (assuming 300 ns tf and tr)
        Bus speed 326.8 kHz
      
      These values are also within the I2C specification.
      
      Since both ACPI and calculated values meet the I2C specification timing
      requirements it is hard to say why the touchpad does not function properly
      with the ACPI values except that the bus speed is higher in this case (but
      still well below the max 400kHz).
      
      Solve this by adding DMI quirk to the driver that disables using ACPI
      parameters on this particulare machine.
      Reported-by: default avatarPavel Roskin <plroskin@gmail.com>
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Tested-by: default avatarPavel Roskin <plroskin@gmail.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2a50314b
    • Shaohua Li's avatar
      memcg: convert threshold to bytes · 92105129
      Shaohua Li authored
      commit 424cdc14 upstream.
      
      page_counter_memparse() returns pages for the threshold, while
      mem_cgroup_usage() returns bytes for memory usage.  Convert the
      threshold to bytes.
      
      Fixes: 3e32cb2e ("memcg: rename cgroup_event to mem_cgroup_event").
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      92105129
    • Wolfram Sang's avatar
      i2c: designware-platdrv: enable RuntimePM before registering to the core · 1458488d
      Wolfram Sang authored
      commit 36d48fb5 upstream.
      
      The core may register clients attached to this master which may use
      funtionality from the master. So, RuntimePM must be enabled before, otherwise
      this will fail.
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Acked-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      [ kamal: backport to 3.19-stable: context ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1458488d
    • Wolfram Sang's avatar
      i2c: s3c2410: enable RuntimePM before registering to the core · f417480f
      Wolfram Sang authored
      commit eadd709f upstream.
      
      The core may register clients attached to this master which may use
      funtionality from the master. So, RuntimePM must be enabled before, otherwise
      this will fail. While here, move drvdata, too.
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Tested-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: default avatarKukjin Kim <kgene@kernel.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f417480f
    • Wolfram Sang's avatar
      i2c: rcar: enable RuntimePM before registering to the core · 1884c3f9
      Wolfram Sang authored
      commit 4f7effdd upstream.
      
      The core may register clients attached to this master which may use
      funtionality from the master. So, RuntimePM must be enabled before, otherwise
      this will fail. While here, move drvdata, too.
      Reported-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1884c3f9
    • Dave Airlie's avatar
      drm/dp/mst: make mst i2c transfer code more robust. · 676a3557
      Dave Airlie authored
      commit ae491542 upstream.
      
      This zeroes the msg so no random stack data ends up getting
      sent, it also limits the function to not accepting > 4
      i2c msgs.
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      676a3557
    • Chris Mason's avatar
      btrfs: fix use after free iterating extrefs · 5b0798a4
      Chris Mason authored
      commit dc6c5fb3 upstream.
      
      The code for btrfs inode-resolve has never worked properly for
      files with enough hard links to trigger extrefs.  It was trying to
      get the leaf out of a path after freeing the path:
      
      	btrfs_release_path(path);
      	leaf = path->nodes[0];
      	item_size = btrfs_item_size_nr(leaf, slot);
      
      The fix here is to use the extent buffer we cloned just a little higher
      up to avoid deadlocks caused by using the leaf in the path.
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      cc: Mark Fasheh <mfasheh@suse.de>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      5b0798a4
    • David Sterba's avatar
      btrfs: check unsupported filters in balance arguments · 210f1fe5
      David Sterba authored
      commit 8eb93459 upstream.
      
      We don't verify that all the balance filter arguments supplemented by
      the flags are actually known to the kernel. Thus we let it silently pass
      and do nothing.
      
      At the moment this means only the 'limit' filter, but we're going to add
      a few more soon so it's better to have that fixed. Also in older stable
      kernels so that it works with newer userspace tools.
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      210f1fe5
    • Mike Snitzer's avatar
      dm thin: fix missing pool reference count decrement in pool_ctr error path · c7c26f3c
      Mike Snitzer authored
      commit ba30670f upstream.
      
      Fixes: ac8c3f3d ("dm thin: generate event when metadata threshold passed")
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c7c26f3c
    • Russell King's avatar
      crypto: ahash - ensure statesize is non-zero · f56048b2
      Russell King authored
      commit 8996eafd upstream.
      
      Unlike shash algorithms, ahash drivers must implement export
      and import as their descriptors may contain hardware state and
      cannot be exported as is.  Unfortunately some ahash drivers did
      not provide them and end up causing crashes with algif_hash.
      
      This patch adds a check to prevent these drivers from registering
      ahash algorithms until they are fixed.
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f56048b2
    • Will Deacon's avatar
      arm64: errata: use KBUILD_CFLAGS_MODULE for erratum #843419 · e9d1ff7a
      Will Deacon authored
      commit b6dd8e07 upstream.
      
      Commit df057cc7 ("arm64: errata: add module build workaround for
      erratum #843419") sets CFLAGS_MODULE to ensure that the large memory
      model is used by the compiler when building kernel modules.
      
      However, CFLAGS_MODULE is an environment variable and intended to be
      overridden on the command line, which appears to be the case with the
      Ubuntu kernel packaging system, so use KBUILD_CFLAGS_MODULE instead.
      
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Fixes: df057cc7 ("arm64: errata: add module build workaround for erratum #843419")
      Reported-by: default avatarDann Frazier <dann.frazier@canonical.com>
      Tested-by: default avatarDann Frazier <dann.frazier@canonical.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e9d1ff7a
    • Ben Skeggs's avatar
      drm/nouveau/fbcon: take runpm reference when userspace has an open fd · d436606d
      Ben Skeggs authored
      commit f231976c upstream.
      
      We need to do this in order to prevent accesses to the device while it's
      powered down.  Userspace may have an mmap of the fb, and there's no good
      way (that I know of) to prevent it from touching the device otherwise.
      
      This fixes some nasty races between runpm and plymouth on some systems,
      which result in the GPU getting very upset and hanging the boot.
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d436606d
    • Daniel Vetter's avatar
      drm: Fix locking for sysfs dpms file · e99852cc
      Daniel Vetter authored
      commit 621bd0f6 upstream.
      
      With atomic drivers we need to make sure that (at least in general)
      property reads hold the right locks. But the legacy dpms property is
      special and can be read locklessly. Since userspace loves to just
      randomly look at that all the time (like with "status") do that.
      
      To make it clear that we play tricks use the READ_ONCE compiler
      barrier (and also for paranoia).
      
      Note that there's not really anything bad going on since even with the
      new atomic paths we eventually end up not chasing any pointers (and
      hence possibly freed memory and other fun stuff). The locking WARNING
      has been added in
      
      commit 88a48e29
      Author: Rob Clark <robdclark@gmail.com>
      Date:   Thu Dec 18 16:01:50 2014 -0500
      
          drm: add atomic properties
      
      but since drivers are converting not everyone will have seen this from
      the start.
      
      Jens reported this and submitted a patch to just grab the
      mode_config.connection_mutex, but we can do a bit better.
      
      v2: Remove unused variables I failed to git add for real.
      
      Reference: http://mid.gmane.org/20150928194822.GA3930@kernel.dkReported-by: default avatarJens Axboe <axboe@fb.com>
      Tested-by: default avatarJens Axboe <axboe@fb.com>
      Cc: Rob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e99852cc
    • Dave Kleikamp's avatar
      crypto: sparc - initialize blkcipher.ivsize · 8db78e8e
      Dave Kleikamp authored
      commit a66d7f72 upstream.
      
      Some of the crypto algorithms write to the initialization vector,
      but no space has been allocated for it. This clobbers adjacent memory.
      Signed-off-by: default avatarDave Kleikamp <dave.kleikamp@oracle.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8db78e8e
    • Christophe Lombard's avatar
      cxl: Fix number of allocated pages in SPA · 27b08ad4
      Christophe Lombard authored
      commit 4108efb0 upstream.
      
      The scheduled process area is currently allocated before assigning the
      correct maximum processes to the AFU, which will mean we only ever
      allocate a fixed number of pages for the scheduled process area. This
      will limit us to 958 processes with 2 x 64K pages. If we try to use more
      processes than that we'd probably overrun the buffer and corrupt memory
      or crash.
      
      AFUs that require three or more interrupts per process will not be
      affected as they are already limited to less processes than that, but we
      could hit it on an AFU that requires 0, 1 or 2 interrupts per process,
      or when using 4K pages.
      
      This patch moves the initialisation of the num_procs to before the SPA
      allocation so that enough pages will be allocated for the number of
      processes that the AFU supports.
      Signed-off-by: default avatarChristophe Lombard <clombard@linux.vnet.ibm.com>
      Signed-off-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      27b08ad4
    • Alex Deucher's avatar
      drm/radeon: add pm sysfs files late · e1da3f14
      Alex Deucher authored
      commit 51a4726b upstream.
      
      They were added relatively early in the driver init process
      which meant that in some cases the driver was not finished
      initializing before external tools tried to use them which
      could result in a crash depending on the timing.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      e1da3f14
    • Shaohua Li's avatar
      workqueue: make sure delayed work run in local cpu · c1562d49
      Shaohua Li authored
      commit 874bbfe6 upstream.
      
      My system keeps crashing with below message. vmstat_update() schedules a delayed
      work in current cpu and expects the work runs in the cpu.
      schedule_delayed_work() is expected to make delayed work run in local cpu. The
      problem is timer can be migrated with NO_HZ. __queue_work() queues work in
      timer handler, which could run in a different cpu other than where the delayed
      work is scheduled. The end result is the delayed work runs in different cpu.
      The patch makes __queue_delayed_work records local cpu earlier. Where the timer
      runs doesn't change where the work runs with the change.
      
      [   28.010131] ------------[ cut here ]------------
      [   28.010609] kernel BUG at ../mm/vmstat.c:1392!
      [   28.011099] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
      [   28.011860] Modules linked in:
      [   28.012245] CPU: 0 PID: 289 Comm: kworker/0:3 Tainted: G        W4.3.0-rc3+ #634
      [   28.013065] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
      [   28.014160] Workqueue: events vmstat_update
      [   28.014571] task: ffff880117682580 ti: ffff8800ba428000 task.ti: ffff8800ba428000
      [   28.015445] RIP: 0010:[<ffffffff8115f921>]  [<ffffffff8115f921>]vmstat_update+0x31/0x80
      [   28.016282] RSP: 0018:ffff8800ba42fd80  EFLAGS: 00010297
      [   28.016812] RAX: 0000000000000000 RBX: ffff88011a858dc0 RCX:0000000000000000
      [   28.017585] RDX: ffff880117682580 RSI: ffffffff81f14d8c RDI:ffffffff81f4df8d
      [   28.018366] RBP: ffff8800ba42fd90 R08: 0000000000000001 R09:0000000000000000
      [   28.019169] R10: 0000000000000000 R11: 0000000000000121 R12:ffff8800baa9f640
      [   28.019947] R13: ffff88011a81e340 R14: ffff88011a823700 R15:0000000000000000
      [   28.020071] FS:  0000000000000000(0000) GS:ffff88011a800000(0000)knlGS:0000000000000000
      [   28.020071] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   28.020071] CR2: 00007ff6144b01d0 CR3: 00000000b8e93000 CR4:00000000000006f0
      [   28.020071] Stack:
      [   28.020071]  ffff88011a858dc0 ffff8800baa9f640 ffff8800ba42fe00ffffffff8106bd88
      [   28.020071]  ffffffff8106bd0b 0000000000000096 0000000000000000ffffffff82f9b1e8
      [   28.020071]  ffffffff829f0b10 0000000000000000 ffffffff81f18460ffff88011a81e340
      [   28.020071] Call Trace:
      [   28.020071]  [<ffffffff8106bd88>] process_one_work+0x1c8/0x540
      [   28.020071]  [<ffffffff8106bd0b>] ? process_one_work+0x14b/0x540
      [   28.020071]  [<ffffffff8106c214>] worker_thread+0x114/0x460
      [   28.020071]  [<ffffffff8106c100>] ? process_one_work+0x540/0x540
      [   28.020071]  [<ffffffff81071bf8>] kthread+0xf8/0x110
      [   28.020071]  [<ffffffff81071b00>] ?kthread_create_on_node+0x200/0x200
      [   28.020071]  [<ffffffff81a6522f>] ret_from_fork+0x3f/0x70
      [   28.020071]  [<ffffffff81071b00>] ?kthread_create_on_node+0x200/0x200
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c1562d49
    • Christoph Hellwig's avatar
      3w-9xxx: don't unmap bounce buffered commands · 04f0566e
      Christoph Hellwig authored
      commit 15e3d5a2 upstream.
      
      3w controller don't dma map small single SGL entry commands but instead
      bounce buffer them.  Add a helper to identify these commands and don't
      call scsi_dma_unmap for them.
      
      Based on an earlier patch from James Bottomley.
      
      Fixes: 118c85 ("3w-9xxx: fix command completion race")
      Reported-by: default avatarTóth Attila <atoth@atoth.sote.hu>
      Tested-by: default avatarTóth Attila <atoth@atoth.sote.hu>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarAdam Radford <aradford@gmail.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      04f0566e
    • Joe Thornber's avatar
      dm cache: fix NULL pointer when switching from cleaner policy · 39556d9f
      Joe Thornber authored
      commit 2bffa150 upstream.
      
      The cleaner policy doesn't make use of the per cache block hint space in
      the metadata (unlike the other policies).  When switching from the
      cleaner policy to mq or smq a NULL pointer crash (in dm_tm_new_block)
      was observed.  The crash was caused by bugs in dm-cache-metadata.c
      when trying to skip creation of the hint btree.
      
      The minimal fix is to change hint size for the cleaner policy to 4 bytes
      (only hint size supported).
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      39556d9f
    • Peter Zijlstra's avatar
      sched/core: Fix TASK_DEAD race in finish_task_switch() · f5d359c1
      Peter Zijlstra authored
      commit 95913d97 upstream.
      
      So the problem this patch is trying to address is as follows:
      
              CPU0                            CPU1
      
              context_switch(A, B)
                                              ttwu(A)
                                                LOCK A->pi_lock
                                                A->on_cpu == 0
              finish_task_switch(A)
                prev_state = A->state  <-.
                WMB                      |
                A->on_cpu = 0;           |
                UNLOCK rq0->lock         |
                                         |    context_switch(C, A)
                                         `--  A->state = TASK_DEAD
                prev_state == TASK_DEAD
                  put_task_struct(A)
                                              context_switch(A, C)
                                              finish_task_switch(A)
                                                A->state == TASK_DEAD
                                                  put_task_struct(A)
      
      The argument being that the WMB will allow the load of A->state on CPU0
      to cross over and observe CPU1's store of A->state, which will then
      result in a double-drop and use-after-free.
      
      Now the comment states (and this was true once upon a long time ago)
      that we need to observe A->state while holding rq->lock because that
      will order us against the wakeup; however the wakeup will not in fact
      acquire (that) rq->lock; it takes A->pi_lock these days.
      
      We can obviously fix this by upgrading the WMB to an MB, but that is
      expensive, so we'd rather avoid that.
      
      The alternative this patch takes is: smp_store_release(&A->on_cpu, 0),
      which avoids the MB on some archs, but not important ones like ARM.
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Cc: manfred@colorfullife.com
      Cc: will.deacon@arm.com
      Fixes: e4a52bcb ("sched: Remove rq->lock from the first half of ttwu()")
      Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f5d359c1
    • Andreas Dannenberg's avatar
      ASoC: tas2552: fix dBscale-min declaration · 4307f310
      Andreas Dannenberg authored
      commit e2600460 upstream.
      
      The minimum volume level for the TAS2552 (control register value 0x00)
      is -7dB however the driver declares it as -0.07dB.
      
      Running amixer before the patch reports:
      dBscale-min=-0.07dB,step=1.00dB,mute=0
      
      Running amixer with the patch applied reports:
      dBscale-min=-7.00dB,step=1.00dB,mute=0
      Signed-off-by: default avatarAndreas Dannenberg <dannenberg@ti.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4307f310
    • Peter Ujfalusi's avatar
      ASoC: tas2552: Correct the Speaker Driver Playback Volume (PGA_GAIN) · 7003f993
      Peter Ujfalusi authored
      commit dd6ae3bc upstream.
      
      The last parameter for DECLARE_TLV_DB_SCALE() is to tell if the gain will
      be muted or not when it is set to raw 0. IN this case it is not muted.
      The PGA_GAIN is in 0-4 bits in the register. Fix the offset in the
      SOC_SINGLE_TLV() for this.
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7003f993
    • Mark Salyzyn's avatar
      arm64: readahead: fault retry breaks mmap file read random detection · b6a8d33e
      Mark Salyzyn authored
      commit 569ba74a upstream.
      
      This is the arm64 portion of commit 45cac65b ("readahead: fault
      retry breaks mmap file read random detection"), which was absent from
      the initial port and has since gone unnoticed. The original commit says:
      
      > .fault now can retry.  The retry can break state machine of .fault.  In
      > filemap_fault, if page is miss, ra->mmap_miss is increased.  In the second
      > try, since the page is in page cache now, ra->mmap_miss is decreased.  And
      > these are done in one fault, so we can't detect random mmap file access.
      >
      > Add a new flag to indicate .fault is tried once.  In the second try, skip
      > ra->mmap_miss decreasing.  The filemap_fault state machine is ok with it.
      
      With this change, Mark reports that:
      
      > Random read improves by 250%, sequential read improves by 40%, and
      > random write by 400% to an eMMC device with dm crypto wrapped around it.
      
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: default avatarMark Salyzyn <salyzyn@android.com>
      Signed-off-by: default avatarRiley Andrews <riandrews@android.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b6a8d33e
    • Takashi Iwai's avatar
      ALSA: synth: Fix conflicting OSS device registration on AWE32 · 11abcc76
      Takashi Iwai authored
      commit 225db576 upstream.
      
      When OSS emulation is loaded on ISA SB AWE32 chip, we get now kernel
      warnings like:
        WARNING: CPU: 0 PID: 2791 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x51/0x80()
        sysfs: cannot create duplicate filename '/devices/isa/sbawe.0/sound/card0/seq-oss-0-0'
      
      It's because both emux synth and opl3 drivers try to register their
      OSS device object with the same static index number 0.  This hasn't
      been a big problem until the recent rewrite of device management code
      (that exposes sysfs at the same time), but it's been an obvious bug.
      
      This patch works around it just by using a different index number of
      emux synth object.  There can be a more elegant way to fix, but it's
      enough for now, as this code won't be touched so often, in anyway.
      Reported-and-tested-by: default avatarMichael Shell <list1@michaelshell.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      11abcc76
    • covici@ccs.covici.com's avatar
      staging: speakup: fix speakup-r regression · 92cc10f2
      covici@ccs.covici.com authored
      commit b1d562ac upstream.
      
      Here is a patch to make speakup-r work again.
      
      It broke in 3.6 due to commit 4369c64c
      "Input: Send events one packet at a time)
      
      The problem was that the fakekey.c routine to fake a down arrow no
      longer functioned properly and putting the input_sync fixed it.
      
      Fixes: 4369c64cAcked-by: default avatarSamuel Thibault <samuel.thibault@ens-lyon.org>
      Signed-off-by: default avatarJohn Covici <covici@ccs.covici.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      92cc10f2
    • Jann Horn's avatar
      drivers/tty: require read access for controlling terminal · 0fac33a1
      Jann Horn authored
      commit 0c556271 upstream.
      
      This is mostly a hardening fix, given that write-only access to other
      users' ttys is usually only given through setgid tty executables.
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0fac33a1
    • Mans Rullgard's avatar
      serial: 8250: add uart_config entry for PORT_RT2880 · f24a06bf
      Mans Rullgard authored
      commit 3c5a0357 upstream.
      
      This adds an entry to the uart_config table for PORT_RT2880
      enabling rx/tx FIFOs.  The UART is actually a Palmchip BK-3103
      which is found in several devices from Alchemy/RMI, Ralink, and
      Sigma Designs.
      Signed-off-by: default avatarMans Rullgard <mans@mansr.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      [ kamal: backport to 3.19-stable: file rename ]
      f24a06bf
    • Kosuke Tatsukawa's avatar
      tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c · 90d3c8a8
      Kosuke Tatsukawa authored
      commit e81107d4 upstream.
      
      My colleague ran into a program stall on a x86_64 server, where
      n_tty_read() was waiting for data even if there was data in the buffer
      in the pty.  kernel stack for the stuck process looks like below.
       #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
       #1 [ffff88303d107bd0] schedule at ffffffff815c513e
       #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
       #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
       #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
       #5 [ffff88303d107dd0] tty_read at ffffffff81368013
       #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
       #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
       #8 [ffff88303d107f00] sys_read at ffffffff811a4306
       #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
      
      There seems to be two problems causing this issue.
      
      First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
      updates ldata->commit_head using smp_store_release() and then checks
      the wait queue using waitqueue_active().  However, since there is no
      memory barrier, __receive_buf() could return without calling
      wake_up_interactive_poll(), and at the same time, n_tty_read() could
      start to wait in wait_woken() as in the following chart.
      
              __receive_buf()                         n_tty_read()
      ------------------------------------------------------------------------
      if (waitqueue_active(&tty->read_wait))
      /* Memory operations issued after the
         RELEASE may be completed before the
         RELEASE operation has completed */
                                              add_wait_queue(&tty->read_wait, &wait);
                                              ...
                                              if (!input_available_p(tty, 0)) {
      smp_store_release(&ldata->commit_head,
                        ldata->read_head);
                                              ...
                                              timeout = wait_woken(&wait,
                                                TASK_INTERRUPTIBLE, timeout);
      ------------------------------------------------------------------------
      
      The second problem is that n_tty_read() also lacks a memory barrier
      call and could also cause __receive_buf() to return without calling
      wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
      as in the chart below.
      
              __receive_buf()                         n_tty_read()
      ------------------------------------------------------------------------
                                              spin_lock_irqsave(&q->lock, flags);
                                              /* from add_wait_queue() */
                                              ...
                                              if (!input_available_p(tty, 0)) {
                                              /* Memory operations issued after the
                                                 RELEASE may be completed before the
                                                 RELEASE operation has completed */
      smp_store_release(&ldata->commit_head,
                        ldata->read_head);
      if (waitqueue_active(&tty->read_wait))
                                              __add_wait_queue(q, wait);
                                              spin_unlock_irqrestore(&q->lock,flags);
                                              /* from add_wait_queue() */
                                              ...
                                              timeout = wait_woken(&wait,
                                                TASK_INTERRUPTIBLE, timeout);
      ------------------------------------------------------------------------
      
      There are also other places in drivers/tty/n_tty.c which have similar
      calls to waitqueue_active(), so instead of adding many memory barrier
      calls, this patch simply removes the call to waitqueue_active(),
      leaving just wake_up*() behind.
      
      This fixes both problems because, even though the memory access before
      or after the spinlocks in both wake_up*() and add_wait_queue() can
      sneak into the critical section, it cannot go past it and the critical
      section assures that they will be serialized (please see "INTER-CPU
      ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
      better explanation).  Moreover, the resulting code is much simpler.
      
      Latency measurement using a ping-pong test over a pty doesn't show any
      visible performance drop.
      Signed-off-by: default avatarKosuke Tatsukawa <tatsu@ab.jp.nec.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      90d3c8a8