1. 24 Dec, 2015 6 commits
    • Rob Gardner's avatar
      sparc64: fix FP corruption in user copy functions · a7c5724b
      Rob Gardner authored
      Short story: Exception handlers used by some copy_to_user() and
      copy_from_user() functions do not diligently clean up floating point
      register usage, and this can result in a user process seeing invalid
      values in floating point registers. This sometimes makes the process
      fail.
      
      Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions
      use floating point registers and VIS alignaddr/faligndata to
      accelerate data copying when source and dest addresses don't align
      well. Linux uses a lazy scheme for saving floating point registers; It
      is not done upon entering the kernel since it's a very expensive
      operation. Rather, it is done only when needed. If the kernel ends up
      not using FP regs during the course of some trap or system call, then
      it can return to user space without saving or restoring them.
      
      The various memcpy functions begin their FP code with VISEntry (or a
      variation thereof), which saves the FP regs. They conclude their FP
      code with VISExit (or a variation) which essentially marks the FP regs
      "clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned
      off so that a lazy restore will be triggered when/if the user process
      accesses floating point regs again.
      
      The bug is that the user copy variants of memcpy, copy_from_user() and
      copy_to_user(), employ an exception handling mechanism to detect faults
      when accessing user space addresses, and when this handler is invoked,
      an immediate return from the function is forced, and VISExit is not
      executed, thus leaving the fprs register in an indeterminate state,
      but often with fprs.FPRS_FEF set and one or more dirty bits. This
      results in a return to user space with invalid values in the FP regs,
      and since fprs.FPRS_FEF is on, no lazy restore occurs.
      
      This bug affects copy_to_user() and copy_from_user() for NG4, NG2,
      U3, and U1. All are fixed by using a new exception handler for those
      loads and stores that are done during the time between VISEnter and
      VISExit.
      
      n.b. In NG4memcpy, the problematic code can be triggered by a copy
      size greater than 128 bytes and an unaligned source address.  This bug
      is known to be the cause of random user process memory corruptions
      while perf is running with the callgraph option (ie, perf record -g).
      This occurs because perf uses copy_from_user() to read user stacks,
      and may fault when it follows a stack frame pointer off to an
      invalid page. Validation checks on the stack address just obscure
      the underlying problem.
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: default avatarDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7c5724b
    • Rob Gardner's avatar
      sparc64: Perf should save/restore fault info · 83352694
      Rob Gardner authored
      There have been several reports of random processes being killed with
      a bus error or segfault during userspace stack walking in perf.  One
      of the root causes of this problem is an asynchronous modification to
      thread_info fault_address and fault_code, which stems from a perf
      counter interrupt arriving during kernel processing of a "benign"
      fault, such as a TSB miss. Since perf_callchain_user() invokes
      copy_from_user() to read user stacks, a fault is not only possible,
      but probable. Validity checks on the stack address merely cover up the
      problem and reduce its frequency.
      
      The solution here is to save and restore fault_address and fault_code
      in perf_callchain_user() so that the benign fault handler is not
      disturbed by a perf interrupt.
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: default avatarDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83352694
    • Rob Gardner's avatar
      sparc64: Ensure perf can access user stacks · 3f74306a
      Rob Gardner authored
      When an interrupt (such as a perf counter interrupt) is delivered
      while executing in user space, the trap entry code puts ASI_AIUS in
      %asi so that copy_from_user() and copy_to_user() will access the
      correct memory. But if a perf counter interrupt is delivered while the
      cpu is already executing in kernel space, then the trap entry code
      will put ASI_P in %asi, and this will prevent copy_from_user() from
      reading any useful stack data in either of the perf_callchain_user_X
      functions, and thus no user callgraph data will be collected for this
      sample period. An additional problem is that a fault is guaranteed
      to occur, and though it will be silently covered up, it wastes time
      and could perturb state.
      
      In perf_callchain_user(), we ensure that %asi contains ASI_AIUS
      because we know for a fact that the subsequent calls to
      copy_from_user() are intended to read the user's stack.
      
      [ Use get_fs()/set_fs() -DaveM ]
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: default avatarDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f74306a
    • Rob Gardner's avatar
      sparc64: Don't set %pil in rtrap_nmi too early · 1ca04a4c
      Rob Gardner authored
      Commit 28a1f533 delays setting %pil to avoid potential
      hardirq stack overflow in the common rtrap_irq path.
      Setting %pil also needs to be delayed in the rtrap_nmi
      path for the same reason.
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Signed-off-by: default avatarDave Aldridge <david.j.aldridge@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ca04a4c
    • Khalid Aziz's avatar
      sparc64: Add ADI capability to cpu capabilities · 82924e54
      Khalid Aziz authored
      Add ADI (Application Data Integrity) capability to cpu capabilities list.
      ADI capability allows virtual addresses to be encoded with a tag in
      bits 63-60. This tag serves as an access control key for the regions
      of virtual address with ADI enabled and a key set on them. Hypervisor
      encodes this capability as "adp" in "hwcap-list" property in machine
      description.
      Signed-off-by: default avatarKhalid Aziz <khalid.aziz@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      82924e54
    • Aya Mahfouz's avatar
      tty: serial: constify sunhv_ops structs · 01fd3c27
      Aya Mahfouz authored
      Constifies sunhv_ops structures in tty's serial
      driver since they are not modified after their
      initialization.
      
      Detected and found using Coccinelle.
      Suggested-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarAya Mahfouz <mahfouz.saif.elyazal@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01fd3c27
  2. 23 Dec, 2015 1 commit
  3. 17 Dec, 2015 9 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 73796d8b
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of
          people reported this...  From Arnd Bergmann.
      
       2) Don't init mutex twice in i40e driver, from Jesse Brandeburg.
      
       3) Fix spurious EBUSY in rhashtable, from Herbert Xu.
      
       4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas.
      
       5) Fix race with work structure access in pppoe driver causing
          corruptions, from Guillaume Nault.
      
       6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb()
          actually succeeded or not, from Sergei Shtylyov.
      
       7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from
          Bjørn Mork.
      
       8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc.
      
       9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo
          Leitner.
      
      10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven.
      
      11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling
          properly as well, from Jiri Benc.
      
      12) Handle request sockets properly in xfrm layer, from Eric Dumazet.
      
      13) Double stats update in ipv6 geneve transmit path, fix from Pravin B
          Shelar.
      
      14) sk->sk_policy[] needs RCU protection, and as a result
          xfrm_policy_destroy() needs to free policies using an RCU grace
          period, from Eric Dumazet.
      
      15) SCTP needs to clone ipv6 tx options in order to avoid use after
          free, from Eric Dumazet.
      
      16) Missing kbuild export if ila.h, from Stephen Hemminger.
      
      17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from
          Tobias Klauser.
      
      18) Validate protocol value range in ->create() methods, from Hannes
          Frederic Sowa.
      
      19) Fix early socket demux races that result in illegal dst reuse, from
          Eric Dumazet.
      
      20) Validate socket address length in pptp code, from WANG Cong.
      
      21) skb_reorder_vlan_header() uses incorrect offset and can corrupt
          packets, from Vlad Yasevich.
      
      22) Fix memory leaks in nl80211 registry code, from Ola Olsson.
      
      23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and
          qlcnic.  From Dan Carpenter.
      
      24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for
          example, AF_ALG will interpret it as an async call.  From Tadeusz
          Struk.
      
      25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from
          Eric Dumazet.
      
      26) rhashtable enforces the minimum table size not early enough,
          breaking how we calculate the per-cpu lock allocations.  From
          Herbert Xu.
      
      27) Fix FCC port lockup in 82xx driver, from Martin Roth.
      
      28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa.
      
      29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and
          sock_setsockopt() wrt.  timestamp handling.  From WANG Cong.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits)
        net: check both type and procotol for tcp sockets
        drivers: net: xgene: fix Tx flow control
        tcp: restore fastopen with no data in SYN packet
        af_unix: Revert 'lock_interruptible' in stream receive code
        fou: clean up socket with kfree_rcu
        82xx: FCC: Fixing a bug causing to FCC port lock-up
        gianfar: Don't enable RX Filer if not supported
        net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration
        rhashtable: Fix walker list corruption
        rhashtable: Enforce minimum size on initial hash table
        inet: tcp: fix inetpeer_set_addr_v4()
        ipv6: automatically enable stable privacy mode if stable_secret set
        net: fix uninitialized variable issue
        bluetooth: Validate socket address length in sco_sock_bind().
        net_sched: make qdisc_tree_decrease_qlen() work for non mq
        ser_gigaset: remove unnecessary kfree() calls from release method
        ser_gigaset: fix deallocation of platform device structure
        ser_gigaset: turn nonsense checks into WARN_ON
        ser_gigaset: fix up NULL checks
        qlcnic: fix a timeout loop
        ...
      73796d8b
    • WANG Cong's avatar
      net: check both type and procotol for tcp sockets · ac5cc977
      WANG Cong authored
      Dmitry reported the following out-of-bound access:
      
      Call Trace:
       [<ffffffff816cec2e>] __asan_report_load4_noabort+0x3e/0x40
      mm/kasan/report.c:294
       [<ffffffff84affb14>] sock_setsockopt+0x1284/0x13d0 net/core/sock.c:880
       [<     inline     >] SYSC_setsockopt net/socket.c:1746
       [<ffffffff84aed7ee>] SyS_setsockopt+0x1fe/0x240 net/socket.c:1729
       [<ffffffff85c18c76>] entry_SYSCALL_64_fastpath+0x16/0x7a
      arch/x86/entry/entry_64.S:185
      
      This is because we mistake a raw socket as a tcp socket.
      We should check both sk->sk_type and sk->sk_protocol to ensure
      it is a tcp socket.
      
      Willem points out __skb_complete_tx_timestamp() needs to fix as well.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac5cc977
    • Iyappan Subramanian's avatar
      drivers: net: xgene: fix Tx flow control · 67894eec
      Iyappan Subramanian authored
      Currently the Tx flow control is based on reading the hardware state,
      which is not accurate since it may not reflect the descriptors that
      are not yet reached the memory.
      
      To accurately control the Tx flow, changing it to be software based.
      Signed-off-by: default avatarIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67894eec
    • Eric Dumazet's avatar
      tcp: restore fastopen with no data in SYN packet · 07e100f9
      Eric Dumazet authored
      Yuchung tracked a regression caused by commit 57be5bda ("ip: convert
      tcp_sendmsg() to iov_iter primitives") for TCP Fast Open.
      
      Some Fast Open users do not actually add any data in the SYN packet.
      
      Fixes: 57be5bda ("ip: convert tcp_sendmsg() to iov_iter primitives")
      Reported-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07e100f9
    • Rainer Weikusat's avatar
      af_unix: Revert 'lock_interruptible' in stream receive code · 3822b5c2
      Rainer Weikusat authored
      With b3ca9b02, the AF_UNIX SOCK_STREAM
      receive code was changed from using mutex_lock(&u->readlock) to
      mutex_lock_interruptible(&u->readlock) to prevent signals from being
      delayed for an indefinite time if a thread sleeping on the mutex
      happened to be selected for handling the signal. But this was never a
      problem with the stream receive code (as opposed to its datagram
      counterpart) as that never went to sleep waiting for new messages with the
      mutex held and thus, wouldn't cause secondary readers to block on the
      mutex waiting for the sleeping primary reader. As the interruptible
      locking makes the code more complicated in exchange for no benefit,
      change it back to using mutex_lock.
      Signed-off-by: default avatarRainer Weikusat <rweikusat@mobileactivedefense.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3822b5c2
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · ce42af94
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Some i915 fixes, one omap fix, one core regression fix.
      
        Not even enough fixes for a twelve days of xmas song, which seemms
        good"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm: Don't overwrite UNVERFIED mode status to OK
        drm/omap: fix fbdev pix format to support all platforms
        drm/i915: Do a better job at disabling primary plane in the noatomic case.
        drm/i915/skl: Double RC6 WRL always on
        drm/i915/skl: Disable coarse power gating up until F0
        drm/i915: Remove incorrect warning in context cleanup
      ce42af94
    • Will Deacon's avatar
      locking/osq: Fix ordering of node initialisation in osq_lock · b4b29f94
      Will Deacon authored
      The Cavium guys reported a soft lockup on their arm64 machine, caused by
      commit c55a6ffa ("locking/osq: Relax atomic semantics"):
      
          mutex_optimistic_spin+0x9c/0x1d0
          __mutex_lock_slowpath+0x44/0x158
          mutex_lock+0x54/0x58
          kernfs_iop_permission+0x38/0x70
          __inode_permission+0x88/0xd8
          inode_permission+0x30/0x6c
          link_path_walk+0x68/0x4d4
          path_openat+0xb4/0x2bc
          do_filp_open+0x74/0xd0
          do_sys_open+0x14c/0x228
          SyS_openat+0x3c/0x48
          el0_svc_naked+0x24/0x28
      
      This is because in osq_lock we initialise the node for the current CPU:
      
          node->locked = 0;
          node->next = NULL;
          node->cpu = curr;
      
      and then publish the current CPU in the lock tail:
      
          old = atomic_xchg_acquire(&lock->tail, curr);
      
      Once the update to lock->tail is visible to another CPU, the node is
      then live and can be both read and updated by concurrent lockers.
      
      Unfortunately, the ACQUIRE semantics of the xchg operation mean that
      there is no guarantee the contents of the node will be visible before
      lock tail is updated.  This can lead to lock corruption when, for
      example, a concurrent locker races to set the next field.
      
      Fixes: c55a6ffa ("locking/osq: Relax atomic semantics"):
      Reported-by: default avatarDavid Daney <ddaney@caviumnetworks.com>
      Reported-by: default avatarAndrew Pinski <andrew.pinski@caviumnetworks.com>
      Tested-by: default avatarAndrew Pinski <andrew.pinski@caviumnetworks.com>
      Acked-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1449856001-21177-1-git-send-email-will.deacon@arm.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b4b29f94
    • Linus Torvalds's avatar
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · d7637d01
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
      
       - Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
         These are tagged for -stable.  The scatterlist impact is potentially
        corrupted dma addresses on HIGHMEM enabled platforms.
      
       - A minor locking fix for the NFIT hot-add implementation that is new
         in 4.4-rc.  This would only trigger in the case a hot-add raced
         driver removal.
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        dma-debug: Fix dma_debug_entry offset calculation
        Revert "scatterlist: use sg_phys()"
        nfit: acpi_nfit_notify(): Do not leave device locked
      d7637d01
    • Hannes Frederic Sowa's avatar
      fou: clean up socket with kfree_rcu · 3036facb
      Hannes Frederic Sowa authored
      fou->udp_offloads is managed by RCU. As it is actually included inside
      the fou sockets, we cannot let the memory go out of scope before a grace
      period. We either can synchronize_rcu or switch over to kfree_rcu to
      manage the sockets. kfree_rcu seems appropriate as it is used by vxlan
      and geneve.
      
      Fixes: 23461551 ("fou: Support for foo-over-udp RX path")
      Cc: Tom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3036facb
  4. 16 Dec, 2015 10 commits
  5. 15 Dec, 2015 14 commits