1. 15 Apr, 2015 26 commits
    • Erez Shitrit's avatar
      IB/ipoib: Use one linear skb in RX flow · a44878d1
      Erez Shitrit authored
      The current code in the RX flow uses two sg entries for each incoming
      packet, the first one was for the IB headers and the second for the rest
      of the data, that causes two  dma map/unmap and two allocations, and few
      more actions that were done at the data path.
      
      Use only one linear skb on each incoming packet, for the data (IB
      headers and payload), that reduces the packet processing in the
      data-path (only one skb, no frags, the first frag was not used anyway,
      less memory allocations) and the dma handling (only one dma map/unmap
      over each incoming packet instead of two map/unmap per each incoming packet).
      
      After commit 73d3fe6d ("gro: fix aggregation for skb using frag_list") from
      Eric Dumazet, we will get full aggregation for large packets.
      
      When running bandwidth tests before and after the (over the card's numa node),
      using "netperf -H 1.1.1.3 -T -t TCP_STREAM", the results before are ~12Gbs before
      and after ~16Gbs on my setup (Mellanox's ConnectX3).
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      a44878d1
    • Doug Ledford's avatar
      IB/ipoib: drop mcast_mutex usage · 1c0453d6
      Doug Ledford authored
      We needed the mcast_mutex when we had to prevent the join completion
      callback from having the value it stored in mcast->mc overwritten
      by a delayed return from ib_sa_join_multicast.  By storing the return
      of ib_sa_join_multicast in an intermediate variable, we prevent a
      delayed return from ib_sa_join_multicast overwriting the valid
      contents of mcast->mc, and we no longer need a mutex to force the
      join callback to run after the return of ib_sa_join_multicast.  This
      allows us to do away with the mutex entirely and protect our critical
      sections with a just a spinlock instead.  This is highly desirable
      as there were some places where we couldn't use a mutex because the
      code was not allowed to sleep, and so we were currently using a mix
      of mutex and spinlock to protect what we needed to protect.  Now we
      only have a spin lock and the locking complexity is greatly reduced.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      1c0453d6
    • Doug Ledford's avatar
      IB/ipoib: deserialize multicast joins · d2fe937c
      Doug Ledford authored
      Allow the ipoib layer to attempt to join all outstanding multicast
      groups at once.  The ib_sa layer will serialize multiple attempts to
      join the same group, but will process attempts to join different groups
      in parallel.  Take advantage of that.
      
      In order to make this happen, change the mcast_join_thread to loop
      through all needed joins, sending a join request for each one that we
      still need to join.  There are a few special cases we handle though:
      
      1) Don't attempt to join anything but the broadcast group until the join
      of the broadcast group has succeeded.
      2) No longer restart the join task at the end of completion handling.
      If we completed successfully, we are done.  The join task now needs kicked
      either by mcast_send or mcast_restart_task or mcast_start_thread, but
      should not need started anytime else except when scheduling a backoff
      attempt to rejoin.
      3) No longer use separate join/completion routines for regular and
      sendonly joins, pass them all through the same routine and just do the
      right thing based on the SENDONLY join flag.
      4) Only try to join a SENDONLY join twice, then drop the packets and
      quit trying.  We leave the mcast group in the list so that if we get a
      new packet, all that we have to do is queue up the packet and restart
      the join task and it will automatically try to join twice and then
      either send or flush the queue again.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      d2fe937c
    • Doug Ledford's avatar
      IB/ipoib: fix MCAST_FLAG_BUSY usage · 69911416
      Doug Ledford authored
      Commit a9c8ba58 ("IPoIB: Fix usage of uninitialized multicast
      objects") added a new flag MCAST_JOIN_STARTED, but was not very strict
      in how it was used.  We didn't always initialize the completion struct
      before we set the flag, and we didn't always call complete on the
      completion struct from all paths that complete it.  And when we did
      complete it, sometimes we continued to touch the mcast entry after
      the completion, opening us up to possible use after free issues.
      
      This made it less than totally effective, and certainly made its use
      confusing.  And in the flush function we would use the presence of this
      flag to signal that we should wait on the completion struct, but we never
      cleared this flag, ever.
      
      In order to make things clearer and aid in resolving the rtnl deadlock
      bug I've been chasing, I cleaned this up a bit.
      
       1) Remove the MCAST_JOIN_STARTED flag entirely
       2) Change MCAST_FLAG_BUSY so it now only means a join is in-flight
       3) Test mcast->mc directly to see if we have completed
          ib_sa_join_multicast (using IS_ERR_OR_NULL)
       4) Make sure that before setting MCAST_FLAG_BUSY we always initialize
          the mcast->done completion struct
       5) Make sure that before calling complete(&mcast->done), we always clear
          the MCAST_FLAG_BUSY bit
       6) Take the mcast_mutex before we call ib_sa_multicast_join and also
          take the mutex in our join callback.  This forces
          ib_sa_multicast_join to return and set mcast->mc before we process
          the callback.  This way, our callback can safely clear mcast->mc
          if there is an error on the join and we will do the right thing as
          a result in mcast_dev_flush.
       7) Because we need the mutex to synchronize mcast->mc, we can no
          longer call mcast_sendonly_join directly from mcast_send and
          instead must add sendonly join processing to the mcast_join_task
       8) Make MCAST_RUN mean that we have a working mcast subsystem, not that
          we have a running task.  We know when we need to reschedule our
          join task thread and don't need a flag to tell us.
       9) Add a helper for rescheduling the join task thread
      
      A number of different races are resolved with these changes.  These
      races existed with the old MCAST_FLAG_BUSY usage, the
      MCAST_JOIN_STARTED flag was an attempt to address them, and while it
      helped, a determined effort could still trip things up.
      
      One race looks something like this:
      
      Thread 1                             Thread 2
      ib_sa_join_multicast (as part of running restart mcast task)
        alloc member
        call callback
                                           ifconfig ib0 down
      				     wait_for_completion
          callback call completes
                                           wait_for_completion in
      				     mcast_dev_flush completes
      				       mcast->mc is PTR_ERR_OR_NULL
      				       so we skip ib_sa_leave_multicast
          return from callback
        return from ib_sa_join_multicast
      set mcast->mc = return from ib_sa_multicast
      
      We now have a permanently unbalanced join/leave issue that trips up the
      refcounting in core/multicast.c
      
      Another like this:
      
      Thread 1                   Thread 2         Thread 3
      ib_sa_multicast_join
                                                  ifconfig ib0 down
      					    priv->broadcast = NULL
                                 join_complete
      			                    wait_for_completion
      			   mcast->mc is not yet set, so don't clear
      return from ib_sa_join_multicast and set mcast->mc
      			   complete
      			   return -EAGAIN (making mcast->mc invalid)
      			   		    call ib_sa_multicast_leave
      					    on invalid mcast->mc, hang
      					    forever
      
      By holding the mutex around ib_sa_multicast_join and taking the mutex
      early in the callback, we force mcast->mc to be valid at the time we
      run the callback.  This allows us to clear mcast->mc if there is an
      error and the join is going to fail.  We do this before we complete
      the mcast.  In this way, mcast_dev_flush always sees consistent state
      in regards to mcast->mc membership at the time that the
      wait_for_completion() returns.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      69911416
    • Doug Ledford's avatar
      IB/ipoib: No longer use flush as a parameter · efc82eee
      Doug Ledford authored
      Various places in the IPoIB code had a deadlock related to flushing
      the ipoib workqueue.  Now that we have per device workqueues and a
      specific flush workqueue, there is no longer a deadlock issue with
      flushing the device specific workqueues and we can do so unilaterally.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      efc82eee
    • Doug Ledford's avatar
      IB/ipoib: Use dedicated workqueues per interface · 0b39578b
      Doug Ledford authored
      During my recent work on the rtnl lock deadlock in the IPoIB driver, I
      saw that even once I fixed the apparent races for a single device, as
      soon as that device had any children, new races popped up.  It turns
      out that this is because no matter how well we protect against races
      on a single device, the fact that all devices use the same workqueue,
      and flush_workqueue() flushes *everything* from that workqueue means
      that we would also have to prevent all races between different devices
      (for instance, ipoib_mcast_restart_task on interface ib0 can race with
      ipoib_mcast_flush_dev on interface ib0.8002, resulting in a deadlock on
      the rtnl_lock).
      
      There are several possible solutions to this problem:
      
      Make carrier_on_task and mcast_restart_task try to take the rtnl for
      some set period of time and if they fail, then bail.  This runs the
      real risk of dropping work on the floor, which can end up being its
      own separate kind of deadlock.
      
      Set some global flag in the driver that says some device is in the
      middle of going down, letting all tasks know to bail.  Again, this can
      drop work on the floor.
      
      Or the method this patch attempts to use, which is when we bring an
      interface up, create a workqueue specifically for that interface, so
      that when we take it back down, we are flushing only those tasks
      associated with our interface.  In addition, keep the global
      workqueue, but now limit it to only flush tasks.  In this way, the
      flush tasks can always flush the device specific work queues without
      having deadlock issues.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      0b39578b
    • Doug Ledford's avatar
      IB/ipoib: Make the carrier_on_task race aware · 894021a7
      Doug Ledford authored
      We blindly assume that we can just take the rtnl lock and that will
      prevent races with downing this interface.  Unfortunately, that's not
      the case.  In ipoib_mcast_stop_thread() we will call flush_workqueue()
      in an attempt to clear out all remaining instances of ipoib_join_task.
      But, since this task is put on the same workqueue as the join task,
      the flush_workqueue waits on this thread too.  But this thread is
      deadlocked on the rtnl lock.  The better thing here is to use trylock
      and loop on that until we either get the lock or we see that
      FLAG_OPER_UP has been cleared, in which case we don't need to do
      anything anyway and we just return.
      
      While investigating which flag should be used, FLAG_ADMIN_UP or
      FLAG_OPER_UP, it was determined that FLAG_OPER_UP was the more
      appropriate flag to use.  However, there was a mix of these two flags in
      use in the existing code.  So while we check for that flag here as part
      of this race fix, also cleanup the two places that had used the less
      appropriate flag for their tests.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      894021a7
    • Doug Ledford's avatar
      IB/ipoib: Consolidate rtnl_lock tasks in workqueue · c84ca6d2
      Doug Ledford authored
      The ipoib_mcast_flush_dev routine is called with the rtnl_lock held and
      needs to keep it held.  It also needs to call flush_workqueue() to flush
      out any outstanding work.  In the past, we've had to try and make sure
      that we didn't flush out any outstanding join completions because they
      also wanted to grab rtnl_lock() and that would deadlock.  It turns out
      that the only thing in the join completion handler that needs this lock
      can be safely moved to our carrier_on_task, thereby reducing the
      potential for the join completion code and the flush code to deadlock
      against each other.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      c84ca6d2
    • Doug Ledford's avatar
      IB/ipoib: change init sequence ordering · be7aa663
      Doug Ledford authored
      In preparation for using per device work queues, we need to move the
      start of the neighbor thread task to after ipoib_ib_dev_init and move
      the destruction of the neighbor task to before ipoib_ib_dev_cleanup.
      Otherwise we will end up freeing our workqueue with work possibly
      still on it.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      be7aa663
    • Doug Ledford's avatar
      IB/ipoib: factor out ah flushing · e135106f
      Doug Ledford authored
      Create a an ipoib_flush_ah and ipoib_stop_ah routines to use at
      appropriate times to flush out all remaining ah entries before we shut
      the device down.
      
      Because neighbors and mcast entries can each have a reference on any
      given ah, we must make sure to free all of those first before our ah
      will actually have a 0 refcount and be able to be reaped.
      
      This factoring is needed in preparation for having per-device work
      queues.  The original per-device workqueue code resulted in the following
      error message:
      
      <ibdev>: ib_dealloc_pd failed
      
      That error was tracked down to this issue.  With the changes to which
      workqueues were flushed when, there were no flushes of the per device
      workqueue after the last ah's were freed, resulting in an attempt to
      dealloc the pd with outstanding resources still allocated.  This code
      puts the explicit flushes in the needed places to avoid that problem.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      e135106f
    • Linus Torvalds's avatar
      Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · c841e12a
      Linus Torvalds authored
      Pull kconfig updates from Michal Marek:
       "Here is the kconfig stuff for v4.1-rc1:
      
         - fixes for mergeconfig (used by make kvmconfig/tinyconfig)
      
         - header cleanup
      
         - make -s *config is silent now"
      
      * 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        kconfig: Do not print status messages in make -s mode
        kconfig: Simplify Makefile
        kbuild: add generic mergeconfig target, %.config
        merge_config.sh: rename MAKE to RUNMAKE
        merge_config.sh: improve indentation
        kbuild: mergeconfig: remove redundant $(objtree)
        kbuild: mergeconfig: move an error check to merge_config.sh
        kbuild: mergeconfig: fix "jobserver unavailable" warning
        kconfig: Remove unnecessary prototypes from headers
        kconfig: Remove dead code
        kconfig: Get rid of the P() macro in headers
        kconfig: fix a misspelling in scripts/kconfig/merge_config.sh
      c841e12a
    • Linus Torvalds's avatar
      Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · b422b758
      Linus Torvalds authored
      Pull kbuild updates from Michal Marek:
       "Here is the first round of kbuild changes for v4.1-rc1:
      
         - kallsyms fix for ARM and cleanup
      
         - make dep(end) removed (developers have no sense of nostalgia these
           days...)
      
         - include Makefiles by relative path
      
         - stop useless rebuilds of asm-offsets.h and bounds.h"
      
      * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        Kbuild: kallsyms: drop special handling of pre-3.0 GCC symbols
        Kbuild: kallsyms: ignore veneers emitted by the ARM linker
        kbuild: ia64: use $(src)/Makefile.gate rather than particular path
        kbuild: include $(src)/Makefile rather than $(obj)/Makefile
        kbuild: use relative path more to include Makefile
        kbuild: use relative path to include Makefile
        kbuild: do not add $(bounds-file) and $(offsets-file) to targets
        kbuild: remove warning about "make depend"
        kbuild: Don't reset timestamps in include/generated if not needed
      b422b758
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · d488d3a4
      Linus Torvalds authored
      Pull security subsystem updates from James Morris:
       "Highlights for this window:
      
         - improved AVC hashing for SELinux by John Brooks and Stephen Smalley
      
         - addition of an unconfined label to Smack
      
         - Smack documentation update
      
         - TPM driver updates"
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (28 commits)
        lsm: copy comm before calling audit_log to avoid race in string printing
        tomoyo: Do not generate empty policy files
        tomoyo: Use if_changed when generating builtin-policy.h
        tomoyo: Use bin2c to generate builtin-policy.h
        selinux: increase avtab max buckets
        selinux: Use a better hash function for avtab
        selinux: convert avtab hash table to flex_array
        selinux: reconcile security_netlbl_secattr_to_sid() and mls_import_netlbl_cat()
        selinux: remove unnecessary pointer reassignment
        Smack: Updates for Smack documentation
        tpm/st33zp24/spi: Add missing device table for spi phy.
        tpm/st33zp24: Add proper wait for ordinal duration in case of irq mode
        smack: Fix gcc warning from unused smack_syslog_lock mutex in smackfs.c
        Smack: Allow an unconfined label in bringup mode
        Smack: getting the Smack security context of keys
        Smack: Assign smack_known_web as default smk_in label for kernel thread's socket
        tpm/tpm_infineon: Use struct dev_pm_ops for power management
        MAINTAINERS: Add Jason as designated reviewer for TPM
        tpm: Update KConfig text to include TPM2.0 FIFO chips
        tpm/st33zp24/dts/st33zp24-spi: Add dts documentation for st33zp24 spi phy
        ...
      d488d3a4
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · cb906953
      Linus Torvalds authored
      Pull crypto update from Herbert Xu:
       "Here is the crypto update for 4.1:
      
        New interfaces:
         - user-space interface for AEAD
         - user-space interface for RNG (i.e., pseudo RNG)
      
        New hashes:
         - ARMv8 SHA1/256
         - ARMv8 AES
         - ARMv8 GHASH
         - ARM assembler and NEON SHA256
         - MIPS OCTEON SHA1/256/512
         - MIPS img-hash SHA1/256 and MD5
         - Power 8 VMX AES/CBC/CTR/GHASH
         - PPC assembler AES, SHA1/256 and MD5
         - Broadcom IPROC RNG driver
      
        Cleanups/fixes:
         - prevent internal helper algos from being exposed to user-space
         - merge common code from assembly/C SHA implementations
         - misc fixes"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (169 commits)
        crypto: arm - workaround for building with old binutils
        crypto: arm/sha256 - avoid sha256 code on ARMv7-M
        crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer
        crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer
        crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer
        crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
        crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer
        crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
        crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer
        crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer
        crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer
        crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer
        crypto: sha512-generic - move to generic glue implementation
        crypto: sha256-generic - move to generic glue implementation
        crypto: sha1-generic - move to generic glue implementation
        crypto: sha512 - implement base layer for SHA-512
        crypto: sha256 - implement base layer for SHA-256
        crypto: sha1 - implement base layer for SHA-1
        crypto: api - remove instance when test failed
        crypto: api - Move alg ref count init to crypto_check_alg
        ...
      cb906953
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · 6c373ca8
      Linus Torvalds authored
      Pull networking updates from David Miller:
      
       1) Add BQL support to via-rhine, from Tino Reichardt.
      
       2) Integrate SWITCHDEV layer support into the DSA layer, so DSA drivers
          can support hw switch offloading.  From Floria Fainelli.
      
       3) Allow 'ip address' commands to initiate multicast group join/leave,
          from Madhu Challa.
      
       4) Many ipv4 FIB lookup optimizations from Alexander Duyck.
      
       5) Support EBPF in cls_bpf classifier and act_bpf action, from Daniel
          Borkmann.
      
       6) Remove the ugly compat support in ARP for ugly layers like ax25,
          rose, etc.  And use this to clean up the neigh layer, then use it to
          implement MPLS support.  All from Eric Biederman.
      
       7) Support L3 forwarding offloading in switches, from Scott Feldman.
      
       8) Collapse the LOCAL and MAIN ipv4 FIB tables when possible, to speed
          up route lookups even further.  From Alexander Duyck.
      
       9) Many improvements and bug fixes to the rhashtable implementation,
          from Herbert Xu and Thomas Graf.  In particular, in the case where
          an rhashtable user bulk adds a large number of items into an empty
          table, we expand the table much more sanely.
      
      10) Don't make the tcp_metrics hash table per-namespace, from Eric
          Biederman.
      
      11) Extend EBPF to access SKB fields, from Alexei Starovoitov.
      
      12) Split out new connection request sockets so that they can be
          established in the main hash table.  Much less false sharing since
          hash lookups go direct to the request sockets instead of having to
          go first to the listener then to the request socks hashed
          underneath.  From Eric Dumazet.
      
      13) Add async I/O support for crytpo AF_ALG sockets, from Tadeusz Struk.
      
      14) Support stable privacy address generation for RFC7217 in IPV6.  From
          Hannes Frederic Sowa.
      
      15) Hash network namespace into IP frag IDs, also from Hannes Frederic
          Sowa.
      
      16) Convert PTP get/set methods to use 64-bit time, from Richard
          Cochran.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1816 commits)
        fm10k: Bump driver version to 0.15.2
        fm10k: corrected VF multicast update
        fm10k: mbx_update_max_size does not drop all oversized messages
        fm10k: reset head instead of calling update_max_size
        fm10k: renamed mbx_tx_dropped to mbx_tx_oversized
        fm10k: update xcast mode before synchronizing multicast addresses
        fm10k: start service timer on probe
        fm10k: fix function header comment
        fm10k: comment next_vf_mbx flow
        fm10k: don't handle mailbox events in iov_event path and always process mailbox
        fm10k: use separate workqueue for fm10k driver
        fm10k: Set PF queues to unlimited bandwidth during virtualization
        fm10k: expose tx_timeout_count as an ethtool stat
        fm10k: only increment tx_timeout_count in Tx hang path
        fm10k: remove extraneous "Reset interface" message
        fm10k: separate PF only stats so that VF does not display them
        fm10k: use hw->mac.max_queues for stats
        fm10k: only show actual queues, not the maximum in hardware
        fm10k: allow creation of VLAN on default vid
        fm10k: fix unused warnings
        ...
      6c373ca8
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · bb0fd7ab
      Linus Torvalds authored
      Pull ARM updates from Russell King:
       "Included in this update are both some long term fixes and some new
        features.
      
        Fixes:
      
         - An integer overflow in the calculation of ELF_ET_DYN_BASE.
      
         - Avoiding OOMs for high-order IOMMU allocations
      
         - SMP requires the data cache to be enabled for synchronisation
           primitives to work, so prevent the CPU_DCACHE_DISABLE option being
           visible on SMP builds.
      
         - A bug going back 10+ years in the noMMU ARM94* CPU support code,
           where it corrupts registers.  Found by folk getting Linux running
           on their cameras.
      
         - Versatile Express needs an errata workaround enabled for CPU
           hot-unplug to work.
      
        Features:
      
         - Clean up module linker by handling out of range relocations
           separately from relocation cases we don't handle.
      
         - Fix a long term bug in the pci_mmap_page_range() code, which we
           hope won't impact userspace (we hope there's no users of the
           existing broken interface.)
      
         - Don't map DMA coherent allocations when we don't have a MMU.
      
         - Drop experimental status for SMP_ON_UP.
      
         - Warn when DT doesn't specify ePAPR mandatory cache properties.
      
         - Add documentation concerning how we find the start of physical
           memory for AUTO_ZRELADDR kernels, detailing why we have chosen the
           mask and the implications of changing it.
      
         - Updates from Ard Biesheuvel to address some issues with large
           kernels (such as allyesconfig) failing to link.
      
         - Allow hibernation to work on modern (ARMv7) CPUs - this appears to
           have never worked in the past on these CPUs.
      
         - Enable IRQ_SHOW_LEVEL, which changes the /proc/interrupts output
           format (hopefully without userspace breaking...  let's hope that if
           it causes someone a problem, they tell us.)
      
         - Fix tegra-ahb DT offsets.
      
         - Rework ARM errata 643719 code (and ARMv7 flush_cache_louis()/
           flush_dcache_all()) code to be more efficient, and enable this
           errata workaround by default for ARMv7+SMP CPUs.  This complements
           the Versatile Express fix above.
      
         - Rework ARMv7 context code for errata 430973, so that only Cortex A8
           CPUs are impacted by the branch target buffer flush when this
           errata is enabled.  Also update the help text to indicate that all
           r1p* A8 CPUs are impacted.
      
         - Switch ARM to the generic show_mem() implementation, it conveys all
           the information which we were already reporting.
      
         - Prevent slow timer sources being used for udelay() - timers running
           at less than 1MHz are not useful for this, and can cause udelay()
           to return immediately, without any wait.  Using such a slow timer
           is silly.
      
         - VDSO support for 32-bit ARM, mainly for gettimeofday() using the
           ARM architected timer.
      
         - Perf support for Scorpion performance monitoring units"
      
      vdso semantic conflict fixed up as per linux-next.
      
      * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (52 commits)
        ARM: update errata 430973 documentation to cover Cortex A8 r1p*
        ARM: ensure delay timer has sufficient accuracy for delays
        ARM: switch to use the generic show_mem() implementation
        ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs
        ARM: enable ARM errata 643719 workaround by default
        ARM: cache-v7: optimise test for Cortex A9 r0pX devices
        ARM: cache-v7: optimise branches in v7_flush_cache_louis
        ARM: cache-v7: consolidate initialisation of cache level index
        ARM: cache-v7: shift CLIDR to extract appropriate field before masking
        ARM: cache-v7: use movw/movt instructions
        ARM: allow 16-bit instructions in ALT_UP()
        ARM: proc-arm94*.S: fix setup function
        ARM: vexpress: fix CPU hotplug with CT9x4 tile.
        ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP
        ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base address
        ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address
        ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros
        ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL
        ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility
        ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations
        ...
      bb0fd7ab
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · bdfa54df
      Linus Torvalds authored
      Pull s390 updates from Martin Schwidefsky:
       "The major change in this merge is the removal of the support for
        31-bit kernels.  Naturally 31-bit user space will continue to work via
        the compat layer.
      
        And then some cleanup, some improvements and bug fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (23 commits)
        s390/smp: wait until secondaries are active & online
        s390/hibernate: fix save and restore of kernel text section
        s390/cacheinfo: add missing facility check
        s390/syscalls: simplify syscall_get_arch()
        s390/irq: enforce correct irqclass_sub_desc array size
        s390: remove "64" suffix from mem64.S and swsusp_asm64.S
        s390/ipl: cleanup macro usage
        s390/ipl: cleanup shutdown_action attributes
        s390/ipl: cleanup bin attr usage
        s390/uprobes: fix address space annotation
        s390: add missing arch_release_task_struct() declaration
        s390: make couple of functions and variables static
        s390/maccess: improve s390_kernel_write()
        s390/maccess: remove potentially broken probe_kernel_write()
        s390/watchdog: support for KVM hypervisors and delete pr_info messages
        s390/watchdog: enable KEEPALIVE for /dev/watchdog
        s390/dasd: remove setting of scheduler from driver
        s390/traps: panic() instead of die() on translation exception
        s390: remove test_facility(2) (== z/Architecture mode active) checks
        s390/cmpxchg: simplify cmpxchg_double
        ...
      bdfa54df
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2481bc75
      Linus Torvalds authored
      Pull power management and ACPI updates from Rafael Wysocki:
       "These are mostly fixes and cleanups all over, although there are a few
        items that sort of fall into the new feature category.
      
        First off, we have new callbacks for PM domains that should help us to
        handle some issues related to device initialization in a better way.
      
        There also is some consolidation in the unified device properties API
        area allowing us to use that inferface for accessing data coming from
        platform initialization code in addition to firmware-provided data.
      
        We have some new device/CPU IDs in a few drivers, support for new
        chips and a new cpufreq driver too.
      
        Specifics:
      
         - Generic PM domains support update including new PM domain callbacks
           to handle device initialization better (Russell King, Rafael J
           Wysocki, Kevin Hilman)
      
         - Unified device properties API update including a new mechanism for
           accessing data provided by platform initialization code (Rafael J
           Wysocki, Adrian Hunter)
      
         - ARM cpuidle update including ARM32/ARM64 handling consolidation
           (Daniel Lezcano)
      
         - intel_idle update including support for the Silvermont Core in the
           Baytrail SOC and for the Airmont Core in the Cherrytrail and
           Braswell SOCs (Len Brown, Mathias Krause)
      
         - New cpufreq driver for Hisilicon ACPU (Leo Yan)
      
         - intel_pstate update including support for the Knights Landing chip
           (Dasaratharaman Chandramouli, Kristen Carlson Accardi)
      
         - QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann)
      
         - powernv cpufreq driver update (Shilpasri G Bhat)
      
         - devfreq update including Tegra support changes (Tomeu Vizoso,
           MyungJoo Ham, Chanwoo Choi)
      
         - powercap RAPL (Running-Average Power Limit) driver update including
           support for Intel Broadwell server chips (Jacob Pan, Mathias Krause)
      
         - ACPI device enumeration update related to the handling of the
           special PRP0001 device ID allowing DT-style 'compatible' property
           to be used for ACPI device identification (Rafael J Wysocki)
      
         - ACPI EC driver update including limited _DEP support (Lan Tianyu,
           Lv Zheng)
      
         - ACPI backlight driver update including a new mechanism to allow
           native backlight handling to be forced on non-Windows 8 systems and
           a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede)
      
         - New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu)
      
         - Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
           Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki)
      
         - Fixes related to suspend-to-idle for the iTCO watchdog driver and
           the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu)
      
         - PM tracing support for the suspend phase of system suspend/resume
           transitions (Zhonghui Fu)
      
         - Configurable delay for the system suspend/resume testing facility
           (Brian Norris)
      
         - PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki)"
      
      * tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
        ACPI / scan: Fix NULL pointer dereference in acpi_companion_match()
        ACPI / scan: Rework modalias creation when "compatible" is present
        intel_idle: mark cpu id array as __initconst
        powercap / RAPL: mark rapl_ids array as __initconst
        powercap / RAPL: add ID for Broadwell server
        intel_pstate: Knights Landing support
        intel_pstate: remove MSR test
        cpufreq: fix qoriq uniprocessor build
        ACPI / scan: Take the PRP0001 position in the list of IDs into account
        ACPI / scan: Simplify acpi_match_device()
        ACPI / scan: Generalize of_compatible matching
        device property: Introduce firmware node type for platform data
        device property: Make it possible to use secondary firmware nodes
        PM / watchdog: iTCO: stop watchdog during system suspend
        cpufreq: hisilicon: add acpu driver
        ACPI / EC: Call acpi_walk_dep_device_list() after installing EC opregion handler
        cpufreq: powernv: Report cpu frequency throttling
        intel_idle: Add support for the Airmont Core in the Cherrytrail and Braswell SOCs
        intel_idle: Update support for Silvermont Core in Baytrail SOC
        PM / devfreq: tegra: Register governor on module init
        ...
      2481bc75
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · 9f915141
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2015-04-14
      
      This series contains updates to fm10k only.
      
      Fixed transmit statistics which was actually using values from the
      receive ring, instead of the transmit ring.  Fixed up spelling mistakes
      in code comments and resolved unused argument warnings.  Added support
      for netconsole.  Fixed up statistic reporting so that we are only
      reporting from actual queues as well as display PF only stats for
      just the PF and not the VF.  Also fixed an issue that when returning
      virtualization queues from the VF back to the PF, we were retaining
      the VF rate limiter.
      
      Fixed up the driver to use a separate workqueue, which helps reduce
      and stabilize latency between scheduling the work in our interrupt and
      actually performing the work.
      
      Fixed a bug where the VF tried to set a multicast address before
      requesting the required xcast mode.
      
      Fix VF multicast update since VFs were being improperly added to the
      switch's mutlicast group.  The error stems from the fact that incorrect
      arguments were passed to the update_mc_addr().
      
      Thanks to Alex Duyck for the extensive review.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f915141
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 8691c130
      Linus Torvalds authored
      Pull input subsystem updates from Dmitry Torokhov:
       "You will get the following new drivers:
      
         - Qualcomm PM8941 power key drver
         - ChipOne icn8318 touchscreen controller driver
         - Broadcom iProc touchscreen and keypad drivers
         - Semtech SX8654 I2C touchscreen controller driver
      
        ALPS driver now supports newer SS4 devices; Elantech got a fix that
        should make it work on some ASUS laptops; and a slew of other
        enhancements and random fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (51 commits)
        Input: alps - non interleaved V2 dualpoint has separate stick button bits
        Input: alps - fix touchpad buttons getting stuck when used with trackpoint
        Input: atkbd - document "no new force-release quirks" policy
        Input: ALPS - make alps_get_pkt_id_ss4_v2() and others static
        Input: ALPS - V7 devices can report 5-finger taps
        Input: ALPS - add support for SS4 touchpad devices
        Input: ALPS - refactor alps_set_abs_params_mt()
        Input: elantech - fix absolute mode setting on some ASUS laptops
        Input: atmel_mxt_ts - split out touchpad initialisation logic
        Input: atmel_mxt_ts - implement support for T100 touch object
        Input: cros_ec_keyb - fix clearing keyboard state on wakeup
        Input: gscps2 - drop pci_ids dependency
        Input: synaptics - allocate 3 slots to keep stability in image sensors
        Input: Revert "Revert "synaptics - use dmax in input_mt_assign_slots""
        Input: MT - make slot assignment work for overcovered solutions
        mfd: tc3589x: enforce device-tree only mode
        Input: tc3589x - localize platform data
        Input: tsc2007 - Convert msecs to jiffies only once
        Input: edt-ft5x06 - remove EV_SYN event report
        Input: edt-ft5x06 - allow to setting the maximum axes value through the DT
        ...
      8691c130
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · c3a416a6
      Linus Torvalds authored
      Pull i2c updates from Wolfram Sang:
       "Most notable:
      
         - introducing the i2c_quirk infrastructure.  Now, flaws of I2C
           controllers can be described and the core will check if the flaws
           collide with the messages to be sent
      
         - wait_for_completion return type cleanup series
      
         - new drivers for Digicolor, Netlogic XLP, Ingenic JZ4780
      
         - updates to the I2C slave framework which include API changes.  Its
           only user was updated, too.  Documentation was finally added
      
         - changed dynamic bus numbering for the DT case.  This could change
           bus numbers for users.  However, it fixes a collision where dynamic
           and static busses request the same id.
      
         - driver bugfixes, cleanups"
      
      * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (52 commits)
        i2c: xlp9xx: Driver for Netlogic XLP9XX/5XX I2C controller
        of: Add vendor prefix 'netlogic'
        i2c: davinci: use ICPFUNC to toggle I2C as gpio for bus recovery
        i2c: davinci: use bus recovery infrastructure
        i2c: change input parameter to i2c_adapter for prepare/unprepare_recovery
        i2c: i2c-mux-gpio: remove error messages for probe deferrals
        i2c: jz4780: Add i2c bus controller driver for Ingenic JZ4780
        i2c: dln2: set the device tree node of the adapter
        i2c: davinci: fixup wait_for_completion_timeout handling
        i2c: mpc: Fix ISR return value
        i2c: slave-eeprom: add more info when to increase the pointer
        i2c: slave: add documentation for i2c-slave-eeprom
        Documentation: i2c: describe the new slave mode
        i2c: slave: rework the slave API
        i2c: add support for the Digicolor I2C controller
        i2c: busses with dynamic ids should start after fixed ids for DT
        of: base: add function to get highest id of an alias stem
        i2c: designware: Suppress error message if platform_get_irq() < 0
        i2c: mpc: assign the correct prescaler from SVR
        i2c: img-scb: fixup of wait_for_completion_timeout return handling
        ...
      c3a416a6
    • Linus Torvalds's avatar
      Merge tag 'vfio-v4.1-rc1' of git://github.com/awilliam/linux-vfio · 8c194f3b
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - VFIO platform bus driver support (Baptiste Reynal, Antonios Motakis,
         testing and review by Eric Auger)
      
       - Split VFIO irqfd support to separate module (Alex Williamson)
      
       - vfio-pci VGA arbiter client (Alex Williamson)
      
       - New vfio-pci.ids= module option (Alex Williamson)
      
       - vfio-pci D3 power state support for idle devices (Alex Williamson)
      
      * tag 'vfio-v4.1-rc1' of git://github.com/awilliam/linux-vfio: (30 commits)
        vfio-pci: Fix use after free
        vfio-pci: Move idle devices to D3hot power state
        vfio-pci: Remove warning if try-reset fails
        vfio-pci: Allow PCI IDs to be specified as module options
        vfio-pci: Add VGA arbiter client
        vfio-pci: Add module option to disable VGA region access
        vgaarb: Stub vga_set_legacy_decoding()
        vfio: Split virqfd into a separate module for vfio bus drivers
        vfio: virqfd_lock can be static
        vfio: put off the allocation of "minor" in vfio_create_group
        vfio/platform: implement IRQ masking/unmasking via an eventfd
        vfio: initialize the virqfd workqueue in VFIO generic code
        vfio: move eventfd support code for VFIO_PCI to a separate file
        vfio: pass an opaque pointer on virqfd initialization
        vfio: add local lock for virqfd instead of depending on VFIO PCI
        vfio: virqfd: rename vfio_pci_virqfd_init and vfio_pci_virqfd_exit
        vfio: add a vfio_ prefix to virqfd_enable and virqfd_disable and export
        vfio/platform: support for level sensitive interrupts
        vfio/platform: trigger an interrupt via eventfd
        vfio/platform: initial interrupts support code
        ...
      8c194f3b
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 07e492eb
      Linus Torvalds authored
      Pull pincontrol updates from Linus Walleij:
       "This is the bulk of pin control changes for the v4.1 development
        cycle.  Nothing really exciting this time: we basically added a few
        new drivers and subdrivers and stabilized them in linux-next.  Some
        cleanups too.  With sunrisepoint Intel has a real fine fully featured
        pin control driver for contemporary hardware, and the AMD driver is
        also for large deployments.  Most of the others are ARM devices.
      
        New drivers:
          - Intel Sunrisepoint
          - AMD KERNCZ GPIO
          - Broadcom Cygnus IOMUX
      
        New subdrivers:
          - Marvell MVEBU Armada 39x SoCs
          - Samsung Exynos 5433
          - nVidia Tegra 210
          - Mediatek MT8135
          - Mediatek MT8173
          - AMLogic Meson8b
          - Qualcomm PM8916
      
        On top of this cleanups and development history for the above drivers
        as issues were fixed after merging"
      
      * tag 'pinctrl-v4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (71 commits)
        pinctrl: sirf: move sgpio lock into state container
        pinctrl: Add support for PM8916 GPIO's and MPP's
        pinctrl: bcm2835: Fix support for threaded level triggered IRQs
        sh-pfc: r8a7790: add EtherAVB pin groups
        pinctrl: Document "function" + "pins" pinmux binding
        pinctrl: intel: Add Intel Sunrisepoint pin controller and GPIO support
        pinctrl: fsl: imx: Check for 0 config register
        pinctrl: Add support for Meson8b
        documentation: Extend pinctrl docs for Meson8b
        pinctrl: Cleanup Meson8 driver
        Fix inconsistent spinlock of AMD GPIO driver which can be recognized by static analysis tool smatch. Declare constant Variables with Sparse's suggestion.
        pinctrl: at91: convert __raw to endian agnostic IO
        pinctrl: constify of_device_id array
        pinctrl: pinconf-generic: add dt node names to error messages
        pinctrl: pinconf-generic: scan also referenced phandle node
        pinctrl: mvebu: add suspend/resume support to Armada XP pinctrl driver
        pinctrl: st: Display pin's function when printing pinctrl debug information
        pinctrl: st: Show correct pin direction also in GPIO mode
        pinctrl: st: Supply a GPIO get_direction() call-back
        pinctrl: st: Move st_get_pio_control() further up the source file
        ...
      07e492eb
    • Linus Torvalds's avatar
      Merge tag 'backlight-for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight · b240452a
      Linus Torvalds authored
      Pull backlight updates from Lee Jones:
       "Changes to existing drivers:
      
         - Use of_get_child_by_name() instead of refcount; 88pm860x_bl
      
         - Terminate array with NULL element; da9052_bl"
      
      * tag 'backlight-for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
        backlight: da9052_bl: Terminate da9052_wled_ids array with empty element
        backlight: 88pm860x_bl: Use of_get_child_by_name() instead of refcount hack
      b240452a
    • Linus Torvalds's avatar
      Merge tag 'mfd-for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · f0c1bc95
      Linus Torvalds authored
      Pull MFD updates from Lee Jones:
       "Changes to existing drivers:
      
         - Rename child driver [axp288_battery => axp288_fuel_gauge]; axp20x
         - Rename child driver [max77693-flash => max77693-led]; max77693
         - Error handling fixes; intel_soc_pmic
         - GPIO tweaking; intel_soc_pmic
         - Remove non-DT code; vexpress-sysreg, tc3589x
         - Remove unused/legacy code; ti_am335x_tscadc, rts5249, rtsx_gops, rtsx_pcr,
                                      rtc-s5m, sec-core, max77693, menelaus,
                                      wm5102-tables
         - Trivial fixups; rtsx_pci, da9150-core, sec-core, max7769, max77693,
                           mc13xxx-core, dln2, hi6421-pmic-core, rk808, twl4030-power,
                           lpc_ich, menelaus, twl6040
         - Update register/address values; rts5227, rts5249
         - DT and/or binding document fixups; arizona, da9150, mt6397, axp20x,
                                              qcom-rpm, qcom-spmi-pmic
         - Couple of trivial core Kconfig fixups
         - Remove use of seq_printf return value; ab8500-debugfs
         - Remove __exit markups; menelaus, tps65010
         - Fix platform-device name collisions; mfd-core
      
        New drivers/supported devices:
      
         - Add support for wm8280/wm8281 into arizona
         - Add support for COMe-cBL6 into kempld-core
         - Add support for rts524a and rts525a into rts5249
         - Add support for ipq8064 into qcom_rpm
         - Add support for extcon into axp20x
         - New MediaTek MT6397 PMIC driver
         - New Maxim MAX77843 PMIC dirver
         - New Intel Quark X1000 I2C-GPIO driver
         - New Skyworks SKY81452 driver"
      
      * tag 'mfd-for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (76 commits)
        mfd: sec: Fix RTC alarm interrupt number on S2MPS11
        mfd: wm5102: Remove registers for output 3R from readable list
        mfd: tps65010: Remove incorrect __exit markups
        mfd: devicetree: bindings: Add Qualcomm RPM regulator subnodes
        mfd: axp20x: Add support for extcon cell
        mfd: lpc_ich: Sort IDs
        mfd: twl6040: Remove wrong and unneeded "platform:twl6040" modalias
        mfd: qcom-spmi-pmic: Add specific compatible strings for Qualcomm's SPMI PMIC's
        mfd: axp20x: Fix duplicate const for model names
        mfd: menelaus: Use macro for magic number
        mfd: menelaus: Drop support for SW controller VCORE
        mfd: menelaus: Delete omap_has_menelaus
        mfd: arizona: Correct type of gpio_defaults
        mfd: lpc_ich: Sort IDs
        mfd: Fix a typo in Kconfig
        mfd: qcom_rpm: Add support for IPQ8064
        mfd: devicetree: qcom_rpm: Document IPQ8064 resources
        mfd: core: Fix platform-device name collisions
        mfd: intel_quark_i2c_gpio: Don't crash if !DMI
        dt-bindings: Add vendor-prefix for X-Powers
        ...
      f0c1bc95
    • Richard Guy Briggs's avatar
      lsm: copy comm before calling audit_log to avoid race in string printing · 5deeb5ce
      Richard Guy Briggs authored
      When task->comm is passed directly to audit_log_untrustedstring() without
      getting a copy or using the task_lock, there is a race that could happen that
      would output a NULL (\0) in the middle of the output string that would
      effectively truncate the rest of the report text after the comm= field in the
      audit log message, losing fields.
      
      Using get_task_comm() to get a copy while acquiring the task_lock to prevent
      this and to prevent the result from being a mixture of old and new values of
      comm would incur potentially unacceptable overhead, considering that the value
      can be influenced by userspace and therefore untrusted anyways.
      
      Copy the value before passing it to audit_log_untrustedstring() ensures that a
      local copy is used to calculate the length *and* subsequently printed.  Even if
      this value contains a mix of old and new values, it will only calculate and
      copy up to the first NULL, preventing the rest of the audit log message being
      truncated.
      
      Use a second local copy of comm to avoid a race between the first and second
      calls to audit_log_untrustedstring() with comm.
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      5deeb5ce
  2. 14 Apr, 2015 14 commits