1. 08 Dec, 2015 3 commits
    • Linus Torvalds's avatar
      Merge branch 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 5406812e
      Linus Torvalds authored
      Pull cgroup fixes from Tejun Heo:
       "More change than I'd have liked at this stage.  The pids controller
        and the changes made to cgroup core to support it introduced and
        revealed several important issues.
      
         - Assigning membership to a newly created task and migrating it can
           race leading to incorrect accounting.  Oleg fixed it by widening
           threadgroup synchronization.  It looks like we'll be able to merge
           it with a different percpu rwsem which is used in fork path making
           things simpler and cheaper.
      
         - The recent change to extend cgroup membership to zombies (so that
           pid accounting can extend till the pid is actually released) missed
           pinning the underlying data structures leading to use-after-free.
           Fixed.
      
         - v2 hierarchy was calling subsystem callbacks with the wrong target
           cgroup_subsys_state based on the incorrect assumption that they
           share the same target.  pids is the first controller affected by
           this.  Subsys callbacks updated so that they can deal with
           multi-target migrations"
      
      * 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup_pids: don't account for the root cgroup
        cgroup: fix handling of multi-destination migration from subtree_control enabling
        cgroup_freezer: simplify propagation of CGROUP_FROZEN clearing in freezer_attach()
        cgroup: pids: kill pids_fork(), simplify pids_can_fork() and pids_cancel_fork()
        cgroup: pids: fix race between cgroup_post_fork() and cgroup_migrate()
        cgroup: make css_set pin its css's to avoid use-afer-free
        cgroup: fix cftype->file_offset handling
      5406812e
    • Linus Torvalds's avatar
      Merge branch 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 633bb738
      Linus Torvalds authored
      Pull libata fixes from Tejun Heo:
       "Nothing too interesting.  All are device specific additions and
        workarounds"
      
      * 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads
        libata-eh.c: Introduce new ata port flag for controller which lockup on read log page
        sata_sil: disable trim
        AHCI: Fix softreset failed issue of Port Multiplier
        sata/mvebu: use #ifdef around suspend/resume code
        ahci: Order SATA device IDs for codename Lewisburg
        ahci: Add Device ID for Intel Sunrise Point PCH
      633bb738
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 51825c8a
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "This tree includes four core perf fixes for misc bugs, three fixes to
        x86 PMU drivers, and two updates to old email addresses"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf: Do not send exit event twice
        perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macro
        perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on Haswell
        perf: Fix PERF_EVENT_IOC_PERIOD deadlock
        treewide: Remove old email address
        perf/x86: Fix LBR call stack save/restore
        perf: Update email address in MAINTAINERS
        perf/core: Robustify the perf_cgroup_from_task() RCU checks
        perf/core: Fix RCU problem with cgroup context switching code
      51825c8a
  2. 07 Dec, 2015 13 commits
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 62ea1ec5
      Linus Torvalds authored
      Pull virtio fixes from Michael Tsirkin:
       "This includes some fixes and cleanups in virtio and vhost code.
      
        Most notably, shadowing the index fixes the excessive cacheline
        bouncing observed on AMD platforms"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_ring: shadow available ring flags & index
        virtio: Do not drop __GFP_HIGH in alloc_indirect
        vhost: replace % with & on data path
        tools/virtio: fix byteswap logic
        tools/virtio: move list macro stubs
        virtio: fix memory leak of virtio ida cache layers
        vhost: relax log address alignment
        virtio-net: Stop doing DMA from the stack
      62ea1ec5
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · f41683a2
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Ext4 bug fixes for v4.4, including fixes for post-2038 time encodings,
        some endian conversion problems with ext4 encryption, potential memory
        leaks after truncate in data=journal mode, and an ocfs2 regression
        caused by a jbd2 performance improvement"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        jbd2: fix null committed data return in undo_access
        ext4: add "static" to ext4_seq_##name##_fops struct
        ext4: fix an endianness bug in ext4_encrypted_follow_link()
        ext4: fix an endianness bug in ext4_encrypted_zeroout()
        jbd2: Fix unreclaimed pages after truncate in data=journal mode
        ext4: Fix handling of extended tv_sec
      f41683a2
    • Venkatesh Srinivas's avatar
      virtio_ring: shadow available ring flags & index · f277ec42
      Venkatesh Srinivas authored
      Improves cacheline transfer flow of available ring header.
      
      Virtqueues are implemented as a pair of rings, one producer->consumer
      avail ring and one consumer->producer used ring; preceding the
      avail ring in memory are two contiguous u16 fields -- avail->flags
      and avail->idx. A producer posts work by writing to avail->idx and
      a consumer reads avail->idx.
      
      The flags and idx fields only need to be written by a producer CPU
      and only read by a consumer CPU; when the producer and consumer are
      running on different CPUs and the virtio_ring code is structured to
      only have source writes/sink reads, we can continuously transfer the
      avail header cacheline between 'M' states between cores. This flow
      optimizes core -> core bandwidth on certain CPUs.
      
      (see: "Software Optimization Guide for AMD Family 15h Processors",
      Section 11.6; similar language appears in the 10h guide and should
      apply to CPUs w/ exclusive caches, using LLC as a transfer cache)
      
      Unfortunately the existing virtio_ring code issued reads to the
      avail->idx and read-modify-writes to avail->flags on the producer.
      
      This change shadows the flags and index fields in producer memory;
      the vring code now reads from the shadows and only ever writes to
      avail->flags and avail->idx, allowing the cacheline to transfer
      core -> core optimally.
      
      In a concurrent version of vring_bench, the time required for
      10,000,000 buffer checkout/returns was reduced by ~2% (average
      across many runs) on an AMD Piledriver (15h) CPU:
      
      (w/o shadowing):
       Performance counter stats for './vring_bench':
           5,451,082,016      L1-dcache-loads
           ...
             2.221477739 seconds time elapsed
      
      (w/ shadowing):
       Performance counter stats for './vring_bench':
           5,405,701,361      L1-dcache-loads
           ...
             2.168405376 seconds time elapsed
      
      The further away (in a NUMA sense) virtio producers and consumers are
      from each other, the more we expect to benefit. Physical implementations
      of virtio devices and implementations of virtio where the consumer polls
      vring avail indexes (vhost) should also benefit.
      Signed-off-by: default avatarVenkatesh Srinivas <venkateshs@google.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      f277ec42
    • Michal Hocko's avatar
      virtio: Do not drop __GFP_HIGH in alloc_indirect · 82107539
      Michal Hocko authored
      b92b1b89 ("virtio: force vring descriptors to be allocated from
      lowmem") tried to exclude highmem pages for descriptors so it cleared
      __GFP_HIGHMEM from a given gfp mask. The patch also cleared __GFP_HIGH
      which doesn't make much sense for this fix because __GFP_HIGH only
      controls access to memory reserves and it doesn't have any influence
      on the zone selection. Some of the call paths use GFP_ATOMIC and
      dropping __GFP_HIGH will reduce their changes for success because the
      lack of access to memory reserves.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Reviewed-by: default avatarMel Gorman <mgorman@techsingularity.net>
      82107539
    • Michael S. Tsirkin's avatar
      vhost: replace % with & on data path · 5fba13b5
      Michael S. Tsirkin authored
      We know vring num is a power of 2, so use &
      to mask the high bits.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      5fba13b5
    • Michael S. Tsirkin's avatar
      tools/virtio: fix byteswap logic · 55564a02
      Michael S. Tsirkin authored
      commit cf561f0d ("virtio: introduce
      virtio_is_little_endian() helper") changed byteswap logic to
      skip feature bit checks for LE platforms, but didn't
      update tools/virtio, so vring_bench started failing.
      
      Update the copy under tools/virtio/ (TODO: find a way to avoid this code
      duplication).
      
      Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      55564a02
    • Michael S. Tsirkin's avatar
      tools/virtio: move list macro stubs · 40c172e5
      Michael S. Tsirkin authored
      Makes them more generally available.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      40c172e5
    • Suman Anna's avatar
      virtio: fix memory leak of virtio ida cache layers · c13f99b7
      Suman Anna authored
      The virtio core uses a static ida named virtio_index_ida for
      assigning index numbers to virtio devices during registration.
      The ida core may allocate some internal idr cache layers and
      an ida bitmap upon any ida allocation, and all these layers are
      truely freed only upon the ida destruction. The virtio_index_ida
      is not destroyed at present, leading to a memory leak when using
      the virtio core as a module and atleast one virtio device is
      registered and unregistered.
      
      Fix this by invoking ida_destroy() in the virtio core module
      exit.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSuman Anna <s-anna@ti.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      c13f99b7
    • Michael S. Tsirkin's avatar
      vhost: relax log address alignment · d5424838
      Michael S. Tsirkin authored
      commit 5d9a07b0 ("vhost: relax used
      address alignment") fixed the alignment for the used virtual address,
      but not for the physical address used for logging.
      
      That's a mistake: alignment should clearly be the same for virtual and
      physical addresses,
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      d5424838
    • Andreas Werner's avatar
      ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads · 4f2568f5
      Andreas Werner authored
      Every attempt to issue a read log page command lockup the controller.
      The command is currently sent if the sata device includes the devlsp feature
      to read out the timing data.
      This attempt to read the data, locks up the controller and the device
      is not recognzied correctly (failed to set xfermode) and cannot be accessed.
      
      This was found on Freescale P1013/P1022 and T4240 CPUs
      using a ATP IG mSATA 4GB with the devslp feature.
      
      fsl-sata ff718000.sata: Sata FSL Platform/CSB Driver init
      [    1.254195] scsi0 : sata_fsl
      [    1.256004] ata1: SATA max UDMA/133 irq 74
      [    1.370666] fsl-gianfar ethernet.3: enabled errata workarounds, flags: 0x4
      [    1.470671] fsl-gianfar ethernet.4: enabled errata workarounds, flags: 0x4
      [    1.775584] ata1: Signature Update detected @ 504 msecs
      [    1.947594] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
      [    1.948366] ata1.00: ATA-8: ATP IG mSATA, 20150311, max UDMA/133
      [    1.948371] ata1.00: 7732368 sectors, multi 0: LBA
      [    1.948843] ata1.00: failed to get Identify Device Data, Emask 0x1
      [    1.948857] ata1.00: failed to set xfermode (err_mask=0x40)
      [    7.467557] ata1: Signature Update detected @ 504 msecs
      [    7.639560] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
      [    7.651320] ata1.00: failed to get Identify Device Data, Emask 0x1
      [    7.651360] ata1.00: failed to set xfermode (err_mask=0x40)
      [    7.655628] ata1: limiting SATA link speed to 1.5 Gbps
      [    7.659458] ata1.00: limiting speed to UDMA/133:PIO3
      [   13.163554] ata1: Signature Update detected @ 504 msecs
      [   13.335558] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
      [   13.347298] ata1.00: failed to get Identify Device Data, Emask 0x1
      [   13.347334] ata1.00: failed to set xfermode (err_mask=0x40)
      [   13.351601] ata1.00: disabled
      [   13.353278] ata1: exception Emask 0x50 SAct 0x0 SErr 0x800 action 0x6 frozen t4
      [   13.359281] ata1: SError: { HostInt }
      [   13.361644] ata1: hard resetting link
      Signed-off-by: default avatarAndreas Werner <andreas.werner@men.de>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      4f2568f5
    • Andreas Werner's avatar
      libata-eh.c: Introduce new ata port flag for controller which lockup on read log page · ea013a9b
      Andreas Werner authored
      Some controller lockup on a ata_read_log_page.
      Add new ata port flag ATA_FLAG_NO_LOG_PAGE which can used
      to blacklist a controller.
      
      If this flag is set, any attempt to read a log page returns an error
      without actually issuing the command.
      Signed-off-by: default avatarAndreas Werner <andreas.werner@men.de>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      ea013a9b
    • Tejun Heo's avatar
      Merge branch 'master' into for-4.4-fixes · 0b98f0c0
      Tejun Heo authored
      The following commit which went into mainline through networking tree
      
        3b13758f ("cgroups: Allow dynamically changing net_classid")
      
      conflicts in net/core/netclassid_cgroup.c with the following pending
      fix in cgroup/for-4.4-fixes.
      
        1f7dd3e5 ("cgroup: fix handling of multi-destination migration from subtree_control enabling")
      
      The former separates out update_classid() from cgrp_attach() and
      updates it to walk all fds of all tasks in the target css so that it
      can be used from both migration and config change paths.  The latter
      drops @css from cgrp_attach().
      
      Resolve the conflict by making cgrp_attach() call update_classid()
      with the css from the first task.  We can revive @tset walking in
      cgrp_attach() but given that net_cls is v1 only where there always is
      only one target css during migration, this is fine.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Nina Schiff <ninasc@fb.com>
      0b98f0c0
    • Michael S. Tsirkin's avatar
      virtio-net: Stop doing DMA from the stack · 2ac46030
      Michael S. Tsirkin authored
      Once virtio starts using the DMA API, we won't be able to safely DMA
      from the stack.  virtio-net does a couple of config DMA requests
      from small stack buffers -- switch to using dynamically-allocated
      memory.
      
      This should have no effect on any performance-critical code paths.
      Reported-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Tested-by: default avatarAndy Lutomirski <luto@kernel.org>
      
      2ac46030
  3. 06 Dec, 2015 13 commits
    • Linus Torvalds's avatar
      Linux 4.4-rc4 · 527e9316
      Linus Torvalds authored
      527e9316
    • James Simmons's avatar
      staging/lustre: remove IOC_LIBCFS_PING_TEST ioctl · d035e336
      James Simmons authored
      The ioctl IOC_LIBCFS_PING_TEST has not been used in ages.  The recent
      nidstring changes which moved all the nidstring operations from libcfs
      to the LNet layer but this ioctl code was still using an nidstring
      operation that was causing a circular dependency loop between libcfs and
      LNet.
      Signed-off-by: default avatarJames Simmons <jsimmons@infradead.org>
      Signed-off-by: default avatarOleg Drokin <green@linuxhacker.ru>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d035e336
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d8cd93ea
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "A couple of fixes (-stable fodder) + dead code removal after the
        overlayfs fix.
      
        I agree that it's better to separate from the fix part to make
        backporting easier, but IMO it's not worth delaying said dead code
        removal until the next window"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        Don't reset ->total_link_count on nested calls of vfs_path_lookup()
        ovl: get rid of the dead code left from broken (and disabled) optimizations
        ovl: fix permission checking for setattr
      d8cd93ea
    • Al Viro's avatar
      Don't reset ->total_link_count on nested calls of vfs_path_lookup() · 2788cc47
      Al Viro authored
      we already zero it on outermost set_nameidata(), so initialization in
      path_init() is pointless and wrong.  The same DoS exists on pre-4.2
      kernels, but there a slightly different fix will be needed.
      
      Cc: stable@vger.kernel.org # v4.2
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      2788cc47
    • Al Viro's avatar
    • Miklos Szeredi's avatar
      ovl: fix permission checking for setattr · acff81ec
      Miklos Szeredi authored
      [Al Viro] The bug is in being too enthusiastic about optimizing ->setattr()
      away - instead of "copy verbatim with metadata" + "chmod/chown/utimes"
      (with the former being always safe and the latter failing in case of
      insufficient permissions) it tries to combine these two.  Note that copyup
      itself will have to do ->setattr() anyway; _that_ is where the elevated
      capabilities are right.  Having these two ->setattr() (one to set verbatim
      copy of metadata, another to do what overlayfs ->setattr() had been asked
      to do in the first place) combined is where it breaks.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      acff81ec
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fb7b26e4
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "This updates contains the following changes:
      
         - Fix a signal handling regression in the bit wait functions.
      
         - Avoid false positive warnings in the wakeup path.
      
         - Initialize the scheduler root domain properly.
      
         - Handle gtime calculations in proc/$PID/stat proper.
      
         - Add more documentation for the barriers in try_to_wake_up().
      
         - Fix a subtle race in try_to_wake_up() which might cause a task to
           be scheduled on two cpus
      
         - Compile static helper function only when it is used"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule()
        sched/core: Better document the try_to_wake_up() barriers
        sched/cputime: Fix invalid gtime in proc
        sched/core: Clear the root_domain cpumasks in init_rootdomain()
        sched/core: Remove false-positive warning from wake_up_process()
        sched/wait: Fix signal handling in bit wait helpers
        sched/rt: Hide the push_irq_work_func() declaration
      fb7b26e4
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 69d2ca60
      Linus Torvalds authored
      Pull x86 fixes from Thoma Gleixner:
       "Another round of fixes for x86:
      
         - Move the initialization of the microcode driver to late_initcall to
           make sure everything that init function needs is available.
      
         - Make sure that lockdep knows about interrupts being off in the
           entry code before calling into c-code.
      
         - Undo the cpu hotplug init delay regression.
      
         - Use the proper conditionals in the mpx instruction decoder.
      
         - Fixup restart_syscall for x32 tasks.
      
         - Fix the hugepage regression on PAE kernels which was introduced
           with the latest PAT changes"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/signal: Fix restart_syscall number for x32 tasks
        x86/mpx: Fix instruction decoder condition
        x86/mm: Fix regression with huge pages on PAE
        x86 smpboot: Re-enable init_udelay=0 by default on modern CPUs
        x86/entry/64: Fix irqflag tracing wrt context tracking
        x86/microcode: Initialize the driver late when facilities are up
      69d2ca60
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 19190f5e
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is quite a bumper crop of fixes: three from Arnd correcting
        various build issues in some configurations, a lock recursion in
        qla2xxx.  Two potentially exploitable issues in hpsa and mvsas, a
        potential null deref in st, a revert of a bdi registration fix that
        turned out to cause even more problems, a set of fixes to allow people
        who only defined MPT2SAS to still work after the mpt2/mpt3sas merger
        and a couple of fixes for issues turned up by the hyper-v storvsc
        driver"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        mpt3sas: fix Kconfig dependency problem for mpt2sas back compatibility
        Revert "scsi: Fix a bdi reregistration race"
        mpt3sas: Add dummy Kconfig option for backwards compatibility
        Fix a memory leak in scsi_host_dev_release()
        block/sd: Fix device-imposed transfer length limits
        scsi_debug: fix prevent_allow+verify regressions
        MAINTAINERS: Add myself as co-maintainer of the SCSI subsystem.
        sd: Make discard granularity match logical block size when LBPRZ=1
        scsi: hpsa: select CONFIG_SCSI_SAS_ATTR
        scsi: advansys needs ISA dma api for ISA support
        scsi_sysfs: protect against double execution of __scsi_remove_device()
        st: fix potential null pointer dereference.
        scsi: report 'INQUIRY result too short' once per host
        advansys: fix big-endian builds
        qla2xxx: Fix rwlock recursion
        hpsa: logical vs bitwise AND typo
        mvsas: don't allow negative timeouts
        mpt3sas: Fix use sas_is_tlr_enabled API before enabling MPI2_SCSIIO_CONTROL_TLR_ON flag
      19190f5e
    • Jiri Olsa's avatar
      perf: Do not send exit event twice · 4e93ad60
      Jiri Olsa authored
      In case we monitor events system wide, we get EXIT event
      (when configured) twice for each task that exited.
      
      Note doubled lines with same pid/tid in following example:
      
        $ sudo ./perf record -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.480 MB perf.data (2518 samples) ]
        $ sudo ./perf report -D | grep EXIT
      
        0 60290687567581 0x59910 [0x38]: PERF_RECORD_EXIT(1250:1250):(1250:1250)
        0 60290687568354 0x59948 [0x38]: PERF_RECORD_EXIT(1250:1250):(1250:1250)
        0 60290687988744 0x59ad8 [0x38]: PERF_RECORD_EXIT(1250:1250):(1250:1250)
        0 60290687989198 0x59b10 [0x38]: PERF_RECORD_EXIT(1250:1250):(1250:1250)
        1 60290692567895 0x62af0 [0x38]: PERF_RECORD_EXIT(1253:1253):(1253:1253)
        1 60290692568322 0x62b28 [0x38]: PERF_RECORD_EXIT(1253:1253):(1253:1253)
        2 60290692739276 0x69a18 [0x38]: PERF_RECORD_EXIT(1252:1252):(1252:1252)
        2 60290692739910 0x69a50 [0x38]: PERF_RECORD_EXIT(1252:1252):(1252:1252)
      
      The reason is that the cpu contexts are processes each time
      we call perf_event_task. I'm changing the perf_event_aux logic
      to serve task_ctx and cpu contexts separately, which ensure we
      don't get EXIT event generated twice on same cpu context.
      
      This does not affect other auxiliary events, as they don't
      use task_ctx at all.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1446649205-5822-1-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4e93ad60
    • Jiri Olsa's avatar
      perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macro · 169b932a
      Jiri Olsa authored
      We need to add rest of the flags to the constraint mask
      instead of another INTEL_ARCH_EVENT_MASK, fixing a typo.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1447061071-28085-1-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      169b932a
    • Yuanfang Chen's avatar
      perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on Haswell · e0fbac1c
      Yuanfang Chen authored
      There was a mistake in the Haswell constraints table.
      Signed-off-by: default avatarYuanfang Chen <cheny@udel.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1448384701-9110-1-git-send-email-cheny@udel.eduSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e0fbac1c
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · a2dbb7b5
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "A bunch of change across the board, the main things are some vblank
        fallout in radeon and nouveau required some work, but I think this
        should fix it all.  There is also one drm fix for an oops in vmwgfx
        with how we pass the drm master around.
      
        The rest is just some amdgpu, i915, imx and rockchip fixes.
      
        Probably more than I'd like at this point, but hopefully things settle
        down now"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (40 commits)
        drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3)
        drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)
        drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
        drm/amdgpu: add spin lock to protect freed list in vm (v2)
        drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2
        drm/amdgpu: take a BO reference for the user fence
        drm/amdgpu: take a BO reference in the display code
        drm/amdgpu: set snooped flags only on system addresses v2
        drm/nouveau: Fix pre-nv50 pageflip events (v4)
        drm: Fix an unwanted master inheritance v2
        drm/amdgpu: fix race condition in amd_sched_entity_push_job
        drm/amdgpu: add err check for pin userptr
        drm/i915: take a power domain reference while checking the HDMI live status
        drm/i915: add MISSING_CASE to a few port/aux power domain helpers
        drm/i915/ddi: fix intel_display_port_aux_power_domain() after HDMI detect
        drm/i915: Introduce a gmbus power domain
        drm/i915: Clean up AUX power domain handling
        drm/rockchip: Use CRTC vblank event interface
        drm/rockchip: Fix module autoload for OF platform driver
        drm/rockchip: vop: fix window origin calculation
        ...
      a2dbb7b5
  4. 05 Dec, 2015 4 commits
  5. 04 Dec, 2015 7 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 849ee3d4
      Linus Torvalds authored
      Pull Ceph fix from Sage Weil:
       "This addresses a refcounting bug that leads to a use-after-free"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
        rbd: don't put snap_context twice in rbd_queue_workfn()
      849ee3d4
    • Alex Deucher's avatar
      drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3) · 8e36f9d3
      Alex Deucher authored
      commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many
      vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core
      more fragile to drivers which don't update hw vblank counters and
      vblank timestamps in sync with firing of the vblank irq and
      essentially at leading edge of vblank.
      
      This exposed a problem with radeon-kms/amdgpu-kms which do not
      satisfy above requirements:
      
      The vblank irq fires a few scanlines before start of vblank, but
      programmed pageflips complete at start of vblank and
      vblank timestamps update at start of vblank, whereas the
      hw vblank counter increments only later, at start of vsync.
      
      This leads to problems like off by one errors for vblank counter
      updates, vblank counters apparently going backwards or vblank
      timestamps apparently having time going backwards. The net result
      is stuttering of graphics in games, or little hangs, as well as
      total failure of timing sensitive applications.
      
      See bug #93147 for an example of the regression on Linux 4.4-rc:
      
      https://bugs.freedesktop.org/show_bug.cgi?id=93147
      
      This patch tries to align all above events better from the
      viewpoint of the drm core / of external callers to fix the problem:
      
      1. The apparent start of vblank is shifted a few scanlines earlier,
      so the vblank irq now always happens after start of this extended
      vblank interval and thereby drm_update_vblank_count() always samples
      the updated vblank count and timestamp of the new vblank interval.
      
      To achieve this, the reporting of scanout positions by
      radeon_get_crtc_scanoutpos() now operates as if the vblank starts
      radeon_crtc->lb_vblank_lead_lines before the real start of the hw
      vblank interval. This means that the vblank timestamps which are based
      on these scanout positions will now update at this earlier start of
      vblank.
      
      2. The driver->get_vblank_counter() function will bump the returned
      vblank count as read from the hw by +1 if the query happens after
      the shifted earlier start of the vblank, but before the real hw increment
      at start of vsync, so the counter appears to increment at start of vblank
      in sync with the timestamp update.
      
      3. Calls from vblank irq-context and regular non-irq calls are now
      treated identical, always simulating the shifted vblank start, to
      avoid inconsistent results for queries happening from vblank irq vs.
      happening from drm_vblank_enable() or vblank_disable_fn().
      
      4. The radeon_flip_work_func will delay mmio programming a pageflip until
      the start of the real vblank iff it happens to execute inside the shifted
      earlier start of the vblank, so pageflips now also appear to execute at
      start of the shifted vblank, in sync with vblank counter and timestamp
      updates. This to avoid some races between updates of vblank count and
      timestamps that are used for swap scheduling and pageflip execution which
      could cause pageflips to execute before the scheduled target vblank.
      
      The lb_vblank_lead_lines "fudge" value is calculated as the size of
      the display controllers line buffer in scanlines for the given video
      mode: Vblank irq's are triggered by the line buffer logic when the line
      buffer refill for a video frame ends, ie. when the line buffer source read
      position enters the hw vblank. This means that a vblank irq could fire at
      most as many scanlines before the current reported scanout position of the
      crtc timing generator as the number of scanlines the line buffer can
      maximally hold for a given video mode.
      
      This patch has been successfully tested on a RV730 card with DCE-3 display
      engine and on a evergreen card with DCE-4 display engine, in single-display
      and dual-display configuration, with different video modes.
      
      A similar patch is needed for amdgpu-kms to fix the same problem.
      
      Limitations:
      
      - Maybe replace the udelay() in the flip_work_func() by a suitable
        usleep_range() for a bit better efficiency? Will try that.
      
      - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value
        i just guessed to be high enough to work ok, lacking info on the true
        sizes atm.
      
      Probably fixes: fdo#93147
      
      Port of Mario's radeon fix to amdgpu.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      (v1) Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
      
      (v2) Refine amdgpu_flip_work_func() for better efficiency.
      
           In amdgpu_flip_work_func, replace the busy waiting udelay(5)
           with event lock held by a more performance and energy efficient
           usleep_range() until at least predicted true start of hw vblank,
           with some slack for scheduler happiness. Release the event lock
           during waits to not delay other outputs in doing their stuff, as
           the waiting can last up to 200 usecs in some cases.
      
           Also small fix to code comment and formatting in that function.
      
      (v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
      
      (v3) Fix crash in crtc disabled case
      8e36f9d3
    • Linus Torvalds's avatar
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · fb39cbda
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
      
       - NFIT parsing regression fixes from Linda.  The nvdimm hot-add
         implementation merged in 4.4-rc1 interpreted the specification in a
         way that breaks actual HPE platforms.  We are also closing the loop
         with the ACPI Working Group to get this clarification added to the
         spec.
      
       - Andy pointed out that his laptop without nvdimm resources is loading
         the e820-nvdimm module by default, fix that up to only load the
         module when an e820-type-12 range is present.
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        nfit: Adjust for different _FIT and NFIT headers
        nfit: Fix the check for a successful NFIT merge
        nfit: Account for table size length variation
        libnvdimm, e820: skip module loading when no type-12
      fb39cbda
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · db281766
      Linus Torvalds authored
      Pull ARM KVM fixes from Paolo Bonzini:
      
       - a series of fixes to deal with the aliasing between the sp and xzr
         register
      
       - a fix for the cache flush fix that went in -rc3
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        ARM/arm64: KVM: correct PTE uncachedness check
        arm64: KVM: Get rid of old vcpu_reg()
        arm64: KVM: Correctly handle zero register in system register accesses
        arm64: KVM: Remove const from struct sys_reg_params
        arm64: KVM: Correctly handle zero register during MMIO
      db281766
    • Mario Kleiner's avatar
      drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2) · 5b5561b3
      Mario Kleiner authored
      commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many
      vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core
      more fragile to drivers which don't update hw vblank counters and
      vblank timestamps in sync with firing of the vblank irq and
      essentially at leading edge of vblank.
      
      This exposed a problem with radeon-kms/amdgpu-kms which do not
      satisfy above requirements:
      
      The vblank irq fires a few scanlines before start of vblank, but
      programmed pageflips complete at start of vblank and
      vblank timestamps update at start of vblank, whereas the
      hw vblank counter increments only later, at start of vsync.
      
      This leads to problems like off by one errors for vblank counter
      updates, vblank counters apparently going backwards or vblank
      timestamps apparently having time going backwards. The net result
      is stuttering of graphics in games, or little hangs, as well as
      total failure of timing sensitive applications.
      
      See bug #93147 for an example of the regression on Linux 4.4-rc:
      
      https://bugs.freedesktop.org/show_bug.cgi?id=93147
      
      This patch tries to align all above events better from the
      viewpoint of the drm core / of external callers to fix the problem:
      
      1. The apparent start of vblank is shifted a few scanlines earlier,
      so the vblank irq now always happens after start of this extended
      vblank interval and thereby drm_update_vblank_count() always samples
      the updated vblank count and timestamp of the new vblank interval.
      
      To achieve this, the reporting of scanout positions by
      radeon_get_crtc_scanoutpos() now operates as if the vblank starts
      radeon_crtc->lb_vblank_lead_lines before the real start of the hw
      vblank interval. This means that the vblank timestamps which are based
      on these scanout positions will now update at this earlier start of
      vblank.
      
      2. The driver->get_vblank_counter() function will bump the returned
      vblank count as read from the hw by +1 if the query happens after
      the shifted earlier start of the vblank, but before the real hw increment
      at start of vsync, so the counter appears to increment at start of vblank
      in sync with the timestamp update.
      
      3. Calls from vblank irq-context and regular non-irq calls are now
      treated identical, always simulating the shifted vblank start, to
      avoid inconsistent results for queries happening from vblank irq vs.
      happening from drm_vblank_enable() or vblank_disable_fn().
      
      4. The radeon_flip_work_func will delay mmio programming a pageflip until
      the start of the real vblank iff it happens to execute inside the shifted
      earlier start of the vblank, so pageflips now also appear to execute at
      start of the shifted vblank, in sync with vblank counter and timestamp
      updates. This to avoid some races between updates of vblank count and
      timestamps that are used for swap scheduling and pageflip execution which
      could cause pageflips to execute before the scheduled target vblank.
      
      The lb_vblank_lead_lines "fudge" value is calculated as the size of
      the display controllers line buffer in scanlines for the given video
      mode: Vblank irq's are triggered by the line buffer logic when the line
      buffer refill for a video frame ends, ie. when the line buffer source read
      position enters the hw vblank. This means that a vblank irq could fire at
      most as many scanlines before the current reported scanout position of the
      crtc timing generator as the number of scanlines the line buffer can
      maximally hold for a given video mode.
      
      This patch has been successfully tested on a RV730 card with DCE-3 display
      engine and on a evergreen card with DCE-4 display engine, in single-display
      and dual-display configuration, with different video modes.
      
      A similar patch is needed for amdgpu-kms to fix the same problem.
      
      Limitations:
      
      - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value
        i just guessed to be high enough to work ok, lacking info on the true
        sizes atm.
      
      Fixes: fdo#93147
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Michel Dänzer <michel.daenzer@amd.com>
      Cc: Harry Wentland <Harry.Wentland@amd.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      
      (v1) Tested-by: Dave Witbrodt <dawitbro@sbcglobal.net>
      
      (v2) Refine radeon_flip_work_func() for better efficiency:
      
           In radeon_flip_work_func, replace the busy waiting udelay(5)
           with event lock held by a more performance and energy efficient
           usleep_range() until at least predicted true start of hw vblank,
           with some slack for scheduler happiness. Release the event lock
           during waits to not delay other outputs in doing their stuff, as
           the waiting can last up to 200 usecs in some cases.
      
           Retested on DCE-3 and DCE-4 to verify it still works nicely.
      
      (v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      5b5561b3
    • Lyude's avatar
      drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt · cb5d4166
      Lyude authored
      HPD signals on DVI ports can be fired off before the pins required for
      DDC probing actually make contact, due to the pins for HPD making
      contact first. This results in a HPD signal being asserted but DDC
      probing failing, resulting in hotplugging occasionally failing.
      
      This is somewhat rare on most cards (depending on what angle you plug
      the DVI connector in), but on some cards it happens constantly. The
      Radeon R5 on the machine used for testing this patch for instance, runs
      into this issue just about every time I try to hotplug a DVI monitor and
      as a result hotplugging almost never works.
      
      Rescheduling the hotplug work for a second when we run into an HPD
      signal with a failing DDC probe usually gives enough time for the rest
      of the connector's pins to make contact, and fixes this issue.
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarLyude <cpaul@redhat.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      cb5d4166
    • jimqu's avatar
      drm/amdgpu: add spin lock to protect freed list in vm (v2) · 81d75a30
      jimqu authored
      there is a protection fault about freed list when OCL test.
      add a spin lock to protect it.
      
      v2: drop changes in vm_fini
      Signed-off-by: default avatarJimQu <jim.qu@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      81d75a30