1. 13 Mar, 2015 10 commits
    • Andrey Ryabinin's avatar
      kasan, module, vmalloc: rework shadow allocation for modules · a5af5aa8
      Andrey Ryabinin authored
      Current approach in handling shadow memory for modules is broken.
      
      Shadow memory could be freed only after memory shadow corresponds it is no
      longer used.  vfree() called from interrupt context could use memory its
      freeing to store 'struct llist_node' in it:
      
          void vfree(const void *addr)
          {
          ...
              if (unlikely(in_interrupt())) {
                  struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
                  if (llist_add((struct llist_node *)addr, &p->list))
                          schedule_work(&p->wq);
      
      Later this list node used in free_work() which actually frees memory.
      Currently module_memfree() called in interrupt context will free shadow
      before freeing module's memory which could provoke kernel crash.
      
      So shadow memory should be freed after module's memory.  However, such
      deallocation order could race with kasan_module_alloc() in module_alloc().
      
      Free shadow right before releasing vm area.  At this point vfree()'d
      memory is not used anymore and yet not available for other allocations.
      New VM_KASAN flag used to indicate that vm area has dynamically allocated
      shadow memory so kasan frees shadow only if it was previously allocated.
      Signed-off-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a5af5aa8
    • Suzuki K. Poulose's avatar
      fanotify: fix event filtering with FAN_ONDIR set · b3c1030d
      Suzuki K. Poulose authored
      With FAN_ONDIR set, the user can end up getting events, which it hasn't
      marked.  This was revealed with fanotify04 testcase failure on
      Linux-4.0-rc1, and is a regression from 3.19, revealed with 66ba93c0
      ("fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask").
      
         # /opt/ltp/testcases/bin/fanotify04
         [ ... ]
        fanotify04    7  TPASS  :  event generated properly for type 100000
        fanotify04    8  TFAIL  :  fanotify04.c:147: got unexpected event 30
        fanotify04    9  TPASS  :  No event as expected
      
      The testcase sets the adds the following marks : FAN_OPEN | FAN_ONDIR for
      a fanotify on a dir.  Then does an open(), followed by close() of the
      directory and expects to see an event FAN_OPEN(0x20).  However, the
      fanotify returns (FAN_OPEN|FAN_CLOSE_NOWRITE(0x10)).  This happens due to
      the flaw in the check for event_mask in fanotify_should_send_event() which
      does:
      
      	if (event_mask & marks_mask & ~marks_ignored_mask)
      		return true;
      
      where, event_mask == (FAN_ONDIR | FAN_CLOSE_NOWRITE),
             marks_mask == (FAN_ONDIR | FAN_OPEN),
             marks_ignored_mask == 0
      
      Fix this by masking the outgoing events to the user, as we already take
      care of FAN_ONDIR and FAN_EVENT_ON_CHILD.
      Signed-off-by: default avatarSuzuki K. Poulose <suzuki.poulose@arm.com>
      Tested-by: default avatarLino Sanfilippo <LinoSanfilippo@gmx.de>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b3c1030d
    • gchen gchen's avatar
      mm/nommu.c: export symbol max_mapnr · 5b8bf307
      gchen gchen authored
      Several modules may need max_mapnr, so export, the related error with
      allmodconfig under c6x:
      
        MODPOST 3327 modules
        ERROR: "max_mapnr" [fs/pstore/ramoops.ko] undefined!
        ERROR: "max_mapnr" [drivers/media/v4l2-core/videobuf2-dma-contig.ko] undefined!
      Signed-off-by: default avatarChen Gang <gang.chen.5i5j@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5b8bf307
    • Chen Gang's avatar
      arch/c6x/include/asm/pgtable.h: define dummy pgprot_writecombine for !MMU · 65b9ab88
      Chen Gang authored
      When !MMU, asm-generic will not define default pgprot_writecombine, so c6x
      needs to define it by itself.  The related error:
      
          CC [M]  fs/pstore/ram_core.o
        fs/pstore/ram_core.c: In function 'persistent_ram_vmap':
        fs/pstore/ram_core.c:399:10: error: implicit declaration of function 'pgprot_writecombine' [-Werror=implicit-function-declaration]
           prot = pgprot_writecombine(PAGE_KERNEL);
                  ^
        fs/pstore/ram_core.c:399:8: error: incompatible types when assigning to type 'pgprot_t {aka struct <anonymous>}' from type 'int'
           prot = pgprot_writecombine(PAGE_KERNEL);
                ^
      Signed-off-by: default avatarChen Gang <gang.chen.5i5j@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      65b9ab88
    • Ryusuke Konishi's avatar
      nilfs2: fix deadlock of segment constructor during recovery · 283ee148
      Ryusuke Konishi authored
      According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
      during recovery at mount time.  The code path that caused the deadlock was
      as follows:
      
        nilfs_fill_super()
          load_nilfs()
            nilfs_salvage_orphan_logs()
              * Do roll-forwarding, attach segment constructor for recovery,
                and kick it.
      
              nilfs_segctor_thread()
                nilfs_segctor_thread_construct()
                 * A lock is held with nilfs_transaction_lock()
                   nilfs_segctor_do_construct()
                     nilfs_segctor_drop_written_files()
                       iput()
                         iput_final()
                           write_inode_now()
                             writeback_single_inode()
                               __writeback_single_inode()
                                 do_writepages()
                                   nilfs_writepage()
                                     nilfs_construct_dsync_segment()
                                       nilfs_transaction_lock() --> deadlock
      
      This can happen if commit 7ef3ff2f ("nilfs2: fix deadlock of segment
      constructor over I_SYNC flag") is applied and roll-forward recovery was
      performed at mount time.  The roll-forward recovery can happen if datasync
      write is done and the file system crashes immediately after that.  For
      instance, we can reproduce the issue with the following steps:
      
       < nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
       # dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
       # dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
       count=1 && reboot -nfh
       < the system will immediately reboot >
       # mount -t nilfs2 /dev/sdb1 /nilfs
      
      The deadlock occurs because iput() can run segment constructor through
      writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags.  The
      above commit changed segment constructor so that it calls iput()
      asynchronously for inodes with i_nlink == 0, but that change was
      imperfect.
      
      This fixes the another deadlock by deferring iput() in segment constructor
      even for the case that mount is not finished, that is, for the case that
      MS_ACTIVE flag is not set.
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Reported-by: default avatarYuxuan Shui <yshuiv7@gmail.com>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      283ee148
    • Danesh Petigara's avatar
      mm: cma: fix CMA aligned offset calculation · 850fc430
      Danesh Petigara authored
      The CMA aligned offset calculation is incorrect for non-zero order_per_bit
      values.
      
      For example, if cma->order_per_bit=1, cma->base_pfn= 0x2f800000 and
      align_order=12, the function returns a value of 0x17c00 instead of 0x400.
      
      This patch fixes the CMA aligned offset calculation.
      
      The previous calculation was wrong and would return too-large values for
      the offset, so that when cma_alloc looks for free pages in the bitmap with
      the requested alignment > order_per_bit, it starts too far into the bitmap
      and so CMA allocations will fail despite there actually being plenty of
      free pages remaining.  It will also probably have the wrong alignment.
      With this change, we will get the correct offset into the bitmap.
      
      One affected user is powerpc KVM, which has kvm_cma->order_per_bit set to
      KVM_CMA_CHUNK_ORDER - PAGE_SHIFT, or 18 - 12 = 6.
      
      [gregory.0xf0@gmail.com: changelog additions]
      Signed-off-by: default avatarDanesh Petigara <dpetigara@broadcom.com>
      Reviewed-by: default avatarGregory Fong <gregory.0xf0@gmail.com>
      Acked-by: default avatarMichal Nazarewicz <mina86@mina86.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      850fc430
    • David Rientjes's avatar
      mm, hugetlb: close race when setting PageTail for gigantic pages · 44fc8057
      David Rientjes authored
      Now that gigantic pages are dynamically allocatable, care must be taken to
      ensure that p->first_page is valid before setting PageTail.
      
      If this isn't done, then it is possible to race and have compound_head()
      return NULL.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      44fc8057
    • Michal Hocko's avatar
      mm, oom: do not fail __GFP_NOFAIL allocation if oom killer is disabled · e009d5dc
      Michal Hocko authored
      Tetsuo Handa has pointed out that __GFP_NOFAIL allocations might fail
      after OOM killer is disabled if the allocation is performed by a kernel
      thread.  This behavior was introduced from the very beginning by
      7f33d49a ("mm, PM/Freezer: Disable OOM killer when tasks are frozen").
       This means that the basic contract for the allocation request is broken
      and the context requesting such an allocation might blow up unexpectedly.
      
      There are basically two ways forward.
      
      1) move oom_killer_disable after kernel threads are frozen.  This has a
         risk that the OOM victim wouldn't be able to finish because it would
         depend on an already frozen kernel thread.  This would be really tricky
         to debug.
      
      2) do not fail GFP_NOFAIL allocation no matter what and risk a
         potential Freezable kernel threads will loop and fail the suspend.
         Incidental allocations after kernel threads are frozen will at least
         dump a warning - if we are lucky and the serial console is still active
         of course...
      
      This patch implements the later option because it is safer.  We would see
      warning rather than allocation failures for the kernel threads which would
      blow up otherwise and have a higher chances to identify __GFP_NOFAIL users
      from deeper pm code.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarDavid Rientjes <rientjes@gooogle.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e009d5dc
    • Javier Martinez Canillas's avatar
      drivers/rtc/rtc-s3c.c: add .needs_src_clk to s3c6410 RTC data · 8792f777
      Javier Martinez Canillas authored
      Commit df9e26d0 ("rtc: s3c: add support for RTC of Exynos3250 SoC")
      added an "rtc_src" DT property to specify the clock used as a source to
      the S3C real-time clock.
      
      Not all SoCs needs this so commit eaf3a659 ("drivers/rtc/rtc-s3c.c:
      fix initialization failure without rtc source clock") changed to check
      the struct s3c_rtc_data .needs_src_clk to conditionally grab the clock.
      
      But that commit didn't update the data for each IP version so the RTC
      broke on the boards that needs a source clock. This is the case of at
      least Exynos5250 and Exynos5440 which uses the s3c6410 RTC IP block.
      
      This commit fixes the S3C rtc on the Exynos5250 Snow and Exynos5420
      Peach Pit and Pi Chromebooks.
      Signed-off-by: default avatarJavier Martinez Canillas <javier.martinez@collabora.co.uk>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Chanwoo Choi <cw00.choi@samsung.com>
      Cc: Doug Anderson <dianders@chromium.org>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Tyler Baker <tyler.baker@linaro.org>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8792f777
    • Mark Fasheh's avatar
      ocfs2: make append_dio an incompat feature · 18d585f0
      Mark Fasheh authored
      It turns out that making this feature ro_compat isn't quite enough to
      prevent accidental corruption on mount from older kernels.  Ocfs2 (like
      other file systems) will process orphaned inodes even when the user mounts
      in 'ro' mode.  So for the case of a filesystem not knowing the append_dio
      feature, mounting the filesystem could result in orphaned-for-dio files
      being deleted, which we clearly don't want.
      
      So instead, turn this into an incompat flag.
      
      Btw, this is kind of my fault - initially I asked that we add a flag to
      cover the feature and even suggested that we use an ro flag.  It wasn't
      until I was looking through our commits for v4.0-rc1 that I realized we
      actually want this to be incompat.
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.de>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18d585f0
  2. 12 Mar, 2015 7 commits
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 09d35919
      Linus Torvalds authored
      Pull i2c fix from Wolfram Sang:
       "An important bugfix for the I2C subsystem core"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        Revert "i2c: core: Dispose OF IRQ mapping at client removal time"
      09d35919
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.0-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 91e9134e
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
       "Here are a couple updates for v4.0.
      
        One fixes a config accessor problem on APM X-Gene that we introduced
        when switching to generic config accessors, and the other fixes an
        older read-past-end-of-buffer problem in sysfs.
      
        APM X-Gene host bridge driver
          - Add register offset to config space base address (Feng Kan)
      
        Miscellaneous
          - Don't read past the end of sysfs "driver_override" buffer (Sasha Levin)"
      
      * tag 'pci-v4.0-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: xgene: Add register offset to config space base address
        PCI: Don't read past the end of sysfs "driver_override" buffer
      91e9134e
    • Linus Torvalds's avatar
      Merge tag 'microblaze-4.0-rc4' of git://git.monstr.eu/linux-2.6-microblaze · d3dd73fc
      Linus Torvalds authored
      Pull arch/microblaze fixes from Michal Simek:
       "Fix syscall error recovery.
      
        Two patches - one is just preparation patch for the second which is
        fixing the problem with syscalls"
      
      * tag 'microblaze-4.0-rc4' of git://git.monstr.eu/linux-2.6-microblaze:
        microblaze: Fix syscall error recovery for invalid syscall IDs
        microblaze: Coding style cleanup
      d3dd73fc
    • Linus Torvalds's avatar
      Merge tag 'nios2-fix-4.0-rc4' of git://git.rocketboards.org/linux-socfpga-next · 56275112
      Linus Torvalds authored
      Pull arch/nios2 fix from Ley Foon Tan:
       "Remove pt_regs from user header and use generic ucontext.h"
      
      * tag 'nios2-fix-4.0-rc4' of git://git.rocketboards.org/linux-socfpga-next:
        nios2: update pt_regs
      56275112
    • Linus Torvalds's avatar
      mm: fix up numa read-only thread grouping logic · 53da3bc2
      Linus Torvalds authored
      Dave Chinner reported that commit 4d942466 ("mm: convert
      p[te|md]_mknonnuma and remaining page table manipulations") slowed down
      his xfsrepair test enormously.  In particular, it was using more system
      time due to extra TLB flushing.
      
      The ultimate reason turns out to be how the change to use the regular
      page table accessor functions broke the NUMA grouping logic.  The old
      special mknuma/mknonnuma code accessed the page table present bit and
      the magic NUMA bit directly, while the new code just changes the page
      protections using PROT_NONE and the regular vma protections.
      
      That sounds equivalent, and from a fault standpoint it really is, but a
      subtle side effect is that the *other* protection bits of the page table
      entries also change.  And the code to decide how to group the NUMA
      entries together used the writable bit to decide whether a particular
      page was likely to be shared read-only or not.
      
      And with the change to make the NUMA handling use the regular permission
      setting functions, that writable bit was basically always cleared for
      private mappings due to COW.  So even if the page actually ends up being
      written to in the end, the NUMA balancing would act as if it was always
      shared RO.
      
      This code is a heuristic anyway, so the fix - at least for now - is to
      instead check whether the page is dirty rather than writable.  The bit
      doesn't change with protection changes.
      
      NOTE! This also adds a FIXME comment to revisit this issue,
      
      Not only should we probably re-visit the whole "is this a shared
      read-only page" heuristic (we might want to take the vma permissions
      into account and base this more on those than the per-page ones, and
      also look at whether the particular access that triggers it is a write
      or not), but the whole COW issue shows that we should think about the
      NUMA fault handling some more.
      
      For example, maybe we should do the early-COW thing that a regular fault
      does.  Or maybe we should accept that while using the same bits as
      PROTNONE was a good thing (and got rid of the specual NUMA bit), we
      might still want to just preseve the other protection bits across NUMA
      faulting.
      
      Those are bigger questions, left for later.  This just fixes up the
      heuristic so that it at least approximates working again.  More analysis
      and work needed.
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Tested-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>,
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      53da3bc2
    • Jakub Kicinski's avatar
      Revert "i2c: core: Dispose OF IRQ mapping at client removal time" · a4944572
      Jakub Kicinski authored
      This reverts commit e4df3a0b
      ("i2c: core: Dispose OF IRQ mapping at client removal time")
      
      Calling irq_dispose_mapping() will destroy the mapping and disassociate
      the IRQ from the IRQ chip to which it belongs. Keeping it is OK, because
      existent mappings are reused properly.
      
      Also, this commit breaks drivers using devm* for IRQ management on
      OF-based systems because devm* cleanup happens in device code, after
      bus's remove() method returns.
      Signed-off-by: default avatarJakub Kicinski <kubakici@wp.pl>
      Reported-by: default avatarSébastien Szymanski <sebastien.szymanski@armadeus.com>
      Acked-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Acked-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      [wsa: updated the commit message with findings fromt the other bug report]
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Cc: stable@kernel.org
      Fixes: e4df3a0b
      a4944572
    • Chung-Ling Tang's avatar
      nios2: update pt_regs · 92d5dd8c
      Chung-Ling Tang authored
      Remove struct pt_regs from user header and use generic ucontext.h.
      Signed-off-by: default avatarChung-Ling Tang <cltang@codesourcery.com>
      Acked-by: default avatarLey Foon Tan <lftan@altera.com>
      92d5dd8c
  3. 11 Mar, 2015 2 commits
  4. 10 Mar, 2015 12 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/virt/kvm/kvm · affb8172
      Linus Torvalds authored
      Pull kvm/s390 bugfixes from Marcelo Tosatti.
      
      * git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: non-LPAR case obsolete during facilities mask init
        KVM: s390: include guest facilities in kvm facility test
        KVM: s390: fix in memory copy of facility lists
        KVM: s390/cpacf: Fix kernel bug under z/VM
        KVM: s390/cpacf: Enable key wrapping by default
      affb8172
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · ec0e6bd3
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
       "One performance optimization for page_clear and a couple of bug fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/mm: fix incorrect ASCE after crst_table_downgrade
        s390/ftrace: fix crashes when switching tracers / add notrace to cpu_relax()
        s390/pci: unify pci_iomap symbol exports
        s390/pci: fix [un]map_resources sequence
        s390: let the compiler do page clearing
        s390/pci: fix possible information leak in mmio syscall
        s390/dcss: array index 'i' is used before limits check.
        s390/scm_block: fix off by one during cluster reservation
        s390/jump label: improve and fix sanity check
        s390/jump label: add missing jump_label_apply_nops() call
      ec0e6bd3
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v4.0-rc2-2' of... · e7901af1
      Linus Torvalds authored
      Merge tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull seq-buf/ftrace fixes from Steven Rostedt:
       "This includes fixes for seq_buf_bprintf() truncation issue.  It also
        contains fixes to ftrace when /proc/sys/kernel/ftrace_enabled and
        function tracing are started.  Doing the following causes some issues:
      
          # echo 0 > /proc/sys/kernel/ftrace_enabled
          # echo function_graph > /sys/kernel/debug/tracing/current_tracer
          # echo 1 > /proc/sys/kernel/ftrace_enabled
          # echo nop > /sys/kernel/debug/tracing/current_tracer
          # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
        As well as with function tracing too.  Pratyush Anand first reported
        this issue to me and supplied a patch.  When I tested this on my x86
        test box, it caused thousands of backtraces and warnings to appear in
        dmesg, which also caused a denial of service (a warning for every
        function that was listed).  I applied Pratyush's patch but it did not
        fix the issue for me.  I looked into it and found a slight problem
        with trampoline accounting.  I fixed it and sent Pratyush a patch, but
        he said that it did not fix the issue for him.
      
        I later learned tha Pratyush was using an ARM64 server, and when I
        tested on my ARM board, I was able to reproduce the same issue as
        Pratyush.  After applying his patch, it fixed the problem.  The above
        test uncovered two different bugs, one in x86 and one in ARM and
        ARM64.  As this looked like it would affect PowerPC, I tested it on my
        PPC64 box.  It too broke, but neither the patch that fixed ARM or x86
        fixed this box (the changes were all in generic code!).  The above
        test, uncovered two more bugs that affected PowerPC.  Again, the
        changes were only done to generic code.  It's the way the arch code
        expected things to be done that was different between the archs.  Some
        where more sensitive than others.
      
        The rest of this series fixes the PPC bugs as well"
      
      * tag 'trace-fixes-v4.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled
        ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl
        ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl
        seq_buf: Fix seq_buf_bprintf() truncation
        seq_buf: Fix seq_buf_vprintf() truncation
      e7901af1
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 36bef883
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) nft_compat accidently truncates ethernet protocol to 8-bits, from
          Arturo Borrero.
      
       2) Memory leak in ip_vs_proc_conn(), from Julian Anastasov.
      
       3) Don't allow the space required for nftables rules to exceed the
          maximum value representable in the dlen field.  From Patrick
          McHardy.
      
       4) bcm63xx_enet can accidently leave interrupts permanently disabled
          due to errors in the NAPI polling exit logic.  Fix from Nicolas
          Schichan.
      
       5) Fix OOPSes triggerable by the ping protocol module, due to missing
          address family validations etc.  From Lorenzo Colitti.
      
       6) Don't use RCU locking in sleepable context in team driver, from Jiri
          Pirko.
      
       7) xen-netback miscalculates statistic offset pointers when reporting
          the stats to userspace.  From David Vrabel.
      
       8) Fix a leak of up to 256 pages per VIF destroy in xen-netaback, also
          from David Vrabel.
      
       9) ip_check_defrag() cannot assume that skb_network_offset(),
          particularly when it is used by the AF_PACKET fanout defrag code.
          From Alexander Drozdov.
      
      10) gianfar driver doesn't query OF node names properly when trying to
          determine the number of hw queues available.  Fix it to explicitly
          check for OF nodes named queue-group.  From Tobias Waldekranz.
      
      11) MID field in macb driver should be 12 bits, not 16.  From Punnaiah
          Choudary Kalluri.
      
      12) Fix unintentional regression in traceroute due to timestamp socket
          option changes.  Empty ICMP payloads should be allowed in
          non-timestamp cases.  From Willem de Bruijn.
      
      13) When devices are unregistered, we have to get rid of AF_PACKET
          multicast list entries that point to it via ifindex.  Fix from
          Francesco Ruggeri.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
        tipc: fix bug in link failover handling
        net: delete stale packet_mclist entries
        net: macb: constify macb configuration data
        MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer
        MAINTAINERS: linux-can moved to github
        can: kvaser_usb: Read all messages in a bulk-in URB buffer
        can: kvaser_usb: Avoid double free on URB submission failures
        can: peak_usb: fix missing ctrlmode_ init for every dev
        can: add missing initialisations in CAN related skbuffs
        ip: fix error queue empty skb handling
        bgmac: Clean warning messages
        tcp: align tcp_xmit_size_goal() on tcp_tso_autosize()
        net: fec: fix unbalanced clk disable on driver unbind
        net: macb: Correct the MID field length value
        net: gianfar: correctly determine the number of queue groups
        ipv4: ip_check_defrag should not assume that skb_network_offset is zero
        net: bcmgenet: properly disable password matching
        net: eth: xgene: fix booting with devicetree
        bnx2x: Force fundamental reset for EEH recovery
        xen-netback: refactor xenvif_handle_frag_list()
        ...
      36bef883
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · e93df634
      Linus Torvalds authored
      Pull input subsystem fixes from Dmitry Torokhov:
       "Miscellaneous driver fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: psmouse - disable "palm detection" in the focaltech driver
        Input: psmouse - disable changing resolution/rate/scale for FocalTech
        Input: psmouse - ensure that focaltech reports consistent coordinates
        Input: psmouse - remove hardcoded touchpad size from the focaltech driver
        Input: tc3589x-keypad - set IRQF_ONESHOT flag to ensure IRQ request
        Input: ALPS - fix memory leak when detection fails
        Input: sun4i-ts - add thermal driver dependency
        Input: cyapa - remove superfluous type check in cyapa_gen5_read_idac_data()
        Input: cyapa - fix unaligned functions redefinition error
        Input: mma8450 - add parent device
      e93df634
    • Linus Torvalds's avatar
      Merge tag 'regulator-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator · 068c65c5
      Linus Torvalds authored
      Pull regulator fixes from Mark Brown:
       "A couple of driver specific fixes plus a fix for a regression in the
        core where the updates to use sysfs group registration were overly
        enthusiastic in eliding properties and removed some that had been
        previously present"
      
      * tag 'regulator-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: Fix regression due to NULL constraints check
        regulator: rk808: Set the enable time for LDOs
        regulator: da9210: Mask all interrupt sources to deassert interrupt line
      068c65c5
    • Linus Torvalds's avatar
      Merge tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · d08edd8f
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A collection of driver specific fixes to which the usual comments
        about them being important if you see them mostly apply (except for
        the comment fix).  The pl022 one is particularly nasty for anyone
        affected by it"
      
      * tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: pl022: Fix race in giveback() leading to driver lock-up
        spi: dw-mid: avoid potential NULL dereference
        spi: img-spfi: Verify max spfi transfer length
        spi: fix a typo in comment.
        spi: atmel: Fix interrupt setup for PDC transfers
        spi: dw: revisit FIFO size detection again
        spi: dw-pci: correct number of chip selects
        drivers: spi: ti-qspi: wait for busy bit clear before data write/read
      d08edd8f
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · eca8dac4
      Linus Torvalds authored
      Pull tpm fixes from James Morris:
       "fixes for the TPM driver"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        tpm: fix call order in tpm-chip.c
        tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send
      eca8dac4
    • Linus Torvalds's avatar
      Merge tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux · ecddad64
      Linus Torvalds authored
      Pull fbdev fixes from Tomi Valkeinen:
       - Fix regression in with omapdss when using i2c displays
       - Fix possible null deref in fbmon
       - Check kalloc return value in AMBA CLCD
      
      * tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
        OMAPDSS: fix regression with display sysfs files
        video: fbdev: fix possible null dereference
        video: ARM CLCD: Add missing error check for devm_kzalloc
      ecddad64
    • Linus Torvalds's avatar
      Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · c0e99a71
      Linus Torvalds authored
      Pull cgroup fixes from Tejun Heo:
       "The cgroup iteration update two years ago and the recent cpuset
        restructuring introduced regressions in subset of cpuset
        configurations.  Three patches to fix them.
      
        All are marked for -stable"
      
      * 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cpuset: Fix cpuset sched_relax_domain_level
        cpuset: fix a warning when clearing configured masks in old hierarchy
        cpuset: initialize effective masks when clone_children is enabled
      c0e99a71
    • Linus Torvalds's avatar
      Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · f930713b
      Linus Torvalds authored
      Pull libata fixlet from Tejun Heo:
       "Speed limiting fix for sata_fsl"
      
      * 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        sata-fsl: Apply link speed limits
      f930713b
    • Linus Torvalds's avatar
      Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · b695f31f
      Linus Torvalds authored
      Pull workqueue fix from Tejun Heo:
       "One fix patch for a subtle livelock condition which can happen on
        PREEMPT_NONE kernels involving two racing cancel_work calls.  Whoever
        comes in the second has to wait for the previous one to finish.  This
        was implemented by making the later one block for the same condition
        that the former would be (work item completion) and then loop and
        retest; unfortunately, depending on the wake up order, the later one
        could lock out the former one to finish by busy looping on the cpu.
      
        This is fixed by implementing explicit wait mechanism.  Work item
        might not belong anywhere at this point and there's remote possibility
        of thundering herd problem.  I originally tried to use bit_waitqueue
        but it didn't work for static work items on modules.  It's currently
        using single wait queue with filtering wake up function and exclusive
        wakeup.  If this ever becomes a problem, which is not very likely, we
        can try to figure out a way to piggy back on bit_waitqueue"
      
      * 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE
      b695f31f
  5. 09 Mar, 2015 9 commits
    • Jon Paul Maloy's avatar
      tipc: fix bug in link failover handling · e6441bae
      Jon Paul Maloy authored
      In commit c637c103
      ("tipc: resolve race problem at unicast message reception") we
      introduced a new mechanism for delivering buffers upwards from link
      to socket layer.
      
      That code contains a bug in how we handle the new link input queue
      during failover. When a link is reset, some of its users may be blocked
      because of congestion, and in order to resolve this, we add any pending
      wakeup pseudo messages to the link's input queue, and deliver them to
      the socket. This misses the case where the other, remaining link also
      may have congested users. Currently, the owner node's reference to the
      remaining link's input queue is unconditionally overwritten by the
      reset link's input queue. This has the effect that wakeup events from
      the remaining link may be unduely delayed (but not lost) for a
      potentially long period.
      
      We fix this by adding the pending events from the reset link to the
      input queue that is currently referenced by the node, whichever one
      it is.
      
      This commit should be applied to both net and net-next.
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6441bae
    • Francesco Ruggeri's avatar
      net: delete stale packet_mclist entries · 82f17091
      Francesco Ruggeri authored
      When an interface is deleted from a net namespace the ifindex in the
      corresponding entries in PF_PACKET sockets' mclists becomes stale.
      This can create inconsistencies if later an interface with the same ifindex
      is moved from a different namespace (not that unlikely since ifindexes are
      per-namespace).
      In particular we saw problems with dev->promiscuity, resulting
      in "promiscuity touches roof, set promiscuity failed. promiscuity
      feature of device might be broken" warnings and EOVERFLOW failures of
      setsockopt(PACKET_ADD_MEMBERSHIP).
      This patch deletes the mclist entries for interfaces that are deleted.
      Since this now causes setsockopt(PACKET_DROP_MEMBERSHIP) to fail with
      EADDRNOTAVAIL if called after the interface is deleted, also make
      packet_mc_drop not fail.
      Signed-off-by: default avatarFrancesco Ruggeri <fruggeri@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      82f17091
    • Josh Cartwright's avatar
      net: macb: constify macb configuration data · 0b2eb3e9
      Josh Cartwright authored
      The configurations are not modified by the driver.  Make them 'const' so
      that they may be placed in a read-only section.
      Signed-off-by: default avatarJosh Cartwright <joshc@ni.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b2eb3e9
    • David S. Miller's avatar
      Merge tag 'linux-can-fixes-for-4.0-20150309' of... · d0372504
      David S. Miller authored
      Merge tag 'linux-can-fixes-for-4.0-20150309' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2015-03-09
      
      this is a pull request for net/master for the 4.0 release cycle, it consists of
      6 patches:
      
      A patch by Oliver Hartkopp fixes a long outstanding bug in the infrastructure,
      which leads to skb_under_panics when CAN interfaces are used by AF_PACKET
      sockets e.g. by dhclient. Stephane Grosjean contributes a patch for the
      peak_usb driver which adds a missing initialization. Two patches by Ahmed S.
      Darwish fix problems in the kvaser_usb driver. Followed by two patches by
      myself, updating the MAINTAINERS file
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0372504
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Fix ftrace enable ordering of sysctl ftrace_enabled · 524a3868
      Steven Rostedt (Red Hat) authored
      Some archs (specifically PowerPC), are sensitive with the ordering of
      the enabling of the calls to function tracing and setting of the
      function to use to be traced.
      
      That is, update_ftrace_function() sets what function the ftrace_caller
      trampoline should call. Some archs require this to be set before
      calling ftrace_run_update_code().
      
      Another bug was discovered, that ftrace_startup_sysctl() called
      ftrace_run_update_code() directly. If the function the ftrace_caller
      trampoline changes, then it will not be updated. Instead a call
      to ftrace_startup_enable() should be called because it tests to see
      if the callback changed since the code was disabled, and will
      tell the arch to update appropriately. Most archs do not need this
      notification, but PowerPC does.
      
      The problem could be seen by the following commands:
      
       # echo 0 > /proc/sys/kernel/ftrace_enabled
       # echo function > /sys/kernel/debug/tracing/current_tracer
       # echo 1 > /proc/sys/kernel/ftrace_enabled
       # cat /sys/kernel/debug/tracing/trace
      
      The trace will show that function tracing was not active.
      
      Cc: stable@vger.kernel.org # 2.6.27+
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      524a3868
    • Pratyush Anand's avatar
      ftrace: Fix en(dis)able graph caller when en(dis)abling record via sysctl · 1619dc3f
      Pratyush Anand authored
      When ftrace is enabled globally through the proc interface, we must check if
      ftrace_graph_active is set. If it is set, then we should also pass the
      FTRACE_START_FUNC_RET command to ftrace_run_update_code(). Similarly, when
      ftrace is disabled globally through the proc interface, we must check if
      ftrace_graph_active is set. If it is set, then we should also pass the
      FTRACE_STOP_FUNC_RET command to ftrace_run_update_code().
      
      Consider the following situation.
      
       # echo 0 > /proc/sys/kernel/ftrace_enabled
      
      After this ftrace_enabled = 0.
      
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
      Since ftrace_enabled = 0, ftrace_enable_ftrace_graph_caller() is never
      called.
      
       # echo 1 > /proc/sys/kernel/ftrace_enabled
      
      Now ftrace_enabled will be set to true, but still
      ftrace_enable_ftrace_graph_caller() will not be called, which is not
      desired.
      
      Further if we execute the following after this:
        # echo nop > /sys/kernel/debug/tracing/current_tracer
      
      Now since ftrace_enabled is set it will call
      ftrace_disable_ftrace_graph_caller(), which causes a kernel warning on
      the ARM platform.
      
      On the ARM platform, when ftrace_enable_ftrace_graph_caller() is called,
      it checks whether the old instruction is a nop or not. If it's not a nop,
      then it returns an error. If it is a nop then it replaces instruction at
      that address with a branch to ftrace_graph_caller.
      ftrace_disable_ftrace_graph_caller() behaves just the opposite. Therefore,
      if generic ftrace code ever calls either ftrace_enable_ftrace_graph_caller()
      or ftrace_disable_ftrace_graph_caller() consecutively two times in a row,
      then it will return an error, which will cause the generic ftrace code to
      raise a warning.
      
      Note, x86 does not have an issue with this because the architecture
      specific code for ftrace_enable_ftrace_graph_caller() and
      ftrace_disable_ftrace_graph_caller() does not check the previous state,
      and calling either of these functions twice in a row has no ill effect.
      
      Link: http://lkml.kernel.org/r/e4fbe64cdac0dd0e86a3bf914b0f83c0b419f146.1425666454.git.panand@redhat.com
      
      Cc: stable@vger.kernel.org # 2.6.31+
      Signed-off-by: default avatarPratyush Anand <panand@redhat.com>
      [
        removed extra if (ftrace_start_up) and defined ftrace_graph_active as 0
        if CONFIG_FUNCTION_GRAPH_TRACER is not set.
      ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      1619dc3f
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Clear REGS_EN and TRAMP_EN flags on disabling record via sysctl · b24d443b
      Steven Rostedt (Red Hat) authored
      When /proc/sys/kernel/ftrace_enabled is set to zero, all function
      tracing is disabled. But the records that represent the functions
      still hold information about the ftrace_ops that are hooked to them.
      
      ftrace_ops may request "REGS" (have a full set of pt_regs passed to
      the callback), or "TRAMP" (the ops has its own trampoline to use).
      When the record is updated to represent the state of the ops hooked
      to it, it sets "REGS_EN" and/or "TRAMP_EN" to state that the callback
      points to the correct trampoline (REGS has its own trampoline).
      
      When ftrace_enabled is set to zero, all ftrace locations are a nop,
      so they do not point to any trampoline. But the _EN flags are still
      set. This can cause the accounting to go wrong when ftrace_enabled
      is cleared and an ops that has a trampoline is registered or unregistered.
      
      For example, the following will cause ftrace to crash:
      
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
       # echo 0 > /proc/sys/kernel/ftrace_enabled
       # echo nop > /sys/kernel/debug/tracing/current_tracer
       # echo 1 > /proc/sys/kernel/ftrace_enabled
       # echo function_graph > /sys/kernel/debug/tracing/current_tracer
      
      As function_graph uses a trampoline, when ftrace_enabled is set to zero
      the updates to the record are not done. When enabling function_graph
      again, the record will still have the TRAMP_EN flag set, and it will
      look for an op that has a trampoline other than the function_graph
      ops, and fail to find one.
      
      Cc: stable@vger.kernel.org # 3.17+
      Reported-by: default avatarPratyush Anand <panand@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b24d443b
    • James Morris's avatar
    • Marc Kleine-Budde's avatar
      MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer · f7214cf2
      Marc Kleine-Budde authored
      This patch adds Marc Kleine-Budde as a co maintainer for the CAN networking
      layer.
      Acked-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      f7214cf2