1. 18 Feb, 2013 3 commits
    • Li Zefan's avatar
      cgroup: fix cgroup_rmdir() vs close(eventfd) race · 810cbee4
      Li Zefan authored
      commit 205a872b ("cgroup: fix lockdep
      warning for event_control") solved a deadlock by introducing a new
      bug.
      
      Move cgrp->event_list to a temporary list doesn't mean you can traverse
      this list locklessly, because at the same time cgroup_event_wake() can
      be called and remove the event from the list. The result of this race
      is disastrous.
      
      We adopt the way how kvm irqfd code implements race-free event removal,
      which is now described in the comments in cgroup_event_wake().
      
      v3:
      - call eventfd_signal() no matter it's eventfd close or cgroup removal
      that removes the cgroup event.
      Acked-by: default avatarKirill A. Shutemov <kirill@shutemov.name>
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      810cbee4
    • Li Zefan's avatar
      cpuset: fix cpuset_print_task_mems_allowed() vs rename() race · 63f43f55
      Li Zefan authored
      rename() will change dentry->d_name. The result of this race can
      be worse than seeing partially rewritten name, but we might access
      a stale pointer because rename() will re-allocate memory to hold
      a longer name.
      
      It's safe in the protection of dentry->d_lock.
      
      v2: check NULL dentry before acquiring dentry lock.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      63f43f55
    • Li Zefan's avatar
      cgroup: fix exit() vs rmdir() race · 71b5707e
      Li Zefan authored
      In cgroup_exit() put_css_set_taskexit() is called without any lock,
      which might lead to accessing a freed cgroup:
      
      thread1                           thread2
      ---------------------------------------------
      exit()
        cgroup_exit()
          put_css_set_taskexit()
            atomic_dec(cgrp->count);
                                         rmdir();
            /* not safe !! */
            check_for_release(cgrp);
      
      rcu_read_lock() can be used to make sure the cgroup is alive.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      71b5707e
  2. 24 Jan, 2013 7 commits
  3. 23 Jan, 2013 1 commit
    • Li Zefan's avatar
      cgroup: fix bogus kernel warnings when cgroup_create() failed · 2739d3cc
      Li Zefan authored
      If cgroup_create() failed and cgroup_destroy_locked() is called to
      do cleanup, we'll see a bunch of warnings:
      
      cgroup_addrm_files: failed to remove 2MB.limit_in_bytes, err=-2
      cgroup_addrm_files: failed to remove 2MB.usage_in_bytes, err=-2
      cgroup_addrm_files: failed to remove 2MB.max_usage_in_bytes, err=-2
      cgroup_addrm_files: failed to remove 2MB.failcnt, err=-2
      cgroup_addrm_files: failed to remove prioidx, err=-2
      cgroup_addrm_files: failed to remove ifpriomap, err=-2
      ...
      
      We failed to remove those files, because cgroup_create() has failed
      before creating those cgroup files.
      
      To fix this, we simply don't warn if cgroup_rm_file() can't find the
      cft entry.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      2739d3cc
  4. 14 Jan, 2013 2 commits
    • Li Zefan's avatar
      cgroup: remove synchronize_rcu() from rebind_subsystems() · 130e3695
      Li Zefan authored
      Nothing's protected by RCU in rebind_subsystems(), and I can't think
      of a reason why it is needed.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      130e3695
    • Li Zefan's avatar
      cgroup: remove synchronize_rcu() from cgroup_attach_{task|proc}() · 5d65bc0c
      Li Zefan authored
      These 2 syncronize_rcu()s make attaching a task to a cgroup
      quite slow, and it can't be ignored in some situations.
      
      A real case from Colin Cross: Android uses cgroups heavily to
      manage thread priorities, putting threads in a background group
      with reduced cpu.shares when they are not visible to the user,
      and in a foreground group when they are. Some RPCs from foreground
      threads to background threads will temporarily move the background
      thread into the foreground group for the duration of the RPC.
      This results in many calls to cgroup_attach_task.
      
      In cgroup_attach_task() it's task->cgroups that is protected by RCU,
      and put_css_set() calls kfree_rcu() to free it.
      
      If we remove this synchronize_rcu(), there can be threads in RCU-read
      sections accessing their old cgroup via current->cgroups with
      concurrent rmdir operation, but this is safe.
      
       # time for ((i=0; i<50; i++)) { echo $$ > /mnt/sub/tasks; echo $$ > /mnt/tasks; }
      
      real    0m2.524s
      user    0m0.008s
      sys     0m0.004s
      
      With this patch:
      
      real    0m0.004s
      user    0m0.004s
      sys     0m0.000s
      
      tj: These synchronize_rcu()s are utterly confused.  synchornize_rcu()
          necessarily has to come between two operations to guarantee that
          the changes made by the former operation are visible to all rcu
          readers before proceeding to the latter operation.  Here,
          synchornize_rcu() are at the end of attach operations with nothing
          beyond it.  Its only effect would be delaying completion of
          write(2) to sysfs tasks/procs files until all rcu readers see the
          change, which doesn't mean anything.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarColin Cross <ccross@google.com>
      5d65bc0c
  5. 10 Jan, 2013 1 commit
  6. 08 Jan, 2013 1 commit
    • Greg Thelen's avatar
      cgroups: fix cgroup_event_listener error handling · 799105d5
      Greg Thelen authored
      The error handling in cgroup_event_listener.c did not correctly deal
      with either an error opening either <control_file> or
      cgroup.event_control.  Due to an uninitialized variable the program
      exit code was undefined if either of these opens failed.
      
      This patch simplifies and corrects cgroup_event_listener.c error
      handling by:
      1. using err*() rather than printf(),exit()
      2. depending on process exit to close open files
      
      With this patch failures always return non-zero error.
      Signed-off-by: default avatarGreg Thelen <gthelen@google.com>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      799105d5
  7. 07 Jan, 2013 3 commits
  8. 03 Jan, 2013 10 commits
  9. 02 Jan, 2013 8 commits
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 5439ca6b
      Linus Torvalds authored
      Pull ext4 bug fixes from Ted Ts'o:
       "Various bug fixes for ext4.  Perhaps the most serious bug fixed is one
        which could cause file system corruptions when performing file punch
        operations."
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: avoid hang when mounting non-journal filesystems with orphan list
        ext4: lock i_mutex when truncating orphan inodes
        ext4: do not try to write superblock on ro remount w/o journal
        ext4: include journal blocks in df overhead calcs
        ext4: remove unaligned AIO warning printk
        ext4: fix an incorrect comment about i_mutex
        ext4: fix deadlock in journal_unmap_buffer()
        ext4: split off ext4_journalled_invalidatepage()
        jbd2: fix assertion failure in jbd2_journal_flush()
        ext4: check dioread_nolock on remount
        ext4: fix extent tree corruption caused by hole punch
      5439ca6b
    • Hugh Dickins's avatar
      mempolicy: remove arg from mpol_parse_str, mpol_to_str · a7a88b23
      Hugh Dickins authored
      Remove the unused argument (formerly no_context) from mpol_parse_str()
      and from mpol_to_str().
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a7a88b23
    • Hugh Dickins's avatar
      tmpfs mempolicy: fix /proc/mounts corrupting memory · f2a07f40
      Hugh Dickins authored
      Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA
      mempolicy testing.  Very nasty.  Reading /proc/mounts, /proc/pid/mounts
      or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often
      in a page table (causing "Bad swap" or "Bad page map" warning or "Bad
      pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere
      worse.  "mpol=prefer" and "mpol=prefer:Node" are equally toxic.
      
      Recent NUMA enhancements are not to blame: this dates back to 2.6.35,
      when commit e17f74af "mempolicy: don't call mpol_set_nodemask() when
      no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(),
      which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags.
      With slab poisoning, you can then rely on mpol_to_str() to set the bit
      for node 0x6b6b, probably in the next page above the caller's stack.
      
      mpol_parse_str() is only called from shmem_parse_options(): no_context
      is always true, so call it unused for now, and remove !no_context code.
      Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might
      expect.  Then mpol_to_str() can ignore its no_context argument also,
      the mpol being appropriately initialized whether contextualized or not.
      Rename its no_context unused too, and let subsequent patch remove them
      (that's not needed for stable backporting, which would involve rejects).
      
      I don't understand why MPOL_LOCAL is described as a pseudo-policy:
      it's a reasonable policy which suffers from a confusing implementation
      in terms of MPOL_PREFERRED with MPOL_F_LOCAL.  I believe this would be
      much more robust if MPOL_LOCAL were recognized in switch statements
      throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly
      empty) nodes mask like everyone else, instead of its preferred_node
      variant (I presume an optimization from the days before MPOL_LOCAL).
      But that would take me too long to get right and fully tested.
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f2a07f40
    • Eric Wong's avatar
      epoll: prevent missed events on EPOLL_CTL_MOD · 128dd175
      Eric Wong authored
      EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to
      ensure events are not missed.  Since the modifications to the interest
      mask are not protected by the same lock as ep_poll_callback, we need to
      ensure the change is visible to other CPUs calling ep_poll_callback.
      
      We also need to ensure f_op->poll() has an up-to-date view of past
      events which occured before we modified the interest mask.  So this
      barrier also pairs with the barrier in wq_has_sleeper().
      
      This should guarantee either ep_poll_callback or f_op->poll() (or both)
      will notice the readiness of a recently-ready/modified item.
      
      This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in:
      http://thread.gmane.org/gmane.linux.kernel/1408782/Signed-off-by: default avatarEric Wong <normalperson@yhbt.net>
      Cc: Hans Verkuil <hans.verkuil@cisco.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andreas Voellmy <andreas.voellmy@yale.edu>
      Tested-by: default avatar"Junchang(Jason) Wang" <junchang.wang@yale.edu>
      Cc: netdev@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      128dd175
    • Aaro Koskinen's avatar
      watchdog: twl4030_wdt: add DT support · 8899b8d9
      Aaro Koskinen authored
      Add DT support for twl4030_wdt. This is needed to get twl4030_wdt to
      probe when booting with DT.
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      8899b8d9
    • Aaro Koskinen's avatar
      watchdog: omap_wdt: eliminate unused variable and a compiler warning · 412b3729
      Aaro Koskinen authored
      We forgot to delete this in the commit 4f4753d9 (watchdog: omap_wdt:
      convert to devm_ functions), and as a result the following compilation
      warning was introduced:
      
      drivers/watchdog/omap_wdt.c: In function 'omap_wdt_remove':
      drivers/watchdog/omap_wdt.c:299:19: warning: unused variable 'res' [-Wunused-variable]
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Reviewed-by: default avatarPaul Walmsley <paul@pwsan.com>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      412b3729
    • Axel Lin's avatar
      watchdog: da9055: Don't update wdt_dev->timeout in da9055_wdt_set_timeout error path · 98e4a293
      Axel Lin authored
      Otherwise, WDIOC_GETTIMEOUT returns wrong value if set_timeout fails.
      This patch also removes unnecessary ret variable in da9055_wdt_ping function.
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      98e4a293
    • Axel Lin's avatar
      watchdog: da9055: Fix invalid free of devm_ allocated data · ee8c94ad
      Axel Lin authored
      It is not required to free devm_ allocated data. Since kref_put
      needs a valid release function, da9055_wdt_release_resources()
      is not deleted.
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      ee8c94ad
  10. 30 Dec, 2012 4 commits
    • Linus Torvalds's avatar
      Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux · 4a490b78
      Linus Torvalds authored
      Pull DRM update from Dave Airlie:
       "This is a bit larger due to me not bothering to do anything since
        before Xmas, and other people working too hard after I had clearly
        given up.
      
        It's got the 3 main x86 driver fixes pulls, and a bunch of tegra
        fixes, doesn't fix the Ironlake bug yet, but that does seem to be
        getting closer.
      
         - radeon: gpu reset fixes and userspace packet support
         - i915: watermark fixes, workarounds, i830/845 fix,
         - nouveau: nvd9/kepler microcode fixes, accel is now enabled and
           working, gk106 support
         - tegra: misc fixes."
      
      * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (34 commits)
        Revert "drm: tegra: protect DC register access with mutex"
        drm: tegra: program only one window during modeset
        drm: tegra: clean out old gem prototypes
        drm: tegra: remove redundant tegra2_tmds_config entry
        drm: tegra: protect DC register access with mutex
        drm: tegra: don't leave clients host1x member uninitialized
        drm: tegra: fix front_porch <-> back_porch mixup
        drm/nve0/graph: fix fuc, and enable acceleration on all known chipsets
        drm/nvc0/graph: fix fuc, and enable acceleration on GF119
        drm/nouveau/bios: cache ramcfg strap on later chipsets
        drm/nouveau/mxm: silence output if no bios data
        drm/nouveau/bios: parse/display extra version component
        drm/nouveau/bios: implement opcode 0xa9
        drm/nouveau/bios: update gpio parsing apis to match current design
        drm/nouveau: initial support for GK106
        drm/radeon: add WAIT_UNTIL to evergreen VM safe reg list
        drm/i915: disable shrinker lock stealing for create_mmap_offset
        drm/i915: optionally disable shrinker lock stealing
        drm/i915: fix flags in dma buf exporting
        drm/radeon: add support for MEM_WRITE packet
        ...
      4a490b78
    • Linus Torvalds's avatar
      Merge tag 'omap-late-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 8d91a42e
      Linus Torvalds authored
      Pull late ARM cleanups for omap from Olof Johansson:
       "From Tony Lindgren:
      
        Here are few more patches to finish the omap changes for multiplatform
        conversion that are not strictly fixes, but were too complex to do
        with the dependencies during the merge window.  Those are to move of
        serial-omap.h to platform_data, and the removal of remaining
        cpu_is_omap macro usage outside mach-omap2.
      
        Then there are several trivial fixes for typos and few minimal
        omap2plus_defconfig updates."
      
      * tag 'omap-late-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        arch/arm/mach-omap2/dpll3xxx.c: drop if around WARN_ON
        OMAP2: Fix a typo - replace regist with register.
        ARM/omap: use module_platform_driver macro
        ARM: OMAP2+: PMU: Remove unused header
        ARM: OMAP4: remove duplicated include from omap_hwmod_44xx_data.c
        ARM: OMAP2+: omap2plus_defconfig: enable twl4030 SoC audio
        ARM: OMAP2+: omap2plus_defconfig: Add tps65217 support
        ARM: OMAP2+: enable devtmpfs and devtmpfs automount
        ARM: OMAP2+: omap_twl: Change TWL4030_MODULE_PM_RECEIVER to TWL_MODULE_PM_RECEIVER
        ARM: OMAP2+: Drop plat/cpu.h for omap2plus
        ARM: OMAP: Split fb.c to remove last remaining cpu_is_omap usage
        MAINTAINERS: Add an entry for omap related .dts files
      8d91a42e
    • Linus Torvalds's avatar
      Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 4fe2dfab
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "It's been quiet over the holidays, but we have had a couple of trivial
        fixes coming in for the newly introduced sunxi platform; one to add it
        to the multiplatform defconfig for build coverage, and one fixup for
        device tree strings."
      
      * tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        sunxi: Change the machine compatible string.
        ARM: multi_v7_defconfig: Add ARCH_SUNXI
      4fe2dfab
    • Dave Airlie's avatar
      Revert "drm: tegra: protect DC register access with mutex" · d5757dbe
      Dave Airlie authored
      This reverts commit 83c0bcb6.
      
      Lucas pointed out this was a mistake, and I missed the discussion,
      so just revert it out to save a rebase.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      d5757dbe