1. 26 Jul, 2013 5 commits
  2. 25 Jul, 2013 3 commits
    • Dave Chinner's avatar
      xfs: di_flushiter considered harmful · e1b4271a
      Dave Chinner authored
      When we made all inode updates transactional, we no longer needed
      the log recovery detection for inodes being newer on disk than the
      transaction being replayed - it was redundant as replay of the log
      would always result in the latest version of the inode would be on
      disk. It was redundant, but left in place because it wasn't
      considered to be a problem.
      
      However, with the new "don't read inodes on create" optimisation,
      flushiter has come back to bite us. Essentially, the optimisation
      made always initialises flushiter to zero in the create transaction,
      and so if we then crash and run recovery and the inode already on
      disk has a non-zero flushiter it will skip recovery of that inode.
      As a result, log recovery does the wrong thing and we end up with a
      corrupt filesystem.
      
      Because we have to support old kernel to new kernel upgrades, we
      can't just get rid of the flushiter support in log recovery as we
      might be upgrading from a kernel that doesn't have fully transactional
      inode updates.  Unfortunately, for v4 superblocks there is no way to
      guarantee that log recovery knows about this fact.
      
      We cannot add a new inode format flag to say it's a "special inode
      create" because it won't be understood by older kernels and so
      recovery could do the wrong thing on downgrade. We cannot specially
      detect the combination of zero mode/non-zero flushiter on disk to
      non-zero mode, zero flushiter in the log item during recovery
      because wrapping of the flushiter can result in false detection.
      
      Hence that makes this "don't use flushiter" optimisation limited to
      a disk format that guarantees that we don't need it. And that means
      the only fix here is to limit the "no read IO on create"
      optimisation to version 5 superblocks....
      Reported-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit e60896d8)
      e1b4271a
    • NeilBrown's avatar
      md/raid5: fix interaction of 'replace' and 'recovery'. · f94c0b66
      NeilBrown authored
      If a device in a RAID4/5/6 is being replaced while another is being
      recovered, then the writes to the replacement device currently don't
      happen, resulting in corruption when the replacement completes and the
      new drive takes over.
      
      This is because the replacement writes are only triggered when
      's.replacing' is set and not when the similar 's.sync' is set (which
      is the case during resync and recovery - it means all devices need to
      be read).
      
      So schedule those writes when s.replacing is set as well.
      
      In this case we cannot use "STRIPE_INSYNC" to record that the
      replacement has happened as that is needed for recording that any
      parity calculation is complete.  So introduce STRIPE_REPLACED to
      record if the replacement has happened.
      
      For safety we should also check that STRIPE_COMPUTE_RUN is not set.
      This has a similar effect to the "s.locked == 0" test.  The latter
      ensure that now IO has been flagged but not started.  The former
      checks if any parity calculation has been flagged by not started.
      We must wait for both of these to complete before triggering the
      'replace'.
      
      Add a similar test to the subsequent check for "are we finished yet".
      This possibly isn't needed (is subsumed in the STRIPE_INSYNC test),
      but it makes it more obvious that the REPLACE will happen before we
      think we are finished.
      
      Finally if a NeedReplace device is not UPTODATE then that is an
      error.  We really must trigger a warning.
      
      This bug was introduced in commit 9a3e1101
      (md/raid5:  detect and handle replacements during recovery.)
      which introduced replacement for raid5.
      That was in 3.3-rc3, so any stable kernel since then would benefit
      from this fix.
      
      Cc: stable@vger.kernel.org (3.3+)
      Reported-by: default avatarqindehua <13691222965@163.com>
      Tested-by: default avatarqindehua <qindehua@163.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      f94c0b66
    • NeilBrown's avatar
      md/raid10: remove use-after-free bug. · 0eb25bb0
      NeilBrown authored
      We always need to be careful when calling generic_make_request, as it
      can start a chain of events which might free something that we are
      using.
      
      Here is one place I wasn't careful enough.  If the wbio2 is not in
      use, then it might get freed at the first generic_make_request call.
      So perform all necessary tests first.
      
      This bug was introduced in 3.3-rc3 (24afd80d) and can cause an
      oops, so fix is suitable for any -stable since then.
      
      Cc: stable@vger.kernel.org (3.3+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      0eb25bb0
  3. 24 Jul, 2013 29 commits
  4. 23 Jul, 2013 3 commits
    • Linus Torvalds's avatar
      Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux · a474902c
      Linus Torvalds authored
      Pull device tree bug fixes and maintainership updates from Grant Likely:
       "This branch contains a couple of minor bug fixes and documentation
        additions, but the bulk of it are several changes to the MAINTAINERS
        file regarding the subsystems I've been involved with"
      
      * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
        of/irq: init struct resource to 0 in of_irq_to_resource()
        of/irq: Avoid calling list_first_entry() for empty list
        of: add vendor prefixes for hisilicon
        of: add vendor prefix for Qualcomm Atheros, Inc.
        MAINTAINERS: Fix incorrect status tag
        MAINTAINERS: Refactor device tree maintainership
        MAINTAINERS: Change device tree mailing list
        MAINTAINERS: Remove Grant Likely
      a474902c
    • Borislav Petkov's avatar
      EDAC: Fix lockdep splat · 88d84ac9
      Borislav Petkov authored
      Fix the following:
      
      BUG: key ffff88043bdd0330 not in .data!
      ------------[ cut here ]------------
      WARNING: at kernel/lockdep.c:2987 lockdep_init_map+0x565/0x5a0()
      DEBUG_LOCKS_WARN_ON(1)
      Modules linked in: glue_helper sb_edac(+) edac_core snd acpi_cpufreq lrw gf128mul ablk_helper iTCO_wdt evdev i2c_i801 dcdbas button cryptd pcspkr iTCO_vendor_support usb_common lpc_ich mfd_core soundcore mperf processor microcode
      CPU: 2 PID: 599 Comm: modprobe Not tainted 3.10.0 #1
      Hardware name: Dell Inc. Precision T3600/0PTTT9, BIOS A08 01/24/2013
       0000000000000009 ffff880439a1d920 ffffffff8160a9a9 ffff880439a1d958
       ffffffff8103d9e0 ffff88043af4a510 ffffffff81a16e11 0000000000000000
       ffff88043bdd0330 0000000000000000 ffff880439a1d9b8 ffffffff8103dacc
      Call Trace:
        dump_stack
        warn_slowpath_common
        warn_slowpath_fmt
        lockdep_init_map
        ? trace_hardirqs_on_caller
        ? trace_hardirqs_on
        debug_mutex_init
        __mutex_init
        bus_register
        edac_create_sysfs_mci_device
        edac_mc_add_mc
        sbridge_probe
        pci_device_probe
        driver_probe_device
        __driver_attach
        ? driver_probe_device
        bus_for_each_dev
        driver_attach
        bus_add_driver
        driver_register
        __pci_register_driver
        ? 0xffffffffa0010fff
        sbridge_init
        ? 0xffffffffa0010fff
        do_one_initcall
        load_module
        ? unset_module_init_ro_nx
        SyS_init_module
        tracesys
      ---[ end trace d24a70b0d3ddf733 ]---
      EDAC MC0: Giving out device to 'sbridge_edac.c' 'Sandy Bridge Socket#0': DEV 0000:3f:0e.0
      EDAC sbridge: Driver loaded.
      
      What happens is that bus_register needs a statically allocated lock_key
      because the last is handed in to lockdep. However, struct mem_ctl_info
      embeds struct bus_type (the whole struct, not a pointer to it) and the
      whole thing gets dynamically allocated.
      
      Fix this by using a statically allocated struct bus_type for the MC bus.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarMauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
      Cc: stable@kernel.org # v3.10
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      88d84ac9
    • Linus Torvalds's avatar
      Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · c2468d32
      Linus Torvalds authored
      Pull cgroup changes from Tejun Heo:
       "This contains two patches, both of which aren't fixes per-se but I
        think it'd be better to fast-track them.
      
        One removes bcache_subsys_id which was added without proper review
        through the block tree.  Fortunately, bcache cgroup code is
        unconditionally disabled, so this was never exposed to userland.  The
        cgroup subsys_id is removed.  Kent will remove the affected (disabled)
        code through bcache branch.
      
        The other simplifies task_group_path_from_hierarchy().  The function
        doesn't currently have in-kernel users but there are external code and
        development going on dependent on the function and making the function
        available for 3.11 would make things go smoother"
      
      * 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: replace task_cgroup_path_from_hierarchy() with task_cgroup_path()
        cgroup: remove bcache_subsys_id which got added stealthily
      c2468d32