1. 23 Feb, 2015 13 commits
    • Borislav Petkov's avatar
      x86/lib/clear_page_64.S: Convert to ALTERNATIVE_2 macro · 6620ef28
      Borislav Petkov authored
      Move clear_page() up so that we can get 2-byte forward JMPs when
      patching:
      
        apply_alternatives: feat: 3*32+16, old: (ffffffff8130adb0, len: 5), repl: (ffffffff81d0b859, len: 5)
        ffffffff8130adb0: alt_insn: 90 90 90 90 90
        recompute_jump: new_displ: 0x0000003e
        ffffffff81d0b859: rpl_insn: eb 3e 66 66 90
      
      even though the compiler generated 5-byte JMPs which we padded with 5
      NOPs.
      
      Also, make the REP_GOOD version be the default as the majority of
      machines set REP_GOOD. This way we get to save ourselves the JMP:
      
        old insn VA: 0xffffffff813038b0, CPU feat: X86_FEATURE_REP_GOOD, size: 5, padlen: 0
        clear_page:
      
        ffffffff813038b0 <clear_page>:
        ffffffff813038b0:       e9 0b 00 00 00          jmpq ffffffff813038c0
        repl insn: 0xffffffff81cf0e92, size: 0
      
        old insn VA: 0xffffffff813038b0, CPU feat: X86_FEATURE_ERMS, size: 5, padlen: 0
        clear_page:
      
        ffffffff813038b0 <clear_page>:
        ffffffff813038b0:       e9 0b 00 00 00          jmpq ffffffff813038c0
        repl insn: 0xffffffff81cf0e92, size: 5
         ffffffff81cf0e92:      e9 69 2a 61 ff          jmpq ffffffff81303900
      
        ffffffff813038b0 <clear_page>:
        ffffffff813038b0:       e9 69 2a 61 ff          jmpq ffffffff8091631e
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      6620ef28
    • Borislav Petkov's avatar
      x86/entry_32: Convert X86_INVD_BUG to ALTERNATIVE macro · 8e65f6e0
      Borislav Petkov authored
      Booting a 486 kernel on an AMD guest with this patch applied, says:
      
        apply_alternatives: feat: 0*32+25, old: (c160a475, len: 5), repl: (c19557d4, len: 5)
        c160a475: alt_insn: 68 10 35 00 c1
        c19557d4: rpl_insn: 68 80 39 00 c1
      
      which is:
      
        old insn VA: 0xc160a475, CPU feat: X86_FEATURE_XMM, size: 5
        simd_coprocessor_error:
                 c160a475:      68 10 35 00 c1          push $0xc1003510 <do_general_protection>
        repl insn: 0xc19557d4, size: 5
                 c160a475:      68 80 39 00 c1          push $0xc1003980 <do_simd_coprocessor_error>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      8e65f6e0
    • Borislav Petkov's avatar
      x86/smap: Use ALTERNATIVE macro · 669f8a90
      Borislav Petkov authored
      ... and drop unfolded version. No need for ASM_NOP3 anymore either as
      the alternatives do the proper padding at build time and insert proper
      NOPs at boot time.
      
      There should be no apparent operational change from this patch.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      669f8a90
    • Borislav Petkov's avatar
      x86/lib/copy_user_64.S: Convert to ALTERNATIVE_2 · de2ff888
      Borislav Petkov authored
      Use the asm macro and drop the locally grown version.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      de2ff888
    • Borislav Petkov's avatar
      x86/lib/copy_page_64.S: Use generic ALTERNATIVE macro · 090a3f61
      Borislav Petkov authored
      ... instead of the semi-version with the spelled out sections.
      
      What is more, make the REP_GOOD version be the default copy_page()
      version as the majority of the relevant x86 CPUs do set
      X86_FEATURE_REP_GOOD. Thus, copy_page gets compiled to:
      
        ffffffff8130af80 <copy_page>:
        ffffffff8130af80:       e9 0b 00 00 00          jmpq   ffffffff8130af90 <copy_page_regs>
        ffffffff8130af85:       b9 00 02 00 00          mov    $0x200,%ecx
        ffffffff8130af8a:       f3 48 a5                rep movsq %ds:(%rsi),%es:(%rdi)
        ffffffff8130af8d:       c3                      retq
        ffffffff8130af8e:       66 90                   xchg   %ax,%ax
      
        ffffffff8130af90 <copy_page_regs>:
        ...
      
      and after the alternatives have run, the JMP to the old, unrolled
      version gets NOPed out:
      
        ffffffff8130af80 <copy_page>:
        ffffffff8130af80:  66 66 90		xchg   %ax,%ax
        ffffffff8130af83:  66 90		xchg   %ax,%ax
        ffffffff8130af85:  b9 00 02 00 00	mov    $0x200,%ecx
        ffffffff8130af8a:  f3 48 a5		rep movsq %ds:(%rsi),%es:(%rdi)
        ffffffff8130af8d:  c3			retq
      
      On modern uarches, those NOPs are cheaper than the unconditional JMP
      previously.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      090a3f61
    • Borislav Petkov's avatar
      x86/alternatives: Use optimized NOPs for padding · 4fd4b6e5
      Borislav Petkov authored
      Alternatives allow now for an empty old instruction. In this case we go
      and pad the space with NOPs at assembly time. However, there are the
      optimal, longer NOPs which should be used. Do that at patching time by
      adding alt_instr.padlen-sized NOPs at the old instruction address.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      4fd4b6e5
    • Borislav Petkov's avatar
      x86/alternatives: Make JMPs more robust · 48c7a250
      Borislav Petkov authored
      Up until now we had to pay attention to relative JMPs in alternatives
      about how their relative offset gets computed so that the jump target
      is still correct. Or, as it is the case for near CALLs (opcode e8), we
      still have to go and readjust the offset at patching time.
      
      What is more, the static_cpu_has_safe() facility had to forcefully
      generate 5-byte JMPs since we couldn't rely on the compiler to generate
      properly sized ones so we had to force the longest ones. Worse than
      that, sometimes it would generate a replacement JMP which is longer than
      the original one, thus overwriting the beginning of the next instruction
      at patching time.
      
      So, in order to alleviate all that and make using JMPs more
      straight-forward we go and pad the original instruction in an
      alternative block with NOPs at build time, should the replacement(s) be
      longer. This way, alternatives users shouldn't pay special attention
      so that original and replacement instruction sizes are fine but the
      assembler would simply add padding where needed and not do anything
      otherwise.
      
      As a second aspect, we go and recompute JMPs at patching time so that we
      can try to make 5-byte JMPs into two-byte ones if possible. If not, we
      still have to recompute the offsets as the replacement JMP gets put far
      away in the .altinstr_replacement section leading to a wrong offset if
      copied verbatim.
      
      For example, on a locally generated kernel image
      
        old insn VA: 0xffffffff810014bd, CPU feat: X86_FEATURE_ALWAYS, size: 2
        __switch_to:
         ffffffff810014bd:      eb 21                   jmp ffffffff810014e0
        repl insn: size: 5
        ffffffff81d0b23c:       e9 b1 62 2f ff          jmpq ffffffff810014f2
      
      gets corrected to a 2-byte JMP:
      
        apply_alternatives: feat: 3*32+21, old: (ffffffff810014bd, len: 2), repl: (ffffffff81d0b23c, len: 5)
        alt_insn: e9 b1 62 2f ff
        recompute_jumps: next_rip: ffffffff81d0b241, tgt_rip: ffffffff810014f2, new_displ: 0x00000033, ret len: 2
        converted to: eb 33 90 90 90
      
      and a 5-byte JMP:
      
        old insn VA: 0xffffffff81001516, CPU feat: X86_FEATURE_ALWAYS, size: 2
        __switch_to:
         ffffffff81001516:      eb 30                   jmp ffffffff81001548
        repl insn: size: 5
         ffffffff81d0b241:      e9 10 63 2f ff          jmpq ffffffff81001556
      
      gets shortened into a two-byte one:
      
        apply_alternatives: feat: 3*32+21, old: (ffffffff81001516, len: 2), repl: (ffffffff81d0b241, len: 5)
        alt_insn: e9 10 63 2f ff
        recompute_jumps: next_rip: ffffffff81d0b246, tgt_rip: ffffffff81001556, new_displ: 0x0000003e, ret len: 2
        converted to: eb 3e 90 90 90
      
      ... and so on.
      
      This leads to a net win of around
      
      40ish replacements * 3 bytes savings =~ 120 bytes of I$
      
      on an AMD guest which means some savings of precious instruction cache
      bandwidth. The padding to the shorter 2-byte JMPs are single-byte NOPs
      which on smart microarchitectures means discarding NOPs at decode time
      and thus freeing up execution bandwidth.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      48c7a250
    • Borislav Petkov's avatar
      x86/alternatives: Add instruction padding · 4332195c
      Borislav Petkov authored
      Up until now we have always paid attention to make sure the length of
      the new instruction replacing the old one is at least less or equal to
      the length of the old instruction. If the new instruction is longer, at
      the time it replaces the old instruction it will overwrite the beginning
      of the next instruction in the kernel image and cause your pants to
      catch fire.
      
      So instead of having to pay attention, teach the alternatives framework
      to pad shorter old instructions with NOPs at buildtime - but only in the
      case when
      
        len(old instruction(s)) < len(new instruction(s))
      
      and add nothing in the >= case. (In that case we do add_nops() when
      patching).
      
      This way the alternatives user shouldn't have to care about instruction
      sizes and simply use the macros.
      
      Add asm ALTERNATIVE* flavor macros too, while at it.
      
      Also, we need to save the pad length in a separate struct alt_instr
      member for NOP optimization and the way to do that reliably is to carry
      the pad length instead of trying to detect whether we're looking at
      single-byte NOPs or at pathological instruction offsets like e9 90 90 90
      90, for example, which is a valid instruction.
      
      Thanks to Michael Matz for the great help with toolchain questions.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      4332195c
    • Borislav Petkov's avatar
      x86/alternatives: Cleanup DPRINTK macro · db477a33
      Borislav Petkov authored
      Make it pass __func__ implicitly. Also, dump info about each replacing
      we're doing. Fixup comments and style while at it.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      db477a33
    • Borislav Petkov's avatar
      x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define · 338ea555
      Borislav Petkov authored
      It is unconditionally enabled so remove it. No object file change.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      338ea555
    • Linus Torvalds's avatar
      Linux 4.0-rc1 · c517d838
      Linus Torvalds authored
      .. after extensive statistical analysis of my G+ polling, I've come to
      the inescapable conclusion that internet polls are bad.
      
      Big surprise.
      
      But "Hurr durr I'ma sheep" trounced "I like online polls" by a 62-to-38%
      margin, in a poll that people weren't even supposed to participate in.
      Who can argue with solid numbers like that? 5,796 votes from people who
      can't even follow the most basic directions?
      
      In contrast, "v4.0" beat out "v3.20" by a slimmer margin of 56-to-44%,
      but with a total of 29,110 votes right now.
      
      Now, arguably, that vote spread is only about 3,200 votes, which is less
      than the almost six thousand votes that the "please ignore" poll got, so
      it could be considered noise.
      
      But hey, I asked, so I'll honor the votes.
      c517d838
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · feaf2229
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Ext4 bug fixes.
      
        We also reserved code points for encryption and read-only images (for
        which the implementation is mostly just the reserved code point for a
        read-only feature :-)"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix indirect punch hole corruption
        ext4: ignore journal checksum on remount; don't fail
        ext4: remove duplicate remount check for JOURNAL_CHECKSUM change
        ext4: fix mmap data corruption in nodelalloc mode when blocksize < pagesize
        ext4: support read-only images
        ext4: change to use setup_timer() instead of init_timer()
        ext4: reserve codepoints used by the ext4 encryption feature
        jbd2: complain about descriptor block checksum errors
      feaf2229
    • Linus Torvalds's avatar
      Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · be5e6616
      Linus Torvalds authored
      Pull more vfs updates from Al Viro:
       "Assorted stuff from this cycle.  The big ones here are multilayer
        overlayfs from Miklos and beginning of sorting ->d_inode accesses out
        from David"
      
      * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits)
        autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
        procfs: fix race between symlink removals and traversals
        debugfs: leave freeing a symlink body until inode eviction
        Documentation/filesystems/Locking: ->get_sb() is long gone
        trylock_super(): replacement for grab_super_passive()
        fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
        Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
        VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
        SELinux: Use d_is_positive() rather than testing dentry->d_inode
        Smack: Use d_is_positive() rather than testing dentry->d_inode
        TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
        Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
        Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
        VFS: Split DCACHE_FILE_TYPE into regular and special types
        VFS: Add a fallthrough flag for marking virtual dentries
        VFS: Add a whiteout dentry type
        VFS: Introduce inode-getting helpers for layered/unioned fs environments
        Infiniband: Fix potential NULL d_inode dereference
        posix_acl: fix reference leaks in posix_acl_create
        autofs4: Wrong format for printing dentry
        ...
      be5e6616
  2. 22 Feb, 2015 21 commits
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 90c453ca
      Linus Torvalds authored
      Pull ARM fix from Russell King:
       "Just one fix this time around.  __iommu_alloc_buffer() can cause a
        BUG() if dma_alloc_coherent() is called with either __GFP_DMA32 or
        __GFP_HIGHMEM set.  The patch from Alexandre addresses this"
      
      * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
        ARM: 8305/1: DMA: Fix kzalloc flags in __iommu_alloc_buffer()
      90c453ca
    • Al Viro's avatar
      autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation · 0a280962
      Al Viro authored
      X-Coverup: just ask spender
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      0a280962
    • Al Viro's avatar
      procfs: fix race between symlink removals and traversals · 7e0e953b
      Al Viro authored
      use_pde()/unuse_pde() in ->follow_link()/->put_link() resp.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      7e0e953b
    • Al Viro's avatar
      debugfs: leave freeing a symlink body until inode eviction · 0db59e59
      Al Viro authored
      As it is, we have debugfs_remove() racing with symlink traversals.
      Supply ->evict_inode() and do freeing there - inode will remain
      pinned until we are done with the symlink body.
      
      And rip the idiocy with checking if dentry is positive right after
      we'd verified debugfs_positive(), which is a stronger check...
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      0db59e59
    • Al Viro's avatar
      dca11178
    • Konstantin Khlebnikov's avatar
      trylock_super(): replacement for grab_super_passive() · eb6ef3df
      Konstantin Khlebnikov authored
      I've noticed significant locking contention in memory reclaimer around
      sb_lock inside grab_super_passive(). Grab_super_passive() is called from
      two places: in icache/dcache shrinkers (function super_cache_scan) and
      from writeback (function __writeback_inodes_wb). Both are required for
      progress in memory allocator.
      
      Grab_super_passive() acquires sb_lock to increment sb->s_count and check
      sb->s_instances. It seems sb->s_umount locked for read is enough here:
      super-block deactivation always runs under sb->s_umount locked for write.
      Protecting super-block itself isn't a problem: in super_cache_scan() sb
      is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers
      are still active. Inside writeback super-block comes from inode from bdi
      writeback list under wb->list_lock.
      
      This patch removes locking sb_lock and checks s_instances under s_umount:
      generic_shutdown_super() unlinks it under sb->s_umount locked for write.
      New variant is called trylock_super() and since it only locks semaphore,
      callers must call up_read(&sb->s_umount) instead of drop_super(sb) when
      they're done.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      eb6ef3df
    • David Howells's avatar
      fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions · 54f2a2f4
      David Howells authored
      Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup()
      rather than d_is_dir() when checking a dir watch and give an error on fake
      directories.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      54f2a2f4
    • David Howells's avatar
      Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions · ce40fa78
      David Howells authored
      Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack
      thereof) in cachefiles:
      
       (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as
           it doesn't want to deal with automounts in its cache.
      
       (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in
           cachefiles.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      ce40fa78
    • David Howells's avatar
      VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) · e36cb0b8
      David Howells authored
      Convert the following where appropriate:
      
       (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
      
       (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
      
       (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry).  This is actually more
           complicated than it appears as some calls should be converted to
           d_can_lookup() instead.  The difference is whether the directory in
           question is a real dir with a ->lookup op or whether it's a fake dir with
           a ->d_automount op.
      
      In some circumstances, we can subsume checks for dentry->d_inode not being
      NULL into this, provided we the code isn't in a filesystem that expects
      d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
      use d_inode() rather than d_backing_inode() to get the inode pointer).
      
      Note that the dentry type field may be set to something other than
      DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
      manages the fall-through from a negative dentry to a lower layer.  In such a
      case, the dentry type of the negative union dentry is set to the same as the
      type of the lower dentry.
      
      However, if you know d_inode is not NULL at the call site, then you can use
      the d_is_xxx() functions even in a filesystem.
      
      There is one further complication: a 0,0 chardev dentry may be labelled
      DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE.  Strictly, this was
      intended for special directory entry types that don't have attached inodes.
      
      The following perl+coccinelle script was used:
      
      use strict;
      
      my @callers;
      open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
          die "Can't grep for S_ISDIR and co. callers";
      @callers = <$fd>;
      close($fd);
      unless (@callers) {
          print "No matches\n";
          exit(0);
      }
      
      my @cocci = (
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISLNK(E->d_inode->i_mode)',
          '+ d_is_symlink(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISDIR(E->d_inode->i_mode)',
          '+ d_is_dir(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISREG(E->d_inode->i_mode)',
          '+ d_is_reg(E)' );
      
      my $coccifile = "tmp.sp.cocci";
      open($fd, ">$coccifile") || die $coccifile;
      print($fd "$_\n") || die $coccifile foreach (@cocci);
      close($fd);
      
      foreach my $file (@callers) {
          chomp $file;
          print "Processing ", $file, "\n";
          system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
      	die "spatch failed";
      }
      
      [AV: overlayfs parts skipped]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e36cb0b8
    • David Howells's avatar
      SELinux: Use d_is_positive() rather than testing dentry->d_inode · 2c616d4d
      David Howells authored
      Use d_is_positive() rather than testing dentry->d_inode in SELinux to get rid
      of direct references to d_inode outside of the VFS.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      2c616d4d
    • David Howells's avatar
      Smack: Use d_is_positive() rather than testing dentry->d_inode · 8802565b
      David Howells authored
      Use d_is_positive() rather than testing dentry->d_inode in Smack to get rid of
      direct references to d_inode outside of the VFS.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      8802565b
    • David Howells's avatar
      TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR() · e656a8eb
      David Howells authored
      Use d_is_dir() rather than d_inode and S_ISDIR().  Note that this will include
      fake directories such as automount triggers.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e656a8eb
    • David Howells's avatar
      Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode · 729b8a3d
      David Howells authored
      Use d_is_positive(dentry) or d_is_negative(dentry) rather than testing
      dentry->d_inode as the dentry may cover another layer that has an inode when
      the top layer doesn't or may hold a 0,0 chardev that's actually a whiteout.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      729b8a3d
    • David Howells's avatar
      Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb · 7ac2856d
      David Howells authored
      mediated_filesystem() should use dentry->d_sb not dentry->d_inode->i_sb and
      should avoid file_inode() also since it is really dealing with the path.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      7ac2856d
    • David Howells's avatar
      VFS: Split DCACHE_FILE_TYPE into regular and special types · 44bdb5e5
      David Howells authored
      Split DCACHE_FILE_TYPE into DCACHE_REGULAR_TYPE (dentries representing regular
      files) and DCACHE_SPECIAL_TYPE (representing blockdev, chardev, FIFO and
      socket files).
      
      d_is_reg() and d_is_special() are added to detect these subtypes and
      d_is_file() is left as the union of the two.
      
      This allows a number of places that use S_ISREG(dentry->d_inode->i_mode) to
      use d_is_reg(dentry) instead.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      44bdb5e5
    • David Howells's avatar
      VFS: Add a fallthrough flag for marking virtual dentries · df1a085a
      David Howells authored
      Add a DCACHE_FALLTHRU flag to indicate that, in a layered filesystem, this is
      a virtual dentry that covers another one in a lower layer that should be used
      instead.  This may be recorded on medium if directory integration is stored
      there.
      
      The flag can be set with d_set_fallthru() and tested with d_is_fallthru().
      
      Original-author: Valerie Aurora <vaurora@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      df1a085a
    • David Howells's avatar
      VFS: Add a whiteout dentry type · e7f7d225
      David Howells authored
      Add DCACHE_WHITEOUT_TYPE and provide a d_is_whiteout() accessor function.  A
      d_is_miss() accessor is also added for ordinary cache misses and
      d_is_negative() is modified to indicate either an ordinary miss or an enforced
      miss (whiteout).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e7f7d225
    • David Howells's avatar
      VFS: Introduce inode-getting helpers for layered/unioned fs environments · 155e35d4
      David Howells authored
      Introduce some function for getting the inode (and also the dentry) in an
      environment where layered/unioned filesystems are in operation.
      
      The problem is that we have places where we need *both* the union dentry and
      the lower source or workspace inode or dentry available, but we can only have
      a handle on one of them.  Therefore we need to derive the handle to the other
      from that.
      
      The idea is to introduce an extra field in struct dentry that allows the union
      dentry to refer to and pin the lower dentry.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      155e35d4
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · a135c717
      Linus Torvalds authored
      Pull MIPS updates from Ralf Baechle:
       "This is the main pull request for MIPS:
      
         - a number of fixes that didn't make the 3.19 release.
      
         - a number of cleanups.
      
         - preliminary support for Cavium's Octeon 3 SOCs which feature up to
           48 MIPS64 R3 cores with FPU and hardware virtualization.
      
         - support for MIPS R6 processors.
      
           Revision 6 of the MIPS architecture is a major revision of the MIPS
           architecture which does away with many of original sins of the
           architecture such as branch delay slots.  This and other changes in
           R6 require major changes throughout the entire MIPS core
           architecture code and make up for the lion share of this pull
           request.
      
         - finally some preparatory work for eXtendend Physical Address
           support, which allows support of up to 40 bit of physical address
           space on 32 bit processors"
      
           [ Ahh, MIPS can't leave the PAE brain damage alone.  It's like
             every CPU architect has to make that mistake, but pee in the snow
             by changing the TLA.  But whether it's called PAE, LPAE or XPA,
             it's horrid crud   - Linus ]
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (114 commits)
        MIPS: sead3: Corrected get_c0_perfcount_int
        MIPS: mm: Remove dead macro definitions
        MIPS: OCTEON: irq: add CIB and other fixes
        MIPS: OCTEON: Don't do acknowledge operations for level triggered irqs.
        MIPS: OCTEON: More OCTEONIII support
        MIPS: OCTEON: Remove setting of processor specific CVMCTL icache bits.
        MIPS: OCTEON: Core-15169 Workaround and general CVMSEG cleanup.
        MIPS: OCTEON: Update octeon-model.h code for new SoCs.
        MIPS: OCTEON: Implement DCache errata workaround for all CN6XXX
        MIPS: OCTEON: Add little-endian support to asm/octeon/octeon.h
        MIPS: OCTEON: Implement the core-16057 workaround
        MIPS: OCTEON: Delete unused COP2 saving code
        MIPS: OCTEON: Use correct instruction to read 64-bit COP0 register
        MIPS: OCTEON: Save and restore CP2 SHA3 state
        MIPS: OCTEON: Fix FP context save.
        MIPS: OCTEON: Save/Restore wider multiply registers in OCTEON III CPUs
        MIPS: boot: Provide more uImage options
        MIPS: Remove unneeded #ifdef __KERNEL__ from asm/processor.h
        MIPS: ip22-gio: Remove legacy suspend/resume support
        mips: pci: Add ifdef around pci_proc_domain
        ...
      a135c717
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 21770332
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "A few fixes that came in too late to make it into the first set of
        pull requests but would still be nice to have in -rc1.
      
        The majority of these are trivial build fixes for bugs that I found
        myself using randconfig testing, and a set of two patches from Uwe to
        mark DT strings as 'const' where appropriate, to resolve inconsistent
        section attributes"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: make of_device_ids const
        ARM: make arrays containing machine compatible strings const
        ARM: mm: Remove Kconfig symbol CACHE_PL310
        ARM: rockchip: force built-in regulator support for PM
        ARM: mvebu: build armada375-smp code conditionally
        ARM: sti: always enable RESET_CONTROLLER
        ARM: rockchip: make rockchip_suspend_init conditional
        ARM: ixp4xx: fix {in,out}s{bwl} data types
        ARM: prima2: do not select SMP_ON_UP
        ARM: at91: fix pm declarations
        ARM: davinci: multi-soc kernels require AUTO_ZRELADDR
        ARM: davinci: davinci_cfg_reg cannot be init
        ARM: BCM: put back ARCH_MULTI_V7 dependency for mobile
        ARM: vexpress: use ARM_CPU_SUSPEND if needed
        ARM: dts: add I2C device nodes for Broadcom Cygnus
        ARM: dts: BCM63xx: fix L2 cache properties
      21770332
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · c8c6c9ba
      Linus Torvalds authored
      Pull misc SCSI patches from James Bottomley:
       "This is a short patch set representing a couple of left overs from the
        merge window (debug removal and MAINTAINER changes).
      
        Plus one merge window regression (the local workqueue for hpsa) and a
        set of bug fixes for several issues (two for scsi-mq and the rest an
        assortment of long standing stuff, all cc'd to stable)"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        sg: fix EWOULDBLOCK errors with scsi-mq
        sg: fix unkillable I/O wait deadlock with scsi-mq
        sg: fix read() error reporting
        wd719x: add missing .module to wd719x_template
        hpsa: correct compiler warnings introduced by hpsa-add-local-workqueue patch
        fixed invalid assignment of 64bit mask to host dma_boundary for scatter gather segment boundary limit.
        fcoe: Transition maintainership to Vasu
        am53c974: remove left-over debugging code
      c8c6c9ba
  3. 21 Feb, 2015 6 commits
    • Linus Torvalds's avatar
      Merge tag 'xfs-pnfs-for-linus-3.20-rc1' of... · 93aaa830
      Linus Torvalds authored
      Merge tag 'xfs-pnfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
      
      Pull xfs pnfs block layout support from Dave Chinner:
       "This contains the changes to XFS needed to support the PNFS block
        layout server that you pulled in through Bruce's NFS server tree
        merge.
      
        I originally thought that I'd need to merge changes into the NFS
        server side, but Bruce had already picked them up and so this is
        purely changes to the fs/xfs/ codebase.
      
        Summary:
      
        This update contains the implementation of the PNFS server export
        methods that enable use of XFS filesystems as a block layout target"
      
      * tag 'xfs-pnfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
        xfs: recall pNFS layouts on conflicting access
        xfs: implement pNFS export operations
      93aaa830
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 24a52e41
      Linus Torvalds authored
      Pull more NFS client updates from Trond Myklebust:
       "Highlights include:
      
         - Fix a use-after-free in decode_cb_sequence_args()
         - Fix a compile error when #undef CONFIG_PROC_FS
         - NFSv4.1 backchannel spinlocking issue
         - Cleanups in the NFS unstable write code requested by Linus
         - NFSv4.1 fix issues when the server denies our backchannel request
         - Cleanups in create_session and bind_conn_to_session"
      
      * tag 'nfs-for-3.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFSv4.1: Clean up bind_conn_to_session
        NFSv4.1: Always set up a forward channel when binding the session
        NFSv4.1: Don't set up a backchannel if the server didn't agree to do so
        NFSv4.1: Clean up create_session
        pnfs: Refactor the *_layout_mark_request_commit to use pnfs_layout_mark_request_commit
        NFSv4: Kill unused nfs_inode->delegation_state field
        NFS: struct nfs_commit_info.lock must always point to inode->i_lock
        nfs: Can call nfs_clear_page_commit() instead
        nfs: Provide and use helper functions for marking a page as unstable
        SUNRPC: Always manipulate rpc_rqst::rq_bc_pa_list under xprt->bc_pa_lock
        SUNRPC: Fix a compile error when #undef CONFIG_PROC_FS
        NFSv4.1: Convert open-coded array allocation calls to kmalloc_array()
        NFSv4.1: Fix a kfree() of uninitialised pointers in decode_cb_sequence_args
      24a52e41
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.20-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · cd50b70c
      Linus Torvalds authored
      Pull one more batch of power management and ACPI updates from Rafael Wysocki:
       "These are mostly fixes on top of the previously merged recent PM and
        ACPI material.
      
        First, one commit that broke the ACPI LPSS (Low-Power Subsystem)
        driver on a Dell box is reverted and there are two stable-candidate
        fixes for that driver.  Another fix cleans up two recently added ACPI
        EC messages that look odd and the printk level of a noisy debug
        message in the core ACPI resources handling code is reduced.
      
        In addition to that we have two stable-candidate fixes for the s3c
        cpufreq driver, two cpuidle powernv driver updates related to Device
        Trees and a PNP subsystem cleanup that will allow us to get rid of
        some old ugliness going forward.  Also there is a new blacklist entry
        for the ACPI backlight code.
      
        Specifics:
      
         - Revert a recent ACPI LPSS driver commit that prevented the touchpad
           driver from loading on Dell XPS13 (Jarkko Nikula).
      
         - Make the ACPI LPSS driver disable the I2C controllers and deassert
           SPI host controllers resets at startup on Intel BayTrail and
           Braswell SoCs in case they have been left in wrong states by the
           platform firmware which then may casuse fatal controller driver
           failures during resume from hibernation (Mika Westerberg).
      
         - Make two recently added ACPI EC messages look better (Scot Doyle).
      
         - Reduce the printk level of a recently added debug message related
           to ACPI resources that may become noisy in some cases (Rafael J
           Wysocki).
      
         - Add a new ACPI backlight blacklist entry for Samsung Series 9
           (900X3C/900X3D/900X3E/900X4C/900X4D) laptops where the native
           backlight interface doesn't work while the ACPI based one does
           (Jens Reyer).
      
         - Make the PNP sybsystem's core code use __request_region() followed
           by __release_region() instead of __check_region() which then will
           allow us to get rid of the latter as it has no more users (Jakub
           Sitnicki).
      
         - Fix a build breakage and an issue with two __init functions that
           may be called after initialization in the s3c cpufreq driver (Arnd
           Bergmann).
      
         - Make the powernv cpuidle driver read target_residency values for
           idle states from a Device Tree (as we have the suitable DT bindings
           for that now) and improve the parsing of the powermgmt DT node in
           that driver (Preeti U Murthy)"
      
      * tag 'pm+acpi-3.20-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpuidle: powernv: Avoid endianness conversions while parsing DT
        cpufreq: s3c: remove last use of resume_clocks callback
        cpufreq: s3c: remove incorrect __init annotations
        ACPI / LPSS: Deassert resets for SPI host controllers on Braswell
        ACPI / LPSS: Always disable I2C host controllers
        ACPI / resources: Change pr_info() to pr_debug() for debug information
        ACPI / video: Disable native backlight on Samsung Series 9 laptops
        cpuidle: powernv: Read target_residency value of idle states from DT if available
        Revert "ACPI / LPSS: Remove non-existing clock control from Intel Lynxpoint I2C"
        ACPI / EC: Remove non-standard log emphasis
        PNP: Switch from __check_region() to __request_region()
      cd50b70c
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 2bfedd1d
      Linus Torvalds authored
      Pull followup block layer updates from Jens Axboe:
       "Two things in this pull request:
      
         - A block throttle oops fix (marked for stable) from Thadeu.
      
         - The NVMe fixes/features queued up for 3.20, but merged later in the
           process.  From Keith.  We should have gotten this merged earlier,
           we're ironing out the kinks in the process.  Will be ready for the
           initial pull next series"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        blk-throttle: check stats_cpu before reading it from sysfs
        NVMe: Fix potential corruption on sync commands
        NVMe: Remove unused variables
        NVMe: Fix scsi mode select llbaa setting
        NVMe: Fix potential corruption during shutdown
        NVMe: Asynchronous controller probe
        NVMe: Register management handle under nvme class
        NVMe: Update SCSI Inquiry VPD 83h translation
        NVMe: Metadata format support
      2bfedd1d
    • Linus Torvalds's avatar
      Merge tag 'dm-3.20-changes-2' of... · a911dcdb
      Linus Torvalds authored
      Merge tag 'dm-3.20-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull more device mapper changes from Mike Snitzer:
      
      - Significant dm-crypt CPU scalability performance improvements thanks
        to changes that enable effective use of an unbound workqueue across
        all available CPUs.  A large battery of tests were performed to
        validate these changes, summary of results is available here:
        https://www.redhat.com/archives/dm-devel/2015-February/msg00106.html
      
      - A few additional stable fixes (to DM core, dm-snapshot and dm-mirror)
        and a small fix to the dm-space-map-disk.
      
      * tag 'dm-3.20-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm snapshot: fix a possible invalid memory access on unload
        dm: fix a race condition in dm_get_md
        dm crypt: sort writes
        dm crypt: add 'submit_from_crypt_cpus' option
        dm crypt: offload writes to thread
        dm crypt: remove unused io_pool and _crypt_io_pool
        dm crypt: avoid deadlock in mempools
        dm crypt: don't allocate pages for a partial request
        dm crypt: use unbound workqueue for request processing
        dm io: reject unsupported DISCARD requests with EOPNOTSUPP
        dm mirror: do not degrade the mirror on discard error
        dm space map disk: fix sm_disk_count_is_more_than_one()
      a911dcdb
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · e20d3ef5
      Linus Torvalds authored
      Pull SCSI target updates from Nicholas Bellinger:
       "The highlights this round include:
      
         - Update vhost-scsi to support F_ANY_LAYOUT using mm/iov_iter.c
           logic, and signal VERSION_1 support (MST + Viro + nab)
      
         - Fix iscsi/iser-target to remove problematic active_ts_set usage
           (Gavin Guo)
      
         - Update iscsi/iser-target to support multi-sequence sendtargets
           (Sagi)
      
         - Fix original PR_APTPL_BUF_LEN 8k size limitation (Martin Svec)
      
         - Add missing WRITE_SAME end-of-device sanity check (Bart)
      
         - Check for LBA + sectors wrap-around in sbc_parse_cdb() (nab)
      
         - Other various minor SPC/SBC compliance fixes based upon Ronnie
           Sahlberg test suite (nab)"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (32 commits)
        target: Set LBPWS10 bit in Logical Block Provisioning EVPD
        target: Fail UNMAP when emulate_tpu=0
        target: Fail WRITE_SAME w/ UNMAP=1 when emulate_tpws=0
        target: Add sanity checks for DPO/FUA bit usage
        target: Perform PROTECT sanity checks for WRITE_SAME
        target: Fail I/O with PROTECT bit when protection is unsupported
        target: Check for LBA + sectors wrap-around in sbc_parse_cdb
        target: Add missing WRITE_SAME end-of-device sanity check
        iscsi-target: Avoid IN_LOGOUT failure case for iser-target
        target: Fix PR_APTPL_BUF_LEN buffer size limitation
        iscsi-target: Drop problematic active_ts_list usage
        iscsi/iser-target: Support multi-sequence sendtargets text response
        iser-target: Remove duplicate function names
        vhost/scsi: potential memory corruption
        vhost/scsi: Global tcm_vhost -> vhost_scsi rename
        vhost/scsi: Drop left-over scsi_tcq.h include
        vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits
        vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq
        vhost/scsi: Add ANY_LAYOUT iov -> sgl mapping prerequisites
        vhost/scsi: Change vhost_scsi_map_to_sgl to accept iov ptr + len
        ...
      e20d3ef5