1. 19 Apr, 2012 5 commits
    • Stuart Menefy's avatar
      sh: Fix error synchronising kernel page tables · 8d9a784d
      Stuart Menefy authored
      The problem is caused by the interaction of two features in the Linux
      memory management code.
      
      A processes address space is described by a struct mm_struct, and
      every thread has a pointer to the mm it should run in. The exception
      to this are kernel threads, which don't have an mm, and so borrow
      the mm from the last thread which ran. The system is bootstrapped
      by the initial kernel thread using init's mm (even though init hasn't
      been created yet, its mm is the static init_mm).
      
      The other feature is how the kernel handles the page table which
      describes the portion of the address space which is only visible when
      executing inside the kernel, and which is shared by all threads. On
      the SH4 the only portion of the kernel's address space which described
      using the page table is called P3, from 0xc0000000 to 0xdfffffff. This
      portion of the address space is divided into three:
        - mappings for dma_alloc_coherent()
        - mappings for vmalloc() and ioremap()
        - fixmap mappings, primarily used in copy_user_pages() to create
          kernel mappings of user pages with the correct cache colour.
      
      To optimise the TLB miss handler we don't want to add an additional
      condition which checks whether the faulting address is in the user or
      the kernel portion of the address space, and so all page tables have a
      common portion which describes the kernel part of the address
      space. As the SH4 uses a two level page table, only the kernel portion
      of first level page table (the pgd entries) is duplicated. These all
      point to the same second level entries (the pte's), and so no memory
      is wasted.
      
      The reference page table for the kernel is called the swapper_pg_dir,
      and when a new page table is created for a new process the kernel
      portion of the page table is copied from swapper_pg_dir. This works
      fine when changes only occur in the second level of the kernel's page
      table, or the first level entries are created before any new user
      processes. However if a change occurs to the first level of the page
      table, and there are existing processes which don't have this entry in
      their page table, this new entry needs to be added. This is done on
      demand, when the kernel accesses a P3 address which isn't mapped using
      the current page table, the code in vmalloc_fault() copies the entry
      from the reference page table (swapper_pg_dir) into the current
      processes page table.
      
      The bug which this patch addresses is that the code in vmalloc_fault()
      was not copying addresses which fell in the dma_alloc_coherent()
      portion of the address space, and it should have been copying any P3
      address.
      
      Why we hadn't seen this before, and what made this hard to reproduce,
      is that normally the kernel will have called dma_alloc_coherent(), and
      accessed the memory mapping created, before any user process
      runs. Typically drivers such as USB or SATA will have created and used
      mappings of this type during the kernel initialisation, when probing
      for the attached devices, before init runs. Ethernet is slightly
      different, as it normally only creates and accesses
      dma_alloc_coherent() mappings when the network is brought up, but if
      kernel level IP configuration is used this will also occur before any
      user space process runs. So the first reproduction of this problem
      which we saw was occurred when USB and SATA were removed from the
      kernel, and then bring up Ethernet from user space using ifconfig.
      I'd like to thank Joseph Bormolini who did the hard work reducing the
      problem to this simple to reproduce criteria.
      
      In your case the situation is slightly different, and turns out to
      depends on the exact kernel configuration (which we had) and your
      ramdisk contents (which we didn't - hence the need for some assumptions).
      
      In this case the problem is a side effect of kernel level module
      loading. Kernel subsystems sometimes trigger the load of kernel
      modules directly, for example the crypto subsystem tries to load the
      cryptomgr and MTD tries to load modules for Flash partitioning if
      these are not built into the kernel. This is done by the kernel
      creating a user process which runs insmod to try and load the
      appropriate module.
      
      In order for this to cause problems the system must be running with a
      initrd or initramfs, which contains an insmod executable - if the
      kernel can't find an insmod to run, no user process is created, and
      the problem doesn't occur.  If an insmod is found, a process is
      created to run it, which will inherit the kernel portion of the
      swapper_pg_dir first level page table. It doesn't matter whether the
      inmod is successful or not, but when the the kernel scheduler context
      switches back to the kernel initialisation thread, the insmod's mm is
      'borrowed' by the kernel thread, as it doesn't have an address space
      of its own. (Reference counting is used to ensure this mm is not
      destroyed, even though the user process which caused its creation may no
      longer exist.) If this address space doesn't have a first level page
      table entry for the consistent mappings, and a driver tries to access
      such a mapping, we are in the same situation as described above,
      except this time in a kernel thread rather than a user thread
      executing inside the kernel.
      
      See bugzilla: 15425, 15836, 15862, 16106, 16793
      Signed-off-by: default avatarStuart Menefy <stuart.menefy@st.com>
      Signed-off-by: default avatarPaul Mundt <lethal@linux-sh.org>
      8d9a784d
    • Linus Torvalds's avatar
    • Jonghwan Choi's avatar
      security: fix compile error in commoncap.c · 51b79bee
      Jonghwan Choi authored
      Add missing "personality.h"
      security/commoncap.c: In function 'cap_bprm_set_creds':
      security/commoncap.c:510: error: 'PER_CLEAR_ON_SETID' undeclared (first use in this function)
      security/commoncap.c:510: error: (Each undeclared identifier is reported only once
      security/commoncap.c:510: error: for each function it appears in.)
      Signed-off-by: default avatarJonghwan Choi <jhbird.choi@samsung.com>
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      51b79bee
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · dbfad214
      Linus Torvalds authored
      Pull fuse updates from Miklos Szeredi.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: use flexible array in fuse.h
        fuse: allow nanosecond granularity
        fuse: O_DIRECT support for files
        fuse: fix nlink after unlink
      dbfad214
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 743e89eb
      Linus Torvalds authored
      Pull s390 updates from Martin Schwidefsky:
       "A couple of bug fixes, one of them is a TLB flush fix.  Included as
        well is one small coding style patch and a patch to update the default
        configuration."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        [S390] Fix compile error in swab.h
        [S390] Fix stfle() lowcore protection problem
        [S390] cpum_cf: get rid of compile warnings
        [S390] irq: simple coding style change
        [S390] update default configuration
        [S390] fix tlb flushing for page table pages
        [S390] kernel: Use local_irq_save() for memcpy_real()
        [S390] s390/char/vmur.c: fix memory leak
        [S390] drivers/s390/block/dasd_eckd.c: add missing dasd_sfree_request
      743e89eb
  2. 18 Apr, 2012 8 commits
  3. 17 Apr, 2012 8 commits
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 592fe898
      Linus Torvalds authored
      Pull ext4 regression fixes from Ted Ts'o:
       "This fixes a scalability problem reported by Andi Kleen and Tim Chen;
        they were quite secretive about the precise nature of their workload,
        but they later admitted that it only showed up when they were using a
        large sparse file, so the amount of data I/O that was needed was close
        to zero.
      
        I'm not sure how realistic this is and it's only a regression if you
        consider changes made since 2.6.39 to be a "regression" vis-a-vis the
        policy regarding post-merge window bug fixes, but Linus agreed it was
        worth fixing, so I'm including it in this pull request.
      
        This also fixes the journalled quota mount options, which I
        accidentally broke while I was cleaning up the mount option handling."
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix handling of journalled quota options
        ext4: address scalability issue by removing extent cache statistics
      592fe898
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d44c6d4f
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "A bunch of endianness fixes and a couple of nfsd error value fixes.
      
        Speaking of endianness stuff, I'm rather tempted to slap
      
      	ccflags-y += -D__CHECK_ENDIAN__
      
        in fs/Makefile, if not making it default for the entire tree; nfsd
        regressions I've caught make one hell of a pile and we'd obviously
        benefit from having that kind of stuff caught earlier..."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        lockd: fix the endianness bug
        ocfs2: ->e_leaf_clusters endianness breakage
        ocfs2: ->rl_count endianness breakage
        ocfs: ->rl_used breakage on big-endian
        ocfs2: ->l_next_free_req breakage on big-endian
        btrfs: btrfs_root_readonly() broken on big-endian
        ext4: fix endianness breakage in ext4_split_extent_at()
        nfsd: fix compose_entry_fh() failure exits
        nfsd: fix error value on allocation failure in nfsd4_decode_test_stateid()
        nfsd: fix endianness breakage in TEST_STATEID handling
        nfsd: fix error values returned by nfsd4_lockt() when nfsd_open() fails
        nfsd: fix b0rken error value for setattr on read-only mount
      d44c6d4f
    • Linus Torvalds's avatar
      Merge git://git.samba.org/sfrench/cifs-2.6 · bc0cf58e
      Linus Torvalds authored
      Pull CIFS fixes from Steve French.
      
      * git://git.samba.org/sfrench/cifs-2.6:
        Fix number parsing in cifs_parse_mount_options
        Cleanup handling of NULL value passed for a mount option
      bc0cf58e
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4643b056
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar.
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86: Handle failures of parsing immediate operands in the instruction decoder
        perf archive: Correct cutting of symbolic link
        perf tools: Ignore auto-generated bison/flex files
        perf tools: Fix parsers' rules to dependencies
        perf tools: fix NO_GTK2 Makefile config error
        perf session: Skip event correctly for unknown id/machine
      4643b056
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · cdd59830
      Linus Torvalds authored
      Pull virtio fixes from Michael S. Tsirkin:
       "Here are some virtio fixes for 3.4: a test build fix, a patch by Ren
        fixing naming for systems with a massive number of virtio blk devices,
        and balloon fixes for powerpc by David Gibson.
      
        There was some discussion about Ren's patch for virtio disc naming:
        some people wanted to move the legacy name mangling function to the
        block core.  But there's no concensus on that yet, and we can always
        deduplicate later.  Added comments in the hope that this will stop
        people from copying this legacy naming scheme into future drivers."
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_balloon: fix handling of PAGE_SIZE != 4k
        virtio_balloon: Fix endian bug
        virtio_blk: helper function to format disk names
        tools/virtio: fix up vhost/test module build
      cdd59830
    • Rafael J. Wysocki's avatar
      PCI: Retry BARs restoration for Type 0 headers only · a6cb9ee7
      Rafael J. Wysocki authored
      Some shortcomings introduced into pci_restore_state() by commit
      26f41062 ("PCI: check for pci bar restore completion and retry")
      have been fixed by recent commit ebfc5b80 ("PCI: Fix regression in
      pci_restore_state(), v3"), but that commit treats all PCI devices as
      those with Type 0 configuration headers.
      
      That is not entirely correct, because Type 1 and Type 2 headers have
      different layouts.  In particular, the area occupied by BARs in Type 0
      config headers contains the secondary status register in Type 1 ones and
      it doesn't make sense to retry the restoration of that register even if
      the value read back from it after a write is not the same as the written
      one (it very well may be different).
      
      For this reason, make pci_restore_state() only retry the restoration
      of BARs for Type 0 config headers.  This effectively makes it behave
      as before commit 26f41062 for all header types except for Type 0.
      Tested-by: default avatarMikko Vinni <mmvinni@yahoo.com>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a6cb9ee7
    • Randy Dunlap's avatar
      Documentation: maintainer change · 5191d566
      Randy Dunlap authored
      I'm dropping off as Documentation/ maintainer.
      Rob Landley has agreed to take it over.  Thanks, Rob.
      
      I'll still be around reviewing patches and testing.
      Signed-off-by: default avatarRandy Dunlap <rdunlap@xenotime.net>
      Acked-by: default avatarRob Landley <rob@landley.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5191d566
    • Luck, Tony's avatar
      ia64: fix futex_atomic_cmpxchg_inatomic() · c76f39bd
      Luck, Tony authored
      Michel Lespinasse cleaned up the futex calling conventions in commit
      37a9d912 ("futex: Sanitize cmpxchg_futex_value_locked API").
      
      But the ia64 implementation was subtly broken.  Gcc does not know that
      register "r8" will be updated by the fault handler if the cmpxchg
      instruction takes an exception.  So it feels safe in letting the
      initialization of r8 slide to after the cmpxchg.  Result: we always
      return 0 whether the user address faulted or not.
      
      Fix by moving the initialization of r8 into the __asm__ code so gcc
      won't move it.
      
      Reported-by: <emeric.maschino@gmail.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42757
      Tested-by: <emeric.maschino@gmail.com>
      Acked-by: default avatarMichel Lespinasse <walken@google.com>
      Cc: stable@vger.kernel.org # v2.6.39+
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c76f39bd
  4. 16 Apr, 2012 7 commits
    • Theodore Ts'o's avatar
      ext4: fix handling of journalled quota options · 57f73c2c
      Theodore Ts'o authored
      Commit 26092bf5 broke handling of journalled quota mount options by
      trying to parse argument of every mount option as a number.  Fix this
      by dealing with the quota options before we call match_int().
      
      Thanks to Jan Kara for discovering this regression.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      57f73c2c
    • Joe Perches's avatar
      checkpatch: revert --strict test for net/ and drivers/net block comment style · c06a9ebd
      Joe Perches authored
      Revert the --strict test for the old preferred block
      comment style in drivers/net and net/
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c06a9ebd
    • Theodore Ts'o's avatar
      ext4: address scalability issue by removing extent cache statistics · 9cd70b34
      Theodore Ts'o authored
      Andi Kleen and Tim Chen have reported that under certain circumstances
      the extent cache statistics are causing scalability problems due to
      cache line bounces.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      9cd70b34
    • Masami Hiramatsu's avatar
      x86: Handle failures of parsing immediate operands in the instruction decoder · 6c7b8e82
      Masami Hiramatsu authored
      This can happen if the instruction is much longer than the maximum length,
      or if insn->opnd_bytes is manually changed.
      
      This patch also fixes warnings from -Wswitch-default flag.
      Reported-by: default avatarPrashanth Nageshappa <prashanth@linux.vnet.ibm.com>
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
      Cc: Linux-mm <linux-mm@kvack.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120413032427.32577.42602.stgit@localhost.localdomainSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6c7b8e82
    • Linus Torvalds's avatar
      Linux 3.4-rc3 · e816b57a
      Linus Torvalds authored
      e816b57a
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm · 9a8e5d41
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "Nothing too disasterous, the biggest thing being the removal of the
        regulator support for vcore in the AMBA driver; only one SoC was using
        this and it got broken during the last merge window, which then
        started causing problems for other people.  Mutual agreement was
        reached for it to be removed."
      
      * 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
        ARM: 7386/1: jump_label: fixup for rename to static_key
        ARM: 7384/1: ThumbEE: Disable userspace TEEHBR access for !CONFIG_ARM_THUMBEE
        ARM: 7382/1: mm: truncate memory banks to fit in 4GB space for classic MMU
        ARM: 7359/2: smp_twd: Only wait for reprogramming on active cpus
        ARM: 7383/1: nommu: populate vectors page from paging_init
        ARM: 7381/1: nommu: fix typo in mm/Kconfig
        ARM: 7380/1: DT: do not add a zero-sized memory property
        ARM: 7379/1: DT: fix atags_to_fdt() second call site
        ARM: 7366/3: amba: Remove AMBA level regulator support
        ARM: 7377/1: vic: re-read status register before dispatching each IRQ handler
        ARM: 7368/1: fault.c: correct how the tsk->[maj|min]_flt gets incremented
      9a8e5d41
    • Linus Torvalds's avatar
      x86-32: fix up strncpy_from_user() sign error · 12e993b8
      Linus Torvalds authored
      The 'max' range needs to be unsigned, since the size of the user address
      space is bigger than 2GB.
      
      We know that 'count' is positive in 'long' (that is checked in the
      caller), so we will truncate 'max' down to something that fits in a
      signed long, but before we actually do that, that comparison needs to be
      done in unsigned.
      
      Bug introduced in commit 92ae03f2 ("x86: merge 32/64-bit versions of
      'strncpy_from_user()' and speed it up").  On x86-64 you can't trigger
      this, since the user address space is much smaller than 63 bits, and on
      x86-32 it works in practice, since you would seldom hit the strncpy
      limits anyway.
      
      I had actually tested the corner-cases, I had only tested them on
      x86-64.  Besides, I had only worried about the case of a pointer *close*
      to the end of the address space, rather than really far away from it ;)
      
      This also changes the "we hit the user-specified maximum" to return
      'res', for the trivial reason that gcc seems to generate better code
      that way.  'res' and 'count' are the same in that case, so it really
      doesn't matter which one we return.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      12e993b8
  5. 15 Apr, 2012 12 commits