1. 30 Oct, 2011 7 commits
    • Linus Torvalds's avatar
      Merge branch 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm · 1bc87b00
      Linus Torvalds authored
      * 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (75 commits)
        KVM: SVM: Keep intercepting task switching with NPT enabled
        KVM: s390: implement sigp external call
        KVM: s390: fix register setting
        KVM: s390: fix return value of kvm_arch_init_vm
        KVM: s390: check cpu_id prior to using it
        KVM: emulate lapic tsc deadline timer for guest
        x86: TSC deadline definitions
        KVM: Fix simultaneous NMIs
        KVM: x86 emulator: convert push %sreg/pop %sreg to direct decode
        KVM: x86 emulator: switch lds/les/lss/lfs/lgs to direct decode
        KVM: x86 emulator: streamline decode of segment registers
        KVM: x86 emulator: simplify OpMem64 decode
        KVM: x86 emulator: switch src decode to decode_operand()
        KVM: x86 emulator: qualify OpReg inhibit_byte_regs hack
        KVM: x86 emulator: switch OpImmUByte decode to decode_imm()
        KVM: x86 emulator: free up some flag bits near src, dst
        KVM: x86 emulator: switch src2 to generic decode_operand()
        KVM: x86 emulator: expand decode flags to 64 bits
        KVM: x86 emulator: split dst decode to a generic decode_operand()
        KVM: x86 emulator: move memop, memopp into emulation context
        ...
      1bc87b00
    • Linus Torvalds's avatar
      Merge branch 'fbdev-next' of git://github.com/schandinat/linux-2.6 · acff987d
      Linus Torvalds authored
      * 'fbdev-next' of git://github.com/schandinat/linux-2.6: (270 commits)
        video: platinumfb: Add __devexit_p at necessary place
        drivers/video: fsl-diu-fb: merge diu_pool into fsl_diu_data
        drivers/video: fsl-diu-fb: merge diu_hw into fsl_diu_data
        drivers/video: fsl-diu-fb: only DIU modes 0 and 1 are supported
        drivers/video: fsl-diu-fb: remove unused panel operating mode support
        drivers/video: fsl-diu-fb: use an enum for the AOI index
        drivers/video: fsl-diu-fb: add several new video modes
        drivers/video: fsl-diu-fb: remove broken screen blanking support
        drivers/video: fsl-diu-fb: move some definitions out of the header file
        drivers/video: fsl-diu-fb: fix some ioctls
        video: da8xx-fb: Increased resolution configuration of revised LCDC IP
        OMAPDSS: picodlp: add missing #include <linux/module.h>
        fb: fix au1100fb bitrot.
        mx3fb: fix NULL pointer dereference in screen blanking.
        video: irq: Remove IRQF_DISABLED
        smscufx: change edid data to u8 instead of char
        OMAPDSS: DISPC: zorder support for DSS overlays
        OMAPDSS: DISPC: VIDEO3 pipeline support
        OMAPDSS/OMAP_VOUT: Fix incorrect OMAP3-alpha compatibility setting
        video/omap: fix build dependencies
        ...
      
      Fix up conflicts in:
       - drivers/staging/xgifb/XGI_main_26.c
      	Changes to XGIfb_pan_var()
       - drivers/video/omap/{lcd_apollon.c,lcd_ldp.c,lcd_overo.c}
      	Removed (or in the case of apollon.c, merged into the generic
      	DSS panel in drivers/video/omap2/displays/panel-generic-dpi.c)
      acff987d
    • Jan Kiszka's avatar
      KVM: SVM: Keep intercepting task switching with NPT enabled · f1c1da2b
      Jan Kiszka authored
      AMD processors apparently have a bug in the hardware task switching
      support when NPT is enabled. If the task switch triggers a NPF, we can
      get wrong EXITINTINFO along with that fault. On resume, spurious
      exceptions may then be injected into the guest.
      
      We were able to reproduce this bug when our guest triggered #SS and the
      handler were supposed to run over a separate task with not yet touched
      stack pages.
      
      Work around the issue by continuing to emulate task switches even in
      NPT mode.
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      f1c1da2b
    • Christian Ehrhardt's avatar
      KVM: s390: implement sigp external call · 7697e71f
      Christian Ehrhardt authored
      Implement sigp external call, which might be required for guests that
      issue an external call instead of an emergency signal for IPI.
      
      This fixes an issue with "KVM: unknown SIGP: 0x02" when booting
      such an SMP guest.
      Signed-off-by: default avatarChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      7697e71f
    • Carsten Otte's avatar
      KVM: s390: fix register setting · 7eef87dc
      Carsten Otte authored
      KVM common code does vcpu_load prior to calling our arch ioctls and
      vcpu_put after we're done here. Via the kvm_arch_vcpu_load/put
      callbacks we do load the fpu and access register state into the
      processor, which saves us moving the state on every SIE exit the
      kernel handles. However this breaks register setting from userspace,
      because of the following sequence:
      1a. vcpu load stores userspace register content
      1b. vcpu load loads guest register content
      2.  kvm_arch_vcpu_ioctl_set_fpu/sregs updates saved guest register content
      3a. vcpu put stores the guest registers and overwrites the new content
      3b. vcpu put loads the userspace register set again
      
      This patch loads the new guest register state into the cpu, so that the correct
      (new) set of guest registers will be stored in step 3a.
      Signed-off-by: default avatarCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      7eef87dc
    • Carsten Otte's avatar
      KVM: s390: fix return value of kvm_arch_init_vm · b290411a
      Carsten Otte authored
      This patch fixes the return value of kvm_arch_init_vm in case a memory
      allocation goes wrong.
      Signed-off-by: default avatarCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      b290411a
    • Carsten Otte's avatar
      KVM: s390: check cpu_id prior to using it · 4d47555a
      Carsten Otte authored
      We use the cpu id provided by userspace as array index here. Thus we
      clearly need to check it first. Ooops.
      
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      4d47555a
  2. 29 Oct, 2011 7 commits
  3. 28 Oct, 2011 26 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 · ec7ae517
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (204 commits)
        [SCSI] qla4xxx: export address/port of connection (fix udev disk names)
        [SCSI] ipr: Fix BUG on adapter dump timeout
        [SCSI] megaraid_sas: Fix instance access in megasas_reset_timer
        [SCSI] hpsa: change confusing message to be more clear
        [SCSI] iscsi class: fix vlan configuration
        [SCSI] qla4xxx: fix data alignment and use nl helpers
        [SCSI] iscsi class: fix link local mispelling
        [SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA
        [SCSI] aacraid: use lower snprintf() limit
        [SCSI] lpfc 8.3.27: Change driver version to 8.3.27
        [SCSI] lpfc 8.3.27: T10 additions for SLI4
        [SCSI] lpfc 8.3.27: Fix queue allocation failure recovery
        [SCSI] lpfc 8.3.27: Change algorithm for getting physical port name
        [SCSI] lpfc 8.3.27: Changed worst case mailbox timeout
        [SCSI] lpfc 8.3.27: Miscellanous logic and interface fixes
        [SCSI] megaraid_sas: Changelog and version update
        [SCSI] megaraid_sas: Add driver workaround for PERC5/1068 kdump kernel panic
        [SCSI] megaraid_sas: Add multiple MSI-X vector/multiple reply queue support
        [SCSI] megaraid_sas: Add support for MegaRAID 9360/9380 12GB/s controllers
        [SCSI] megaraid_sas: Clear FUSION_IN_RESET before enabling interrupts
        ...
      ec7ae517
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-client · 97d2eb13
      Linus Torvalds authored
      * 'for-linus' of git://ceph.newdream.net/git/ceph-client:
        libceph: fix double-free of page vector
        ceph: fix 32-bit ino numbers
        libceph: force resend of osd requests if we skip an osdmap
        ceph: use kernel DNS resolver
        ceph: fix ceph_monc_init memory leak
        ceph: let the set_layout ioctl set single traits
        Revert "ceph: don't truncate dirty pages in invalidate work thread"
        ceph: replace leading spaces with tabs
        libceph: warn on msg allocation failures
        libceph: don't complain on msgpool alloc failures
        libceph: always preallocate mon connection
        libceph: create messenger with client
        ceph: document ioctls
        ceph: implement (optional) max read size
        ceph: rename rsize -> rasize
        ceph: make readpages fully async
      97d2eb13
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 68d99b2c
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (549 commits)
        ALSA: hda - Fix ADC input-amp handling for Cx20549 codec
        ALSA: hda - Keep EAPD turned on for old Conexant chips
        ALSA: hda/realtek - Fix missing volume controls with ALC260
        ASoC: wm8940: Properly set codec->dapm.bias_level
        ALSA: hda - Fix pin-config for ASUS W90V
        ALSA: hda - Fix surround/CLFE headphone and speaker pins order
        ALSA: hda - Fix typo
        ALSA: Update the sound git tree URL
        ALSA: HDA: Add new revision for ALC662
        ASoC: max98095: Convert codec->hw_write to snd_soc_write
        ASoC: keep pointer to resource so it can be freed
        ASoC: sgtl5000: Fix wrong mask in some snd_soc_update_bits calls
        ASoC: wm8996: Fix wrong mask for setting WM8996_AIF_CLOCKING_2
        ASoC: da7210: Add support for line out and DAC
        ASoC: da7210: Add support for DAPM
        ALSA: hda/realtek - Fix DAC assignments of multiple speakers
        ASoC: Use SGTL5000_LINREG_VDDD_MASK instead of hardcoded mask value
        ASoC: Set sgtl5000->ldo in ldo_regulator_register
        ASoC: wm8996: Use SND_SOC_DAPM_AIF_OUT for AIF2 Capture
        ASoC: wm8994: Use SND_SOC_DAPM_AIF_OUT for AIF3 Capture
        ...
      68d99b2c
    • Linus Torvalds's avatar
      Merge branch 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci · 0e59e7e7
      Linus Torvalds authored
      * 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
        PCI: Clean-up MPS debug output
        pci: Clamp pcie_set_readrq() when using "performance" settings
        PCI: enable MPS "performance" setting to properly handle bridge MPS
        PCI: Workaround for Intel MPS errata
        PCI: Add support for PASID capability
        PCI: Add implementation for PRI capability
        PCI: Export ATS functions to modules
        PCI: Move ATS implementation into own file
        PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
        PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
        PCI / PM: Extend PME polling to all PCI devices
        PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
        PCI: Make pci_setup_bridge() non-static for use by arch code
        x86: constify PCI raw ops structures
        PCI: Add quirk for known incorrect MPSS
        PCI: Add Solarflare vendor ID and SFC4000 device IDs
      0e59e7e7
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc · 46b51ea2
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (83 commits)
        mmc: fix compile error when CONFIG_BLOCK is not enabled
        mmc: core: Cleanup eMMC4.5 conditionals
        mmc: omap_hsmmc: if multiblock reads are broken, disable them
        mmc: core: add workaround for controllers with broken multiblock reads
        mmc: core: Prevent too long response times for suspend
        mmc: recognise SDIO cards with SDIO_CCCR_REV 3.00
        mmc: sd: Handle SD3.0 cards not supporting UHS-I bus speed mode
        mmc: core: support HPI send command
        mmc: core: Add cache control for eMMC4.5 device
        mmc: core: Modify the timeout value for writing power class
        mmc: core: new discard feature support at eMMC v4.5
        mmc: core: mmc sanitize feature support for v4.5
        mmc: dw_mmc: modify DATA register offset
        mmc: sdhci-pci: add flag for devices that can support runtime PM
        mmc: omap_hsmmc: ensure pbias configuration is always done
        mmc: core: Add Power Off Notify Feature eMMC 4.5
        mmc: sdhci-s3c: fix potential NULL dereference
        mmc: replace printk with appropriate display macro
        mmc: core: Add default timeout value for CMD6
        mmc: sdhci-pci: add runtime pm support
        ...
      46b51ea2
    • Linus Torvalds's avatar
      Merge branch 'devel-stable' of... · 1fdb24e9
      Linus Torvalds authored
      Merge branch 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm
      
      * 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm: (178 commits)
        ARM: 7139/1: fix compilation with CONFIG_ARM_ATAG_DTB_COMPAT and large TEXT_OFFSET
        ARM: gic, local timers: use the request_percpu_irq() interface
        ARM: gic: consolidate PPI handling
        ARM: switch from NO_MACH_MEMORY_H to NEED_MACH_MEMORY_H
        ARM: mach-s5p64x0: remove mach/memory.h
        ARM: mach-s3c64xx: remove mach/memory.h
        ARM: plat-mxc: remove mach/memory.h
        ARM: mach-prima2: remove mach/memory.h
        ARM: mach-zynq: remove mach/memory.h
        ARM: mach-bcmring: remove mach/memory.h
        ARM: mach-davinci: remove mach/memory.h
        ARM: mach-pxa: remove mach/memory.h
        ARM: mach-ixp4xx: remove mach/memory.h
        ARM: mach-h720x: remove mach/memory.h
        ARM: mach-vt8500: remove mach/memory.h
        ARM: mach-s5pc100: remove mach/memory.h
        ARM: mach-tegra: remove mach/memory.h
        ARM: plat-tcc: remove mach/memory.h
        ARM: mach-mmp: remove mach/memory.h
        ARM: mach-cns3xxx: remove mach/memory.h
        ...
      
      Fix up mostly pretty trivial conflicts in:
       - arch/arm/Kconfig
       - arch/arm/include/asm/localtimer.h
       - arch/arm/kernel/Makefile
       - arch/arm/mach-shmobile/board-ap4evb.c
       - arch/arm/mach-u300/core.c
       - arch/arm/mm/dma-mapping.c
       - arch/arm/mm/proc-v7.S
       - arch/arm/plat-omap/Kconfig
      largely due to some CONFIG option renaming (ie CONFIG_PM_SLEEP ->
      CONFIG_ARM_CPU_SUSPEND for the arm-specific suspend code etc) and
      addition of NEED_MACH_MEMORY_H next to HAVE_IDE.
      1fdb24e9
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue · f362f98e
      Linus Torvalds authored
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)
        leases: fix write-open/read-lease race
        nfs: drop unnecessary locking in llseek
        ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
        vfs: add generic_file_llseek_size
        vfs: do (nearly) lockless generic_file_llseek
        direct-io: merge direct_io_walker into __blockdev_direct_IO
        direct-io: inline the complete submission path
        direct-io: separate map_bh from dio
        direct-io: use a slab cache for struct dio
        direct-io: rearrange fields in dio/dio_submit to avoid holes
        direct-io: fix a wrong comment
        direct-io: separate fields only used in the submission path from struct dio
        vfs: fix spinning prevention in prune_icache_sb
        vfs: add a comment to inode_permission()
        vfs: pass all mask flags check_acl and posix_acl_permission
        vfs: add hex format for MAY_* flag values
        vfs: indicate that the permission functions take all the MAY_* flags
        compat: sync compat_stats with statfs.
        vfs: add "device" tag to /proc/self/mountstats
        cleanup: vfs: small comment fix for block_invalidatepage
        ...
      
      Fix up trivial conflict in fs/gfs2/file.c (llseek changes)
      f362f98e
    • Linus Torvalds's avatar
      Merge http://sucs.org/~rohan/git/gfs2-3.0-nmw · f793f296
      Linus Torvalds authored
      * http://sucs.org/~rohan/git/gfs2-3.0-nmw: (24 commits)
        GFS2: Move readahead of metadata during deallocation into its own function
        GFS2: Remove two unused variables
        GFS2: Misc fixes
        GFS2: rewrite fallocate code to write blocks directly
        GFS2: speed up delete/unlink performance for large files
        GFS2: Fix off-by-one in gfs2_blk2rgrpd
        GFS2: Clean up ->page_mkwrite
        GFS2: Correctly set goal block after allocation
        GFS2: Fix AIL flush issue during fsync
        GFS2: Use cached rgrp in gfs2_rlist_add()
        GFS2: Call do_strip() directly from recursive_scan()
        GFS2: Remove obsolete assert
        GFS2: Cache the most recently used resource group in the inode
        GFS2: Make resource groups "append only" during life of fs
        GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme
        GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added
        GFS2: Clean up gfs2_create
        GFS2: Use ->dirty_inode()
        GFS2: Fix bug trap and journaled data fsync
        GFS2: Fix inode allocation error path
        ...
      f793f296
    • Linus Torvalds's avatar
      Merge branch '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6 · dabcbb1b
      Linus Torvalds authored
      * '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6: (52 commits)
        Fix build break when freezer not configured
        Add definition for share encryption
        CIFS: Make cifs_push_locks send as many locks at once as possible
        CIFS: Send as many mandatory unlock ranges at once as possible
        CIFS: Implement caching mechanism for posix brlocks
        CIFS: Implement caching mechanism for mandatory brlocks
        CIFS: Fix DFS handling in cifs_get_file_info
        CIFS: Fix error handling in cifs_readv_complete
        [CIFS] Fixup trivial checkpatch warning
        [CIFS] Show nostrictsync and noperm mount options in /proc/mounts
        cifs, freezer: add wait_event_freezekillable and have cifs use it
        cifs: allow cifs_max_pending to be readable under /sys/module/cifs/parameters
        cifs: tune bdi.ra_pages in accordance with the rsize
        cifs: allow for larger rsize= options and change defaults
        cifs: convert cifs_readpages to use async reads
        cifs: add cifs_async_readv
        cifs: fix protocol definition for READ_RSP
        cifs: add a callback function to receive the rest of the frame
        cifs: break out 3rd receive phase into separate function
        cifs: find mid earlier in receive codepath
        ...
      dabcbb1b
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · 5619a693
      Linus Torvalds authored
      * 'for-linus' of git://oss.sgi.com/xfs/xfs: (69 commits)
        xfs: add AIL pushing tracepoints
        xfs: put in missed fix for merge problem
        xfs: do not flush data workqueues in xfs_flush_buftarg
        xfs: remove XFS_bflush
        xfs: remove xfs_buf_target_name
        xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks
        xfs: clean up xfs_ioerror_alert
        xfs: clean up buffer allocation
        xfs: remove buffers from the delwri list in xfs_buf_stale
        xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALE
        xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF
        xfs: remove XFS_BUF_FINISH_IOWAIT
        xfs: remove xfs_get_buftarg_list
        xfs: fix buffer flushing during unmount
        xfs: optimize fsync on directories
        xfs: reduce the number of log forces from tail pushing
        xfs: Don't allocate new buffers on every call to _xfs_buf_find
        xfs: simplify xfs_trans_ijoin* again
        xfs: unlock the inode before log force in xfs_change_file_space
        xfs: unlock the inode before log force in xfs_fs_nfs_commit_metadata
        ...
      5619a693
    • J. Bruce Fields's avatar
      leases: fix write-open/read-lease race · f3c7691e
      J. Bruce Fields authored
      In setlease, we use i_writecount to decide whether we can give out a
      read lease.
      
      In open, we break leases before incrementing i_writecount.
      
      There is therefore a window between the break lease and the i_writecount
      increment when setlease could add a new read lease.
      
      This would leave us with a simultaneous write open and read lease, which
      shouldn't happen.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      f3c7691e
    • Andi Kleen's avatar
      nfs: drop unnecessary locking in llseek · 79835a71
      Andi Kleen authored
      This makes NFS follow the standard generic_file_llseek locking scheme.
      
      Cc: Trond.Myklebust@netapp.com
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      79835a71
    • Andi Kleen's avatar
      ext4: replace cut'n'pasted llseek code with generic_file_llseek_size · 4cce0e28
      Andi Kleen authored
      This gives ext4 the benefits of unlocked llseek.
      
      Cc: tytso@mit.edu
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      4cce0e28
    • Andi Kleen's avatar
      vfs: add generic_file_llseek_size · 5760495a
      Andi Kleen authored
      Add a generic_file_llseek variant to the VFS that allows passing in
      the maximum file size of the file system, instead of always
      using maxbytes from the superblock.
      
      This can be used to eliminate some cut'n'paste seek code in ext4.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      5760495a
    • Andi Kleen's avatar
      vfs: do (nearly) lockless generic_file_llseek · ef3d0fd2
      Andi Kleen authored
      The i_mutex lock use of generic _file_llseek hurts.  Independent processes
      accessing the same file synchronize over a single lock, even though
      they have no need for synchronization at all.
      
      Under high utilization this can cause llseek to scale very poorly on larger
      systems.
      
      This patch does some rethinking of the llseek locking model:
      
      First the 64bit f_pos is not necessarily atomic without locks
      on 32bit systems. This can already cause races with read() today.
      This was discussed on linux-kernel in the past and deemed acceptable.
      The patch does not change that.
      
      Let's look at the different seek variants:
      
      SEEK_SET: Doesn't really need any locking.
      If there's a race one writer wins, the other loses.
      
      For 32bit the non atomic update races against read()
      stay the same. Without a lock they can also happen
      against write() now.  The read() race was deemed
      acceptable in past discussions, and I think if it's
      ok for read it's ok for write too.
      
      => Don't need a lock.
      
      SEEK_END: This behaves like SEEK_SET plus it reads
      the maximum size too. Reading the maximum size would have the
      32bit atomic problem. But luckily we already have a way to read
      the maximum size without locking (i_size_read), so we
      can just use that instead.
      
      Without i_mutex there is no synchronization with write() anymore,
      however since the write() update is atomic on 64bit it just behaves
      like another racy SEEK_SET.  On non atomic 32bit it's the same
      as SEEK_SET.
      
      => Don't need a lock, but need to use i_size_read()
      
      SEEK_CUR: This has a read-modify-write race window
      on the same file. One could argue that any application
      doing unsynchronized seeks on the same file is already broken.
      But for the sake of not adding a regression here I'm
      using the file->f_lock to synchronize this. Using this
      lock is much better than the inode mutex because it doesn't
      synchronize between processes.
      
      => So still need a lock, but can use a f_lock.
      
      This patch implements this new scheme in generic_file_llseek.
      I dropped generic_file_llseek_unlocked and changed all callers.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      ef3d0fd2
    • Andi Kleen's avatar
      direct-io: merge direct_io_walker into __blockdev_direct_IO · 847cc637
      Andi Kleen authored
      This doesn't change anything for the compiler, but hch thought it would
      make the code clearer.
      
      I moved the reference counting into its own little inline.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      847cc637
    • Andi Kleen's avatar
      direct-io: inline the complete submission path · ba253fbf
      Andi Kleen authored
      Add inlines to all the submission path functions. While this increases
      code size it also gives gcc a lot of optimization opportunities
      in this critical hotpath.
      
      In particular -- together with some other changes -- this
      allows gcc to get rid of the unnecessary clearing of
      sdio at the beginning and optimize the messy parameter passing.
      Any non inlining of a function which takes a sdio parameter
      would break this optimization because they cannot be done if the
      address of a structure is taken.
      
      Note that benefits are only seen with CONFIG_OPTIMIZE_INLINING
      and CONFIG_CC_OPTIMIZE_FOR_SIZE both set to off.
      
      This gives about 2.2% improvement on a large database benchmark
      with a high IOPS rate.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      ba253fbf
    • Andi Kleen's avatar
      direct-io: separate map_bh from dio · 18772641
      Andi Kleen authored
      Only a single b_private field in the map_bh buffer head is needed after
      the submission path. Move map_bh separately to avoid storing
      this information in the long term slab.
      
      This avoids the weird 104 byte hole in struct dio_submit which also needed
      to be memseted early.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      18772641
    • Andi Kleen's avatar
      direct-io: use a slab cache for struct dio · 6e8267f5
      Andi Kleen authored
      A direct slab call is slightly faster than kmalloc and can be better cached
      per CPU. It also avoids rounding to the next kmalloc slab.
      
      In addition this enforces cache line alignment for struct dio to avoid
      any false sharing.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      6e8267f5
    • Andi Kleen's avatar
      direct-io: rearrange fields in dio/dio_submit to avoid holes · 0dc2bc49
      Andi Kleen authored
      Fix most problems reported by pahole.
      
      There is still a weird 104 byte hole after map_bh. I'm not sure what
      causes this.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      0dc2bc49
    • Andi Kleen's avatar
      direct-io: fix a wrong comment · cde1ecb3
      Andi Kleen authored
      There's nothing on the stack, even before my changes.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      cde1ecb3
    • Andi Kleen's avatar
      direct-io: separate fields only used in the submission path from struct dio · eb28be2b
      Andi Kleen authored
      This large, but largely mechanic, patch moves all fields in struct dio
      that are only used in the submission path into a separate on stack
      data structure. This has the advantage that the memory is very likely
      cache hot, which is not guaranteed for memory fresh out of kmalloc.
      
      This also gives gcc more optimization potential because it can easier
      determine that there are no external aliases for these variables.
      
      The sdio initialization is a initialization now instead of memset.
      This allows gcc to break sdio into individual fields and optimize
      away unnecessary zeroing (after all the functions are inlined)
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      eb28be2b
    • Christoph Hellwig's avatar
      vfs: fix spinning prevention in prune_icache_sb · 62a3ddef
      Christoph Hellwig authored
      We need to move the inode to the end of the list to actually make the
      spinning prevention explained in the comment above it work.  With a
      plain list_move it will simply stay in place as we're always reclaiming
      from the head of the list.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      62a3ddef
    • Andreas Gruenbacher's avatar
    • Andreas Gruenbacher's avatar
    • Aneesh Kumar K.V's avatar
      vfs: add hex format for MAY_* flag values · 8522ca58
      Aneesh Kumar K.V authored
      We are going to add more flags and having them in hex format
      make it simpler
      Acked-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      8522ca58