1. 23 Jan, 2015 4 commits
    • Shaohua Li's avatar
      libata: use blk taging · 12cb5ce1
      Shaohua Li authored
      libata uses its own tag management which is duplication and the
      implementation is poor. And if we switch to blk-mq, tag is build-in.
      It's time to switch to generic taging.
      
      The SAS driver has its own tag management, and looks we can't directly
      map the host controler tag to SATA tag. So I just bypassed the SAS case.
      
      I changed the code/variable name for the tag management of libata to
      make it self contained. Only sas will use it. Later if libsas implements
      its tag management, the tag management code in libata can be deleted
      easily.
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      12cb5ce1
    • Jens Axboe's avatar
      Merge branch 'for-3.20/core' into for-3.20/drivers · a4a1cc16
      Jens Axboe authored
      We need the tagging changes for the libata conversion.
      a4a1cc16
    • Shaohua Li's avatar
      blk-mq: add tag allocation policy · 24391c0d
      Shaohua Li authored
      This is the blk-mq part to support tag allocation policy. The default
      allocation policy isn't changed (though it's not a strict FIFO). The new
      policy is round-robin for libata. But it's a try-best implementation. If
      multiple tasks are competing, the tags returned will be mixed (which is
      unavoidable even with !mq, as requests from different tasks can be
      mixed in queue)
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      24391c0d
    • Shaohua Li's avatar
      block: support different tag allocation policy · ee1b6f7a
      Shaohua Li authored
      The libata tag allocation is using a round-robin policy. Next patch will
      make libata use block generic tag allocation, so let's add a policy to
      tag allocation.
      
      Currently two policies: FIFO (default) and round-robin.
      
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      ee1b6f7a
  2. 22 Jan, 2015 2 commits
  3. 21 Jan, 2015 2 commits
    • Martin K. Petersen's avatar
      block: Add discard flag to blkdev_issue_zeroout() function · d93ba7a5
      Martin K. Petersen authored
      blkdev_issue_discard() will zero a given block range. This is done by
      way of explicit writing, thus provisioning or allocating the blocks on
      disk.
      
      There are use cases where the desired behavior is to zero the blocks but
      unprovision them if possible. The blocks must deterministically contain
      zeroes when they are subsequently read back.
      
      This patch adds a flag to blkdev_issue_zeroout() that provides this
      variant. If the discard flag is set and a block device guarantees
      discard_zeroes_data we will use REQ_DISCARD to clear the block range. If
      the device does not support discard_zeroes_data or if the discard
      request fails we will fall back to first REQ_WRITE_SAME and then a
      regular REQ_WRITE.
      
      Also update the callers of blkdev_issue_zero() to reflect the new flag
      and make sb_issue_zeroout() prefer the discard approach.
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      d93ba7a5
    • Jeff Moyer's avatar
      cfq-iosched: fix incorrect filing of rt async cfqq · c6ce1943
      Jeff Moyer authored
      Hi,
      
      If you can manage to submit an async write as the first async I/O from
      the context of a process with realtime scheduling priority, then a
      cfq_queue is allocated, but filed into the wrong async_cfqq bucket.  It
      ends up in the best effort array, but actually has realtime I/O
      scheduling priority set in cfqq->ioprio.
      
      The reason is that cfq_get_queue assumes the default scheduling class and
      priority when there is no information present (i.e. when the async cfqq
      is created):
      
      static struct cfq_queue *
      cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
      	      struct bio *bio, gfp_t gfp_mask)
      {
      	const int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
      	const int ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
      
      cic->ioprio starts out as 0, which is "invalid".  So, class of 0
      (IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so:
      
      		async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio);
      
      static struct cfq_queue **
      cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio)
      {
              switch (ioprio_class) {
              case IOPRIO_CLASS_RT:
                      return &cfqd->async_cfqq[0][ioprio];
              case IOPRIO_CLASS_NONE:
                      ioprio = IOPRIO_NORM;
                      /* fall through */
              case IOPRIO_CLASS_BE:
                      return &cfqd->async_cfqq[1][ioprio];
              case IOPRIO_CLASS_IDLE:
                      return &cfqd->async_idle_cfqq;
              default:
                      BUG();
              }
      }
      
      Here, instead of returning a class mapped from the process' scheduling
      priority, we get back the bucket associated with IOPRIO_CLASS_BE.
      
      Now, there is no queue allocated there yet, so we create it:
      
      		cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask);
      
      That function ends up doing this:
      
      			cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync);
      			cfq_init_prio_data(cfqq, cic);
      
      cfq_init_cfqq marks the priority as having changed.  Then, cfq_init_prio
      data does this:
      
      	ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
      	switch (ioprio_class) {
      	default:
      		printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class);
      	case IOPRIO_CLASS_NONE:
      		/*
      		 * no prio set, inherit CPU scheduling settings
      		 */
      		cfqq->ioprio = task_nice_ioprio(tsk);
      		cfqq->ioprio_class = task_nice_ioclass(tsk);
      		break;
      
      So we basically have two code paths that treat IOPRIO_CLASS_NONE
      differently, which results in an RT async cfqq filed into a best effort
      bucket.
      
      Attached is a patch which fixes the problem.  I'm not sure how to make
      it cleaner.  Suggestions would be welcome.
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Tested-by: default avatarHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      c6ce1943
  4. 16 Jan, 2015 1 commit
    • Jens Axboe's avatar
      null_blk: suppress invalid partition info · 227290b4
      Jens Axboe authored
      null_blk is partitionable, but it doesn't store any of the info. When
      it is loaded, you would normally see:
      
      [1226739.343608]  nullb0: unknown partition table
      [1226739.343746]  nullb1: unknown partition table
      
      which can confuse some people. Add the appropriate gendisk flag
      to suppress this info.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      227290b4
  5. 14 Jan, 2015 6 commits
    • Jens Axboe's avatar
      blk-mq: fix false negative out-of-tags condition · 0bf36498
      Jens Axboe authored
      The blk-mq tagging tries to maintain some locality between CPUs and
      the tags issued. The tags are split into groups of words, and the
      words may not be fully populated. When searching for a new free tag,
      blk-mq may look at partial words, hence it passes in an offset/size
      to find_next_zero_bit(). However, it does that wrong, the size must
      always be the full length of the number of tags in that word,
      otherwise we'll potentially miss some near the end.
      
      Another issue is when __bt_get() goes from one word set to the next.
      It bumps the index, but not the last_tag associated with the
      previous index. Bump that to be in the range of the new word.
      
      Finally, clean up __bt_get() and __bt_get_word() a bit and get
      rid of the goto in there, and the unnecessary 'wrap' variable.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      0bf36498
    • Boaz Harrosh's avatar
      brd: Request from fdisk 4k alignment · c8fa3173
      Boaz Harrosh authored
      Because of the direct_access() API which returns a PFN. partitions
      better start on 4K boundary, else offset ZERO of a partition will
      not be aligned and blk_direct_access() will fail the call.
      
      By setting blk_queue_physical_block_size(PAGE_SIZE) we can communicate
      this to fdisk and friends.
      
      The call to blk_queue_physical_block_size() is harmless and will
      not affect the Kernel behavior in any way. It is only for
      communication to user-mode.
      
      before this patch running fdisk on a default size brd of 4M
      the first sector offered is 34 (BAD), but after this patch it
      will be 40, ie 8 sectors aligned. Also when entering some random
      partition sizes the next partition-start sector is offered 8 sectors
      aligned after this patch. (Please note that with fdisk the user
      can still enter bad values, only the offered default values will
      be correct)
      
      Note that with bdev-size > 4M fdisk will try to align on a 1M
      boundary (above first-sector will be 2048), in any case.
      
      CC: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarBoaz Harrosh <boaz@plexistor.com>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      c8fa3173
    • Boaz Harrosh's avatar
      brd: Fix all partitions BUGs · 937af5ec
      Boaz Harrosh authored
      This patch fixes up brd's partitions scheme, now enjoying all worlds.
      
      The MAIN fix here is that currently, if one fdisks some partitions,
      a BAD bug will make all partitions point to the same start-end sector
      ie: 0 - brd_size And an mkfs of any partition would trash the partition
      table and the other partition.
      
      Another fix is that "mount -U uuid" will not work if show_part was not
      specified, because of the GENHD_FL_SUPPRESS_PARTITION_INFO flag.
      We now always load without it and remove the show_part parameter.
      
      [We remove Dmitry's new module-param part_show it is now always
       show]
      
      So NOW the logic goes like this:
      * max_part - Just says how many minors to reserve between ramX
        devices. In any way, there can be as many partition as requested.
        If minors between devices ends, then dynamic 259-major ids will
        be allocated on the fly.
        The default is now max_part=1, which means all partitions devt(s)
        will be from the dynamic (259) major-range.
        (If persistent partition minors is needed use max_part=X)
        For example with /dev/sdX max_part is hard coded 16.
      
      * Creation of new devices on the fly still/always work:
        mknod /path/devnod b 1 X
        fdisk -l /path/devnod
        Will create a new device if [X / max_part] was not already
        created before. (Just as before)
      
        partitions on the dynamically created device will work as well
        Same logic applies with minors as with the pre-created ones.
      
      TODO: dynamic grow of device size. So each device can have it's
            own size.
      
      CC: Dmitry Monakhov <dmonakhov@openvz.org>
      Tested-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarBoaz Harrosh <boaz@plexistor.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      937af5ec
    • Jens Axboe's avatar
      d4119ee0
    • Matthew Wilcox's avatar
      block: Change direct_access calling convention · dd22f551
      Matthew Wilcox authored
      In order to support accesses to larger chunks of memory, pass in a
      'size' parameter (counted in bytes), and return the amount available at
      that address.
      
      Add a new helper function, bdev_direct_access(), to handle common
      functionality including partition handling, checking the length requested
      is positive, checking for the sector being page-aligned, and checking
      the length of the request does not pass the end of the partition.
      Signed-off-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarBoaz Harrosh <boaz@plexistor.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      dd22f551
    • Matthew Wilcox's avatar
      axonram: Fix bug in direct_access · 91117a20
      Matthew Wilcox authored
      The 'pfn' returned by axonram was completely bogus, and has been since
      2008.
      Signed-off-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      91117a20
  6. 02 Jan, 2015 8 commits
  7. 31 Dec, 2014 1 commit
  8. 29 Dec, 2014 1 commit
  9. 28 Dec, 2014 4 commits
  10. 27 Dec, 2014 4 commits
  11. 26 Dec, 2014 5 commits
    • Linus Torvalds's avatar
      Merge branch 'parisc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 58628a78
      Linus Torvalds authored
      Pull parisc build fix from Helge Deller:
       "This unbreaks the kernel compilation on parisc with gcc-4.9"
      
      * 'parisc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: fix out-of-register compiler error in ldcw inline assembler function
      58628a78
    • John David Anglin's avatar
      parisc: fix out-of-register compiler error in ldcw inline assembler function · 45db0738
      John David Anglin authored
      The __ldcw macro has a problem when its argument needs to be reloaded from
      memory. The output memory operand and the input register operand both need to
      be reloaded using a register in class R1_REGS when generating 64-bit code.
      This fails because there's only a single register in the class. Instead, use a
      memory clobber. This also makes the __ldcw macro a compiler memory barrier.
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org>        [3.13+]
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      45db0738
    • Libin Yang's avatar
      ALSA: hda_intel: apply the Seperate stream_tag for Skylake · d6795827
      Libin Yang authored
      The total stream number of Skylake's input and output stream
      exceeds 15, which will cause some streams do not work because
      of the overflow on SDxCTL.STRM field if using the legacy
      stream tag allocation method.
      
      This patch uses the new stream tag allocation method by add
      the flag AZX_DCAPS_SEPARATE_STREAM_TAG for Skylake platform.
      Signed-off-by: default avatarLibin Yang <libin.yang@intel.com>
      Reviewed-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      d6795827
    • Rafal Redzimski's avatar
      ALSA: hda_controller: Separate stream_tag for input and output streams. · 93e3423e
      Rafal Redzimski authored
      Implemented separate stream_tag assignment for input and output streams.
      According to hda specification stream tag must be unique throughout the
      input streams group, however an output stream might use a stream tag
      which is already in use by an input stream. This change is necessary
      to support HW which provides a total of more than 15 stream DMA engines
      which with legacy implementation causes an overflow on SDxCTL.STRM
      field (and the whole SDxCTL register) and as a result usage of
      Reserved value 0 in the SDxCTL.STRM field which confuses HDA controller.
      Signed-off-by: default avatarRafal Redzimski <rafal.f.redzimski@intel.com>
      Signed-off-by: default avatarJayachandran B <jayachandran.b@intel.com>
      Signed-off-by: default avatarLibin Yang <libin.yang@intel.com>
      Reviewed-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      93e3423e
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 08b022a9
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Xmas fixes pull:
      
        core:
            one atomic fix, revert the WARN_ON dumb buffers patch.
      
        agp:
            fixup Dave J.
      
        nouveau:
            fix 3.18 regression for old userspace
      
        tegra fixes:
            vblank and iommu fixes
      
        amdkfd:
            fix bugs shown by testing with userspace, init apertures once
      
        msm:
            hdmi fixes and cleanup
      
        i915:
            misc fixes
      
        There is also a link ordering fix that I've asked to be cc'ed to you,
        putting iommu before gpu, it fixes an issue with amdkfd when things
        are all in the kernel, but I didn't like sending it via my tree
        without discussion.
      
        I'll probably be a bit on/off for a few weeks with pulls now, due to
        holidays and LCA, so don't be surprised if stuff gets a bit backed up,
        and things end up a bit large due to lag"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (28 commits)
        Revert "drm/gem: Warn on illegal use of the dumb buffer interface v2"
        agp: Fix up email address & attributions in AGP MODULE_AUTHOR tags
        nouveau: bring back legacy mmap handler
        drm/msm/hdmi: rework HDMI IRQ handler
        drm/msm/hdmi: enable regulators before clocks to avoid warnings
        drm/msm/mdp5: update irqs on crtc<->encoder link change
        drm/msm: block incoming update on pending updates
        drm/atomic: fix potential null ptr on plane enable
        drm/msm: Deletion of unnecessary checks before the function call "release_firmware"
        drm/msm: Deletion of unnecessary checks before two function calls
        drm/tegra: dc: Select root window for event dispatch
        drm/tegra: gem: Use the proper size for GEM objects
        drm/tegra: gem: Flush buffer objects upon allocation
        drm/tegra: dc: Fix a potential race on page-flip completion
        drm/tegra: dc: Consistently use the same pipe
        drm/irq: Add drm_crtc_vblank_count()
        drm/irq: Add drm_crtc_handle_vblank()
        drm/irq: Add drm_crtc_send_vblank_event()
        drm/i915: Disable PSMI sleep messages on all rings around context switches
        drm/i915: Force the CS stall for invalidate flushes
        ...
      08b022a9
  12. 25 Dec, 2014 1 commit
  13. 24 Dec, 2014 1 commit