1. 09 Mar, 2011 16 commits
    • Tejun Heo's avatar
      staging: Convert to bdops->check_events() · cafb0bfc
      Tejun Heo authored
      Convert two staging drivers - blkvsc_drv and cyasblkdev_block - from
      ->media_changed() to ->check_events().  The former always indicated
      media changed while the latter always indicated media not changed.
      Not sure what the drivers are trying to achieve but keep the original
      behavior.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      cafb0bfc
    • Tejun Heo's avatar
      pktcdvd: Convert to bdops->check_events() · 3c0d2060
      Tejun Heo authored
      Convert from ->media_changed() to ->check_events().
      
      pktcdvd needs to forward all event related operations to the
      underlying device.  Forward ->check_events() instead of
      ->media_changed() and inherit disk->[async_]events.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Peter Osterlund <petero2@telia.com>
      3c0d2060
    • Tejun Heo's avatar
      umem: Drop dummy ->media_changed() · 6fac80e3
      Tejun Heo authored
      umem doesn't implement media changed detection and there's no need to
      implement dummy callback anymore.  Remove it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      6fac80e3
    • Tejun Heo's avatar
      s390/tape_block: Convert to bdops->check_events() · ffe80cea
      Tejun Heo authored
      Convert from ->media_changed() to ->check_events().
      
      s390/tape_block buffers media changed state and clears it on
      revalidation.  It will behave correctly with kernel event polling.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      ffe80cea
    • Tejun Heo's avatar
      i2o_block: Convert to bdops->check_events() · f47350fd
      Tejun Heo authored
      Convert from ->media_changed() to ->check_events().
      
      i2o_block buffers media changed state and clears it after reporting.
      It will behave correctly with kernel event polling.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Markus Lidel <Markus.Lidel@shadowconnect.com>
      f47350fd
    • Tejun Heo's avatar
      xsysace: Convert to bdops->check_events() · 3a200911
      Tejun Heo authored
      Convert from ->media_changed() to ->check_events().
      
      xsysace buffers media changed state and clears it on revalidation.  It
      will behave correctly with kernel event polling.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      3a200911
    • Tejun Heo's avatar
      ub: Convert to bdops->check_events() · aaa7c015
      Tejun Heo authored
      Convert from ->media_changed() to ->check_events().
      
      ub buffers media changed state and clears it on revalidation.  It will
      behave correctly with kernel event polling.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Pete Zaitcev <zaitcev@redhat.com>
      aaa7c015
    • Tejun Heo's avatar
      swim[3]: Convert to bdops->check_events() · 4bbde777
      Tejun Heo authored
      Convert from ->media_changed() to ->check_events().
      
      Both swim and swim3 buffer media changed state and clear it on
      revalidation.  They will behave correctly with kernel event polling.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Laurent Vivier <laurent@lvivier.info>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      4bbde777
    • Tejun Heo's avatar
      dac960: Convert to bdops->check_events() · 507daea2
      Tejun Heo authored
      Convert from ->media_changed() to ->check_events().
      
      DAC960 media change notification seems to be one way (once set, never
      cleared) and will generate spurious events when polled once the
      condition triggers.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      507daea2
    • Tejun Heo's avatar
      paride: Convert to bdops->check_events() · b1b56b93
      Tejun Heo authored
      Convert paride drivers from ->media_changed() to ->check_events().
      
      pcd and pd buffer and clear events after reporting; however, pf
      unconditionally reports MEDIA_CHANGE and will generate spurious events
      when polled.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Tim Waugh <tim@cyberelk.net>
      b1b56b93
    • Tejun Heo's avatar
      gdrom,viocd: Convert to bdops->check_events() · 1c27030b
      Tejun Heo authored
      Convert gdrom and viocd from ->media_changed() to ->check_events().
      
      It's unclear how the conditions are cleared and it's possible that it
      may generate spurious events when polled.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      1c27030b
    • Tejun Heo's avatar
      floppy,{ami|ata}flop: Convert to bdops->check_events() · 1a8a74f0
      Tejun Heo authored
      Convert the floppy drivers from ->media_changed() to ->check_events().
      Both floppy and ataflop buffer media changed state bit and clear them
      on revalidation and will behave correctly with kernel event polling.
      
      I can't tell how amiflop clears its event and it's possible that it
      may generate spurious events when polled.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      1a8a74f0
    • Tejun Heo's avatar
      ide: Convert to bdops->check_events() · 5b03a1b1
      Tejun Heo authored
      Convert ->media_changed() to the new ->check_events() method.  The
      conversion is mostly mechanical.  The only notable change is that
      cdrom now doesn't generate any event if @slot_nr isn't CDSL_CURRENT.
      It used to return -EINVAL which would be treated as media changed.  As
      media changer isn't supported anyway, this doesn't make any
      difference.
      
      This makes ide emit the standard disk events and allows kernel event
      polling.  Currently, only MEDIA_CHANGE event is implemented.  Adding
      support for EJECT_REQUEST shouldn't be difficult; however, given that
      ide driver is already deprecated, it probably is best to leave it
      alone.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: linux-ide@vger.kernel.org
      5b03a1b1
    • Tejun Heo's avatar
      block: Don't check events while open is in progress · 69e02c59
      Tejun Heo authored
      Not all block drivers clear events immediately after reporting.  Some
      do so in ->revalidate_disk() or other steps during ->open().  There is
      a slim chance event poll may happen between the clearing event check
      from check_disk_change() and the actual clearing of the events which
      would result in spurious events.
      
      Block event checks while block device open is in progress.  There is
      no need to kick explicit event check afterwards as events are always
      checked during open.
      
      -v2: The original patch could have called disk_unblock_events() with
           an already released or %NULL @disk causing oops.  Fixed by making
           sure references are put after disk_unblock_events() is called.
           It also makes the error path of __blkdev_get() a bit simpler.
           This problem was reported by Jens.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      69e02c59
    • Tejun Heo's avatar
      block: Don't check events on close unless it was blocked · 6936217c
      Tejun Heo authored
      The block event mechanism currently always checks events when the
      device is being closed regardless of the open mode.  The intention was
      to allow detection of EJECT_REQUEST when a device is closed whether
      disk event polling is enabled or not.
      
      This is unnecessary as, for devices of interest, events are checked
      from either userland or kernel and in the former case ->check_events()
      is performed on open of each poll attempt anyway.  Furthermore, this
      unconditional event check on close makes the code susceptible to event
      loop if the block driver doesn't clear reported events correctly - an
      event triggers userland to open and close the device which in turn
      causes another event, rinse and repeat.
      
      Check events on close only if it was blocked by excl write open.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      6936217c
    • Tejun Heo's avatar
      block: Don't implicitly trigger event check on disk_unblock_events() · facc31dd
      Tejun Heo authored
      Currently, disk_unblock_events() implicitly kick event check if the
      block count reaches zero.  This behavior is not described in the
      comment and hinders with future changes.  Make the unblocker
      explicitly check events by calling disk_check_events() as necessary.
      
      This patch doesn't cause any behavior difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      facc31dd
  2. 08 Mar, 2011 2 commits
  3. 07 Mar, 2011 6 commits
  4. 04 Mar, 2011 1 commit
    • Tejun Heo's avatar
      Merge branch 'for-linus' of ../linux-2.6-block into block-for-2.6.39/core · e83a46bb
      Tejun Heo authored
      This merge creates two set of conflicts.  One is simple context
      conflicts caused by removal of throtl_scheduled_delayed_work() in
      for-linus and removal of throtl_shutdown_timer_wq() in
      for-2.6.39/core.
      
      The other is caused by commit 255bb490 (block: blk-flush shouldn't
      call directly into q->request_fn() __blk_run_queue()) in for-linus
      crashing with FLUSH reimplementation in for-2.6.39/core.  The conflict
      isn't trivial but the resolution is straight-forward.
      
      * __blk_run_queue() calls in flush_end_io() and flush_data_end_io()
        should be called with @force_kblockd set to %true.
      
      * elv_insert() in blk_kick_flush() should use
        %ELEVATOR_INSERT_REQUEUE.
      
      Both changes are to avoid invoking ->request_fn() directly from
      request completion path and closely match the changes in the commit
      255bb490.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      e83a46bb
  5. 03 Mar, 2011 5 commits
    • Petr Uzel's avatar
      block: kill loop_mutex · fd51469f
      Petr Uzel authored
      Following steps lead to deadlock in kernel:
      
      dd if=/dev/zero of=img bs=512 count=1000
      losetup -f img
      mkfs.ext2 /dev/loop0
      mount -t ext2 -o loop /dev/loop0 mnt
      umount mnt/
      
      Stacktrace:
      [<c102ec04>] irq_exit+0x36/0x59
      [<c101502c>] smp_apic_timer_interrupt+0x6b/0x75
      [<c127f639>] apic_timer_interrupt+0x31/0x38
      [<c101df88>] mutex_spin_on_owner+0x54/0x5b
      [<fe2250e9>] lo_release+0x12/0x67 [loop]
      [<c10c4eae>] __blkdev_put+0x7c/0x10c
      [<c10a4da5>] fput+0xd5/0x1aa
      [<fe2250cf>] loop_clr_fd+0x1a9/0x1b1 [loop]
      [<fe225110>] lo_release+0x39/0x67 [loop]
      [<c10c4eae>] __blkdev_put+0x7c/0x10c
      [<c10a59d9>] deactivate_locked_super+0x17/0x36
      [<c10b6f37>] sys_umount+0x27e/0x2a5
      [<c10b6f69>] sys_oldumount+0xb/0xe
      [<c1002897>] sysenter_do_call+0x12/0x26
      [<ffffffff>] 0xffffffff
      
      Regression since 2a48fc0a, which introduced the private
      loop_mutex as part of the BKL removal process.
      
      As per [1], the mutex can be safely removed.
      
      [1] http://www.gossamer-threads.com/lists/linux/kernel/1341930
      
      Addresses: https://bugzilla.novell.com/show_bug.cgi?id=669394
      Addresses: https://bugzilla.kernel.org/show_bug.cgi?id=29172Signed-off-by: default avatarPetr Uzel <petr.uzel@suse.cz>
      Cc: stable@kernel.org
      Reviewed-by: default avatarNikanth Karthikesan <knikanth@suse.de>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      fd51469f
    • Tao Ma's avatar
      blktrace: Remove blk_fill_rwbs_rq. · 2d3a8497
      Tao Ma authored
      If we enable trace events to trace block actions, We use
      blk_fill_rwbs_rq to analyze the corresponding actions
      in request's cmd_flags, but we only choose the minor 2 bits
      from it, so most of other flags(e.g, REQ_SYNC) are missing.
      For example, with a sync write we get:
      write_test-2409  [001]   160.013869: block_rq_insert: 3,64 W 0 () 258135 + =
      8 [write_test]
      
      Since now we have integrated the flags of both bio and request,
      it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and
      blk_fill_rwbs_rq isn't needed any more.
      
      With this patch, after a sync write we get:
      write_test-2417  [000]   226.603878: block_rq_insert: 3,64 WS 0 () 258135 +=
       8 [write_test]
      Signed-off-by: default avatarTao Ma <boyu.mt@taobao.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      2d3a8497
    • Vivek Goyal's avatar
      block: Move blk_throtl_exit() call to blk_cleanup_queue() · da527770
      Vivek Goyal authored
      Move blk_throtl_exit() in blk_cleanup_queue() as blk_throtl_exit() is
      written in such a way that it needs queue lock. In blk_release_queue()
      there is no gurantee that ->queue_lock is still around.
      
      Initially blk_throtl_exit() was in blk_cleanup_queue() but Ingo reported
      one problem.
      
        https://lkml.org/lkml/2010/10/23/86
      
        And a quick fix moved blk_throtl_exit() to blk_release_queue().
      
              commit 7ad58c02
              Author: Jens Axboe <jaxboe@fusionio.com>
              Date:   Sat Oct 23 20:40:26 2010 +0200
      
              block: fix use-after-free bug in blk throttle code
      
      This patch reverts above change and does not try to shutdown the
      throtl work in blk_sync_queue(). By avoiding call to
      throtl_shutdown_timer_wq() from blk_sync_queue(), we should also avoid
      the problem reported by Ingo.
      
      blk_sync_queue() seems to be used only by md driver and it seems to be
      using it to make sure q->unplug_fn is not called as md registers its
      own unplug functions and it is about to free up the data structures
      used by unplug_fn(). Block throttle does not call back into unplug_fn()
      or into md. So there is no need to cancel blk throttle work.
      
      In fact I think cancelling block throttle work is bad because it might
      happen that some bios are throttled and scheduled to be dispatched later
      with the help of pending work and if work is cancelled, these bios might
      never be dispatched.
      
      Block layer also uses blk_sync_queue() during blk_cleanup_queue() and
      blk_release_queue() time. That should be safe as we are also calling
      blk_throtl_exit() which should make sure all the throttling related
      data structures are cleaned up.
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      da527770
    • Vivek Goyal's avatar
      loop: No need to initialize ->queue_lock explicitly before calling blk_cleanup_queue() · cd25f549
      Vivek Goyal authored
      Now we initialize ->queue_lock at queue allocation time so driver does
      not have to worry about initializing it before calling
      blk_cleanup_queue().
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      cd25f549
    • Vivek Goyal's avatar
      block: Initialize ->queue_lock to internal lock at queue allocation time · c94a96ac
      Vivek Goyal authored
      There does not seem to be a clear convention whether q->queue_lock is
      initialized or not when blk_cleanup_queue() is called. In the past it
      was not necessary but now blk_throtl_exit() takes up queue lock by
      default and needs queue lock to be available.
      
      In fact elevator_exit() code also has similar requirement just that it
      is less stringent in the sense that elevator_exit() is called only if
      elevator is initialized.
      
      Two problems have been noticed because of ambiguity about spin lock
      status.
      
            - If a driver calls blk_alloc_queue() and then soon calls
              blk_cleanup_queue() almost immediately, (because some other
      	driver structure allocation failed or some other error happened)
      	then blk_throtl_exit() will run into issues as queue lock is not
      	initialized. Loop driver ran into this issue recently and I
      	noticed error paths in md driver too. Similar error paths should
      	exist in other drivers too.
      
            - If some driver provided external spin lock and zapped the lock
              before blk_cleanup_queue(), then it can lead to issues.
      
      So this patch initializes the default queue lock at queue allocation time.
      
      block throttling code is one of the users of queue lock and it is
      initialized at the queue allocation time, so it makes sense to
      initialize ->queue_lock also to internal lock. A driver can overide that
      lock later. This will take care of the issue where a driver does not have
      to worry about initializing the queue lock to default before calling
      blk_cleanup_queue()
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      c94a96ac
  6. 02 Mar, 2011 3 commits
  7. 01 Mar, 2011 7 commits