- 28 Apr, 2009 40 commits
-
-
Jens Axboe authored
We currently don't do merging on discard requests, but we potentially could. If we do, then we need to include discard requests in the IO accounting, or merging would end up decrementing in_flight IO counters for an IO which never incremented them. So enable accounting for discard requests. Problem found by Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
We currently check for file system requests outside of blk_do_io_stat(rq), but we may as well just include it. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
Now that all block request data transfer is done via bio, rq->data isn't used. Kill it. While at it, make the roles of rq->special and buffer clear. [ Impact: drop now unncessary field from struct request ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Boaz Harrosh <bharrosh@panasas.com>
-
Tejun Heo authored
omap mailbox uses rq->data as the second opaque pointer to carry mbox_msg_t and rq->special message argument which is needed only for tx. Add and use omap_msg_tx_data struct for tx and use rq->special for mbox_msg_t for rx such that only rq->special is used as opaque pointer. [ Impact: cleanup rq->data usage, extra kmalloc in msg_send ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk>
-
Tejun Heo authored
end_request() has been kept around for backward compatibility; however, it's about time for it to go away. * There aren't too many users left. * Its use of @updtodate is pretty confusing. * In some cases, newer code ends up using mixture of end_request() and [__]blk_end_request[_all](), which is way too confusing. So, add [__]blk_end_request_cur() and replace end_request() with it. Most conversions are straightforward. Noteworthy ones are... * paride/pcd: next_request() updated to take 0/-errno instead of 1/0. * paride/pf: pf_end_request() and next_request() updated to take 0/-errno instead of 1/0. * xd: xd_readwrite() updated to return 0/-errno instead of 1/0. * mtd/mtd_blkdevs: blktrans_discard_request() updated to return 0/-errno instead of 1/0. Unnecessary local variable res initialization removed from mtd_blktrans_thread(). [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Joerg Dorchain <joerg@dorchain.net> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Laurent Vivier <Laurent@lvivier.info> Cc: Tim Waugh <tim@cyberelk.net> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: unsik Kim <donari75@gmail.com>
-
Tejun Heo authored
There are many [__]blk_end_request() call sites which call it with full request length and expect full completion. Many of them ensure that the request actually completes by doing BUG_ON() the return value, which is awkward and error-prone. This patch adds [__]blk_end_request_all() which takes @rq and @error and fully completes the request. BUG_ON() is added to to ensure that this actually happens. Most conversions are simple but there are a few noteworthy ones. * cdrom/viocd: viocd_end_request() replaced with direct calls to __blk_end_request_all(). * s390/block/dasd: dasd_end_request() replaced with direct calls to __blk_end_request_all(). * s390/char/tape_block: tapeblock_end_request() replaced with direct calls to blk_end_request_all(). [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mike Miller <mike.miller@hp.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Tejun Heo authored
rq->start_time was initialized in init_request_from_bio() so special requests didn't have start_time set. This has been okay as start_time has been used only for fs requests; however, there is no indication of this actually is the case or not. Set rq->start_time in blk_rq_init() and guarantee that all initialized rq's have its start_time set. This improves consistency at virtually no cost and future changes will make use of the timestamp for !bio requests. [ Impact: rq->start_time is valid for all requests ] Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Request completion has gone through several changes and became a bit messy over the time. Clean it up. 1. end_that_request_data() is a thin wrapper around end_that_request_data_first() which checks whether bio is NULL before doing anything and handles bidi completion. blk_update_request() is a thin wrapper around end_that_request_data() which clears nr_sectors on the last iteration but doesn't use the bidi completion. Clean it up by moving the initial bio NULL check and nr_sectors clearing on the last iteration into end_that_request_data() and renaming it to blk_update_request(), which makes blk_end_io() the only user of end_that_request_data(). Collapse end_that_request_data() into blk_end_io(). 2. There are four visible completion variants - blk_end_request(), __blk_end_request(), blk_end_bidi_request() and end_request(). blk_end_request() and blk_end_bidi_request() uses blk_end_request() as the backend but __blk_end_request() and end_request() use separate implementation in __blk_end_request() due to different locking rules. blk_end_bidi_request() is identical to blk_end_io(). Collapse blk_end_io() into blk_end_bidi_request(), separate out request update into internal helper blk_update_bidi_request() and add __blk_end_bidi_request(). Redefine [__]blk_end_request() as thin inline wrappers around [__]blk_end_bidi_request(). 3. As the whole request issue/completion usages are about to be modified and audited, it's a good chance to convert completion functions return bool which better indicates the intended meaning of return values. 4. The function name end_that_request_last() is from the days when it was a public interface and slighly confusing. Give it a proper internal name - blk_finish_request(). 5. Add description explaning that blk_end_bidi_request() can be safely used for uni requests as suggested by Boaz Harrosh. The only visible behavior change is from #1. nr_sectors counts are cleared after the final iteration no matter which function is used to complete the request. I couldn't find any place where the code assumes those nr_sectors counters contain the values for the last segment and this change is good as it makes the API much more consistent as the end result is now same whether a request is completed using [__]blk_end_request() alone or in combination with blk_update_request(). API further cleaned up per Christoph's suggestion. [ Impact: cleanup, rq->*nr_sectors always updated after req completion ] Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Boaz Harrosh <bharrosh@panasas.com> Cc: Christoph Hellwig <hch@infradead.org>
-
Tejun Heo authored
With recent IDE updates, blk_end_request_callback() doesn't have any user now. Kill it. [ Impact: removal of unused convoluted interface ] Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: code reorganization elv_next_request() and elv_dequeue_request() are public block layer interface than actual elevator implementation. They mostly deal with how requests interact with block layer and low level drivers at the beginning of rqeuest processing whereas __elv_next_request() is the actual eleveator request fetching interface. Move the two functions to blk-core.c. This prepares for further interface cleanup. Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Reorder request completion functions such that * All request completion functions are located together. * Functions which are used by only one caller is put right above the caller. * end_request() is put after other completion functions but before blk_update_request(). This change is for completion function cleanup which will follow. [ Impact: cleanup, code reorganization ] Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
* In blk_rq_timed_out_timer(), else { if } to else if * In blk_add_timer(), simplify if/else block [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
blk_insert_request() doesn't need to worry about REQ_SOFTBARRIER. Don't set it. Combined with recent ide updates, REQ_SOFTBARRIER is now only used in elevator proper and for discard requests. [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
RQ_NOMERGE_FLAGS already clears defines which REQ flags aren't mergeable. There is no reason to specify it superflously. It only adds to confusion. Don't set REQ_NOMERGE for barriers and requests with specific queueing directive. REQ_NOMERGE is now exclusively used by the merging code. [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
blk_start_queueing() is identical to __blk_run_queue() except that it doesn't check for recursion. None of the current users depends on blk_start_queueing() running request_fn directly. Replace usages of blk_start_queueing() with [__]blk_run_queue() and kill it. [ Impact: removal of mostly duplicate interface function ] Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
__blk_run_queue wraps blk_invoke_request_fn() such that it additionally removes plug and bails out early if the queue is empty. Both extra operations have their own pending mechanisms and don't cause any harm correctness-wise when they are done superflously. The only user of blk_invoke_request_fn() being blk_start_queue(), there isn't much reason to keep both functions around. Merge blk_invoke_request_fn() into __blk_run_queue() and make blk_start_queue() use __blk_run_queue() instead. [ Impact: merge two subtly different internal functions ] Signed-off-by: Tejun Heo <tj@kernel.org>
-
Jeff Moyer authored
Doing a proper block dev ->readpages() speeds up the crazy dump(8) approach of using interleaved process IO. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Bartlomiej Zolnierkiewicz authored
Enable by default support for large devices and files (CONFIG_LBD): - With 1TB disks being a commodity hardware it is quite easy to hit 2TB limitation while building RAIDs etc. and many distros have been using CONFIG_LBD=y by default already (at least Fedora 10 and openSUSE 11.1). - This should also prevent a subtle ext4 filesystem compatibility issue: mke2fs.ext4 defaults to creating filesystems with huge_files feature enabled and such filesystems cannot be later mounted read-write on machines with CONFIG_LBD=n (it should be quite easy to hit this issue when trying to use filesystem created using distro kernel on system running the self-build kernel, think about USB disk enclosures & co.). While at it: - Clarify config option help text w.r.t. mounting ext4 filesystems (they can be mounted with CONFIG_LBD=n but in the read-only mode). Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
Impact: drop unnecessary code Now that everything uses bio and block operations, there is no need to reset request fields manually when retrying a request. Every field is guaranteed to be always valid. Drop unnecessary request field resetting from ide_dma_timeout_retry(). Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: remove code path which is no longer necessary All IDE data transfers now use rq->bio. Simplify ide_map_sg() accordingly. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk>
-
Tejun Heo authored
Impact: remove fields and code paths which are no longer necessary Now that ide-tape uses standard mechanisms to transfer data, special case handling for bh handling can be dropped from ide-atapi. Drop the followings. * pc->cur_pos, b_count, bh and b_data * drive->pc_update_buffers() and pc_io_buffers(). Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: cleanup idetape_chrdev_read/write() functions are unnecessarily complex when everything can be handled in a single loop. Collapse idetape_add_chrdev_read/write_request() into the rw functions and simplify the implementation. Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: cleanup Byte size is what most issue functions deal with, make idetape_queue_rw_tail() and its wrappers take byte size instead of sector counts. idetape_chrdev_read() and write() functions are converted to use tape->buffer_size instead of ctl from tape->cap. This cleans up code a little bit and will ease the next r/w reimplementation. Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: cleanup Read and write init paths are almost identical. Unify them into idetape_init_rw(). Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: kill now unnecessary idetape_bh With everything using standard mechanisms, there is no need for idetape_bh anymore. Kill it and use tape->buf, cur and valid to describe data buffer instead. Changes worth mentioning are... * idetape_queue_rq_tail() now always queue tape->buf and and adjusts buffer state properly before completion. * idetape_pad_zeros() clears the buffer only once. Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: use standard way to transfer data ide-tape uses rq in an interesting way. For r/w requests, rq->special is used to carry a private buffer management structure idetape_bh and rq->nr_sectors and current_nr_sectors are initialized to the number of idetape blocks which isn't necessary 512 bytes. Also, rq->current_nr_sectors is used to report back the residual count in units of idetape blocks. This peculiarity taxes both block layer and ide. ide-atapi has different paths and hooks to accomodate it and what a rq means becomes quite confusing and making changes at the block layer becomes quite difficult and error-prone. This patch makes ide-tape use bio instead. With the previous patch, ide-tape currently is using single contiguos buffer so replacing it isn't difficult. Data buffer is mapped into bio using blk_rq_map_kern() in idetape_queue_rw_tail(). idetape_io_buffers() and idetape_update_buffers() are dropped and pc->bh is set to null to tell ide-atapi to use standard data transfer mechanism and idetape_bh byte counts are updated by the issuer on completion using the residual count. This change also nicely removes the FIXME in ide_pc_intr() where ide-tape rqs need to be completed using ide_rq_bytes() instead of blk_rq_bytes() (although this didn't really matter as the request didn't have bio). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
Impact: simpler buffer allocation and handling, kills OOM, fix DMA transfers ide-tape has its own multiple buffer mechanism using struct idetape_bh. It allocates buffer with decreasing order-of-two allocations so that it results in minimum number of segments. However, the implementation is quite complex and works in a way that no other block or ide driver works necessitating a lot of special case handling. The benefit this complex allocation scheme brings is questionable as PIO or DMA the number of segments (16 maximum) doesn't make any noticeable difference and it also doesn't negate the need for multiple order allocation which can fail under memory pressure or high fragmentation although it does lower the highest order necessary by one when the buffer size isn't power of two. As the first step to remove the custom buffer management, this patch makes ide-tape allocate single continous buffer. The maximum order is four. I doubt the change would cause any trouble but if it ever matters, it should be converted to regular sg mechanism like everyone else and even in that case dropping custom buffer handling and moving to standard mechanism first make sense as an intermediate step. This patch makes the first bh to contain the whole buffer and drops multi bh handling code. Following patches will make further changes. This patch has the side effect of killing OOM triggered by allocation path and fixing DMA transfers. Previously, bug in alloc path triggered OOM on command issue and commands were passed to DMA engine without DMA-mapping all the segments. Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: allow residual count implementation in ->pc_callback() rq->data_len has two duties - carrying the number of input bytes on issue and carrying residual count back to the issuer on completion. ide-atapi completion callback ->pc_callback() is the right place to do this but currently ide-atapi depends on rq->data_len carrying the original request size after calling ->pc_callback() to complete the pc request. This patch makes ide_pc_intr(), ide_tape_issue_pc() and ide_floppy_issue_pc() cache length to complete before calling ->pc_callback() so that it can modify rq->data_len as necessary. Note: As using rq->data_len for two purposes can make cases like this incorrect in subtle ways, future changes will introduce separate field for residual count. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
Impact: fix infinite retry loop After a command failed, ide-tape and floppy inserts REQUEST_SENSE in front of the failed command and according to the result, sets pc->retries, flags and errors. After REQUEST_SENSE is complete, the failed command is again at the front of the queue and if the verdict was to terminate the request, the issue functions tries to complete it directly by calling drive->pc_callback() and returning ide_stopped. However, drive->pc_callback() doesn't complete a request. It only prepares for completion of the request. As a result, this creates an infinite loop where the failed request is retried perpetually. Fix it by actually ending the request by calling ide_complete_rq(). Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: cleanup rq->data usage ide-pm uses rq->data to carry pointer to struct request_pm_state through request queue and rq->special is used to carray pointer to local struct ide_cmd, which isn't necessary. Use rq->special for request_pm_state instead and use local ide_cmd in ide_start_power_step(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk>
-
Tejun Heo authored
Impact: unify request data buffer handling rq->data is used mostly to pass kernel buffer through request queue without using bio. There are only a couple of places which still do this in kernel and converting to bio isn't difficult. This patch converts ide-cd and atapi to use bio instead of rq->data for request sense and internal pc commands. With previous change to unify sense request handling, this is relatively easily achieved by adding blk_rq_map_kern() during sense_rq prep and PC issue. If blk_rq_map_kern() fails for sense, the error is deferred till sense issue and aborts the failed command which triggered the sense. Note that this is a slim possibility as sense prep is done on each command issue, so for the above condition to actually trigger, all preps since the last sense issue till the issue of the request which would require a sense should fail. * do_request functions might sleep now. This should be okay as ide request_fn - do_ide_request() - is invoked only from make_request and plug work. Make sure this is the case by adding might_sleep() to do_ide_request(). * Functions which access the read sense data before the sense request is complete now should access bio_data(sense_rq->bio) as the sense buffer might have been copied during blk_rq_map_kern(). * ide-tape updated to map sg. * cdrom_do_block_pc() now doesn't have to deal with REQ_TYPE_ATA_PC special case. Simplified. * tp_ops->output/input_data path dropped from ide_pc_intr(). Signed-off-by: Tejun Heo <tj@kernel.org>
-
Borislav Petkov authored
Since we're issuing REQ_TYPE_SENSE now we need to allow those types of rqs in the ->do_request callbacks. As a future improvement, sense_len assignment might be unified across all ATAPI devices. Borislav to check with specs and test. As a result, get rid of ide_queue_pc_head() and drive->request_sense_rq. tj: * Init request sense ide_atapi_pc from sense request. In the longer timer, it would probably better to fold ide_create_request_sense_cmd() into its only current user - ide_floppy_get_format_progress(). * ide_retry_pc() no longer takes @disk. CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
-
Borislav Petkov authored
Preallocate a sense request in the ->do_request method and reinitialize it only on demand, in case it's been consumed in the IRQ handler path. The reason for this is that we don't want to be mapping rq to bio in the IRQ path and introduce all kinds of unnecessary hacks to the block layer. tj: * Both user and kernel PC requests expect sense data to be stored in separate storage other than drive->sense_data. Copy sense data to rq->sense on completion if rq->sense is not NULL. This fixes bogus sense data on PC requests. As a result, remove cdrom_queue_request_sense. CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
-
Borislav Petkov authored
This is in preparation of removing the queueing of a sense request out of the IRQ handler path. Use struct request_sense as a general sense buffer for all ATAPI devices ide-{floppy,tape,cd}. tj: * blk_get_request(__GFP_WAIT) can't be called from do_request() as it can cause deadlock. Converted to use inline struct request and blk_rq_init(). * Added xfer / cdb len selection depending on device type. * All sense prep logics folded into ide_prep_sense() which never fails. * hwif->rq clearing and sense_rq used handling moved into ide_queue_sense_rq(). * blk_rq_map_kern() conversion is moved to later patch. CC: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: rq->buffer usage cleanup ide-cd uses rq->buffer to carry pointer to the original request when issuing REQUEST_SENSE. Use rq->special instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk>
-
Tejun Heo authored
Impact: rq->buffer usage cleanup ide-atapi uses rq->buffer as private opaque value for internal special requests. rq->special isn't used for these cases (the only case where rq->special is used is for ide-tape rw requests). Use rq->special instead. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk>
-
Tejun Heo authored
Impact: rq->buffer usage cleanup ide_raw_taskfile() directly uses rq->buffer to carry pointer to the data buffer. This complicates both block interface and ide backend request handling. Use blk_rq_map_kern() instead and drop special handling for REQ_TYPE_ATA_TASKFILE from ide_map_sg(). Note that REQ_RW setting is moved upwards as blk_rq_map_kern() uses it to initialize bio rw flag. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk>
-
Tejun Heo authored
Impact: remove unnecessary code path Block pc requests always use bio and rq->data is always NULL. No need to worry about !rq->bio cases in idefloppy_block_pc_cmd(). Note that ide-atapi uses ide_pio_bytes() for bio PIO transfer which handle sg fine. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk>
-
Tejun Heo authored
Impact: code simplification ide_cd_request_sense_fixup() clears the tail of the sense buffer if the device didn't completely fill it. This patch makes cdrom_queue_request_sense() clear the sense buffer before issuing the command instead of clearing it afterwards. This simplifies code and eases future changes. Signed-off-by: Tejun Heo <tj@kernel.org>
-
Tejun Heo authored
Impact: removal of unused field No one uses ide_cmd->special anymore. Kill it. Signed-off-by: Tejun Heo <tj@kernel.org>
-