Commit 16d91548 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'xfs-5.8-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Darrick Wong:
 "Most of the changes this cycle are refactoring of existing code in
  preparation for things landing in the future.

  We also fixed various problems and deficiencies in the quota
  implementation, and (I hope) the last of the stale read vectors by
  forcing write allocations to go through the unwritten state until the
  write completes.

  Summary:

   - Various cleanups to remove dead code, unnecessary conditionals,
     asserts, etc.

   - Fix a linker warning caused by xfs stuffing '-g' into CFLAGS
     redundantly.

   - Tighten up our dmesg logging to ensure that everything is prefixed
     with 'XFS' for easier grepping.

   - Kill a bunch of typedefs.

   - Refactor the deferred ops code to reduce indirect function calls.

   - Increase type-safety with the deferred ops code.

   - Make the DAX mount options a tri-state.

   - Fix some error handling problems in the inode flush code and clean
     up other inode flush warts.

   - Refactor log recovery so that each log item recovery functions now
     live with the other log item processing code.

   - Fix some SPDX forms.

   - Fix quota counter corruption if the fs crashes after running
     quotacheck but before any dquots get logged.

   - Don't fail metadata verification on zero-entry attr leaf blocks,
     since they're just part of the disk format now due to a historic
     lack of log atomicity.

   - Don't allow SWAPEXT between files with different [ugp]id when
     quotas are enabled.

   - Refactor inode fork reading and verification to run directly from
     the inode-from-disk function. This means that we now actually
     guarantee that _iget'ted inodes are totally verified and ready to
     go.

   - Move the incore inode fork format and extent counts to the ifork
     structure.

   - Scalability improvements by reducing cacheline pingponging in
     struct xfs_mount.

   - More scalability improvements by removing m_active_trans from the
     hot path.

   - Fix inode counter update sanity checking to run /only/ on debug
     kernels.

   - Fix longstanding inconsistency in what error code we return when a
     program hits project quota limits (ENOSPC).

   - Fix group quota returning the wrong error code when a program hits
     group quota limits.

   - Fix per-type quota limits and grace periods for group and project
     quotas so that they actually work.

   - Allow extension of individual grace periods.

   - Refactor the non-reclaim inode radix tree walking code to remove a
     bunch of stupid little functions and straighten out the
     inconsistent naming schemes.

   - Fix a bug in speculative preallocation where we measured a new
     allocation based on the last extent mapping in the file instead of
     looking farther for the last contiguous space allocation.

   - Force delalloc writes to unwritten extents. This closes a stale
     disk contents exposure vector if the system goes down before the
     write completes.

   - More lockdep whackamole"

* tag 'xfs-5.8-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (129 commits)
  xfs: more lockdep whackamole with kmem_alloc*
  xfs: force writes to delalloc regions to unwritten
  xfs: refactor xfs_iomap_prealloc_size
  xfs: measure all contiguous previous extents for prealloc size
  xfs: don't fail unwritten extent conversion on writeback due to edquot
  xfs: rearrange xfs_inode_walk_ag parameters
  xfs: straighten out all the naming around incore inode tree walks
  xfs: move xfs_inode_ag_iterator to be closer to the perag walking code
  xfs: use bool for done in xfs_inode_ag_walk
  xfs: fix inode ag walk predicate function return values
  xfs: refactor eofb matching into a single helper
  xfs: remove __xfs_icache_free_eofblocks
  xfs: remove flags argument from xfs_inode_ag_walk
  xfs: remove xfs_inode_ag_iterator_flags
  xfs: remove unused xfs_inode_ag_iterator function
  xfs: replace open-coded XFS_ICI_NO_TAG
  xfs: move eofblocks conversion function to xfs_ioctl.c
  xfs: allow individual quota grace period extension
  xfs: per-type quota timers and warn limits
  xfs: switch xfs_get_defquota to take explicit type
  ...
parents d9afbb35 6dcde60e
......@@ -340,11 +340,11 @@ buffer.
The structure of the verifiers and the identifiers checks is very similar to the
buffer code described above. The only difference is where they are called. For
example, inode read verification is done in xfs_iread() when the inode is first
read out of the buffer and the struct xfs_inode is instantiated. The inode is
already extensively verified during writeback in xfs_iflush_int, so the only
addition here is to add the LSN and CRC to the inode as it is copied back into
the buffer.
example, inode read verification is done in xfs_inode_from_disk() when the inode
is first read out of the buffer and the struct xfs_inode is instantiated. The
inode is already extensively verified during writeback in xfs_iflush_int, so the
only addition here is to add the LSN and CRC to the inode as it is copied back
into the buffer.
XXX: inode unlinked list modification doesn't recalculate the inode CRC! None of
the unlinked list modifications check or update CRCs, neither during unlink nor
......
......@@ -7,8 +7,6 @@
ccflags-y += -I $(srctree)/$(src) # needed for trace events
ccflags-y += -I $(srctree)/$(src)/libxfs
ccflags-$(CONFIG_XFS_DEBUG) += -g
obj-$(CONFIG_XFS_FS) += xfs.o
# this one should be compiled first, as the tracing macros can easily blow up
......@@ -101,9 +99,12 @@ xfs-y += xfs_log.o \
xfs_log_cil.o \
xfs_bmap_item.o \
xfs_buf_item.o \
xfs_buf_item_recover.o \
xfs_dquot_item_recover.o \
xfs_extfree_item.o \
xfs_icreate_item.o \
xfs_inode_item.o \
xfs_inode_item_recover.o \
xfs_refcount_item.o \
xfs_rmap_item.o \
xfs_log_recover.o \
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2005 Silicon Graphics, Inc.
* All Rights Reserved.
......@@ -19,6 +19,7 @@ typedef unsigned __bitwise xfs_km_flags_t;
#define KM_NOFS ((__force xfs_km_flags_t)0x0004u)
#define KM_MAYFAIL ((__force xfs_km_flags_t)0x0008u)
#define KM_ZERO ((__force xfs_km_flags_t)0x0010u)
#define KM_NOLOCKDEP ((__force xfs_km_flags_t)0x0020u)
/*
* We use a special process flag to avoid recursive callbacks into
......@@ -30,7 +31,7 @@ kmem_flags_convert(xfs_km_flags_t flags)
{
gfp_t lflags;
BUG_ON(flags & ~(KM_NOFS|KM_MAYFAIL|KM_ZERO));
BUG_ON(flags & ~(KM_NOFS | KM_MAYFAIL | KM_ZERO | KM_NOLOCKDEP));
lflags = GFP_KERNEL | __GFP_NOWARN;
if (flags & KM_NOFS)
......@@ -49,6 +50,9 @@ kmem_flags_convert(xfs_km_flags_t flags)
if (flags & KM_ZERO)
lflags |= __GFP_ZERO;
if (flags & KM_NOLOCKDEP)
lflags |= __GFP_NOLOCKDEP;
return lflags;
}
......
// SPDX-License-Identifier: GPL-2.0+
/* SPDX-License-Identifier: GPL-2.0+ */
/*
* Copyright (C) 2016 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2002,2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000,2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
......@@ -61,8 +61,8 @@ xfs_inode_hasattr(
struct xfs_inode *ip)
{
if (!XFS_IFORK_Q(ip) ||
(ip->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
ip->i_d.di_anextents == 0))
(ip->i_afp->if_format == XFS_DINODE_FMT_EXTENTS &&
ip->i_afp->if_nextents == 0))
return 0;
return 1;
}
......@@ -84,7 +84,7 @@ xfs_attr_get_ilocked(
if (!xfs_inode_hasattr(args->dp))
return -ENOATTR;
if (args->dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL)
if (args->dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL)
return xfs_attr_shortform_getvalue(args);
if (xfs_bmap_one_block(args->dp, XFS_ATTR_FORK))
return xfs_attr_leaf_get(args);
......@@ -212,14 +212,14 @@ xfs_attr_set_args(
* If the attribute list is non-existent or a shortform list,
* upgrade it to a single-leaf-block attribute list.
*/
if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
(dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
dp->i_d.di_anextents == 0)) {
if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL ||
(dp->i_afp->if_format == XFS_DINODE_FMT_EXTENTS &&
dp->i_afp->if_nextents == 0)) {
/*
* Build initial attribute list (if required).
*/
if (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS)
if (dp->i_afp->if_format == XFS_DINODE_FMT_EXTENTS)
xfs_attr_shortform_create(args);
/*
......@@ -272,7 +272,7 @@ xfs_attr_remove_args(
if (!xfs_inode_hasattr(dp)) {
error = -ENOATTR;
} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
} else if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL) {
ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
error = xfs_attr_shortform_remove(args);
} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000,2002-2003,2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
......@@ -308,14 +308,6 @@ xfs_attr3_leaf_verify(
if (fa)
return fa;
/*
* In recovery there is a transient state where count == 0 is valid
* because we may have transitioned an empty shortform attr to a leaf
* if the attr didn't fit in shortform.
*/
if (!xfs_log_in_recovery(mp) && ichdr.count == 0)
return __this_address;
/*
* firstused is the block offset of the first name info structure.
* Make sure it doesn't go off the block or crash into the header.
......@@ -331,6 +323,13 @@ xfs_attr3_leaf_verify(
(char *)bp->b_addr + ichdr.firstused)
return __this_address;
/*
* NOTE: This verifier historically failed empty leaf buffers because
* we expect the fork to be in another format. Empty attr fork format
* conversions are possible during xattr set, however, and format
* conversion is not atomic with the xattr set that triggers it. We
* cannot assume leaf blocks are non-empty until that is addressed.
*/
buf_end = (char *)bp->b_addr + mp->m_attr_geo->blksize;
for (i = 0, ent = entries; i < ichdr.count; ent++, i++) {
fa = xfs_attr3_leaf_verify_entry(mp, buf_end, leaf, &ichdr,
......@@ -489,7 +488,7 @@ xfs_attr_copy_value(
}
if (!args->value) {
args->value = kmem_alloc_large(valuelen, 0);
args->value = kmem_alloc_large(valuelen, KM_NOLOCKDEP);
if (!args->value)
return -ENOMEM;
}
......@@ -539,7 +538,7 @@ xfs_attr_shortform_bytesfit(
/* rounded down */
offset = (XFS_LITINO(mp) - bytes) >> 3;
if (dp->i_d.di_format == XFS_DINODE_FMT_DEV) {
if (dp->i_df.if_format == XFS_DINODE_FMT_DEV) {
minforkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
return (offset >= minforkoff) ? minforkoff : 0;
}
......@@ -567,7 +566,7 @@ xfs_attr_shortform_bytesfit(
dsize = dp->i_df.if_bytes;
switch (dp->i_d.di_format) {
switch (dp->i_df.if_format) {
case XFS_DINODE_FMT_EXTENTS:
/*
* If there is no attr fork and the data fork is extents,
......@@ -636,22 +635,19 @@ xfs_sbversion_add_attr2(xfs_mount_t *mp, xfs_trans_t *tp)
* Create the initial contents of a shortform attribute list.
*/
void
xfs_attr_shortform_create(xfs_da_args_t *args)
xfs_attr_shortform_create(
struct xfs_da_args *args)
{
xfs_attr_sf_hdr_t *hdr;
xfs_inode_t *dp;
struct xfs_ifork *ifp;
struct xfs_inode *dp = args->dp;
struct xfs_ifork *ifp = dp->i_afp;
struct xfs_attr_sf_hdr *hdr;
trace_xfs_attr_sf_create(args);
dp = args->dp;
ASSERT(dp != NULL);
ifp = dp->i_afp;
ASSERT(ifp != NULL);
ASSERT(ifp->if_bytes == 0);
if (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS) {
if (ifp->if_format == XFS_DINODE_FMT_EXTENTS) {
ifp->if_flags &= ~XFS_IFEXTENTS; /* just in case */
dp->i_d.di_aformat = XFS_DINODE_FMT_LOCAL;
ifp->if_format = XFS_DINODE_FMT_LOCAL;
ifp->if_flags |= XFS_IFINLINE;
} else {
ASSERT(ifp->if_flags & XFS_IFINLINE);
......@@ -719,13 +715,12 @@ xfs_attr_fork_remove(
struct xfs_inode *ip,
struct xfs_trans *tp)
{
xfs_idestroy_fork(ip, XFS_ATTR_FORK);
ip->i_d.di_forkoff = 0;
ip->i_d.di_aformat = XFS_DINODE_FMT_EXTENTS;
ASSERT(ip->i_d.di_anextents == 0);
ASSERT(ip->i_afp == NULL);
ASSERT(ip->i_afp->if_nextents == 0);
xfs_idestroy_fork(ip->i_afp);
kmem_cache_free(xfs_ifork_zone, ip->i_afp);
ip->i_afp = NULL;
ip->i_d.di_forkoff = 0;
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
}
......@@ -775,7 +770,7 @@ xfs_attr_shortform_remove(xfs_da_args_t *args)
totsize -= size;
if (totsize == sizeof(xfs_attr_sf_hdr_t) &&
(mp->m_flags & XFS_MOUNT_ATTR2) &&
(dp->i_d.di_format != XFS_DINODE_FMT_BTREE) &&
(dp->i_df.if_format != XFS_DINODE_FMT_BTREE) &&
!(args->op_flags & XFS_DA_OP_ADDNAME)) {
xfs_attr_fork_remove(dp, args->trans);
} else {
......@@ -785,7 +780,7 @@ xfs_attr_shortform_remove(xfs_da_args_t *args)
ASSERT(totsize > sizeof(xfs_attr_sf_hdr_t) ||
(args->op_flags & XFS_DA_OP_ADDNAME) ||
!(mp->m_flags & XFS_MOUNT_ATTR2) ||
dp->i_d.di_format == XFS_DINODE_FMT_BTREE);
dp->i_df.if_format == XFS_DINODE_FMT_BTREE);
xfs_trans_log_inode(args->trans, dp,
XFS_ILOG_CORE | XFS_ILOG_ADATA);
}
......@@ -962,7 +957,7 @@ xfs_attr_shortform_allfit(
+ be16_to_cpu(name_loc->valuelen);
}
if ((dp->i_mount->m_flags & XFS_MOUNT_ATTR2) &&
(dp->i_d.di_format != XFS_DINODE_FMT_BTREE) &&
(dp->i_df.if_format != XFS_DINODE_FMT_BTREE) &&
(bytes == sizeof(struct xfs_attr_sf_hdr)))
return -1;
return xfs_attr_shortform_bytesfit(dp, bytes);
......@@ -981,7 +976,7 @@ xfs_attr_shortform_verify(
int i;
int64_t size;
ASSERT(ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL);
ASSERT(ip->i_afp->if_format == XFS_DINODE_FMT_LOCAL);
ifp = XFS_IFORK_PTR(ip, XFS_ATTR_FORK);
sfp = (struct xfs_attr_shortform *)ifp->if_u1.if_data;
size = ifp->if_bytes;
......@@ -1085,7 +1080,7 @@ xfs_attr3_leaf_to_shortform(
if (forkoff == -1) {
ASSERT(dp->i_mount->m_flags & XFS_MOUNT_ATTR2);
ASSERT(dp->i_d.di_format != XFS_DINODE_FMT_BTREE);
ASSERT(dp->i_df.if_format != XFS_DINODE_FMT_BTREE);
xfs_attr_fork_remove(dp, args->trans);
goto out;
}
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000,2002-2003,2005 Silicon Graphics, Inc.
* Copyright (c) 2013 Red Hat, Inc.
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2013 Red Hat, Inc.
* All Rights Reserved.
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000,2002,2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000,2002,2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2006 Silicon Graphics, Inc.
* All Rights Reserved.
......
......@@ -636,10 +636,7 @@ xfs_bmbt_change_owner(
ASSERT(tp || buffer_list);
ASSERT(!(tp && buffer_list));
if (whichfork == XFS_DATA_FORK)
ASSERT(ip->i_d.di_format == XFS_DINODE_FMT_BTREE);
else
ASSERT(ip->i_d.di_aformat == XFS_DINODE_FMT_BTREE);
ASSERT(XFS_IFORK_PTR(ip, whichfork)->if_format == XFS_DINODE_FMT_BTREE);
cur = xfs_bmbt_init_cursor(ip->i_mount, tp, ip, whichfork);
if (!cur)
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000,2002-2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2001,2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000,2002,2005 Silicon Graphics, Inc.
* Copyright (c) 2013 Red Hat, Inc.
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2001,2005 Silicon Graphics, Inc.
* Copyright (c) 2013 Red Hat, Inc.
......
......@@ -178,6 +178,18 @@ static const struct xfs_defer_op_type *defer_op_types[] = {
[XFS_DEFER_OPS_TYPE_AGFL_FREE] = &xfs_agfl_free_defer_type,
};
static void
xfs_defer_create_intent(
struct xfs_trans *tp,
struct xfs_defer_pending *dfp,
bool sort)
{
const struct xfs_defer_op_type *ops = defer_op_types[dfp->dfp_type];
dfp->dfp_intent = ops->create_intent(tp, &dfp->dfp_work,
dfp->dfp_count, sort);
}
/*
* For each pending item in the intake list, log its intent item and the
* associated extents, then add the entire intake list to the end of
......@@ -187,17 +199,11 @@ STATIC void
xfs_defer_create_intents(
struct xfs_trans *tp)
{
struct list_head *li;
struct xfs_defer_pending *dfp;
const struct xfs_defer_op_type *ops;
list_for_each_entry(dfp, &tp->t_dfops, dfp_list) {
ops = defer_op_types[dfp->dfp_type];
dfp->dfp_intent = ops->create_intent(tp, dfp->dfp_count);
trace_xfs_defer_create_intent(tp->t_mountp, dfp);
list_sort(tp->t_mountp, &dfp->dfp_work, ops->diff_items);
list_for_each(li, &dfp->dfp_work)
ops->log_item(tp, dfp->dfp_intent, li);
xfs_defer_create_intent(tp, dfp, true);
}
}
......@@ -234,10 +240,13 @@ xfs_defer_trans_roll(
struct xfs_log_item *lip;
struct xfs_buf *bplist[XFS_DEFER_OPS_NR_BUFS];
struct xfs_inode *iplist[XFS_DEFER_OPS_NR_INODES];
unsigned int ordered = 0; /* bitmap */
int bpcount = 0, ipcount = 0;
int i;
int error;
BUILD_BUG_ON(NBBY * sizeof(ordered) < XFS_DEFER_OPS_NR_BUFS);
list_for_each_entry(lip, &tp->t_items, li_trans) {
switch (lip->li_type) {
case XFS_LI_BUF:
......@@ -248,7 +257,10 @@ xfs_defer_trans_roll(
ASSERT(0);
return -EFSCORRUPTED;
}
xfs_trans_dirty_buf(tp, bli->bli_buf);
if (bli->bli_flags & XFS_BLI_ORDERED)
ordered |= (1U << bpcount);
else
xfs_trans_dirty_buf(tp, bli->bli_buf);
bplist[bpcount++] = bli->bli_buf;
}
break;
......@@ -289,6 +301,8 @@ xfs_defer_trans_roll(
/* Rejoin the buffers and dirty them so the log moves forward. */
for (i = 0; i < bpcount; i++) {
xfs_trans_bjoin(tp, bplist[i]);
if (ordered & (1U << i))
xfs_trans_ordered_buf(tp, bplist[i]);
xfs_trans_bhold(tp, bplist[i]);
}
......@@ -345,6 +359,53 @@ xfs_defer_cancel_list(
}
}
/*
* Log an intent-done item for the first pending intent, and finish the work
* items.
*/
static int
xfs_defer_finish_one(
struct xfs_trans *tp,
struct xfs_defer_pending *dfp)
{
const struct xfs_defer_op_type *ops = defer_op_types[dfp->dfp_type];
struct xfs_btree_cur *state = NULL;
struct list_head *li, *n;
int error;
trace_xfs_defer_pending_finish(tp->t_mountp, dfp);
dfp->dfp_done = ops->create_done(tp, dfp->dfp_intent, dfp->dfp_count);
list_for_each_safe(li, n, &dfp->dfp_work) {
list_del(li);
dfp->dfp_count--;
error = ops->finish_item(tp, dfp->dfp_done, li, &state);
if (error == -EAGAIN) {
/*
* Caller wants a fresh transaction; put the work item
* back on the list and log a new log intent item to
* replace the old one. See "Requesting a Fresh
* Transaction while Finishing Deferred Work" above.
*/
list_add(li, &dfp->dfp_work);
dfp->dfp_count++;
dfp->dfp_done = NULL;
xfs_defer_create_intent(tp, dfp, false);
}
if (error)
goto out;
}
/* Done with the dfp, free it. */
list_del(&dfp->dfp_list);
kmem_free(dfp);
out:
if (ops->finish_cleanup)
ops->finish_cleanup(tp, state, error);
return error;
}
/*
* Finish all the pending work. This involves logging intent items for
* any work items that wandered in since the last transaction roll (if
......@@ -358,11 +419,7 @@ xfs_defer_finish_noroll(
struct xfs_trans **tp)
{
struct xfs_defer_pending *dfp;
struct list_head *li;
struct list_head *n;
void *state;
int error = 0;
const struct xfs_defer_op_type *ops;
LIST_HEAD(dop_pending);
ASSERT((*tp)->t_flags & XFS_TRANS_PERM_LOG_RES);
......@@ -371,87 +428,30 @@ xfs_defer_finish_noroll(
/* Until we run out of pending work to finish... */
while (!list_empty(&dop_pending) || !list_empty(&(*tp)->t_dfops)) {
/* log intents and pull in intake items */
xfs_defer_create_intents(*tp);
list_splice_tail_init(&(*tp)->t_dfops, &dop_pending);
/*
* Roll the transaction.
*/
error = xfs_defer_trans_roll(tp);
if (error)
goto out;
goto out_shutdown;
/* Log an intent-done item for the first pending item. */
dfp = list_first_entry(&dop_pending, struct xfs_defer_pending,
dfp_list);
ops = defer_op_types[dfp->dfp_type];
trace_xfs_defer_pending_finish((*tp)->t_mountp, dfp);
dfp->dfp_done = ops->create_done(*tp, dfp->dfp_intent,
dfp->dfp_count);
/* Finish the work items. */
state = NULL;
list_for_each_safe(li, n, &dfp->dfp_work) {
list_del(li);
dfp->dfp_count--;
error = ops->finish_item(*tp, li, dfp->dfp_done,
&state);
if (error == -EAGAIN) {
/*
* Caller wants a fresh transaction;
* put the work item back on the list
* and jump out.
*/
list_add(li, &dfp->dfp_work);
dfp->dfp_count++;
break;
} else if (error) {
/*
* Clean up after ourselves and jump out.
* xfs_defer_cancel will take care of freeing
* all these lists and stuff.
*/
if (ops->finish_cleanup)
ops->finish_cleanup(*tp, state, error);
goto out;
}
}
if (error == -EAGAIN) {
/*
* Caller wants a fresh transaction, so log a
* new log intent item to replace the old one
* and roll the transaction. See "Requesting
* a Fresh Transaction while Finishing
* Deferred Work" above.
*/
dfp->dfp_intent = ops->create_intent(*tp,
dfp->dfp_count);
dfp->dfp_done = NULL;
list_for_each(li, &dfp->dfp_work)
ops->log_item(*tp, dfp->dfp_intent, li);
} else {
/* Done with the dfp, free it. */
list_del(&dfp->dfp_list);
kmem_free(dfp);
}
if (ops->finish_cleanup)
ops->finish_cleanup(*tp, state, error);
}
out:
if (error) {
xfs_defer_trans_abort(*tp, &dop_pending);
xfs_force_shutdown((*tp)->t_mountp, SHUTDOWN_CORRUPT_INCORE);
trace_xfs_defer_finish_error(*tp, error);
xfs_defer_cancel_list((*tp)->t_mountp, &dop_pending);
xfs_defer_cancel(*tp);
return error;
error = xfs_defer_finish_one(*tp, dfp);
if (error && error != -EAGAIN)
goto out_shutdown;
}
trace_xfs_defer_finish_done(*tp, _RET_IP_);
return 0;
out_shutdown:
xfs_defer_trans_abort(*tp, &dop_pending);
xfs_force_shutdown((*tp)->t_mountp, SHUTDOWN_CORRUPT_INCORE);
trace_xfs_defer_finish_error(*tp, error);
xfs_defer_cancel_list((*tp)->t_mountp, &dop_pending);
xfs_defer_cancel(*tp);
return error;
}
int
......
// SPDX-License-Identifier: GPL-2.0+
/* SPDX-License-Identifier: GPL-2.0+ */
/*
* Copyright (C) 2016 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
......@@ -6,6 +6,7 @@
#ifndef __XFS_DEFER_H__
#define __XFS_DEFER_H__
struct xfs_btree_cur;
struct xfs_defer_op_type;
/*
......@@ -28,8 +29,8 @@ enum xfs_defer_ops_type {
struct xfs_defer_pending {
struct list_head dfp_list; /* pending items */
struct list_head dfp_work; /* work items */
void *dfp_intent; /* log intent item */
void *dfp_done; /* log done item */
struct xfs_log_item *dfp_intent; /* log intent item */
struct xfs_log_item *dfp_done; /* log done item */
unsigned int dfp_count; /* # extent items */
enum xfs_defer_ops_type dfp_type;
};
......@@ -43,15 +44,16 @@ void xfs_defer_move(struct xfs_trans *dtp, struct xfs_trans *stp);
/* Description of a deferred type. */
struct xfs_defer_op_type {
void (*abort_intent)(void *);
void *(*create_done)(struct xfs_trans *, void *, unsigned int);
int (*finish_item)(struct xfs_trans *, struct list_head *, void *,
void **);
void (*finish_cleanup)(struct xfs_trans *, void *, int);
void (*cancel_item)(struct list_head *);
int (*diff_items)(void *, struct list_head *, struct list_head *);
void *(*create_intent)(struct xfs_trans *, uint);
void (*log_item)(struct xfs_trans *, void *, struct list_head *);
struct xfs_log_item *(*create_intent)(struct xfs_trans *tp,
struct list_head *items, unsigned int count, bool sort);
void (*abort_intent)(struct xfs_log_item *intent);
struct xfs_log_item *(*create_done)(struct xfs_trans *tp,
struct xfs_log_item *intent, unsigned int count);
int (*finish_item)(struct xfs_trans *tp, struct xfs_log_item *done,
struct list_head *item, struct xfs_btree_cur **state);
void (*finish_cleanup)(struct xfs_trans *tp,
struct xfs_btree_cur *state, int error);
void (*cancel_item)(struct list_head *item);
unsigned int max_items;
};
......
......@@ -278,7 +278,7 @@ xfs_dir_createname(
if (!inum)
args->op_flags |= XFS_DA_OP_JUSTCHECK;
if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL) {
if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) {
rval = xfs_dir2_sf_addname(args);
goto out_free;
}
......@@ -373,7 +373,7 @@ xfs_dir_lookup(
args->op_flags |= XFS_DA_OP_CILOOKUP;
lock_mode = xfs_ilock_data_map_shared(dp);
if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL) {
if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) {
rval = xfs_dir2_sf_lookup(args);
goto out_check_rval;
}
......@@ -443,7 +443,7 @@ xfs_dir_removename(
args->whichfork = XFS_DATA_FORK;
args->trans = tp;
if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL) {
if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) {
rval = xfs_dir2_sf_removename(args);
goto out_free;
}
......@@ -504,7 +504,7 @@ xfs_dir_replace(
args->whichfork = XFS_DATA_FORK;
args->trans = tp;
if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL) {
if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) {
rval = xfs_dir2_sf_replace(args);
goto out_free;
}
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2001,2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
......@@ -1104,7 +1104,7 @@ xfs_dir2_sf_to_block(
ASSERT(ifp->if_bytes == dp->i_d.di_size);
ASSERT(ifp->if_u1.if_data != NULL);
ASSERT(dp->i_d.di_size >= xfs_dir2_sf_hdr_size(oldsfp->i8count));
ASSERT(dp->i_d.di_nextents == 0);
ASSERT(dp->i_df.if_nextents == 0);
/*
* Copy the directory into a temporary buffer.
......
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2001,2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
......@@ -343,7 +343,7 @@ xfs_dir2_block_to_sf(
*/
ASSERT(dp->i_df.if_bytes == 0);
xfs_init_local_fork(dp, XFS_DATA_FORK, sfp, size);
dp->i_d.di_format = XFS_DINODE_FMT_LOCAL;
dp->i_df.if_format = XFS_DINODE_FMT_LOCAL;
dp->i_d.di_size = size;
logflags |= XFS_ILOG_DDATA;
......@@ -710,11 +710,11 @@ xfs_dir2_sf_verify(
struct xfs_inode *ip)
{
struct xfs_mount *mp = ip->i_mount;
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
struct xfs_dir2_sf_hdr *sfp;
struct xfs_dir2_sf_entry *sfep;
struct xfs_dir2_sf_entry *next_sfep;
char *endp;
struct xfs_ifork *ifp;
xfs_ino_t ino;
int i;
int i8count;
......@@ -723,9 +723,8 @@ xfs_dir2_sf_verify(
int error;
uint8_t filetype;
ASSERT(ip->i_d.di_format == XFS_DINODE_FMT_LOCAL);
ASSERT(ifp->if_format == XFS_DINODE_FMT_LOCAL);
ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
sfp = (struct xfs_dir2_sf_hdr *)ifp->if_u1.if_data;
size = ifp->if_bytes;
......@@ -827,9 +826,9 @@ xfs_dir2_sf_create(
* If it's currently a zero-length extent file,
* convert it to local format.
*/
if (dp->i_d.di_format == XFS_DINODE_FMT_EXTENTS) {
if (dp->i_df.if_format == XFS_DINODE_FMT_EXTENTS) {
dp->i_df.if_flags &= ~XFS_IFEXTENTS; /* just in case */
dp->i_d.di_format = XFS_DINODE_FMT_LOCAL;
dp->i_df.if_format = XFS_DINODE_FMT_LOCAL;
xfs_trans_log_inode(args->trans, dp, XFS_ILOG_CORE);
dp->i_df.if_flags |= XFS_IFINLINE;
}
......@@ -1027,7 +1026,7 @@ xfs_dir2_sf_replace_needblock(
int newsize;
struct xfs_dir2_sf_hdr *sfp;
if (dp->i_d.di_format != XFS_DINODE_FMT_LOCAL)
if (dp->i_df.if_format != XFS_DINODE_FMT_LOCAL)
return false;
sfp = (struct xfs_dir2_sf_hdr *)dp->i_df.if_u1.if_data;
......
// SPDX-License-Identifier: GPL-2.0+
/* SPDX-License-Identifier: GPL-2.0+ */
/*
* Copyright (c) 2000-2002,2005 Silicon Graphics, Inc.
* Copyright (C) 2017 Oracle.
......@@ -55,7 +55,8 @@
#define XFS_ERRTAG_FORCE_SCRUB_REPAIR 32
#define XFS_ERRTAG_FORCE_SUMMARY_RECALC 33
#define XFS_ERRTAG_IUNLINK_FALLBACK 34
#define XFS_ERRTAG_MAX 35
#define XFS_ERRTAG_BUF_IOERROR 35
#define XFS_ERRTAG_MAX 36
/*
* Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
......@@ -95,5 +96,6 @@
#define XFS_RANDOM_FORCE_SCRUB_REPAIR 1
#define XFS_RANDOM_FORCE_SUMMARY_RECALC 1
#define XFS_RANDOM_IUNLINK_FALLBACK (XFS_RANDOM_DEFAULT/10)
#define XFS_RANDOM_BUF_IOERROR XFS_RANDOM_DEFAULT
#endif /* __XFS_ERRORTAG_H_ */
// SPDX-License-Identifier: GPL-2.0
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2005 Silicon Graphics, Inc.
* All Rights Reserved.
......@@ -964,13 +964,12 @@ enum xfs_dinode_fmt {
/*
* Inode data & attribute fork sizes, per inode.
*/
#define XFS_DFORK_Q(dip) ((dip)->di_forkoff != 0)
#define XFS_DFORK_BOFF(dip) ((int)((dip)->di_forkoff << 3))
#define XFS_DFORK_DSIZE(dip,mp) \
(XFS_DFORK_Q(dip) ? XFS_DFORK_BOFF(dip) : XFS_LITINO(mp))
((dip)->di_forkoff ? XFS_DFORK_BOFF(dip) : XFS_LITINO(mp))
#define XFS_DFORK_ASIZE(dip,mp) \
(XFS_DFORK_Q(dip) ? XFS_LITINO(mp) - XFS_DFORK_BOFF(dip) : 0)
((dip)->di_forkoff ? XFS_LITINO(mp) - XFS_DFORK_BOFF(dip) : 0)
#define XFS_DFORK_SIZE(dip,mp,w) \
((w) == XFS_DATA_FORK ? \
XFS_DFORK_DSIZE(dip, mp) : \
......@@ -1681,7 +1680,7 @@ struct xfs_acl_entry {
struct xfs_acl {
__be32 acl_cnt;
struct xfs_acl_entry acl_entry[0];
struct xfs_acl_entry acl_entry[];
};
/*
......
// SPDX-License-Identifier: LGPL-2.1
/* SPDX-License-Identifier: LGPL-2.1 */
/*
* Copyright (c) 1995-2005 Silicon Graphics, Inc.
* All Rights Reserved.
......
// SPDX-License-Identifier: GPL-2.0+
/* SPDX-License-Identifier: GPL-2.0+ */
/*
* Copyright (C) 2019 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
......
......@@ -161,8 +161,7 @@ xfs_imap_to_bp(
struct xfs_imap *imap,
struct xfs_dinode **dipp,
struct xfs_buf **bpp,
uint buf_flags,
uint iget_flags)
uint buf_flags)
{
struct xfs_buf *bp;
int error;
......@@ -172,12 +171,7 @@ xfs_imap_to_bp(
(int)imap->im_len, buf_flags, &bp,
&xfs_inode_buf_ops);
if (error) {
if (error == -EAGAIN) {
ASSERT(buf_flags & XBF_TRYLOCK);
return error;
}
xfs_warn(mp, "%s: xfs_trans_read_buf() returned error %d.",
__func__, error);
ASSERT(error != -EAGAIN || (buf_flags & XBF_TRYLOCK));
return error;
}
......@@ -186,13 +180,36 @@ xfs_imap_to_bp(
return 0;
}
void
int
xfs_inode_from_disk(
struct xfs_inode *ip,
struct xfs_dinode *from)
{
struct xfs_icdinode *to = &ip->i_d;
struct inode *inode = VFS_I(ip);
int error;
xfs_failaddr_t fa;
ASSERT(ip->i_cowfp == NULL);
ASSERT(ip->i_afp == NULL);
fa = xfs_dinode_verify(ip->i_mount, ip->i_ino, from);
if (fa) {
xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", from,
sizeof(*from), fa);
return -EFSCORRUPTED;
}
/*
* First get the permanent information that is needed to allocate an
* inode. If the inode is unused, mode is zero and we shouldn't mess
* with the unitialized part of it.
*/
to->di_flushiter = be16_to_cpu(from->di_flushiter);
inode->i_generation = be32_to_cpu(from->di_gen);
inode->i_mode = be16_to_cpu(from->di_mode);
if (!inode->i_mode)
return 0;
/*
* Convert v1 inodes immediately to v2 inode format as this is the
......@@ -208,10 +225,8 @@ xfs_inode_from_disk(
be16_to_cpu(from->di_projid_lo);
}
to->di_format = from->di_format;
i_uid_write(inode, be32_to_cpu(from->di_uid));
i_gid_write(inode, be32_to_cpu(from->di_gid));
to->di_flushiter = be16_to_cpu(from->di_flushiter);
/*
* Time is signed, so need to convert to signed 32 bit before
......@@ -225,16 +240,11 @@ xfs_inode_from_disk(
inode->i_mtime.tv_nsec = (int)be32_to_cpu(from->di_mtime.t_nsec);
inode->i_ctime.tv_sec = (int)be32_to_cpu(from->di_ctime.t_sec);
inode->i_ctime.tv_nsec = (int)be32_to_cpu(from->di_ctime.t_nsec);
inode->i_generation = be32_to_cpu(from->di_gen);
inode->i_mode = be16_to_cpu(from->di_mode);
to->di_size = be64_to_cpu(from->di_size);
to->di_nblocks = be64_to_cpu(from->di_nblocks);
to->di_extsize = be32_to_cpu(from->di_extsize);
to->di_nextents = be32_to_cpu(from->di_nextents);
to->di_anextents = be16_to_cpu(from->di_anextents);
to->di_forkoff = from->di_forkoff;
to->di_aformat = from->di_aformat;
to->di_dmevmask = be32_to_cpu(from->di_dmevmask);
to->di_dmstate = be16_to_cpu(from->di_dmstate);
to->di_flags = be16_to_cpu(from->di_flags);
......@@ -247,6 +257,22 @@ xfs_inode_from_disk(
to->di_flags2 = be64_to_cpu(from->di_flags2);
to->di_cowextsize = be32_to_cpu(from->di_cowextsize);
}
error = xfs_iformat_data_fork(ip, from);
if (error)
return error;
if (from->di_forkoff) {
error = xfs_iformat_attr_fork(ip, from);
if (error)
goto out_destroy_data_fork;
}
if (xfs_is_reflink_inode(ip))
xfs_ifork_init_cow(ip);
return 0;
out_destroy_data_fork:
xfs_idestroy_fork(&ip->i_df);
return error;
}
void
......@@ -261,7 +287,7 @@ xfs_inode_to_disk(
to->di_magic = cpu_to_be16(XFS_DINODE_MAGIC);
to->di_onlink = 0;
to->di_format = from->di_format;
to->di_format = xfs_ifork_format(&ip->i_df);
to->di_uid = cpu_to_be32(i_uid_read(inode));
to->di_gid = cpu_to_be32(i_gid_read(inode));
to->di_projid_lo = cpu_to_be16(from->di_projid & 0xffff);
......@@ -281,10 +307,10 @@ xfs_inode_to_disk(
to->di_size = cpu_to_be64(from->di_size);
to->di_nblocks = cpu_to_be64(from->di_nblocks);
to->di_extsize = cpu_to_be32(from->di_extsize);
to->di_nextents = cpu_to_be32(from->di_nextents);
to->di_anextents = cpu_to_be16(from->di_anextents);
to->di_nextents = cpu_to_be32(xfs_ifork_nextents(&ip->i_df));
to->di_anextents = cpu_to_be16(xfs_ifork_nextents(ip->i_afp));
to->di_forkoff = from->di_forkoff;
to->di_aformat = from->di_aformat;
to->di_aformat = xfs_ifork_format(ip->i_afp);
to->di_dmevmask = cpu_to_be32(from->di_dmevmask);
to->di_dmstate = cpu_to_be16(from->di_dmstate);
to->di_flags = cpu_to_be16(from->di_flags);
......@@ -405,7 +431,7 @@ xfs_dinode_verify_forkoff(
struct xfs_dinode *dip,
struct xfs_mount *mp)
{
if (!XFS_DFORK_Q(dip))
if (!dip->di_forkoff)
return NULL;
switch (dip->di_format) {
......@@ -508,7 +534,7 @@ xfs_dinode_verify(
return __this_address;
}
if (XFS_DFORK_Q(dip)) {
if (dip->di_forkoff) {
fa = xfs_dinode_verify_fork(dip, mp, XFS_ATTR_FORK);
if (fa)
return fa;
......@@ -584,122 +610,6 @@ xfs_dinode_calc_crc(
dip->di_crc = xfs_end_cksum(crc);
}
/*
* Read the disk inode attributes into the in-core inode structure.
*
* For version 5 superblocks, if we are initialising a new inode and we are not
* utilising the XFS_MOUNT_IKEEP inode cluster mode, we can simple build the new
* inode core with a random generation number. If we are keeping inodes around,
* we need to read the inode cluster to get the existing generation number off
* disk. Further, if we are using version 4 superblocks (i.e. v1/v2 inode
* format) then log recovery is dependent on the di_flushiter field being
* initialised from the current on-disk value and hence we must also read the
* inode off disk.
*/
int
xfs_iread(
xfs_mount_t *mp,
xfs_trans_t *tp,
xfs_inode_t *ip,
uint iget_flags)
{
xfs_buf_t *bp;
xfs_dinode_t *dip;
xfs_failaddr_t fa;
int error;
/*
* Fill in the location information in the in-core inode.
*/
error = xfs_imap(mp, tp, ip->i_ino, &ip->i_imap, iget_flags);
if (error)
return error;
/* shortcut IO on inode allocation if possible */
if ((iget_flags & XFS_IGET_CREATE) &&
xfs_sb_version_has_v3inode(&mp->m_sb) &&
!(mp->m_flags & XFS_MOUNT_IKEEP)) {
VFS_I(ip)->i_generation = prandom_u32();
return 0;
}
/*
* Get pointers to the on-disk inode and the buffer containing it.
*/
error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &dip, &bp, 0, iget_flags);
if (error)
return error;
/* even unallocated inodes are verified */
fa = xfs_dinode_verify(mp, ip->i_ino, dip);
if (fa) {
xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", dip,
sizeof(*dip), fa);
error = -EFSCORRUPTED;
goto out_brelse;
}
/*
* If the on-disk inode is already linked to a directory
* entry, copy all of the inode into the in-core inode.
* xfs_iformat_fork() handles copying in the inode format
* specific information.
* Otherwise, just get the truly permanent information.
*/
if (dip->di_mode) {
xfs_inode_from_disk(ip, dip);
error = xfs_iformat_fork(ip, dip);
if (error) {
#ifdef DEBUG
xfs_alert(mp, "%s: xfs_iformat() returned error %d",
__func__, error);
#endif /* DEBUG */
goto out_brelse;
}
} else {
/*
* Partial initialisation of the in-core inode. Just the bits
* that xfs_ialloc won't overwrite or relies on being correct.
*/
VFS_I(ip)->i_generation = be32_to_cpu(dip->di_gen);
ip->i_d.di_flushiter = be16_to_cpu(dip->di_flushiter);
/*
* Make sure to pull in the mode here as well in
* case the inode is released without being used.
* This ensures that xfs_inactive() will see that
* the inode is already free and not try to mess
* with the uninitialized part of it.
*/
VFS_I(ip)->i_mode = 0;
}
ip->i_delayed_blks = 0;
/*
* Mark the buffer containing the inode as something to keep
* around for a while. This helps to keep recently accessed
* meta-data in-core longer.
*/
xfs_buf_set_ref(bp, XFS_INO_REF);
/*
* Use xfs_trans_brelse() to release the buffer containing the on-disk
* inode, because it was acquired with xfs_trans_read_buf() in
* xfs_imap_to_bp() above. If tp is NULL, this is just a normal
* brelse(). If we're within a transaction, then xfs_trans_brelse()
* will only release the buffer if it is not dirty within the
* transaction. It will be OK to release the buffer in this case,
* because inodes on disk are never destroyed and we will be locking the
* new in-core inode before putting it in the cache where other
* processes can find it. Thus we don't have to worry about the inode
* being changed just because we released the buffer.
*/
out_brelse:
xfs_trans_brelse(tp, bp);
return error;
}
/*
* Validate di_extsize hint.
*
......
......@@ -16,16 +16,12 @@ struct xfs_dinode;
* format specific structures at the appropriate time.
*/
struct xfs_icdinode {
int8_t di_format; /* format of di_c data */
uint16_t di_flushiter; /* incremented on flush */
uint32_t di_projid; /* owner's project id */
xfs_fsize_t di_size; /* number of bytes in file */
xfs_rfsblock_t di_nblocks; /* # of direct & btree blocks used */
xfs_extlen_t di_extsize; /* basic/minimum extent size for file */
xfs_extnum_t di_nextents; /* number of extents in data fork */
xfs_aextnum_t di_anextents; /* number of extents in attribute fork*/
uint8_t di_forkoff; /* attr fork offs, <<3 for 64b align */
int8_t di_aformat; /* format of attr fork's data */
uint32_t di_dmevmask; /* DMIG event mask */
uint16_t di_dmstate; /* DMIG state info */
uint16_t di_flags; /* random flags, XFS_DIFLAG_... */
......@@ -48,13 +44,11 @@ struct xfs_imap {
int xfs_imap_to_bp(struct xfs_mount *, struct xfs_trans *,
struct xfs_imap *, struct xfs_dinode **,
struct xfs_buf **, uint, uint);
int xfs_iread(struct xfs_mount *, struct xfs_trans *,
struct xfs_inode *, uint);
struct xfs_buf **, uint);
void xfs_dinode_calc_crc(struct xfs_mount *, struct xfs_dinode *);
void xfs_inode_to_disk(struct xfs_inode *ip, struct xfs_dinode *to,
xfs_lsn_t lsn);
void xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
int xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
void xfs_log_dinode_to_disk(struct xfs_log_dinode *from,
struct xfs_dinode *to);
......
This diff is collapsed.
......@@ -23,6 +23,8 @@ struct xfs_ifork {
} if_u1;
short if_broot_bytes; /* bytes allocated for root */
unsigned char if_flags; /* per-fork flags */
int8_t if_format; /* format of this fork */
xfs_extnum_t if_nextents; /* # of extents in this fork */
};
/*
......@@ -55,43 +57,36 @@ struct xfs_ifork {
((w) == XFS_ATTR_FORK ? \
XFS_IFORK_ASIZE(ip) : \
0))
#define XFS_IFORK_FORMAT(ip,w) \
((w) == XFS_DATA_FORK ? \
(ip)->i_d.di_format : \
((w) == XFS_ATTR_FORK ? \
(ip)->i_d.di_aformat : \
(ip)->i_cformat))
#define XFS_IFORK_FMT_SET(ip,w,n) \
((w) == XFS_DATA_FORK ? \
((ip)->i_d.di_format = (n)) : \
((w) == XFS_ATTR_FORK ? \
((ip)->i_d.di_aformat = (n)) : \
((ip)->i_cformat = (n))))
#define XFS_IFORK_NEXTENTS(ip,w) \
((w) == XFS_DATA_FORK ? \
(ip)->i_d.di_nextents : \
((w) == XFS_ATTR_FORK ? \
(ip)->i_d.di_anextents : \
(ip)->i_cnextents))
#define XFS_IFORK_NEXT_SET(ip,w,n) \
((w) == XFS_DATA_FORK ? \
((ip)->i_d.di_nextents = (n)) : \
((w) == XFS_ATTR_FORK ? \
((ip)->i_d.di_anextents = (n)) : \
((ip)->i_cnextents = (n))))
#define XFS_IFORK_MAXEXT(ip, w) \
(XFS_IFORK_SIZE(ip, w) / sizeof(xfs_bmbt_rec_t))
#define xfs_ifork_has_extents(ip, w) \
(XFS_IFORK_FORMAT((ip), (w)) == XFS_DINODE_FMT_EXTENTS || \
XFS_IFORK_FORMAT((ip), (w)) == XFS_DINODE_FMT_BTREE)
static inline bool xfs_ifork_has_extents(struct xfs_ifork *ifp)
{
return ifp->if_format == XFS_DINODE_FMT_EXTENTS ||
ifp->if_format == XFS_DINODE_FMT_BTREE;
}
static inline xfs_extnum_t xfs_ifork_nextents(struct xfs_ifork *ifp)
{
if (!ifp)
return 0;
return ifp->if_nextents;
}
static inline int8_t xfs_ifork_format(struct xfs_ifork *ifp)
{
if (!ifp)
return XFS_DINODE_FMT_EXTENTS;
return ifp->if_format;
}
struct xfs_ifork *xfs_iext_state_to_fork(struct xfs_inode *ip, int state);
int xfs_iformat_fork(struct xfs_inode *, struct xfs_dinode *);
int xfs_iformat_data_fork(struct xfs_inode *, struct xfs_dinode *);
int xfs_iformat_attr_fork(struct xfs_inode *, struct xfs_dinode *);
void xfs_iflush_fork(struct xfs_inode *, struct xfs_dinode *,
struct xfs_inode_log_item *, int);
void xfs_idestroy_fork(struct xfs_inode *, int);
void xfs_idestroy_fork(struct xfs_ifork *ifp);
void xfs_idata_realloc(struct xfs_inode *ip, int64_t byte_diff,
int whichfork);
void xfs_iroot_realloc(struct xfs_inode *, int, int);
......@@ -175,18 +170,7 @@ extern struct kmem_zone *xfs_ifork_zone;
extern void xfs_ifork_init_cow(struct xfs_inode *ip);
typedef xfs_failaddr_t (*xfs_ifork_verifier_t)(struct xfs_inode *);
struct xfs_ifork_ops {
xfs_ifork_verifier_t verify_symlink;
xfs_ifork_verifier_t verify_dir;
xfs_ifork_verifier_t verify_attr;
};
extern struct xfs_ifork_ops xfs_default_ifork_ops;
xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip,
struct xfs_ifork_ops *ops);
xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip,
struct xfs_ifork_ops *ops);
int xfs_ifork_verify_local_data(struct xfs_inode *ip);
int xfs_ifork_verify_local_attr(struct xfs_inode *ip);
#endif /* __XFS_INODE_FORK_H__ */
......@@ -6,6 +6,73 @@
#ifndef __XFS_LOG_RECOVER_H__
#define __XFS_LOG_RECOVER_H__
/*
* Each log item type (XFS_LI_*) gets its own xlog_recover_item_ops to
* define how recovery should work for that type of log item.
*/
struct xlog_recover_item;
/* Sorting hat for log items as they're read in. */
enum xlog_recover_reorder {
XLOG_REORDER_BUFFER_LIST,
XLOG_REORDER_ITEM_LIST,
XLOG_REORDER_INODE_BUFFER_LIST,
XLOG_REORDER_CANCEL_LIST,
};
struct xlog_recover_item_ops {
uint16_t item_type; /* XFS_LI_* type code. */
/*
* Help sort recovered log items into the order required to replay them
* correctly. Log item types that always use XLOG_REORDER_ITEM_LIST do
* not have to supply a function here. See the comment preceding
* xlog_recover_reorder_trans for more details about what the return
* values mean.
*/
enum xlog_recover_reorder (*reorder)(struct xlog_recover_item *item);
/* Start readahead for pass2, if provided. */
void (*ra_pass2)(struct xlog *log, struct xlog_recover_item *item);
/* Do whatever work we need to do for pass1, if provided. */
int (*commit_pass1)(struct xlog *log, struct xlog_recover_item *item);
/*
* This function should do whatever work is needed for pass2 of log
* recovery, if provided.
*
* If the recovered item is an intent item, this function should parse
* the recovered item to construct an in-core log intent item and
* insert it into the AIL. The in-core log intent item should have 1
* refcount so that the item is freed either (a) when we commit the
* recovered log item for the intent-done item; (b) replay the work and
* log a new intent-done item; or (c) recovery fails and we have to
* abort.
*
* If the recovered item is an intent-done item, this function should
* parse the recovered item to find the id of the corresponding intent
* log item. Next, it should find the in-core log intent item in the
* AIL and release it.
*/
int (*commit_pass2)(struct xlog *log, struct list_head *buffer_list,
struct xlog_recover_item *item, xfs_lsn_t lsn);
};
extern const struct xlog_recover_item_ops xlog_icreate_item_ops;
extern const struct xlog_recover_item_ops xlog_buf_item_ops;
extern const struct xlog_recover_item_ops xlog_inode_item_ops;
extern const struct xlog_recover_item_ops xlog_dquot_item_ops;
extern const struct xlog_recover_item_ops xlog_quotaoff_item_ops;
extern const struct xlog_recover_item_ops xlog_bui_item_ops;
extern const struct xlog_recover_item_ops xlog_bud_item_ops;
extern const struct xlog_recover_item_ops xlog_efi_item_ops;
extern const struct xlog_recover_item_ops xlog_efd_item_ops;
extern const struct xlog_recover_item_ops xlog_rui_item_ops;
extern const struct xlog_recover_item_ops xlog_rud_item_ops;
extern const struct xlog_recover_item_ops xlog_cui_item_ops;
extern const struct xlog_recover_item_ops xlog_cud_item_ops;
/*
* Macros, structures, prototypes for internal log manager use.
*/
......@@ -22,13 +89,13 @@
/*
* item headers are in ri_buf[0]. Additional buffers follow.
*/
typedef struct xlog_recover_item {
struct xlog_recover_item {
struct list_head ri_list;
int ri_type;
int ri_cnt; /* count of regions found */
int ri_total; /* total regions */
xfs_log_iovec_t *ri_buf; /* ptr to regions buffer */
} xlog_recover_item_t;
struct xfs_log_iovec *ri_buf; /* ptr to regions buffer */
const struct xlog_recover_item_ops *ri_ops;
};
struct xlog_recover {
struct hlist_node r_list;
......@@ -51,4 +118,12 @@ struct xlog_recover {
#define XLOG_RECOVER_PASS1 1
#define XLOG_RECOVER_PASS2 2
void xlog_buf_readahead(struct xlog *log, xfs_daddr_t blkno, uint len,
const struct xfs_buf_ops *ops);
bool xlog_is_buffer_cancelled(struct xlog *log, xfs_daddr_t blkno, uint len);
void xlog_recover_iodone(struct xfs_buf *bp);
void xlog_recover_release_intent(struct xlog *log, unsigned short intent_type,
uint64_t intent_id);
#endif /* __XFS_LOG_RECOVER_H__ */
......@@ -100,7 +100,6 @@ typedef uint16_t xfs_qwarncnt_t;
#define XFS_QMOPT_FORCE_RES 0x0000010 /* ignore quota limits */
#define XFS_QMOPT_SBVERSION 0x0000040 /* change superblock version num */
#define XFS_QMOPT_GQUOTA 0x0002000 /* group dquot requested */
#define XFS_QMOPT_ENOSPC 0x0004000 /* enospc instead of edquot (prj) */
/*
* flags to xfs_trans_mod_dquot to indicate which field needs to be
......
......@@ -66,7 +66,7 @@ xfs_rtbuf_get(
ip = issum ? mp->m_rsumip : mp->m_rbmip;
error = xfs_bmapi_read(ip, block, 1, &map, &nmap, XFS_DATA_FORK);
error = xfs_bmapi_read(ip, block, 1, &map, &nmap, 0);
if (error)
return error;
......
......@@ -243,7 +243,7 @@ xfs_validate_sb_common(
} else if (sbp->sb_qflags & (XFS_PQUOTA_ENFD | XFS_GQUOTA_ENFD |
XFS_PQUOTA_CHKD | XFS_GQUOTA_CHKD)) {
xfs_notice(mp,
"Superblock earlier than Version 5 has XFS_[PQ]UOTA_{ENFD|CHKD} bits.");
"Superblock earlier than Version 5 has XFS_{P|G}QUOTA_{ENFD|CHKD} bits.");
return -EFSCORRUPTED;
}
......
......@@ -204,16 +204,12 @@ xfs_failaddr_t
xfs_symlink_shortform_verify(
struct xfs_inode *ip)
{
char *sfp;
char *endp;
struct xfs_ifork *ifp;
int size;
ASSERT(ip->i_d.di_format == XFS_DINODE_FMT_LOCAL);
ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
sfp = (char *)ifp->if_u1.if_data;
size = ifp->if_bytes;
endp = sfp + size;
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
char *sfp = (char *)ifp->if_u1.if_data;
int size = ifp->if_bytes;
char *endp = sfp + size;
ASSERT(ifp->if_format == XFS_DINODE_FMT_LOCAL);
/*
* Zero length symlinks should never occur in memory as they are
......
......@@ -27,7 +27,7 @@ xfs_trans_ijoin(
struct xfs_inode *ip,
uint lock_flags)
{
xfs_inode_log_item_t *iip;
struct xfs_inode_log_item *iip;
ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
if (ip->i_itemp == NULL)
......
......@@ -566,8 +566,9 @@ xchk_bmap_check_rmaps(
struct xfs_scrub *sc,
int whichfork)
{
loff_t size;
struct xfs_ifork *ifp = XFS_IFORK_PTR(sc->ip, whichfork);
xfs_agnumber_t agno;
bool zero_size;
int error;
if (!xfs_sb_version_hasrmapbt(&sc->mp->m_sb) ||
......@@ -579,6 +580,8 @@ xchk_bmap_check_rmaps(
if (XFS_IS_REALTIME_INODE(sc->ip) && whichfork == XFS_DATA_FORK)
return 0;
ASSERT(XFS_IFORK_PTR(sc->ip, whichfork) != NULL);
/*
* Only do this for complex maps that are in btree format, or for
* situations where we would seem to have a size but zero extents.
......@@ -586,19 +589,14 @@ xchk_bmap_check_rmaps(
* to flag this bmap as corrupt if there are rmaps that need to be
* reattached.
*/
switch (whichfork) {
case XFS_DATA_FORK:
size = i_size_read(VFS_I(sc->ip));
break;
case XFS_ATTR_FORK:
size = XFS_IFORK_Q(sc->ip);
break;
default:
size = 0;
break;
}
if (XFS_IFORK_FORMAT(sc->ip, whichfork) != XFS_DINODE_FMT_BTREE &&
(size == 0 || XFS_IFORK_NEXTENTS(sc->ip, whichfork) > 0))
if (whichfork == XFS_DATA_FORK)
zero_size = i_size_read(VFS_I(sc->ip)) == 0;
else
zero_size = false;
if (ifp->if_format != XFS_DINODE_FMT_BTREE &&
(zero_size || ifp->if_nextents > 0))
return 0;
for (agno = 0; agno < sc->mp->m_sb.sb_agcount; agno++) {
......@@ -627,12 +625,14 @@ xchk_bmap(
struct xchk_bmap_info info = { NULL };
struct xfs_mount *mp = sc->mp;
struct xfs_inode *ip = sc->ip;
struct xfs_ifork *ifp;
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork);
xfs_fileoff_t endoff;
struct xfs_iext_cursor icur;
int error = 0;
ifp = XFS_IFORK_PTR(ip, whichfork);
/* Non-existent forks can be ignored. */
if (!ifp)
goto out;
info.is_rt = whichfork == XFS_DATA_FORK && XFS_IS_REALTIME_INODE(ip);
info.whichfork = whichfork;
......@@ -641,9 +641,6 @@ xchk_bmap(
switch (whichfork) {
case XFS_COW_FORK:
/* Non-existent CoW forks are ignorable. */
if (!ifp)
goto out;
/* No CoW forks on non-reflink inodes/filesystems. */
if (!xfs_is_reflink_inode(ip)) {
xchk_ino_set_corrupt(sc, sc->ip->i_ino);
......@@ -651,8 +648,6 @@ xchk_bmap(
}
break;
case XFS_ATTR_FORK:
if (!ifp)
goto out_check_rmap;
if (!xfs_sb_version_hasattr(&mp->m_sb) &&
!xfs_sb_version_hasattr2(&mp->m_sb))
xchk_ino_set_corrupt(sc, sc->ip->i_ino);
......@@ -663,7 +658,7 @@ xchk_bmap(
}
/* Check the fork values */
switch (XFS_IFORK_FORMAT(ip, whichfork)) {
switch (ifp->if_format) {
case XFS_DINODE_FMT_UUID:
case XFS_DINODE_FMT_DEV:
case XFS_DINODE_FMT_LOCAL:
......@@ -717,7 +712,6 @@ xchk_bmap(
goto out;
}
out_check_rmap:
error = xchk_bmap_check_rmaps(sc, whichfork);
if (!xchk_fblock_xref_process_error(sc, whichfork, 0, &error))
goto out;
......
......@@ -468,7 +468,7 @@ xchk_da_btree(
int error;
/* Skip short format data structures; no btree to scan. */
if (!xfs_ifork_has_extents(sc->ip, whichfork))
if (!xfs_ifork_has_extents(XFS_IFORK_PTR(sc->ip, whichfork)))
return 0;
/* Set up initial da state. */
......
......@@ -635,7 +635,7 @@ xchk_directory_blocks(
{
struct xfs_bmbt_irec got;
struct xfs_da_args args;
struct xfs_ifork *ifp;
struct xfs_ifork *ifp = XFS_IFORK_PTR(sc->ip, XFS_DATA_FORK);
struct xfs_mount *mp = sc->mp;
xfs_fileoff_t leaf_lblk;
xfs_fileoff_t free_lblk;
......@@ -647,11 +647,10 @@ xchk_directory_blocks(
int error;
/* Ignore local format directories. */
if (sc->ip->i_d.di_format != XFS_DINODE_FMT_EXTENTS &&
sc->ip->i_d.di_format != XFS_DINODE_FMT_BTREE)
if (ifp->if_format != XFS_DINODE_FMT_EXTENTS &&
ifp->if_format != XFS_DINODE_FMT_BTREE)
return 0;
ifp = XFS_IFORK_PTR(sc->ip, XFS_DATA_FORK);
lblk = XFS_B_TO_FSB(mp, XFS_DIR2_DATA_OFFSET);
leaf_lblk = XFS_B_TO_FSB(mp, XFS_DIR2_LEAF_OFFSET);
free_lblk = XFS_B_TO_FSB(mp, XFS_DIR2_FREE_OFFSET);
......
......@@ -278,8 +278,7 @@ xchk_iallocbt_check_cluster(
&XFS_RMAP_OINFO_INODES);
/* Grab the inode cluster buffer. */
error = xfs_imap_to_bp(mp, bs->cur->bc_tp, &imap, &dip, &cluster_bp,
0, 0);
error = xfs_imap_to_bp(mp, bs->cur->bc_tp, &imap, &dip, &cluster_bp, 0);
if (!xchk_btree_xref_process_error(bs->sc, bs->cur, 0, &error))
return error;
......
......@@ -90,7 +90,7 @@ xchk_parent_count_parent_dentries(
* if there is one.
*/
lock_mode = xfs_ilock_data_map_shared(parent);
if (parent->i_d.di_nextents > 0)
if (parent->i_df.if_nextents > 0)
error = xfs_dir3_data_readahead(parent, 0, 0);
xfs_iunlock(parent, lock_mode);
if (error)
......
......@@ -382,7 +382,7 @@ xfs_map_blocks(
*/
retry:
xfs_ilock(ip, XFS_ILOCK_SHARED);
ASSERT(ip->i_d.di_format != XFS_DINODE_FMT_BTREE ||
ASSERT(ip->i_df.if_format != XFS_DINODE_FMT_BTREE ||
(ip->i_df.if_flags & XFS_IFEXTENTS));
/*
......
......@@ -367,7 +367,7 @@ xfs_attr_inactive(
* removal below.
*/
if (xfs_inode_hasattr(dp) &&
dp->i_d.di_aformat != XFS_DINODE_FMT_LOCAL) {
dp->i_afp->if_format != XFS_DINODE_FMT_LOCAL) {
error = xfs_attr3_root_inactive(&trans, dp);
if (error)
goto out_cancel;
......@@ -388,8 +388,11 @@ xfs_attr_inactive(
xfs_trans_cancel(trans);
out_destroy_fork:
/* kill the in-core attr fork before we drop the inode lock */
if (dp->i_afp)
xfs_idestroy_fork(dp, XFS_ATTR_FORK);
if (dp->i_afp) {
xfs_idestroy_fork(dp->i_afp);
kmem_cache_free(xfs_ifork_zone, dp->i_afp);
dp->i_afp = NULL;
}
if (lock_mode)
xfs_iunlock(dp, lock_mode);
return error;
......
......@@ -512,9 +512,9 @@ xfs_attr_list_ilocked(
*/
if (!xfs_inode_hasattr(dp))
return 0;
else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL)
if (dp->i_afp->if_format == XFS_DINODE_FMT_LOCAL)
return xfs_attr_shortform_list(context);
else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
return xfs_attr_leaf_list(context);
return xfs_attr_node_list(context);
}
......
This diff is collapsed.
......@@ -32,11 +32,6 @@ struct kmem_zone;
*/
#define XFS_BUI_MAX_FAST_EXTENTS 1
/*
* Define BUI flag bits. Manipulated by set/clear/test_bit operators.
*/
#define XFS_BUI_RECOVERED 1
/*
* This is the "bmap update intent" log item. It is used to log the fact that
* some reverse mappings need to change. It is used in conjunction with the
......@@ -49,7 +44,6 @@ struct xfs_bui_log_item {
struct xfs_log_item bui_item;
atomic_t bui_refcount;
atomic_t bui_next_extent;
unsigned long bui_flags; /* misc flags */
struct xfs_bui_log_format bui_format;
};
......@@ -74,9 +68,4 @@ struct xfs_bud_log_item {
extern struct kmem_zone *xfs_bui_zone;
extern struct kmem_zone *xfs_bud_zone;
struct xfs_bui_log_item *xfs_bui_init(struct xfs_mount *);
void xfs_bui_item_free(struct xfs_bui_log_item *);
void xfs_bui_release(struct xfs_bui_log_item *);
int xfs_bui_recover(struct xfs_trans *parent_tp, struct xfs_bui_log_item *buip);
#endif /* __XFS_BMAP_ITEM_H__ */
......@@ -223,7 +223,7 @@ xfs_bmap_count_blocks(
if (!ifp)
return 0;
switch (XFS_IFORK_FORMAT(ip, whichfork)) {
switch (ifp->if_format) {
case XFS_DINODE_FMT_BTREE:
if (!(ifp->if_flags & XFS_IFEXTENTS)) {
error = xfs_iread_extents(tp, ip, whichfork);
......@@ -449,7 +449,7 @@ xfs_getbmap(
break;
}
switch (XFS_IFORK_FORMAT(ip, whichfork)) {
switch (ifp->if_format) {
case XFS_DINODE_FMT_EXTENTS:
case XFS_DINODE_FMT_BTREE:
break;
......@@ -1210,17 +1210,26 @@ xfs_swap_extents_check_format(
struct xfs_inode *ip, /* target inode */
struct xfs_inode *tip) /* tmp inode */
{
struct xfs_ifork *ifp = &ip->i_df;
struct xfs_ifork *tifp = &tip->i_df;
/* User/group/project quota ids must match if quotas are enforced. */
if (XFS_IS_QUOTA_ON(ip->i_mount) &&
(!uid_eq(VFS_I(ip)->i_uid, VFS_I(tip)->i_uid) ||
!gid_eq(VFS_I(ip)->i_gid, VFS_I(tip)->i_gid) ||
ip->i_d.di_projid != tip->i_d.di_projid))
return -EINVAL;
/* Should never get a local format */
if (ip->i_d.di_format == XFS_DINODE_FMT_LOCAL ||
tip->i_d.di_format == XFS_DINODE_FMT_LOCAL)
if (ifp->if_format == XFS_DINODE_FMT_LOCAL ||
tifp->if_format == XFS_DINODE_FMT_LOCAL)
return -EINVAL;
/*
* if the target inode has less extents that then temporary inode then
* why did userspace call us?
*/
if (ip->i_d.di_nextents < tip->i_d.di_nextents)
if (ifp->if_nextents < tifp->if_nextents)
return -EINVAL;
/*
......@@ -1235,20 +1244,18 @@ xfs_swap_extents_check_format(
* form then we will end up with the target inode in the wrong format
* as we already know there are less extents in the temp inode.
*/
if (ip->i_d.di_format == XFS_DINODE_FMT_EXTENTS &&
tip->i_d.di_format == XFS_DINODE_FMT_BTREE)
if (ifp->if_format == XFS_DINODE_FMT_EXTENTS &&
tifp->if_format == XFS_DINODE_FMT_BTREE)
return -EINVAL;
/* Check temp in extent form to max in target */
if (tip->i_d.di_format == XFS_DINODE_FMT_EXTENTS &&
XFS_IFORK_NEXTENTS(tip, XFS_DATA_FORK) >
XFS_IFORK_MAXEXT(ip, XFS_DATA_FORK))
if (tifp->if_format == XFS_DINODE_FMT_EXTENTS &&
tifp->if_nextents > XFS_IFORK_MAXEXT(ip, XFS_DATA_FORK))
return -EINVAL;
/* Check target in extent form to max in temp */
if (ip->i_d.di_format == XFS_DINODE_FMT_EXTENTS &&
XFS_IFORK_NEXTENTS(ip, XFS_DATA_FORK) >
XFS_IFORK_MAXEXT(tip, XFS_DATA_FORK))
if (ifp->if_format == XFS_DINODE_FMT_EXTENTS &&
ifp->if_nextents > XFS_IFORK_MAXEXT(tip, XFS_DATA_FORK))
return -EINVAL;
/*
......@@ -1260,22 +1267,20 @@ xfs_swap_extents_check_format(
* (a common defrag case) which will occur when the temp inode is in
* extent format...
*/
if (tip->i_d.di_format == XFS_DINODE_FMT_BTREE) {
if (tifp->if_format == XFS_DINODE_FMT_BTREE) {
if (XFS_IFORK_Q(ip) &&
XFS_BMAP_BMDR_SPACE(tip->i_df.if_broot) > XFS_IFORK_BOFF(ip))
XFS_BMAP_BMDR_SPACE(tifp->if_broot) > XFS_IFORK_BOFF(ip))
return -EINVAL;
if (XFS_IFORK_NEXTENTS(tip, XFS_DATA_FORK) <=
XFS_IFORK_MAXEXT(ip, XFS_DATA_FORK))
if (tifp->if_nextents <= XFS_IFORK_MAXEXT(ip, XFS_DATA_FORK))
return -EINVAL;
}
/* Reciprocal target->temp btree format checks */
if (ip->i_d.di_format == XFS_DINODE_FMT_BTREE) {
if (ifp->if_format == XFS_DINODE_FMT_BTREE) {
if (XFS_IFORK_Q(tip) &&
XFS_BMAP_BMDR_SPACE(ip->i_df.if_broot) > XFS_IFORK_BOFF(tip))
return -EINVAL;
if (XFS_IFORK_NEXTENTS(ip, XFS_DATA_FORK) <=
XFS_IFORK_MAXEXT(tip, XFS_DATA_FORK))
if (ifp->if_nextents <= XFS_IFORK_MAXEXT(tip, XFS_DATA_FORK))
return -EINVAL;
}
......@@ -1427,15 +1432,15 @@ xfs_swap_extent_forks(
/*
* Count the number of extended attribute blocks
*/
if ( ((XFS_IFORK_Q(ip) != 0) && (ip->i_d.di_anextents > 0)) &&
(ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
if (XFS_IFORK_Q(ip) && ip->i_afp->if_nextents > 0 &&
ip->i_afp->if_format != XFS_DINODE_FMT_LOCAL) {
error = xfs_bmap_count_blocks(tp, ip, XFS_ATTR_FORK, &junk,
&aforkblks);
if (error)
return error;
}
if ( ((XFS_IFORK_Q(tip) != 0) && (tip->i_d.di_anextents > 0)) &&
(tip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)) {
if (XFS_IFORK_Q(tip) && tip->i_afp->if_nextents > 0 &&
tip->i_afp->if_format != XFS_DINODE_FMT_LOCAL) {
error = xfs_bmap_count_blocks(tp, tip, XFS_ATTR_FORK, &junk,
&taforkblks);
if (error)
......@@ -1450,9 +1455,9 @@ xfs_swap_extent_forks(
* bmbt scan as the last step.
*/
if (xfs_sb_version_has_v3inode(&ip->i_mount->m_sb)) {
if (ip->i_d.di_format == XFS_DINODE_FMT_BTREE)
if (ip->i_df.if_format == XFS_DINODE_FMT_BTREE)
(*target_log_flags) |= XFS_ILOG_DOWNER;
if (tip->i_d.di_format == XFS_DINODE_FMT_BTREE)
if (tip->i_df.if_format == XFS_DINODE_FMT_BTREE)
(*src_log_flags) |= XFS_ILOG_DOWNER;
}
......@@ -1468,9 +1473,6 @@ xfs_swap_extent_forks(
ip->i_d.di_nblocks = tip->i_d.di_nblocks - taforkblks + aforkblks;
tip->i_d.di_nblocks = tmp + taforkblks - aforkblks;
swap(ip->i_d.di_nextents, tip->i_d.di_nextents);
swap(ip->i_d.di_format, tip->i_d.di_format);
/*
* The extents in the source inode could still contain speculative
* preallocation beyond EOF (e.g. the file is open but not modified
......@@ -1484,7 +1486,7 @@ xfs_swap_extent_forks(
tip->i_delayed_blks = ip->i_delayed_blks;
ip->i_delayed_blks = 0;
switch (ip->i_d.di_format) {
switch (ip->i_df.if_format) {
case XFS_DINODE_FMT_EXTENTS:
(*src_log_flags) |= XFS_ILOG_DEXT;
break;
......@@ -1495,7 +1497,7 @@ xfs_swap_extent_forks(
break;
}
switch (tip->i_d.di_format) {
switch (tip->i_df.if_format) {
case XFS_DINODE_FMT_EXTENTS:
(*target_log_flags) |= XFS_ILOG_DEXT;
break;
......@@ -1606,7 +1608,7 @@ xfs_swap_extents(
if (xfs_inode_has_cow_data(tip)) {
error = xfs_reflink_cancel_cow_range(tip, 0, NULLFILEOFF, true);
if (error)
return error;
goto out_unlock;
}
/*
......@@ -1615,9 +1617,9 @@ xfs_swap_extents(
* performed with log redo items!
*/
if (xfs_sb_version_hasrmapbt(&mp->m_sb)) {
int w = XFS_DATA_FORK;
uint32_t ipnext = XFS_IFORK_NEXTENTS(ip, w);
uint32_t tipnext = XFS_IFORK_NEXTENTS(tip, w);
int w = XFS_DATA_FORK;
uint32_t ipnext = ip->i_df.if_nextents;
uint32_t tipnext = tip->i_df.if_nextents;
/*
* Conceptually this shouldn't affect the shape of either bmbt,
......@@ -1717,10 +1719,11 @@ xfs_swap_extents(
/* Swap the cow forks. */
if (xfs_sb_version_hasreflink(&mp->m_sb)) {
ASSERT(ip->i_cformat == XFS_DINODE_FMT_EXTENTS);
ASSERT(tip->i_cformat == XFS_DINODE_FMT_EXTENTS);
ASSERT(!ip->i_cowfp ||
ip->i_cowfp->if_format == XFS_DINODE_FMT_EXTENTS);
ASSERT(!tip->i_cowfp ||
tip->i_cowfp->if_format == XFS_DINODE_FMT_EXTENTS);
swap(ip->i_cnextents, tip->i_cnextents);
swap(ip->i_cowfp, tip->i_cowfp);
if (ip->i_cowfp && ip->i_cowfp->if_bytes)
......
......@@ -1197,8 +1197,10 @@ xfs_buf_ioend(
bp->b_ops->verify_read(bp);
}
if (!bp->b_error)
if (!bp->b_error) {
bp->b_flags &= ~XBF_WRITE_FAIL;
bp->b_flags |= XBF_DONE;
}
if (bp->b_iodone)
(*(bp->b_iodone))(bp);
......@@ -1242,10 +1244,26 @@ xfs_buf_ioerror_alert(
struct xfs_buf *bp,
xfs_failaddr_t func)
{
xfs_alert_ratelimited(bp->b_mount,
"metadata I/O error in \"%pS\" at daddr 0x%llx len %d error %d",
func, (uint64_t)XFS_BUF_ADDR(bp), bp->b_length,
-bp->b_error);
xfs_buf_alert_ratelimited(bp, "XFS: metadata IO error",
"metadata I/O error in \"%pS\" at daddr 0x%llx len %d error %d",
func, (uint64_t)XFS_BUF_ADDR(bp),
bp->b_length, -bp->b_error);
}
/*
* To simulate an I/O failure, the buffer must be locked and held with at least
* three references. The LRU reference is dropped by the stale call. The buf
* item reference is dropped via ioend processing. The third reference is owned
* by the caller and is dropped on I/O completion if the buffer is XBF_ASYNC.
*/
void
xfs_buf_ioend_fail(
struct xfs_buf *bp)
{
bp->b_flags &= ~XBF_DONE;
xfs_buf_stale(bp);
xfs_buf_ioerror(bp, -EIO);
xfs_buf_ioend(bp);
}
int
......@@ -1258,7 +1276,7 @@ xfs_bwrite(
bp->b_flags |= XBF_WRITE;
bp->b_flags &= ~(XBF_ASYNC | XBF_READ | _XBF_DELWRI_Q |
XBF_WRITE_FAIL | XBF_DONE);
XBF_DONE);
error = xfs_buf_submit(bp);
if (error)
......@@ -1272,6 +1290,11 @@ xfs_buf_bio_end_io(
{
struct xfs_buf *bp = (struct xfs_buf *)bio->bi_private;
if (!bio->bi_status &&
(bp->b_flags & XBF_WRITE) && (bp->b_flags & XBF_ASYNC) &&
XFS_TEST_ERROR(false, bp->b_mount, XFS_ERRTAG_BUF_IOERROR))
bio->bi_status = BLK_STS_IOERR;
/*
* don't overwrite existing errors - otherwise we can lose errors on
* buffers that require multiple bios to complete.
......@@ -1480,10 +1503,7 @@ __xfs_buf_submit(
/* on shutdown we stale and complete the buffer immediately */
if (XFS_FORCED_SHUTDOWN(bp->b_mount)) {
xfs_buf_ioerror(bp, -EIO);
bp->b_flags &= ~XBF_DONE;
xfs_buf_stale(bp);
xfs_buf_ioend(bp);
xfs_buf_ioend_fail(bp);
return -EIO;
}
......@@ -1642,7 +1662,8 @@ xfs_wait_buftarg(
struct xfs_buftarg *btp)
{
LIST_HEAD(dispose);
int loop = 0;
int loop = 0;
bool write_fail = false;
/*
* First wait on the buftarg I/O count for all in-flight buffers to be
......@@ -1670,17 +1691,29 @@ xfs_wait_buftarg(
bp = list_first_entry(&dispose, struct xfs_buf, b_lru);
list_del_init(&bp->b_lru);
if (bp->b_flags & XBF_WRITE_FAIL) {
xfs_alert(btp->bt_mount,
write_fail = true;
xfs_buf_alert_ratelimited(bp,
"XFS: Corruption Alert",
"Corruption Alert: Buffer at daddr 0x%llx had permanent write failures!",
(long long)bp->b_bn);
xfs_alert(btp->bt_mount,
"Please run xfs_repair to determine the extent of the problem.");
}
xfs_buf_rele(bp);
}
if (loop++ != 0)
delay(100);
}
/*
* If one or more failed buffers were freed, that means dirty metadata
* was thrown away. This should only ever happen after I/O completion
* handling has elevated I/O error(s) to permanent failures and shuts
* down the fs.
*/
if (write_fail) {
ASSERT(XFS_FORCED_SHUTDOWN(btp->bt_mount));
xfs_alert(btp->bt_mount,
"Please run xfs_repair to determine the extent of the problem.");
}
}
static enum lru_status
......@@ -1813,6 +1846,13 @@ xfs_alloc_buftarg(
btp->bt_bdev = bdev;
btp->bt_daxdev = dax_dev;
/*
* Buffer IO error rate limiting. Limit it to no more than 10 messages
* per 30 seconds so as to not spam logs too much on repeated errors.
*/
ratelimit_state_init(&btp->bt_ioerror_rl, 30 * HZ,
DEFAULT_RATELIMIT_BURST);
if (xfs_setsize_buftarg_early(btp, bdev))
goto error_free;
......@@ -1983,7 +2023,7 @@ xfs_buf_delwri_submit_buffers(
* synchronously. Otherwise, drop the buffer from the delwri
* queue and submit async.
*/
bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL);
bp->b_flags &= ~_XBF_DELWRI_Q;
bp->b_flags |= XBF_WRITE;
if (wait_list) {
bp->b_flags &= ~XBF_ASYNC;
......
......@@ -91,6 +91,7 @@ typedef struct xfs_buftarg {
struct list_lru bt_lru;
struct percpu_counter bt_io_count;
struct ratelimit_state bt_ioerror_rl;
} xfs_buftarg_t;
struct xfs_buf;
......@@ -263,6 +264,7 @@ extern void __xfs_buf_ioerror(struct xfs_buf *bp, int error,
xfs_failaddr_t failaddr);
#define xfs_buf_ioerror(bp, err) __xfs_buf_ioerror((bp), (err), __this_address)
extern void xfs_buf_ioerror_alert(struct xfs_buf *bp, xfs_failaddr_t fa);
void xfs_buf_ioend_fail(struct xfs_buf *);
extern int __xfs_buf_submit(struct xfs_buf *bp, bool);
static inline int xfs_buf_submit(struct xfs_buf *bp)
......
......@@ -410,7 +410,6 @@ xfs_buf_item_unpin(
{
struct xfs_buf_log_item *bip = BUF_ITEM(lip);
xfs_buf_t *bp = bip->bli_buf;
struct xfs_ail *ailp = lip->li_ailp;
int stale = bip->bli_flags & XFS_BLI_STALE;
int freed;
......@@ -452,10 +451,10 @@ xfs_buf_item_unpin(
}
/*
* If we get called here because of an IO error, we may
* or may not have the item on the AIL. xfs_trans_ail_delete()
* will take care of that situation.
* xfs_trans_ail_delete() drops the AIL lock.
* If we get called here because of an IO error, we may or may
* not have the item on the AIL. xfs_trans_ail_delete() will
* take care of that situation. xfs_trans_ail_delete() drops
* the AIL lock.
*/
if (bip->bli_flags & XFS_BLI_STALE_INODE) {
xfs_buf_do_callbacks(bp);
......@@ -463,47 +462,23 @@ xfs_buf_item_unpin(
list_del_init(&bp->b_li_list);
bp->b_iodone = NULL;
} else {
spin_lock(&ailp->ail_lock);
xfs_trans_ail_delete(ailp, lip, SHUTDOWN_LOG_IO_ERROR);
xfs_trans_ail_delete(lip, SHUTDOWN_LOG_IO_ERROR);
xfs_buf_item_relse(bp);
ASSERT(bp->b_log_item == NULL);
}
xfs_buf_relse(bp);
} else if (freed && remove) {
/*
* There are currently two references to the buffer - the active
* LRU reference and the buf log item. What we are about to do
* here - simulate a failed IO completion - requires 3
* references.
*
* The LRU reference is removed by the xfs_buf_stale() call. The
* buf item reference is removed by the xfs_buf_iodone()
* callback that is run by xfs_buf_do_callbacks() during ioend
* processing (via the bp->b_iodone callback), and then finally
* the ioend processing will drop the IO reference if the buffer
* is marked XBF_ASYNC.
*
* Hence we need to take an additional reference here so that IO
* completion processing doesn't free the buffer prematurely.
* The buffer must be locked and held by the caller to simulate
* an async I/O failure.
*/
xfs_buf_lock(bp);
xfs_buf_hold(bp);
bp->b_flags |= XBF_ASYNC;
xfs_buf_ioerror(bp, -EIO);
bp->b_flags &= ~XBF_DONE;
xfs_buf_stale(bp);
xfs_buf_ioend(bp);
xfs_buf_ioend_fail(bp);
}
}
/*
* Buffer IO error rate limiting. Limit it to no more than 10 messages per 30
* seconds so as to not spam logs too much on repeated detection of the same
* buffer being bad..
*/
static DEFINE_RATELIMIT_STATE(xfs_buf_write_fail_rl_state, 30 * HZ, 10);
STATIC uint
xfs_buf_item_push(
struct xfs_log_item *lip,
......@@ -533,11 +508,10 @@ xfs_buf_item_push(
trace_xfs_buf_item_push(bip);
/* has a previous flush failed due to IO errors? */
if ((bp->b_flags & XBF_WRITE_FAIL) &&
___ratelimit(&xfs_buf_write_fail_rl_state, "XFS: Failing async write")) {
xfs_warn(bp->b_mount,
"Failing async write on buffer block 0x%llx. Retrying async write.",
(long long)bp->b_bn);
if (bp->b_flags & XBF_WRITE_FAIL) {
xfs_buf_alert_ratelimited(bp, "XFS: Failing async write",
"Failing async write on buffer block 0x%llx. Retrying async write.",
(long long)bp->b_bn);
}
if (!xfs_buf_delwri_queue(bp, buffer_list))
......@@ -584,7 +558,7 @@ xfs_buf_item_put(
* state.
*/
if (aborted)
xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
xfs_trans_ail_delete(lip, 0);
xfs_buf_item_relse(bip->bli_buf);
return true;
}
......@@ -1229,61 +1203,19 @@ xfs_buf_iodone(
struct xfs_buf *bp,
struct xfs_log_item *lip)
{
struct xfs_ail *ailp = lip->li_ailp;
ASSERT(BUF_ITEM(lip)->bli_buf == bp);
xfs_buf_rele(bp);
/*
* If we are forcibly shutting down, this may well be
* off the AIL already. That's because we simulate the
* log-committed callbacks to unpin these buffers. Or we may never
* have put this item on AIL because of the transaction was
* aborted forcibly. xfs_trans_ail_delete() takes care of these.
* If we are forcibly shutting down, this may well be off the AIL
* already. That's because we simulate the log-committed callbacks to
* unpin these buffers. Or we may never have put this item on AIL
* because of the transaction was aborted forcibly.
* xfs_trans_ail_delete() takes care of these.
*
* Either way, AIL is useless if we're forcing a shutdown.
*/
spin_lock(&ailp->ail_lock);
xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE);
xfs_trans_ail_delete(lip, SHUTDOWN_CORRUPT_INCORE);
xfs_buf_item_free(BUF_ITEM(lip));
}
/*
* Requeue a failed buffer for writeback.
*
* We clear the log item failed state here as well, but we have to be careful
* about reference counts because the only active reference counts on the buffer
* may be the failed log items. Hence if we clear the log item failed state
* before queuing the buffer for IO we can release all active references to
* the buffer and free it, leading to use after free problems in
* xfs_buf_delwri_queue. It makes no difference to the buffer or log items which
* order we process them in - the buffer is locked, and we own the buffer list
* so nothing on them is going to change while we are performing this action.
*
* Hence we can safely queue the buffer for IO before we clear the failed log
* item state, therefore always having an active reference to the buffer and
* avoiding the transient zero-reference state that leads to use-after-free.
*
* Return true if the buffer was added to the buffer list, false if it was
* already on the buffer list.
*/
bool
xfs_buf_resubmit_failed_buffers(
struct xfs_buf *bp,
struct list_head *buffer_list)
{
struct xfs_log_item *lip;
bool ret;
ret = xfs_buf_delwri_queue(bp, buffer_list);
/*
* XFS_LI_FAILED set/clear is protected by ail_lock, caller of this
* function already have it acquired
*/
list_for_each_entry(lip, &bp->b_li_list, li_bio_list)
xfs_clear_li_failed(lip);
return ret;
}
......@@ -59,8 +59,6 @@ void xfs_buf_attach_iodone(struct xfs_buf *,
struct xfs_log_item *);
void xfs_buf_iodone_callbacks(struct xfs_buf *);
void xfs_buf_iodone(struct xfs_buf *, struct xfs_log_item *);
bool xfs_buf_resubmit_failed_buffers(struct xfs_buf *,
struct list_head *);
bool xfs_buf_log_check_iovec(struct xfs_log_iovec *iovec);
extern kmem_zone_t *xfs_buf_item_zone;
......
This diff is collapsed.
......@@ -524,7 +524,7 @@ xfs_readdir(
args.geo = dp->i_mount->m_dir_geo;
args.trans = tp;
if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL)
if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL)
rval = xfs_dir2_sf_getdents(&args, ctx);
else if ((rval = xfs_dir2_isblock(&args, &v)))
;
......
......@@ -75,7 +75,7 @@ xfs_qm_adjust_dqlimits(
int prealloc = 0;
ASSERT(d->d_id);
defq = xfs_get_defquota(dq, q);
defq = xfs_get_defquota(q, xfs_dquot_type(dq));
if (defq->bsoftlimit && !d->d_blk_softlimit) {
d->d_blk_softlimit = cpu_to_be64(defq->bsoftlimit);
......@@ -114,9 +114,14 @@ xfs_qm_adjust_dqlimits(
void
xfs_qm_adjust_dqtimers(
struct xfs_mount *mp,
struct xfs_disk_dquot *d)
struct xfs_dquot *dq)
{
struct xfs_quotainfo *qi = mp->m_quotainfo;
struct xfs_disk_dquot *d = &dq->q_core;
struct xfs_def_quota *defq;
ASSERT(d->d_id);
defq = xfs_get_defquota(qi, xfs_dquot_type(dq));
#ifdef DEBUG
if (d->d_blk_hardlimit)
......@@ -138,7 +143,7 @@ xfs_qm_adjust_dqtimers(
(be64_to_cpu(d->d_bcount) >
be64_to_cpu(d->d_blk_hardlimit)))) {
d->d_btimer = cpu_to_be32(ktime_get_real_seconds() +
mp->m_quotainfo->qi_btimelimit);
defq->btimelimit);
} else {
d->d_bwarns = 0;
}
......@@ -161,7 +166,7 @@ xfs_qm_adjust_dqtimers(
(be64_to_cpu(d->d_icount) >
be64_to_cpu(d->d_ino_hardlimit)))) {
d->d_itimer = cpu_to_be32(ktime_get_real_seconds() +
mp->m_quotainfo->qi_itimelimit);
defq->itimelimit);
} else {
d->d_iwarns = 0;
}
......@@ -184,7 +189,7 @@ xfs_qm_adjust_dqtimers(
(be64_to_cpu(d->d_rtbcount) >
be64_to_cpu(d->d_rtb_hardlimit)))) {
d->d_rtbtimer = cpu_to_be32(ktime_get_real_seconds() +
mp->m_quotainfo->qi_rtbtimelimit);
defq->rtbtimelimit);
} else {
d->d_rtbwarns = 0;
}
......@@ -205,16 +210,18 @@ xfs_qm_adjust_dqtimers(
*/
STATIC void
xfs_qm_init_dquot_blk(
xfs_trans_t *tp,
xfs_mount_t *mp,
xfs_dqid_t id,
uint type,
xfs_buf_t *bp)
struct xfs_trans *tp,
struct xfs_mount *mp,
xfs_dqid_t id,
uint type,
struct xfs_buf *bp)
{
struct xfs_quotainfo *q = mp->m_quotainfo;
xfs_dqblk_t *d;
xfs_dqid_t curid;
int i;
struct xfs_dqblk *d;
xfs_dqid_t curid;
unsigned int qflag;
unsigned int blftype;
int i;
ASSERT(tp);
ASSERT(xfs_buf_islocked(bp));
......@@ -238,11 +245,39 @@ xfs_qm_init_dquot_blk(
}
}
xfs_trans_dquot_buf(tp, bp,
(type & XFS_DQ_USER ? XFS_BLF_UDQUOT_BUF :
((type & XFS_DQ_PROJ) ? XFS_BLF_PDQUOT_BUF :
XFS_BLF_GDQUOT_BUF)));
xfs_trans_log_buf(tp, bp, 0, BBTOB(q->qi_dqchunklen) - 1);
if (type & XFS_DQ_USER) {
qflag = XFS_UQUOTA_CHKD;
blftype = XFS_BLF_UDQUOT_BUF;
} else if (type & XFS_DQ_PROJ) {
qflag = XFS_PQUOTA_CHKD;
blftype = XFS_BLF_PDQUOT_BUF;
} else {
qflag = XFS_GQUOTA_CHKD;
blftype = XFS_BLF_GDQUOT_BUF;
}
xfs_trans_dquot_buf(tp, bp, blftype);
/*
* quotacheck uses delayed writes to update all the dquots on disk in an
* efficient manner instead of logging the individual dquot changes as
* they are made. However if we log the buffer allocated here and crash
* after quotacheck while the logged initialisation is still in the
* active region of the log, log recovery can replay the dquot buffer
* initialisation over the top of the checked dquots and corrupt quota
* accounting.
*
* To avoid this problem, quotacheck cannot log the initialised buffer.
* We must still dirty the buffer and write it back before the
* allocation transaction clears the log. Therefore, mark the buffer as
* ordered instead of logging it directly. This is safe for quotacheck
* because it detects and repairs allocated but initialized dquot blocks
* in the quota inodes.
*/
if (!(mp->m_qflags & qflag))
xfs_trans_ordered_buf(tp, bp);
else
xfs_trans_log_buf(tp, bp, 0, BBTOB(q->qi_dqchunklen) - 1);
}
/*
......@@ -1021,6 +1056,7 @@ xfs_qm_dqflush_done(
struct xfs_dq_logitem *qip = (struct xfs_dq_logitem *)lip;
struct xfs_dquot *dqp = qip->qli_dquot;
struct xfs_ail *ailp = lip->li_ailp;
xfs_lsn_t tail_lsn;
/*
* We only want to pull the item from the AIL if its
......@@ -1034,10 +1070,11 @@ xfs_qm_dqflush_done(
((lip->li_lsn == qip->qli_flush_lsn) ||
test_bit(XFS_LI_FAILED, &lip->li_flags))) {
/* xfs_trans_ail_delete() drops the AIL lock. */
spin_lock(&ailp->ail_lock);
if (lip->li_lsn == qip->qli_flush_lsn) {
xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE);
/* xfs_ail_update_finish() drops the AIL lock */
tail_lsn = xfs_ail_delete_one(ailp, lip);
xfs_ail_update_finish(ailp, tail_lsn);
} else {
/*
* Clear the failed state since we are about to drop the
......@@ -1068,6 +1105,7 @@ xfs_qm_dqflush(
struct xfs_buf **bpp)
{
struct xfs_mount *mp = dqp->q_mount;
struct xfs_log_item *lip = &dqp->q_logitem.qli_item;
struct xfs_buf *bp;
struct xfs_dqblk *dqb;
struct xfs_disk_dquot *ddqp;
......@@ -1083,32 +1121,16 @@ xfs_qm_dqflush(
xfs_qm_dqunpin_wait(dqp);
/*
* This may have been unpinned because the filesystem is shutting
* down forcibly. If that's the case we must not write this dquot
* to disk, because the log record didn't make it to disk.
*
* We also have to remove the log item from the AIL in this case,
* as we wait for an emptry AIL as part of the unmount process.
*/
if (XFS_FORCED_SHUTDOWN(mp)) {
struct xfs_log_item *lip = &dqp->q_logitem.qli_item;
dqp->dq_flags &= ~XFS_DQ_DIRTY;
xfs_trans_ail_remove(lip, SHUTDOWN_CORRUPT_INCORE);
error = -EIO;
goto out_unlock;
}
/*
* Get the buffer containing the on-disk dquot
*/
error = xfs_trans_read_buf(mp, NULL, mp->m_ddev_targp, dqp->q_blkno,
mp->m_quotainfo->qi_dqchunklen, XBF_TRYLOCK,
&bp, &xfs_dquot_buf_ops);
if (error)
if (error == -EAGAIN)
goto out_unlock;
if (error)
goto out_abort;
/*
* Calculate the location of the dquot inside the buffer.
......@@ -1116,17 +1138,15 @@ xfs_qm_dqflush(
dqb = bp->b_addr + dqp->q_bufoffset;
ddqp = &dqb->dd_diskdq;
/*
* A simple sanity check in case we got a corrupted dquot.
*/
fa = xfs_dqblk_verify(mp, dqb, be32_to_cpu(ddqp->d_id), 0);
/* sanity check the in-core structure before we flush */
fa = xfs_dquot_verify(mp, &dqp->q_core, be32_to_cpu(dqp->q_core.d_id),
0);
if (fa) {
xfs_alert(mp, "corrupt dquot ID 0x%x in memory at %pS",
be32_to_cpu(ddqp->d_id), fa);
be32_to_cpu(dqp->q_core.d_id), fa);
xfs_buf_relse(bp);
xfs_dqfunlock(dqp);
xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
return -EFSCORRUPTED;
error = -EFSCORRUPTED;
goto out_abort;
}
/* This is the only portion of data that needs to persist */
......@@ -1175,6 +1195,10 @@ xfs_qm_dqflush(
*bpp = bp;
return 0;
out_abort:
dqp->dq_flags &= ~XFS_DQ_DIRTY;
xfs_trans_ail_delete(lip, 0);
xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
out_unlock:
xfs_dqfunlock(dqp);
return error;
......
......@@ -154,7 +154,7 @@ void xfs_qm_dqdestroy(struct xfs_dquot *dqp);
int xfs_qm_dqflush(struct xfs_dquot *dqp, struct xfs_buf **bpp);
void xfs_qm_dqunpin_wait(struct xfs_dquot *dqp);
void xfs_qm_adjust_dqtimers(struct xfs_mount *mp,
struct xfs_disk_dquot *d);
struct xfs_dquot *d);
void xfs_qm_adjust_dqlimits(struct xfs_mount *mp,
struct xfs_dquot *d);
xfs_dqid_t xfs_qm_id_for_quotatype(struct xfs_inode *ip, uint type);
......
......@@ -145,21 +145,6 @@ xfs_qm_dquot_logitem_push(
if (atomic_read(&dqp->q_pincount) > 0)
return XFS_ITEM_PINNED;
/*
* The buffer containing this item failed to be written back
* previously. Resubmit the buffer for IO
*/
if (test_bit(XFS_LI_FAILED, &lip->li_flags)) {
if (!xfs_buf_trylock(bp))
return XFS_ITEM_LOCKED;
if (!xfs_buf_resubmit_failed_buffers(bp, buffer_list))
rval = XFS_ITEM_FLUSHING;
xfs_buf_unlock(bp);
return rval;
}
if (!xfs_dqlock_nowait(dqp))
return XFS_ITEM_LOCKED;
......@@ -358,7 +343,7 @@ xfs_qm_qoff_logitem_relse(
ASSERT(test_bit(XFS_LI_IN_AIL, &lip->li_flags) ||
test_bit(XFS_LI_ABORTED, &lip->li_flags) ||
XFS_FORCED_SHUTDOWN(lip->li_mountp));
xfs_trans_ail_remove(lip, SHUTDOWN_LOG_IO_ERROR);
xfs_trans_ail_delete(lip, 0);
kmem_free(lip->li_lv_shadow);
kmem_free(qoff);
}
......
This diff is collapsed.
......@@ -53,6 +53,7 @@ static unsigned int xfs_errortag_random_default[] = {
XFS_RANDOM_FORCE_SCRUB_REPAIR,
XFS_RANDOM_FORCE_SUMMARY_RECALC,
XFS_RANDOM_IUNLINK_FALLBACK,
XFS_RANDOM_BUF_IOERROR,
};
struct xfs_errortag_attr {
......@@ -162,6 +163,7 @@ XFS_ERRORTAG_ATTR_RW(buf_lru_ref, XFS_ERRTAG_BUF_LRU_REF);
XFS_ERRORTAG_ATTR_RW(force_repair, XFS_ERRTAG_FORCE_SCRUB_REPAIR);
XFS_ERRORTAG_ATTR_RW(bad_summary, XFS_ERRTAG_FORCE_SUMMARY_RECALC);
XFS_ERRORTAG_ATTR_RW(iunlink_fallback, XFS_ERRTAG_IUNLINK_FALLBACK);
XFS_ERRORTAG_ATTR_RW(buf_ioerror, XFS_ERRTAG_BUF_IOERROR);
static struct attribute *xfs_errortag_attrs[] = {
XFS_ERRORTAG_ATTR_LIST(noerror),
......@@ -199,6 +201,7 @@ static struct attribute *xfs_errortag_attrs[] = {
XFS_ERRORTAG_ATTR_LIST(force_repair),
XFS_ERRORTAG_ATTR_LIST(bad_summary),
XFS_ERRORTAG_ATTR_LIST(iunlink_fallback),
XFS_ERRORTAG_ATTR_LIST(buf_ioerror),
NULL,
};
......
This diff is collapsed.
......@@ -16,11 +16,6 @@ struct kmem_zone;
*/
#define XFS_EFI_MAX_FAST_EXTENTS 16
/*
* Define EFI flag bits. Manipulated by set/clear/test_bit operators.
*/
#define XFS_EFI_RECOVERED 1
/*
* This is the "extent free intention" log item. It is used to log the fact
* that some extents need to be free. It is used in conjunction with the
......@@ -50,25 +45,24 @@ struct kmem_zone;
* of commit failure or log I/O errors. Note that the EFD is not inserted in the
* AIL, so at this point both the EFI and EFD are freed.
*/
typedef struct xfs_efi_log_item {
struct xfs_efi_log_item {
struct xfs_log_item efi_item;
atomic_t efi_refcount;
atomic_t efi_next_extent;
unsigned long efi_flags; /* misc flags */
xfs_efi_log_format_t efi_format;
} xfs_efi_log_item_t;
};
/*
* This is the "extent free done" log item. It is used to log
* the fact that some extents earlier mentioned in an efi item
* have been freed.
*/
typedef struct xfs_efd_log_item {
struct xfs_efd_log_item {
struct xfs_log_item efd_item;
xfs_efi_log_item_t *efd_efip;
struct xfs_efi_log_item *efd_efip;
uint efd_next_extent;
xfs_efd_log_format_t efd_format;
} xfs_efd_log_item_t;
};
/*
* Max number of extents in fast allocation path.
......@@ -78,13 +72,4 @@ typedef struct xfs_efd_log_item {
extern struct kmem_zone *xfs_efi_zone;
extern struct kmem_zone *xfs_efd_zone;
xfs_efi_log_item_t *xfs_efi_init(struct xfs_mount *, uint);
int xfs_efi_copy_format(xfs_log_iovec_t *buf,
xfs_efi_log_format_t *dst_efi_fmt);
void xfs_efi_item_free(xfs_efi_log_item_t *);
void xfs_efi_release(struct xfs_efi_log_item *);
int xfs_efi_recover(struct xfs_mount *mp,
struct xfs_efi_log_item *efip);
#endif /* __XFS_EXTFREE_ITEM_H__ */
......@@ -1102,7 +1102,7 @@ xfs_dir_open(
* certain to have the next operation be a read there.
*/
mode = xfs_ilock_data_map_shared(ip);
if (ip->i_d.di_nextents > 0)
if (ip->i_df.if_nextents > 0)
error = xfs_dir3_data_readahead(ip, 0, 0);
xfs_iunlock(ip, mode);
return error;
......
......@@ -504,10 +504,7 @@ xfs_do_force_shutdown(
} else if (logerror) {
xfs_alert_tag(mp, XFS_PTAG_SHUTDOWN_LOGERROR,
"Log I/O Error Detected. Shutting down filesystem");
} else if (flags & SHUTDOWN_DEVICE_REQ) {
xfs_alert_tag(mp, XFS_PTAG_SHUTDOWN_IOERROR,
"All device paths lost. Shutting down filesystem");
} else if (!(flags & SHUTDOWN_REMOTE_REQ)) {
} else {
xfs_alert_tag(mp, XFS_PTAG_SHUTDOWN_IOERROR,
"I/O Error Detected. Shutting down filesystem");
}
......
This diff is collapsed.
......@@ -24,7 +24,7 @@ struct xfs_eofblocks {
* tags for inode radix tree
*/
#define XFS_ICI_NO_TAG (-1) /* special flag for an untagged lookup
in xfs_inode_ag_iterator */
in xfs_inode_walk */
#define XFS_ICI_RECLAIM_TAG 0 /* inode is to be reclaimed */
#define XFS_ICI_EOFBLOCKS_TAG 1 /* inode has blocks beyond EOF */
#define XFS_ICI_COWBLOCKS_TAG 2 /* inode can have cow blocks to gc */
......@@ -40,7 +40,7 @@ struct xfs_eofblocks {
/*
* flags for AG inode iterator
*/
#define XFS_AGITER_INEW_WAIT 0x1 /* wait on new inodes */
#define XFS_INODE_WALK_INEW_WAIT 0x1 /* wait on new inodes */
int xfs_iget(struct xfs_mount *mp, struct xfs_trans *tp, xfs_ino_t ino,
uint flags, uint lock_flags, xfs_inode_t **ipp);
......@@ -71,50 +71,9 @@ int xfs_inode_free_quota_cowblocks(struct xfs_inode *ip);
void xfs_cowblocks_worker(struct work_struct *);
void xfs_queue_cowblocks(struct xfs_mount *);
int xfs_inode_ag_iterator(struct xfs_mount *mp,
int (*execute)(struct xfs_inode *ip, int flags, void *args),
int flags, void *args);
int xfs_inode_ag_iterator_flags(struct xfs_mount *mp,
int (*execute)(struct xfs_inode *ip, int flags, void *args),
int flags, void *args, int iter_flags);
int xfs_inode_ag_iterator_tag(struct xfs_mount *mp,
int (*execute)(struct xfs_inode *ip, int flags, void *args),
int flags, void *args, int tag);
static inline int
xfs_fs_eofblocks_from_user(
struct xfs_fs_eofblocks *src,
struct xfs_eofblocks *dst)
{
if (src->eof_version != XFS_EOFBLOCKS_VERSION)
return -EINVAL;
if (src->eof_flags & ~XFS_EOF_FLAGS_VALID)
return -EINVAL;
if (memchr_inv(&src->pad32, 0, sizeof(src->pad32)) ||
memchr_inv(src->pad64, 0, sizeof(src->pad64)))
return -EINVAL;
dst->eof_flags = src->eof_flags;
dst->eof_prid = src->eof_prid;
dst->eof_min_file_size = src->eof_min_file_size;
dst->eof_uid = INVALID_UID;
if (src->eof_flags & XFS_EOF_FLAGS_UID) {
dst->eof_uid = make_kuid(current_user_ns(), src->eof_uid);
if (!uid_valid(dst->eof_uid))
return -EINVAL;
}
dst->eof_gid = INVALID_GID;
if (src->eof_flags & XFS_EOF_FLAGS_GID) {
dst->eof_gid = make_kgid(current_user_ns(), src->eof_gid);
if (!gid_valid(dst->eof_gid))
return -EINVAL;
}
return 0;
}
int xfs_inode_walk(struct xfs_mount *mp, int iter_flags,
int (*execute)(struct xfs_inode *ip, void *args),
void *args, int tag);
int xfs_icache_inode_is_allocated(struct xfs_mount *mp, struct xfs_trans *tp,
xfs_ino_t ino, bool *inuse);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment