Commit 56c455b3 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'xfs-6.4-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Dave Chinner:
 "This consists mainly of online scrub functionality and the design
  documentation for the upcoming online repair functionality built on
  top of the scrub code:

   - Added detailed design documentation for the upcoming online repair
     feature

   - major update to online scrub to complete the reverse mapping
     cross-referencing infrastructure enabling us to fully validate
     allocated metadata against owner records. This is the last piece of
     scrub infrastructure needed before we can start merging online
     repair functionality.

   - Fixes for the ascii-ci hashing issues

   - deprecation of the ascii-ci functionality

   - on-disk format verification bug fixes

   - various random bug fixes for syzbot and other bug reports"

* tag 'xfs-6.4-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (107 commits)
  xfs: fix livelock in delayed allocation at ENOSPC
  xfs: Extend table marker on deprecated mount options table
  xfs: fix duplicate includes
  xfs: fix BUG_ON in xfs_getbmap()
  xfs: verify buffer contents when we skip log replay
  xfs: _{attr,data}_map_shared should take ILOCK_EXCL until iread_extents is completely done
  xfs: remove WARN when dquot cache insertion fails
  xfs: don't consider future format versions valid
  xfs: deprecate the ascii-ci feature
  xfs: test the ascii case-insensitive hash
  xfs: stabilize the dirent name transformation function used for ascii-ci dir hash computation
  xfs: cross-reference rmap records with refcount btrees
  xfs: cross-reference rmap records with inode btrees
  xfs: cross-reference rmap records with free space btrees
  xfs: cross-reference rmap records with ag btrees
  xfs: introduce bitmap type for AG blocks
  xfs: convert xbitmap to interval tree
  xfs: drop the _safe behavior from the xbitmap foreach macro
  xfs: don't load local xattr values during scrub
  xfs: remove the for_each_xbitmap_ helpers
  ...
parents bedf1495 9419092f
......@@ -236,13 +236,14 @@ the dates listed above.
Deprecated Mount Options
========================
=========================== ================
============================ ================
Name Removal Schedule
=========================== ================
============================ ================
Mounting with V4 filesystem September 2030
Mounting ascii-ci filesystem September 2030
ikeep/noikeep September 2025
attr2/noattr2 September 2025
=========================== ================
============================ ================
Removed Mount Options
......
......@@ -123,4 +123,5 @@ Documentation for filesystem implementations.
vfat
xfs-delayed-logging-design
xfs-self-describing-metadata
xfs-online-fsck-design
zonefs
This diff is collapsed.
.. SPDX-License-Identifier: GPL-2.0
.. _xfs_self_describing_metadata:
============================
XFS Self Describing Metadata
......
......@@ -47,6 +47,33 @@ config XFS_SUPPORT_V4
To continue supporting the old V4 format (crc=0), say Y.
To close off an attack surface, say N.
config XFS_SUPPORT_ASCII_CI
bool "Support deprecated case-insensitive ascii (ascii-ci=1) format"
depends on XFS_FS
default y
help
The ASCII case insensitivity filesystem feature only works correctly
on systems that have been coerced into using ISO 8859-1, and it does
not work on extended attributes. The kernel has no visibility into
the locale settings in userspace, so it corrupts UTF-8 names.
Enabling this feature makes XFS vulnerable to mixed case sensitivity
attacks. Because of this, the feature is deprecated. All users
should upgrade by backing up their files, reformatting, and restoring
from the backup.
Administrators and users can detect such a filesystem by running
xfs_info against a filesystem mountpoint and checking for a string
beginning with "ascii-ci=". If the string "ascii-ci=1" is found, the
filesystem is a case-insensitive filesystem. If no such string is
found, please upgrade xfsprogs to the latest version and try again.
This option will become default N in September 2025. Support for the
feature will be removed entirely in September 2030. Distributors
can say N here to withdraw support earlier.
To continue supporting case-insensitivity (ascii-ci=1), say Y.
To close off an attack surface, say N.
config XFS_QUOTA
bool "XFS Quota support"
depends on XFS_FS
......@@ -93,10 +120,15 @@ config XFS_RT
If unsure, say N.
config XFS_DRAIN_INTENTS
bool
select JUMP_LABEL if HAVE_ARCH_JUMP_LABEL
config XFS_ONLINE_SCRUB
bool "XFS online metadata check support"
default n
depends on XFS_FS
select XFS_DRAIN_INTENTS
help
If you say Y here you will be able to check metadata on a
mounted XFS filesystem. This feature is intended to reduce
......
......@@ -136,6 +136,8 @@ ifeq ($(CONFIG_MEMORY_FAILURE),y)
xfs-$(CONFIG_FS_DAX) += xfs_notify_failure.o
endif
xfs-$(CONFIG_XFS_DRAIN_INTENTS) += xfs_drain.o
# online scrub/repair
ifeq ($(CONFIG_XFS_ONLINE_SCRUB),y)
......@@ -146,6 +148,7 @@ xfs-y += $(addprefix scrub/, \
agheader.o \
alloc.o \
attr.o \
bitmap.o \
bmap.o \
btree.o \
common.o \
......@@ -156,6 +159,7 @@ xfs-y += $(addprefix scrub/, \
ialloc.o \
inode.o \
parent.o \
readdir.o \
refcount.o \
rmap.o \
scrub.o \
......@@ -169,7 +173,6 @@ xfs-$(CONFIG_XFS_QUOTA) += scrub/quota.o
ifeq ($(CONFIG_XFS_ONLINE_REPAIR),y)
xfs-y += $(addprefix scrub/, \
agheader_repair.o \
bitmap.o \
repair.o \
)
endif
......
......@@ -81,6 +81,19 @@ xfs_perag_get_tag(
return pag;
}
/* Get a passive reference to the given perag. */
struct xfs_perag *
xfs_perag_hold(
struct xfs_perag *pag)
{
ASSERT(atomic_read(&pag->pag_ref) > 0 ||
atomic_read(&pag->pag_active_ref) > 0);
trace_xfs_perag_hold(pag, _RET_IP_);
atomic_inc(&pag->pag_ref);
return pag;
}
void
xfs_perag_put(
struct xfs_perag *pag)
......@@ -247,6 +260,7 @@ xfs_free_perag(
spin_unlock(&mp->m_perag_lock);
ASSERT(pag);
XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0);
xfs_defer_drain_free(&pag->pag_intents_drain);
cancel_delayed_work_sync(&pag->pag_blockgc_work);
xfs_buf_hash_destroy(pag);
......@@ -372,6 +386,7 @@ xfs_initialize_perag(
spin_lock_init(&pag->pag_state_lock);
INIT_DELAYED_WORK(&pag->pag_blockgc_work, xfs_blockgc_worker);
INIT_RADIX_TREE(&pag->pag_ici_root, GFP_ATOMIC);
xfs_defer_drain_init(&pag->pag_intents_drain);
init_waitqueue_head(&pag->pagb_wait);
init_waitqueue_head(&pag->pag_active_wq);
pag->pagb_count = 0;
......@@ -408,6 +423,7 @@ xfs_initialize_perag(
return 0;
out_remove_pag:
xfs_defer_drain_free(&pag->pag_intents_drain);
radix_tree_delete(&mp->m_perag_tree, index);
out_free_pag:
kmem_free(pag);
......@@ -418,6 +434,7 @@ xfs_initialize_perag(
if (!pag)
break;
xfs_buf_hash_destroy(pag);
xfs_defer_drain_free(&pag->pag_intents_drain);
kmem_free(pag);
}
return error;
......@@ -1043,10 +1060,8 @@ xfs_ag_extend_space(
if (error)
return error;
error = xfs_free_extent(tp, XFS_AGB_TO_FSB(pag->pag_mount, pag->pag_agno,
be32_to_cpu(agf->agf_length) - len),
len, &XFS_RMAP_OINFO_SKIP_UPDATE,
XFS_AG_RESV_NONE);
error = xfs_free_extent(tp, pag, be32_to_cpu(agf->agf_length) - len,
len, &XFS_RMAP_OINFO_SKIP_UPDATE, XFS_AG_RESV_NONE);
if (error)
return error;
......
......@@ -101,6 +101,14 @@ struct xfs_perag {
/* background prealloc block trimming */
struct delayed_work pag_blockgc_work;
/*
* We use xfs_drain to track the number of deferred log intent items
* that have been queued (but not yet processed) so that waiters (e.g.
* scrub) will not lock resources when other threads are in the middle
* of processing a chain of intent items only to find momentary
* inconsistencies.
*/
struct xfs_defer_drain pag_intents_drain;
#endif /* __KERNEL__ */
};
......@@ -134,6 +142,7 @@ void xfs_free_perag(struct xfs_mount *mp);
struct xfs_perag *xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t agno);
struct xfs_perag *xfs_perag_get_tag(struct xfs_mount *mp, xfs_agnumber_t agno,
unsigned int tag);
struct xfs_perag *xfs_perag_hold(struct xfs_perag *pag);
void xfs_perag_put(struct xfs_perag *pag);
/* Active AG references */
......
......@@ -233,6 +233,52 @@ xfs_alloc_update(
return xfs_btree_update(cur, &rec);
}
/* Convert the ondisk btree record to its incore representation. */
void
xfs_alloc_btrec_to_irec(
const union xfs_btree_rec *rec,
struct xfs_alloc_rec_incore *irec)
{
irec->ar_startblock = be32_to_cpu(rec->alloc.ar_startblock);
irec->ar_blockcount = be32_to_cpu(rec->alloc.ar_blockcount);
}
/* Simple checks for free space records. */
xfs_failaddr_t
xfs_alloc_check_irec(
struct xfs_btree_cur *cur,
const struct xfs_alloc_rec_incore *irec)
{
struct xfs_perag *pag = cur->bc_ag.pag;
if (irec->ar_blockcount == 0)
return __this_address;
/* check for valid extent range, including overflow */
if (!xfs_verify_agbext(pag, irec->ar_startblock, irec->ar_blockcount))
return __this_address;
return NULL;
}
static inline int
xfs_alloc_complain_bad_rec(
struct xfs_btree_cur *cur,
xfs_failaddr_t fa,
const struct xfs_alloc_rec_incore *irec)
{
struct xfs_mount *mp = cur->bc_mp;
xfs_warn(mp,
"%s Freespace BTree record corruption in AG %d detected at %pS!",
cur->bc_btnum == XFS_BTNUM_BNO ? "Block" : "Size",
cur->bc_ag.pag->pag_agno, fa);
xfs_warn(mp,
"start block 0x%x block count 0x%x", irec->ar_startblock,
irec->ar_blockcount);
return -EFSCORRUPTED;
}
/*
* Get the data from the pointed-to record.
*/
......@@ -243,35 +289,23 @@ xfs_alloc_get_rec(
xfs_extlen_t *len, /* output: length of extent */
int *stat) /* output: success/failure */
{
struct xfs_mount *mp = cur->bc_mp;
struct xfs_perag *pag = cur->bc_ag.pag;
struct xfs_alloc_rec_incore irec;
union xfs_btree_rec *rec;
xfs_failaddr_t fa;
int error;
error = xfs_btree_get_rec(cur, &rec, stat);
if (error || !(*stat))
return error;
*bno = be32_to_cpu(rec->alloc.ar_startblock);
*len = be32_to_cpu(rec->alloc.ar_blockcount);
if (*len == 0)
goto out_bad_rec;
/* check for valid extent range, including overflow */
if (!xfs_verify_agbext(pag, *bno, *len))
goto out_bad_rec;
xfs_alloc_btrec_to_irec(rec, &irec);
fa = xfs_alloc_check_irec(cur, &irec);
if (fa)
return xfs_alloc_complain_bad_rec(cur, fa, &irec);
*bno = irec.ar_startblock;
*len = irec.ar_blockcount;
return 0;
out_bad_rec:
xfs_warn(mp,
"%s Freespace BTree record corruption in AG %d detected!",
cur->bc_btnum == XFS_BTNUM_BNO ? "Block" : "Size",
pag->pag_agno);
xfs_warn(mp,
"start block 0x%x block count 0x%x", *bno, *len);
return -EFSCORRUPTED;
}
/*
......@@ -2405,6 +2439,7 @@ xfs_defer_agfl_block(
trace_xfs_agfl_free_defer(mp, agno, 0, agbno, 1);
xfs_extent_free_get_group(mp, xefi);
xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_AGFL_FREE, &xefi->xefi_list);
}
......@@ -2421,8 +2456,8 @@ __xfs_free_extent_later(
bool skip_discard)
{
struct xfs_extent_free_item *xefi;
#ifdef DEBUG
struct xfs_mount *mp = tp->t_mountp;
#ifdef DEBUG
xfs_agnumber_t agno;
xfs_agblock_t agbno;
......@@ -2456,9 +2491,11 @@ __xfs_free_extent_later(
} else {
xefi->xefi_owner = XFS_RMAP_OWN_NULL;
}
trace_xfs_bmap_free_defer(tp->t_mountp,
trace_xfs_bmap_free_defer(mp,
XFS_FSB_TO_AGNO(tp->t_mountp, bno), 0,
XFS_FSB_TO_AGBNO(tp->t_mountp, bno), len);
xfs_extent_free_get_group(mp, xefi);
xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_FREE, &xefi->xefi_list);
}
......@@ -3596,7 +3633,8 @@ xfs_free_extent_fix_freelist(
int
__xfs_free_extent(
struct xfs_trans *tp,
xfs_fsblock_t bno,
struct xfs_perag *pag,
xfs_agblock_t agbno,
xfs_extlen_t len,
const struct xfs_owner_info *oinfo,
enum xfs_ag_resv_type type,
......@@ -3604,12 +3642,9 @@ __xfs_free_extent(
{
struct xfs_mount *mp = tp->t_mountp;
struct xfs_buf *agbp;
xfs_agnumber_t agno = XFS_FSB_TO_AGNO(mp, bno);
xfs_agblock_t agbno = XFS_FSB_TO_AGBNO(mp, bno);
struct xfs_agf *agf;
int error;
unsigned int busy_flags = 0;
struct xfs_perag *pag;
ASSERT(len != 0);
ASSERT(type != XFS_AG_RESV_AGFL);
......@@ -3618,10 +3653,9 @@ __xfs_free_extent(
XFS_ERRTAG_FREE_EXTENT))
return -EIO;
pag = xfs_perag_get(mp, agno);
error = xfs_free_extent_fix_freelist(tp, pag, &agbp);
if (error)
goto err;
return error;
agf = agbp->b_addr;
if (XFS_IS_CORRUPT(mp, agbno >= mp->m_sb.sb_agblocks)) {
......@@ -3635,20 +3669,18 @@ __xfs_free_extent(
goto err_release;
}
error = xfs_free_ag_extent(tp, agbp, agno, agbno, len, oinfo, type);
error = xfs_free_ag_extent(tp, agbp, pag->pag_agno, agbno, len, oinfo,
type);
if (error)
goto err_release;
if (skip_discard)
busy_flags |= XFS_EXTENT_BUSY_SKIP_DISCARD;
xfs_extent_busy_insert(tp, pag, agbno, len, busy_flags);
xfs_perag_put(pag);
return 0;
err_release:
xfs_trans_brelse(tp, agbp);
err:
xfs_perag_put(pag);
return error;
}
......@@ -3666,9 +3698,13 @@ xfs_alloc_query_range_helper(
{
struct xfs_alloc_query_range_info *query = priv;
struct xfs_alloc_rec_incore irec;
xfs_failaddr_t fa;
xfs_alloc_btrec_to_irec(rec, &irec);
fa = xfs_alloc_check_irec(cur, &irec);
if (fa)
return xfs_alloc_complain_bad_rec(cur, fa, &irec);
irec.ar_startblock = be32_to_cpu(rec->alloc.ar_startblock);
irec.ar_blockcount = be32_to_cpu(rec->alloc.ar_blockcount);
return query->fn(cur, &irec, query->priv);
}
......@@ -3709,13 +3745,16 @@ xfs_alloc_query_all(
return xfs_btree_query_all(cur, xfs_alloc_query_range_helper, &query);
}
/* Is there a record covering a given extent? */
/*
* Scan part of the keyspace of the free space and tell us if the area has no
* records, is fully mapped by records, or is partially filled.
*/
int
xfs_alloc_has_record(
xfs_alloc_has_records(
struct xfs_btree_cur *cur,
xfs_agblock_t bno,
xfs_extlen_t len,
bool *exists)
enum xbtree_recpacking *outcome)
{
union xfs_btree_irec low;
union xfs_btree_irec high;
......@@ -3725,7 +3764,7 @@ xfs_alloc_has_record(
memset(&high, 0xFF, sizeof(high));
high.a.ar_startblock = bno + len - 1;
return xfs_btree_has_record(cur, &low, &high, exists);
return xfs_btree_has_records(cur, &low, &high, NULL, outcome);
}
/*
......
......@@ -141,7 +141,8 @@ int xfs_alloc_vextent_first_ag(struct xfs_alloc_arg *args,
int /* error */
__xfs_free_extent(
struct xfs_trans *tp, /* transaction pointer */
xfs_fsblock_t bno, /* starting block number of extent */
struct xfs_perag *pag,
xfs_agblock_t agbno,
xfs_extlen_t len, /* length of extent */
const struct xfs_owner_info *oinfo, /* extent owner */
enum xfs_ag_resv_type type, /* block reservation type */
......@@ -150,12 +151,13 @@ __xfs_free_extent(
static inline int
xfs_free_extent(
struct xfs_trans *tp,
xfs_fsblock_t bno,
struct xfs_perag *pag,
xfs_agblock_t agbno,
xfs_extlen_t len,
const struct xfs_owner_info *oinfo,
enum xfs_ag_resv_type type)
{
return __xfs_free_extent(tp, bno, len, oinfo, type, false);
return __xfs_free_extent(tp, pag, agbno, len, oinfo, type, false);
}
int /* error */
......@@ -179,6 +181,12 @@ xfs_alloc_get_rec(
xfs_extlen_t *len, /* output: length of extent */
int *stat); /* output: success/failure */
union xfs_btree_rec;
void xfs_alloc_btrec_to_irec(const union xfs_btree_rec *rec,
struct xfs_alloc_rec_incore *irec);
xfs_failaddr_t xfs_alloc_check_irec(struct xfs_btree_cur *cur,
const struct xfs_alloc_rec_incore *irec);
int xfs_read_agf(struct xfs_perag *pag, struct xfs_trans *tp, int flags,
struct xfs_buf **agfbpp);
int xfs_alloc_read_agf(struct xfs_perag *pag, struct xfs_trans *tp, int flags,
......@@ -205,8 +213,8 @@ int xfs_alloc_query_range(struct xfs_btree_cur *cur,
int xfs_alloc_query_all(struct xfs_btree_cur *cur, xfs_alloc_query_range_fn fn,
void *priv);
int xfs_alloc_has_record(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_extlen_t len, bool *exist);
int xfs_alloc_has_records(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_extlen_t len, enum xbtree_recpacking *outcome);
typedef int (*xfs_agfl_walk_fn)(struct xfs_mount *mp, xfs_agblock_t bno,
void *priv);
......@@ -235,9 +243,13 @@ struct xfs_extent_free_item {
uint64_t xefi_owner;
xfs_fsblock_t xefi_startblock;/* starting fs block number */
xfs_extlen_t xefi_blockcount;/* number of blocks in extent */
struct xfs_perag *xefi_pag;
unsigned int xefi_flags;
};
void xfs_extent_free_get_group(struct xfs_mount *mp,
struct xfs_extent_free_item *xefi);
#define XFS_EFI_SKIP_DISCARD (1U << 0) /* don't issue discard */
#define XFS_EFI_ATTR_FORK (1U << 1) /* freeing attr fork block */
#define XFS_EFI_BMBT_BLOCK (1U << 2) /* freeing bmap btree block */
......
......@@ -260,8 +260,11 @@ STATIC int64_t
xfs_bnobt_diff_two_keys(
struct xfs_btree_cur *cur,
const union xfs_btree_key *k1,
const union xfs_btree_key *k2)
const union xfs_btree_key *k2,
const union xfs_btree_key *mask)
{
ASSERT(!mask || mask->alloc.ar_startblock);
return (int64_t)be32_to_cpu(k1->alloc.ar_startblock) -
be32_to_cpu(k2->alloc.ar_startblock);
}
......@@ -270,10 +273,14 @@ STATIC int64_t
xfs_cntbt_diff_two_keys(
struct xfs_btree_cur *cur,
const union xfs_btree_key *k1,
const union xfs_btree_key *k2)
const union xfs_btree_key *k2,
const union xfs_btree_key *mask)
{
int64_t diff;
ASSERT(!mask || (mask->alloc.ar_blockcount &&
mask->alloc.ar_startblock));
diff = be32_to_cpu(k1->alloc.ar_blockcount) -
be32_to_cpu(k2->alloc.ar_blockcount);
if (diff)
......@@ -423,6 +430,19 @@ xfs_cntbt_recs_inorder(
be32_to_cpu(r2->alloc.ar_startblock));
}
STATIC enum xbtree_key_contig
xfs_allocbt_keys_contiguous(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask)
{
ASSERT(!mask || mask->alloc.ar_startblock);
return xbtree_key_contig(be32_to_cpu(key1->alloc.ar_startblock),
be32_to_cpu(key2->alloc.ar_startblock));
}
static const struct xfs_btree_ops xfs_bnobt_ops = {
.rec_len = sizeof(xfs_alloc_rec_t),
.key_len = sizeof(xfs_alloc_key_t),
......@@ -443,6 +463,7 @@ static const struct xfs_btree_ops xfs_bnobt_ops = {
.diff_two_keys = xfs_bnobt_diff_two_keys,
.keys_inorder = xfs_bnobt_keys_inorder,
.recs_inorder = xfs_bnobt_recs_inorder,
.keys_contiguous = xfs_allocbt_keys_contiguous,
};
static const struct xfs_btree_ops xfs_cntbt_ops = {
......@@ -465,6 +486,7 @@ static const struct xfs_btree_ops xfs_cntbt_ops = {
.diff_two_keys = xfs_cntbt_diff_two_keys,
.keys_inorder = xfs_cntbt_keys_inorder,
.recs_inorder = xfs_cntbt_recs_inorder,
.keys_contiguous = NULL, /* not needed right now */
};
/* Allocate most of a new allocation btree cursor. */
......@@ -492,9 +514,7 @@ xfs_allocbt_init_common(
cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_abtb_2);
}
/* take a reference for the cursor */
atomic_inc(&pag->pag_ref);
cur->bc_ag.pag = pag;
cur->bc_ag.pag = xfs_perag_hold(pag);
if (xfs_has_crc(mp))
cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
......
......@@ -1083,6 +1083,34 @@ struct xfs_iread_state {
xfs_extnum_t loaded;
};
int
xfs_bmap_complain_bad_rec(
struct xfs_inode *ip,
int whichfork,
xfs_failaddr_t fa,
const struct xfs_bmbt_irec *irec)
{
struct xfs_mount *mp = ip->i_mount;
const char *forkname;
switch (whichfork) {
case XFS_DATA_FORK: forkname = "data"; break;
case XFS_ATTR_FORK: forkname = "attr"; break;
case XFS_COW_FORK: forkname = "CoW"; break;
default: forkname = "???"; break;
}
xfs_warn(mp,
"Bmap BTree record corruption in inode 0x%llx %s fork detected at %pS!",
ip->i_ino, forkname, fa);
xfs_warn(mp,
"Offset 0x%llx, start block 0x%llx, block count 0x%llx state 0x%x",
irec->br_startoff, irec->br_startblock, irec->br_blockcount,
irec->br_state);
return -EFSCORRUPTED;
}
/* Stuff every bmbt record from this block into the incore extent map. */
static int
xfs_iread_bmbt_block(
......@@ -1125,7 +1153,8 @@ xfs_iread_bmbt_block(
xfs_inode_verifier_error(ip, -EFSCORRUPTED,
"xfs_iread_extents(2)", frp,
sizeof(*frp), fa);
return -EFSCORRUPTED;
return xfs_bmap_complain_bad_rec(ip, whichfork, fa,
&new);
}
xfs_iext_insert(ip, &ir->icur, &new,
xfs_bmap_fork_to_state(whichfork));
......@@ -1171,6 +1200,12 @@ xfs_iread_extents(
goto out;
}
ASSERT(ir.loaded == xfs_iext_count(ifp));
/*
* Use release semantics so that we can use acquire semantics in
* xfs_need_iread_extents and be guaranteed to see a valid mapping tree
* after that load.
*/
smp_store_release(&ifp->if_needextents, 0);
return 0;
out:
xfs_iext_destroy(ifp);
......@@ -3505,7 +3540,6 @@ xfs_bmap_btalloc_at_eof(
* original non-aligned state so the caller can proceed on allocation
* failure as if this function was never called.
*/
args->fsbno = ap->blkno;
args->alignment = 1;
return 0;
}
......@@ -6075,6 +6109,7 @@ __xfs_bmap_add(
bi->bi_whichfork = whichfork;
bi->bi_bmap = *bmap;
xfs_bmap_update_get_group(tp->t_mountp, bi);
xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_BMAP, &bi->bi_list);
return 0;
}
......
......@@ -145,7 +145,7 @@ static inline int xfs_bmapi_whichfork(uint32_t bmapi_flags)
{ BMAP_COWFORK, "COW" }
/* Return true if the extent is an allocated extent, written or not. */
static inline bool xfs_bmap_is_real_extent(struct xfs_bmbt_irec *irec)
static inline bool xfs_bmap_is_real_extent(const struct xfs_bmbt_irec *irec)
{
return irec->br_startblock != HOLESTARTBLOCK &&
irec->br_startblock != DELAYSTARTBLOCK &&
......@@ -238,9 +238,13 @@ struct xfs_bmap_intent {
enum xfs_bmap_intent_type bi_type;
int bi_whichfork;
struct xfs_inode *bi_owner;
struct xfs_perag *bi_pag;
struct xfs_bmbt_irec bi_bmap;
};
void xfs_bmap_update_get_group(struct xfs_mount *mp,
struct xfs_bmap_intent *bi);
int xfs_bmap_finish_one(struct xfs_trans *tp, struct xfs_bmap_intent *bi);
void xfs_bmap_map_extent(struct xfs_trans *tp, struct xfs_inode *ip,
struct xfs_bmbt_irec *imap);
......@@ -261,6 +265,8 @@ static inline uint32_t xfs_bmap_fork_to_state(int whichfork)
xfs_failaddr_t xfs_bmap_validate_extent(struct xfs_inode *ip, int whichfork,
struct xfs_bmbt_irec *irec);
int xfs_bmap_complain_bad_rec(struct xfs_inode *ip, int whichfork,
xfs_failaddr_t fa, const struct xfs_bmbt_irec *irec);
int xfs_bmapi_remap(struct xfs_trans *tp, struct xfs_inode *ip,
xfs_fileoff_t bno, xfs_filblks_t len, xfs_fsblock_t startblock,
......
......@@ -382,11 +382,14 @@ STATIC int64_t
xfs_bmbt_diff_two_keys(
struct xfs_btree_cur *cur,
const union xfs_btree_key *k1,
const union xfs_btree_key *k2)
const union xfs_btree_key *k2,
const union xfs_btree_key *mask)
{
uint64_t a = be64_to_cpu(k1->bmbt.br_startoff);
uint64_t b = be64_to_cpu(k2->bmbt.br_startoff);
ASSERT(!mask || mask->bmbt.br_startoff);
/*
* Note: This routine previously casted a and b to int64 and subtracted
* them to generate a result. This lead to problems if b was the
......@@ -500,6 +503,19 @@ xfs_bmbt_recs_inorder(
xfs_bmbt_disk_get_startoff(&r2->bmbt);
}
STATIC enum xbtree_key_contig
xfs_bmbt_keys_contiguous(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask)
{
ASSERT(!mask || mask->bmbt.br_startoff);
return xbtree_key_contig(be64_to_cpu(key1->bmbt.br_startoff),
be64_to_cpu(key2->bmbt.br_startoff));
}
static const struct xfs_btree_ops xfs_bmbt_ops = {
.rec_len = sizeof(xfs_bmbt_rec_t),
.key_len = sizeof(xfs_bmbt_key_t),
......@@ -520,6 +536,7 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
.buf_ops = &xfs_bmbt_buf_ops,
.keys_inorder = xfs_bmbt_keys_inorder,
.recs_inorder = xfs_bmbt_recs_inorder,
.keys_contiguous = xfs_bmbt_keys_contiguous,
};
/*
......
This diff is collapsed.
......@@ -90,6 +90,27 @@ uint32_t xfs_btree_magic(int crc, xfs_btnum_t btnum);
#define XFS_BTREE_STATS_ADD(cur, stat, val) \
XFS_STATS_ADD_OFF((cur)->bc_mp, (cur)->bc_statoff + __XBTS_ ## stat, val)
enum xbtree_key_contig {
XBTREE_KEY_GAP = 0,
XBTREE_KEY_CONTIGUOUS,
XBTREE_KEY_OVERLAP,
};
/*
* Decide if these two numeric btree key fields are contiguous, overlapping,
* or if there's a gap between them. @x should be the field from the high
* key and @y should be the field from the low key.
*/
static inline enum xbtree_key_contig xbtree_key_contig(uint64_t x, uint64_t y)
{
x++;
if (x < y)
return XBTREE_KEY_GAP;
if (x == y)
return XBTREE_KEY_CONTIGUOUS;
return XBTREE_KEY_OVERLAP;
}
struct xfs_btree_ops {
/* size of the key and record structures */
size_t key_len;
......@@ -140,11 +161,14 @@ struct xfs_btree_ops {
/*
* Difference between key2 and key1 -- positive if key1 > key2,
* negative if key1 < key2, and zero if equal.
* negative if key1 < key2, and zero if equal. If the @mask parameter
* is non NULL, each key field to be used in the comparison must
* contain a nonzero value.
*/
int64_t (*diff_two_keys)(struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2);
const union xfs_btree_key *key2,
const union xfs_btree_key *mask);
const struct xfs_buf_ops *buf_ops;
......@@ -157,6 +181,22 @@ struct xfs_btree_ops {
int (*recs_inorder)(struct xfs_btree_cur *cur,
const union xfs_btree_rec *r1,
const union xfs_btree_rec *r2);
/*
* Are these two btree keys immediately adjacent?
*
* Given two btree keys @key1 and @key2, decide if it is impossible for
* there to be a third btree key K satisfying the relationship
* @key1 < K < @key2. To determine if two btree records are
* immediately adjacent, @key1 should be the high key of the first
* record and @key2 should be the low key of the second record.
* If the @mask parameter is non NULL, each key field to be used in the
* comparison must contain a nonzero value.
*/
enum xbtree_key_contig (*keys_contiguous)(struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask);
};
/*
......@@ -540,12 +580,105 @@ void xfs_btree_get_keys(struct xfs_btree_cur *cur,
struct xfs_btree_block *block, union xfs_btree_key *key);
union xfs_btree_key *xfs_btree_high_key_from_key(struct xfs_btree_cur *cur,
union xfs_btree_key *key);
int xfs_btree_has_record(struct xfs_btree_cur *cur,
typedef bool (*xfs_btree_key_gap_fn)(struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2);
int xfs_btree_has_records(struct xfs_btree_cur *cur,
const union xfs_btree_irec *low,
const union xfs_btree_irec *high, bool *exists);
const union xfs_btree_irec *high,
const union xfs_btree_key *mask,
enum xbtree_recpacking *outcome);
bool xfs_btree_has_more_records(struct xfs_btree_cur *cur);
struct xfs_ifork *xfs_btree_ifork_ptr(struct xfs_btree_cur *cur);
/* Key comparison helpers */
static inline bool
xfs_btree_keycmp_lt(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2)
{
return cur->bc_ops->diff_two_keys(cur, key1, key2, NULL) < 0;
}
static inline bool
xfs_btree_keycmp_gt(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2)
{
return cur->bc_ops->diff_two_keys(cur, key1, key2, NULL) > 0;
}
static inline bool
xfs_btree_keycmp_eq(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2)
{
return cur->bc_ops->diff_two_keys(cur, key1, key2, NULL) == 0;
}
static inline bool
xfs_btree_keycmp_le(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2)
{
return !xfs_btree_keycmp_gt(cur, key1, key2);
}
static inline bool
xfs_btree_keycmp_ge(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2)
{
return !xfs_btree_keycmp_lt(cur, key1, key2);
}
static inline bool
xfs_btree_keycmp_ne(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2)
{
return !xfs_btree_keycmp_eq(cur, key1, key2);
}
/* Masked key comparison helpers */
static inline bool
xfs_btree_masked_keycmp_lt(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask)
{
return cur->bc_ops->diff_two_keys(cur, key1, key2, mask) < 0;
}
static inline bool
xfs_btree_masked_keycmp_gt(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask)
{
return cur->bc_ops->diff_two_keys(cur, key1, key2, mask) > 0;
}
static inline bool
xfs_btree_masked_keycmp_ge(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask)
{
return !xfs_btree_masked_keycmp_lt(cur, key1, key2, mask);
}
/* Does this cursor point to the last block in the given level? */
static inline bool
xfs_btree_islastblock(
......
......@@ -397,6 +397,7 @@ xfs_defer_cancel_list(
list_for_each_safe(pwi, n, &dfp->dfp_work) {
list_del(pwi);
dfp->dfp_count--;
trace_xfs_defer_cancel_item(mp, dfp, pwi);
ops->cancel_item(pwi);
}
ASSERT(dfp->dfp_count == 0);
......@@ -476,6 +477,7 @@ xfs_defer_finish_one(
list_for_each_safe(li, n, &dfp->dfp_work) {
list_del(li);
dfp->dfp_count--;
trace_xfs_defer_finish_item(tp->t_mountp, dfp, li);
error = ops->finish_item(tp, dfp->dfp_done, li, &state);
if (error == -EAGAIN) {
int ret;
......@@ -623,7 +625,7 @@ xfs_defer_add(
struct list_head *li)
{
struct xfs_defer_pending *dfp = NULL;
const struct xfs_defer_op_type *ops;
const struct xfs_defer_op_type *ops = defer_op_types[type];
ASSERT(tp->t_flags & XFS_TRANS_PERM_LOG_RES);
BUILD_BUG_ON(ARRAY_SIZE(defer_op_types) != XFS_DEFER_OPS_TYPE_MAX);
......@@ -636,7 +638,6 @@ xfs_defer_add(
if (!list_empty(&tp->t_dfops)) {
dfp = list_last_entry(&tp->t_dfops,
struct xfs_defer_pending, dfp_list);
ops = defer_op_types[dfp->dfp_type];
if (dfp->dfp_type != type ||
(ops->max_items && dfp->dfp_count >= ops->max_items))
dfp = NULL;
......@@ -653,6 +654,7 @@ xfs_defer_add(
}
list_add_tail(li, &dfp->dfp_work);
trace_xfs_defer_add_item(tp->t_mountp, dfp, li);
dfp->dfp_count++;
}
......
......@@ -64,7 +64,7 @@ xfs_ascii_ci_hashname(
int i;
for (i = 0, hash = 0; i < name->len; i++)
hash = tolower(name->name[i]) ^ rol32(hash, 7);
hash = xfs_ascii_ci_xfrm(name->name[i]) ^ rol32(hash, 7);
return hash;
}
......@@ -85,7 +85,8 @@ xfs_ascii_ci_compname(
for (i = 0; i < len; i++) {
if (args->name[i] == name[i])
continue;
if (tolower(args->name[i]) != tolower(name[i]))
if (xfs_ascii_ci_xfrm(args->name[i]) !=
xfs_ascii_ci_xfrm(name[i]))
return XFS_CMP_DIFFERENT;
result = XFS_CMP_CASE;
}
......
......@@ -248,4 +248,35 @@ unsigned int xfs_dir3_data_end_offset(struct xfs_da_geometry *geo,
struct xfs_dir2_data_hdr *hdr);
bool xfs_dir2_namecheck(const void *name, size_t length);
/*
* The "ascii-ci" feature was created to speed up case-insensitive lookups for
* a Samba product. Because of the inherent problems with CI and UTF-8
* encoding, etc, it was decided that Samba would be configured to export
* latin1/iso 8859-1 encodings as that covered >90% of the target markets for
* the product. Hence the "ascii-ci" casefolding code could be encoded into
* the XFS directory operations and remove all the overhead of casefolding from
* Samba.
*
* To provide consistent hashing behavior between the userspace and kernel,
* these functions prepare names for hashing by transforming specific bytes
* to other bytes. Robustness with other encodings is not guaranteed.
*/
static inline bool xfs_ascii_ci_need_xfrm(unsigned char c)
{
if (c >= 0x41 && c <= 0x5a) /* A-Z */
return true;
if (c >= 0xc0 && c <= 0xd6) /* latin A-O with accents */
return true;
if (c >= 0xd8 && c <= 0xde) /* latin O-Y with accents */
return true;
return false;
}
static inline unsigned char xfs_ascii_ci_xfrm(unsigned char c)
{
if (xfs_ascii_ci_need_xfrm(c))
c -= 'A' - 'a';
return c;
}
#endif /* __XFS_DIR2_H__ */
......@@ -95,33 +95,25 @@ xfs_inobt_btrec_to_irec(
irec->ir_free = be64_to_cpu(rec->inobt.ir_free);
}
/*
* Get the data from the pointed-to record.
*/
int
xfs_inobt_get_rec(
/* Simple checks for inode records. */
xfs_failaddr_t
xfs_inobt_check_irec(
struct xfs_btree_cur *cur,
struct xfs_inobt_rec_incore *irec,
int *stat)
const struct xfs_inobt_rec_incore *irec)
{
struct xfs_mount *mp = cur->bc_mp;
union xfs_btree_rec *rec;
int error;
uint64_t realfree;
error = xfs_btree_get_rec(cur, &rec, stat);
if (error || *stat == 0)
return error;
xfs_inobt_btrec_to_irec(mp, rec, irec);
/* Record has to be properly aligned within the AG. */
if (!xfs_verify_agino(cur->bc_ag.pag, irec->ir_startino))
goto out_bad_rec;
return __this_address;
if (!xfs_verify_agino(cur->bc_ag.pag,
irec->ir_startino + XFS_INODES_PER_CHUNK - 1))
return __this_address;
if (irec->ir_count < XFS_INODES_PER_HOLEMASK_BIT ||
irec->ir_count > XFS_INODES_PER_CHUNK)
goto out_bad_rec;
return __this_address;
if (irec->ir_freecount > XFS_INODES_PER_CHUNK)
goto out_bad_rec;
return __this_address;
/* if there are no holes, return the first available offset */
if (!xfs_inobt_issparse(irec->ir_holemask))
......@@ -129,15 +121,23 @@ xfs_inobt_get_rec(
else
realfree = irec->ir_free & xfs_inobt_irec_to_allocmask(irec);
if (hweight64(realfree) != irec->ir_freecount)
goto out_bad_rec;
return __this_address;
return 0;
return NULL;
}
static inline int
xfs_inobt_complain_bad_rec(
struct xfs_btree_cur *cur,
xfs_failaddr_t fa,
const struct xfs_inobt_rec_incore *irec)
{
struct xfs_mount *mp = cur->bc_mp;
out_bad_rec:
xfs_warn(mp,
"%s Inode BTree record corruption in AG %d detected!",
"%s Inode BTree record corruption in AG %d detected at %pS!",
cur->bc_btnum == XFS_BTNUM_INO ? "Used" : "Free",
cur->bc_ag.pag->pag_agno);
cur->bc_ag.pag->pag_agno, fa);
xfs_warn(mp,
"start inode 0x%x, count 0x%x, free 0x%x freemask 0x%llx, holemask 0x%x",
irec->ir_startino, irec->ir_count, irec->ir_freecount,
......@@ -145,6 +145,32 @@ xfs_inobt_get_rec(
return -EFSCORRUPTED;
}
/*
* Get the data from the pointed-to record.
*/
int
xfs_inobt_get_rec(
struct xfs_btree_cur *cur,
struct xfs_inobt_rec_incore *irec,
int *stat)
{
struct xfs_mount *mp = cur->bc_mp;
union xfs_btree_rec *rec;
xfs_failaddr_t fa;
int error;
error = xfs_btree_get_rec(cur, &rec, stat);
if (error || *stat == 0)
return error;
xfs_inobt_btrec_to_irec(mp, rec, irec);
fa = xfs_inobt_check_irec(cur, irec);
if (fa)
return xfs_inobt_complain_bad_rec(cur, fa, irec);
return 0;
}
/*
* Insert a single inobt record. Cursor must already point to desired location.
*/
......@@ -1952,8 +1978,6 @@ xfs_difree_inobt(
*/
if (!xfs_has_ikeep(mp) && rec.ir_free == XFS_INOBT_ALL_FREE &&
mp->m_sb.sb_inopblock <= XFS_INODES_PER_CHUNK) {
struct xfs_perag *pag = agbp->b_pag;
xic->deleted = true;
xic->first_ino = XFS_AGINO_TO_INO(mp, pag->pag_agno,
rec.ir_startino);
......@@ -2617,44 +2641,50 @@ xfs_ialloc_read_agi(
return 0;
}
/* Is there an inode record covering a given range of inode numbers? */
int
xfs_ialloc_has_inode_record(
/* How many inodes are backed by inode clusters ondisk? */
STATIC int
xfs_ialloc_count_ondisk(
struct xfs_btree_cur *cur,
xfs_agino_t low,
xfs_agino_t high,
bool *exists)
unsigned int *allocated)
{
struct xfs_inobt_rec_incore irec;
xfs_agino_t agino;
uint16_t holemask;
unsigned int ret = 0;
int has_record;
int i;
int error;
*exists = false;
error = xfs_inobt_lookup(cur, low, XFS_LOOKUP_LE, &has_record);
while (error == 0 && has_record) {
if (error)
return error;
while (has_record) {
unsigned int i, hole_idx;
error = xfs_inobt_get_rec(cur, &irec, &has_record);
if (error || irec.ir_startino > high)
if (error)
return error;
if (irec.ir_startino > high)
break;
agino = irec.ir_startino;
holemask = irec.ir_holemask;
for (i = 0; i < XFS_INOBT_HOLEMASK_BITS; holemask >>= 1,
i++, agino += XFS_INODES_PER_HOLEMASK_BIT) {
if (holemask & 1)
for (i = 0; i < XFS_INODES_PER_CHUNK; i++) {
if (irec.ir_startino + i < low)
continue;
if (agino + XFS_INODES_PER_HOLEMASK_BIT > low &&
agino <= high) {
*exists = true;
return 0;
}
if (irec.ir_startino + i > high)
break;
hole_idx = i / XFS_INODES_PER_HOLEMASK_BIT;
if (!(irec.ir_holemask & (1U << hole_idx)))
ret++;
}
error = xfs_btree_increment(cur, 0, &has_record);
}
if (error)
return error;
}
*allocated = ret;
return 0;
}
/* Is there an inode record covering a given extent? */
......@@ -2663,15 +2693,27 @@ xfs_ialloc_has_inodes_at_extent(
struct xfs_btree_cur *cur,
xfs_agblock_t bno,
xfs_extlen_t len,
bool *exists)
enum xbtree_recpacking *outcome)
{
xfs_agino_t low;
xfs_agino_t high;
xfs_agino_t agino;
xfs_agino_t last_agino;
unsigned int allocated;
int error;
low = XFS_AGB_TO_AGINO(cur->bc_mp, bno);
high = XFS_AGB_TO_AGINO(cur->bc_mp, bno + len) - 1;
agino = XFS_AGB_TO_AGINO(cur->bc_mp, bno);
last_agino = XFS_AGB_TO_AGINO(cur->bc_mp, bno + len) - 1;
error = xfs_ialloc_count_ondisk(cur, agino, last_agino, &allocated);
if (error)
return error;
return xfs_ialloc_has_inode_record(cur, low, high, exists);
if (allocated == 0)
*outcome = XBTREE_RECPACKING_EMPTY;
else if (allocated == last_agino - agino + 1)
*outcome = XBTREE_RECPACKING_FULL;
else
*outcome = XBTREE_RECPACKING_SPARSE;
return 0;
}
struct xfs_ialloc_count_inodes {
......@@ -2688,8 +2730,13 @@ xfs_ialloc_count_inodes_rec(
{
struct xfs_inobt_rec_incore irec;
struct xfs_ialloc_count_inodes *ci = priv;
xfs_failaddr_t fa;
xfs_inobt_btrec_to_irec(cur->bc_mp, rec, &irec);
fa = xfs_inobt_check_irec(cur, &irec);
if (fa)
return xfs_inobt_complain_bad_rec(cur, fa, &irec);
ci->count += irec.ir_count;
ci->freecount += irec.ir_freecount;
......
......@@ -93,10 +93,11 @@ union xfs_btree_rec;
void xfs_inobt_btrec_to_irec(struct xfs_mount *mp,
const union xfs_btree_rec *rec,
struct xfs_inobt_rec_incore *irec);
xfs_failaddr_t xfs_inobt_check_irec(struct xfs_btree_cur *cur,
const struct xfs_inobt_rec_incore *irec);
int xfs_ialloc_has_inodes_at_extent(struct xfs_btree_cur *cur,
xfs_agblock_t bno, xfs_extlen_t len, bool *exists);
int xfs_ialloc_has_inode_record(struct xfs_btree_cur *cur, xfs_agino_t low,
xfs_agino_t high, bool *exists);
xfs_agblock_t bno, xfs_extlen_t len,
enum xbtree_recpacking *outcome);
int xfs_ialloc_count_inodes(struct xfs_btree_cur *cur, xfs_agino_t *count,
xfs_agino_t *freecount);
int xfs_inobt_insert_rec(struct xfs_btree_cur *cur, uint16_t holemask,
......
......@@ -156,9 +156,12 @@ __xfs_inobt_free_block(
struct xfs_buf *bp,
enum xfs_ag_resv_type resv)
{
xfs_fsblock_t fsbno;
xfs_inobt_mod_blockcount(cur, -1);
return xfs_free_extent(cur->bc_tp,
XFS_DADDR_TO_FSB(cur->bc_mp, xfs_buf_daddr(bp)), 1,
fsbno = XFS_DADDR_TO_FSB(cur->bc_mp, xfs_buf_daddr(bp));
return xfs_free_extent(cur->bc_tp, cur->bc_ag.pag,
XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno), 1,
&XFS_RMAP_OINFO_INOBT, resv);
}
......@@ -266,8 +269,11 @@ STATIC int64_t
xfs_inobt_diff_two_keys(
struct xfs_btree_cur *cur,
const union xfs_btree_key *k1,
const union xfs_btree_key *k2)
const union xfs_btree_key *k2,
const union xfs_btree_key *mask)
{
ASSERT(!mask || mask->inobt.ir_startino);
return (int64_t)be32_to_cpu(k1->inobt.ir_startino) -
be32_to_cpu(k2->inobt.ir_startino);
}
......@@ -380,6 +386,19 @@ xfs_inobt_recs_inorder(
be32_to_cpu(r2->inobt.ir_startino);
}
STATIC enum xbtree_key_contig
xfs_inobt_keys_contiguous(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask)
{
ASSERT(!mask || mask->inobt.ir_startino);
return xbtree_key_contig(be32_to_cpu(key1->inobt.ir_startino),
be32_to_cpu(key2->inobt.ir_startino));
}
static const struct xfs_btree_ops xfs_inobt_ops = {
.rec_len = sizeof(xfs_inobt_rec_t),
.key_len = sizeof(xfs_inobt_key_t),
......@@ -399,6 +418,7 @@ static const struct xfs_btree_ops xfs_inobt_ops = {
.diff_two_keys = xfs_inobt_diff_two_keys,
.keys_inorder = xfs_inobt_keys_inorder,
.recs_inorder = xfs_inobt_recs_inorder,
.keys_contiguous = xfs_inobt_keys_contiguous,
};
static const struct xfs_btree_ops xfs_finobt_ops = {
......@@ -420,6 +440,7 @@ static const struct xfs_btree_ops xfs_finobt_ops = {
.diff_two_keys = xfs_inobt_diff_two_keys,
.keys_inorder = xfs_inobt_keys_inorder,
.recs_inorder = xfs_inobt_recs_inorder,
.keys_contiguous = xfs_inobt_keys_contiguous,
};
/*
......@@ -447,9 +468,7 @@ xfs_inobt_init_common(
if (xfs_has_crc(mp))
cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
/* take a reference for the cursor */
atomic_inc(&pag->pag_ref);
cur->bc_ag.pag = pag;
cur->bc_ag.pag = xfs_perag_hold(pag);
return cur;
}
......@@ -607,7 +626,7 @@ xfs_iallocbt_maxlevels_ondisk(void)
*/
uint64_t
xfs_inobt_irec_to_allocmask(
struct xfs_inobt_rec_incore *rec)
const struct xfs_inobt_rec_incore *rec)
{
uint64_t bitmap = 0;
uint64_t inodespbit;
......
......@@ -53,7 +53,7 @@ struct xfs_btree_cur *xfs_inobt_stage_cursor(struct xfs_perag *pag,
extern int xfs_inobt_maxrecs(struct xfs_mount *, int, int);
/* ir_holemask to inode allocation bitmap conversion */
uint64_t xfs_inobt_irec_to_allocmask(struct xfs_inobt_rec_incore *);
uint64_t xfs_inobt_irec_to_allocmask(const struct xfs_inobt_rec_incore *irec);
#if defined(DEBUG) || defined(XFS_WARN)
int xfs_inobt_rec_check_count(struct xfs_mount *,
......
......@@ -140,7 +140,8 @@ xfs_iformat_extents(
xfs_inode_verifier_error(ip, -EFSCORRUPTED,
"xfs_iformat_extents(2)",
dp, sizeof(*dp), fa);
return -EFSCORRUPTED;
return xfs_bmap_complain_bad_rec(ip, whichfork,
fa, &new);
}
xfs_iext_insert(ip, &icur, &new, state);
......@@ -226,10 +227,15 @@ xfs_iformat_data_fork(
/*
* Initialize the extent count early, as the per-format routines may
* depend on it.
* depend on it. Use release semantics to set needextents /after/ we
* set the format. This ensures that we can use acquire semantics on
* needextents in xfs_need_iread_extents() and be guaranteed to see a
* valid format value after that load.
*/
ip->i_df.if_format = dip->di_format;
ip->i_df.if_nextents = xfs_dfork_data_extents(dip);
smp_store_release(&ip->i_df.if_needextents,
ip->i_df.if_format == XFS_DINODE_FMT_BTREE ? 1 : 0);
switch (inode->i_mode & S_IFMT) {
case S_IFIFO:
......@@ -282,8 +288,17 @@ xfs_ifork_init_attr(
enum xfs_dinode_fmt format,
xfs_extnum_t nextents)
{
/*
* Initialize the extent count early, as the per-format routines may
* depend on it. Use release semantics to set needextents /after/ we
* set the format. This ensures that we can use acquire semantics on
* needextents in xfs_need_iread_extents() and be guaranteed to see a
* valid format value after that load.
*/
ip->i_af.if_format = format;
ip->i_af.if_nextents = nextents;
smp_store_release(&ip->i_af.if_needextents,
ip->i_af.if_format == XFS_DINODE_FMT_BTREE ? 1 : 0);
}
void
......
......@@ -24,6 +24,7 @@ struct xfs_ifork {
xfs_extnum_t if_nextents; /* # of extents in this fork */
short if_broot_bytes; /* bytes allocated for root */
int8_t if_format; /* format of this fork */
uint8_t if_needextents; /* extents have not been read */
};
/*
......@@ -260,9 +261,10 @@ int xfs_iext_count_upgrade(struct xfs_trans *tp, struct xfs_inode *ip,
uint nr_to_add);
/* returns true if the fork has extents but they are not read in yet. */
static inline bool xfs_need_iread_extents(struct xfs_ifork *ifp)
static inline bool xfs_need_iread_extents(const struct xfs_ifork *ifp)
{
return ifp->if_format == XFS_DINODE_FMT_BTREE && ifp->if_height == 0;
/* see xfs_iformat_{data,attr}_fork() for needextents semantics */
return smp_load_acquire(&ifp->if_needextents) != 0;
}
#endif /* __XFS_INODE_FORK_H__ */
......@@ -120,51 +120,73 @@ xfs_refcount_btrec_to_irec(
irec->rc_refcount = be32_to_cpu(rec->refc.rc_refcount);
}
/*
* Get the data from the pointed-to record.
*/
int
xfs_refcount_get_rec(
/* Simple checks for refcount records. */
xfs_failaddr_t
xfs_refcount_check_irec(
struct xfs_btree_cur *cur,
struct xfs_refcount_irec *irec,
int *stat)
const struct xfs_refcount_irec *irec)
{
struct xfs_mount *mp = cur->bc_mp;
struct xfs_perag *pag = cur->bc_ag.pag;
union xfs_btree_rec *rec;
int error;
error = xfs_btree_get_rec(cur, &rec, stat);
if (error || !*stat)
return error;
xfs_refcount_btrec_to_irec(rec, irec);
if (irec->rc_blockcount == 0 || irec->rc_blockcount > MAXREFCEXTLEN)
goto out_bad_rec;
return __this_address;
if (!xfs_refcount_check_domain(irec))
goto out_bad_rec;
return __this_address;
/* check for valid extent range, including overflow */
if (!xfs_verify_agbext(pag, irec->rc_startblock, irec->rc_blockcount))
goto out_bad_rec;
return __this_address;
if (irec->rc_refcount == 0 || irec->rc_refcount > MAXREFCOUNT)
goto out_bad_rec;
return __this_address;
trace_xfs_refcount_get(cur->bc_mp, pag->pag_agno, irec);
return 0;
return NULL;
}
static inline int
xfs_refcount_complain_bad_rec(
struct xfs_btree_cur *cur,
xfs_failaddr_t fa,
const struct xfs_refcount_irec *irec)
{
struct xfs_mount *mp = cur->bc_mp;
out_bad_rec:
xfs_warn(mp,
"Refcount BTree record corruption in AG %d detected!",
pag->pag_agno);
"Refcount BTree record corruption in AG %d detected at %pS!",
cur->bc_ag.pag->pag_agno, fa);
xfs_warn(mp,
"Start block 0x%x, block count 0x%x, references 0x%x",
irec->rc_startblock, irec->rc_blockcount, irec->rc_refcount);
return -EFSCORRUPTED;
}
/*
* Get the data from the pointed-to record.
*/
int
xfs_refcount_get_rec(
struct xfs_btree_cur *cur,
struct xfs_refcount_irec *irec,
int *stat)
{
union xfs_btree_rec *rec;
xfs_failaddr_t fa;
int error;
error = xfs_btree_get_rec(cur, &rec, stat);
if (error || !*stat)
return error;
xfs_refcount_btrec_to_irec(rec, irec);
fa = xfs_refcount_check_irec(cur, irec);
if (fa)
return xfs_refcount_complain_bad_rec(cur, fa, irec);
trace_xfs_refcount_get(cur->bc_mp, cur->bc_ag.pag->pag_agno, irec);
return 0;
}
/*
* Update the record referred to by cur to the value given
* by [bno, len, refcount].
......@@ -1332,26 +1354,22 @@ xfs_refcount_finish_one(
xfs_agblock_t bno;
unsigned long nr_ops = 0;
int shape_changes = 0;
struct xfs_perag *pag;
pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, ri->ri_startblock));
bno = XFS_FSB_TO_AGBNO(mp, ri->ri_startblock);
trace_xfs_refcount_deferred(mp, XFS_FSB_TO_AGNO(mp, ri->ri_startblock),
ri->ri_type, XFS_FSB_TO_AGBNO(mp, ri->ri_startblock),
ri->ri_blockcount);
if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REFCOUNT_FINISH_ONE)) {
error = -EIO;
goto out_drop;
}
if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REFCOUNT_FINISH_ONE))
return -EIO;
/*
* If we haven't gotten a cursor or the cursor AG doesn't match
* the startblock, get one now.
*/
rcur = *pcur;
if (rcur != NULL && rcur->bc_ag.pag != pag) {
if (rcur != NULL && rcur->bc_ag.pag != ri->ri_pag) {
nr_ops = rcur->bc_ag.refc.nr_ops;
shape_changes = rcur->bc_ag.refc.shape_changes;
xfs_refcount_finish_one_cleanup(tp, rcur, 0);
......@@ -1359,12 +1377,12 @@ xfs_refcount_finish_one(
*pcur = NULL;
}
if (rcur == NULL) {
error = xfs_alloc_read_agf(pag, tp, XFS_ALLOC_FLAG_FREEING,
&agbp);
error = xfs_alloc_read_agf(ri->ri_pag, tp,
XFS_ALLOC_FLAG_FREEING, &agbp);
if (error)
goto out_drop;
return error;
rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, pag);
rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, ri->ri_pag);
rcur->bc_ag.refc.nr_ops = nr_ops;
rcur->bc_ag.refc.shape_changes = shape_changes;
}
......@@ -1375,7 +1393,7 @@ xfs_refcount_finish_one(
error = xfs_refcount_adjust(rcur, &bno, &ri->ri_blockcount,
XFS_REFCOUNT_ADJUST_INCREASE);
if (error)
goto out_drop;
return error;
if (ri->ri_blockcount > 0)
error = xfs_refcount_continue_op(rcur, ri, bno);
break;
......@@ -1383,31 +1401,29 @@ xfs_refcount_finish_one(
error = xfs_refcount_adjust(rcur, &bno, &ri->ri_blockcount,
XFS_REFCOUNT_ADJUST_DECREASE);
if (error)
goto out_drop;
return error;
if (ri->ri_blockcount > 0)
error = xfs_refcount_continue_op(rcur, ri, bno);
break;
case XFS_REFCOUNT_ALLOC_COW:
error = __xfs_refcount_cow_alloc(rcur, bno, ri->ri_blockcount);
if (error)
goto out_drop;
return error;
ri->ri_blockcount = 0;
break;
case XFS_REFCOUNT_FREE_COW:
error = __xfs_refcount_cow_free(rcur, bno, ri->ri_blockcount);
if (error)
goto out_drop;
return error;
ri->ri_blockcount = 0;
break;
default:
ASSERT(0);
error = -EFSCORRUPTED;
return -EFSCORRUPTED;
}
if (!error && ri->ri_blockcount > 0)
trace_xfs_refcount_finish_one_leftover(mp, pag->pag_agno,
trace_xfs_refcount_finish_one_leftover(mp, ri->ri_pag->pag_agno,
ri->ri_type, bno, ri->ri_blockcount);
out_drop:
xfs_perag_put(pag);
return error;
}
......@@ -1435,6 +1451,7 @@ __xfs_refcount_add(
ri->ri_startblock = startblock;
ri->ri_blockcount = blockcount;
xfs_refcount_update_get_group(tp->t_mountp, ri);
xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_REFCOUNT, &ri->ri_list);
}
......@@ -1876,7 +1893,8 @@ xfs_refcount_recover_extent(
INIT_LIST_HEAD(&rr->rr_list);
xfs_refcount_btrec_to_irec(rec, &rr->rr_rrec);
if (XFS_IS_CORRUPT(cur->bc_mp,
if (xfs_refcount_check_irec(cur, &rr->rr_rrec) != NULL ||
XFS_IS_CORRUPT(cur->bc_mp,
rr->rr_rrec.rc_domain != XFS_REFC_DOMAIN_COW)) {
kfree(rr);
return -EFSCORRUPTED;
......@@ -1980,14 +1998,17 @@ xfs_refcount_recover_cow_leftovers(
return error;
}
/* Is there a record covering a given extent? */
/*
* Scan part of the keyspace of the refcount records and tell us if the area
* has no records, is fully mapped by records, or is partially filled.
*/
int
xfs_refcount_has_record(
xfs_refcount_has_records(
struct xfs_btree_cur *cur,
enum xfs_refc_domain domain,
xfs_agblock_t bno,
xfs_extlen_t len,
bool *exists)
enum xbtree_recpacking *outcome)
{
union xfs_btree_irec low;
union xfs_btree_irec high;
......@@ -1998,7 +2019,7 @@ xfs_refcount_has_record(
high.rc.rc_startblock = bno + len - 1;
low.rc.rc_domain = high.rc.rc_domain = domain;
return xfs_btree_has_record(cur, &low, &high, exists);
return xfs_btree_has_records(cur, &low, &high, NULL, outcome);
}
int __init
......
......@@ -50,6 +50,7 @@ enum xfs_refcount_intent_type {
struct xfs_refcount_intent {
struct list_head ri_list;
struct xfs_perag *ri_pag;
enum xfs_refcount_intent_type ri_type;
xfs_extlen_t ri_blockcount;
xfs_fsblock_t ri_startblock;
......@@ -67,6 +68,9 @@ xfs_refcount_check_domain(
return true;
}
void xfs_refcount_update_get_group(struct xfs_mount *mp,
struct xfs_refcount_intent *ri);
void xfs_refcount_increase_extent(struct xfs_trans *tp,
struct xfs_bmbt_irec *irec);
void xfs_refcount_decrease_extent(struct xfs_trans *tp,
......@@ -107,12 +111,14 @@ extern int xfs_refcount_recover_cow_leftovers(struct xfs_mount *mp,
*/
#define XFS_REFCOUNT_ITEM_OVERHEAD 32
extern int xfs_refcount_has_record(struct xfs_btree_cur *cur,
extern int xfs_refcount_has_records(struct xfs_btree_cur *cur,
enum xfs_refc_domain domain, xfs_agblock_t bno,
xfs_extlen_t len, bool *exists);
xfs_extlen_t len, enum xbtree_recpacking *outcome);
union xfs_btree_rec;
extern void xfs_refcount_btrec_to_irec(const union xfs_btree_rec *rec,
struct xfs_refcount_irec *irec);
xfs_failaddr_t xfs_refcount_check_irec(struct xfs_btree_cur *cur,
const struct xfs_refcount_irec *irec);
extern int xfs_refcount_insert(struct xfs_btree_cur *cur,
struct xfs_refcount_irec *irec, int *stat);
......
......@@ -112,8 +112,9 @@ xfs_refcountbt_free_block(
XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno), 1);
be32_add_cpu(&agf->agf_refcount_blocks, -1);
xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_REFCOUNT_BLOCKS);
error = xfs_free_extent(cur->bc_tp, fsbno, 1, &XFS_RMAP_OINFO_REFC,
XFS_AG_RESV_METADATA);
error = xfs_free_extent(cur->bc_tp, cur->bc_ag.pag,
XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno), 1,
&XFS_RMAP_OINFO_REFC, XFS_AG_RESV_METADATA);
if (error)
return error;
......@@ -201,8 +202,11 @@ STATIC int64_t
xfs_refcountbt_diff_two_keys(
struct xfs_btree_cur *cur,
const union xfs_btree_key *k1,
const union xfs_btree_key *k2)
const union xfs_btree_key *k2,
const union xfs_btree_key *mask)
{
ASSERT(!mask || mask->refc.rc_startblock);
return (int64_t)be32_to_cpu(k1->refc.rc_startblock) -
be32_to_cpu(k2->refc.rc_startblock);
}
......@@ -299,6 +303,19 @@ xfs_refcountbt_recs_inorder(
be32_to_cpu(r2->refc.rc_startblock);
}
STATIC enum xbtree_key_contig
xfs_refcountbt_keys_contiguous(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask)
{
ASSERT(!mask || mask->refc.rc_startblock);
return xbtree_key_contig(be32_to_cpu(key1->refc.rc_startblock),
be32_to_cpu(key2->refc.rc_startblock));
}
static const struct xfs_btree_ops xfs_refcountbt_ops = {
.rec_len = sizeof(struct xfs_refcount_rec),
.key_len = sizeof(struct xfs_refcount_key),
......@@ -318,6 +335,7 @@ static const struct xfs_btree_ops xfs_refcountbt_ops = {
.diff_two_keys = xfs_refcountbt_diff_two_keys,
.keys_inorder = xfs_refcountbt_keys_inorder,
.recs_inorder = xfs_refcountbt_recs_inorder,
.keys_contiguous = xfs_refcountbt_keys_contiguous,
};
/*
......@@ -339,10 +357,7 @@ xfs_refcountbt_init_common(
cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
/* take a reference for the cursor */
atomic_inc(&pag->pag_ref);
cur->bc_ag.pag = pag;
cur->bc_ag.pag = xfs_perag_hold(pag);
cur->bc_ag.refc.nr_ops = 0;
cur->bc_ag.refc.shape_changes = 0;
cur->bc_ops = &xfs_refcountbt_ops;
......
This diff is collapsed.
......@@ -62,13 +62,14 @@ xfs_rmap_irec_offset_pack(
return x;
}
static inline int
static inline xfs_failaddr_t
xfs_rmap_irec_offset_unpack(
__u64 offset,
struct xfs_rmap_irec *irec)
{
if (offset & ~(XFS_RMAP_OFF_MASK | XFS_RMAP_OFF_FLAGS))
return -EFSCORRUPTED;
return __this_address;
irec->rm_offset = XFS_RMAP_OFF(offset);
irec->rm_flags = 0;
if (offset & XFS_RMAP_OFF_ATTR_FORK)
......@@ -77,7 +78,7 @@ xfs_rmap_irec_offset_unpack(
irec->rm_flags |= XFS_RMAP_BMBT_BLOCK;
if (offset & XFS_RMAP_OFF_UNWRITTEN)
irec->rm_flags |= XFS_RMAP_UNWRITTEN;
return 0;
return NULL;
}
static inline void
......@@ -162,8 +163,12 @@ struct xfs_rmap_intent {
int ri_whichfork;
uint64_t ri_owner;
struct xfs_bmbt_irec ri_bmap;
struct xfs_perag *ri_pag;
};
void xfs_rmap_update_get_group(struct xfs_mount *mp,
struct xfs_rmap_intent *ri);
/* functions for updating the rmapbt based on bmbt map/unmap operations */
void xfs_rmap_map_extent(struct xfs_trans *tp, struct xfs_inode *ip,
int whichfork, struct xfs_bmbt_irec *imap);
......@@ -188,16 +193,31 @@ int xfs_rmap_lookup_le_range(struct xfs_btree_cur *cur, xfs_agblock_t bno,
int xfs_rmap_compare(const struct xfs_rmap_irec *a,
const struct xfs_rmap_irec *b);
union xfs_btree_rec;
int xfs_rmap_btrec_to_irec(const union xfs_btree_rec *rec,
xfs_failaddr_t xfs_rmap_btrec_to_irec(const union xfs_btree_rec *rec,
struct xfs_rmap_irec *irec);
int xfs_rmap_has_record(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_extlen_t len, bool *exists);
int xfs_rmap_record_exists(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_failaddr_t xfs_rmap_check_irec(struct xfs_btree_cur *cur,
const struct xfs_rmap_irec *irec);
int xfs_rmap_has_records(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_extlen_t len, enum xbtree_recpacking *outcome);
struct xfs_rmap_matches {
/* Number of owner matches. */
unsigned long long matches;
/* Number of non-owner matches. */
unsigned long long non_owner_matches;
/* Number of non-owner matches that conflict with the owner matches. */
unsigned long long bad_non_owner_matches;
};
int xfs_rmap_count_owners(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_extlen_t len, const struct xfs_owner_info *oinfo,
bool *has_rmap);
struct xfs_rmap_matches *rmatch);
int xfs_rmap_has_other_keys(struct xfs_btree_cur *cur, xfs_agblock_t bno,
xfs_extlen_t len, const struct xfs_owner_info *oinfo,
bool *has_rmap);
bool *has_other);
int xfs_rmap_map_raw(struct xfs_btree_cur *cur, struct xfs_rmap_irec *rmap);
extern const struct xfs_owner_info XFS_RMAP_OINFO_SKIP_UPDATE;
......
......@@ -156,6 +156,16 @@ xfs_rmapbt_get_maxrecs(
return cur->bc_mp->m_rmap_mxr[level != 0];
}
/*
* Convert the ondisk record's offset field into the ondisk key's offset field.
* Fork and bmbt are significant parts of the rmap record key, but written
* status is merely a record attribute.
*/
static inline __be64 ondisk_rec_offset_to_key(const union xfs_btree_rec *rec)
{
return rec->rmap.rm_offset & ~cpu_to_be64(XFS_RMAP_OFF_UNWRITTEN);
}
STATIC void
xfs_rmapbt_init_key_from_rec(
union xfs_btree_key *key,
......@@ -163,7 +173,7 @@ xfs_rmapbt_init_key_from_rec(
{
key->rmap.rm_startblock = rec->rmap.rm_startblock;
key->rmap.rm_owner = rec->rmap.rm_owner;
key->rmap.rm_offset = rec->rmap.rm_offset;
key->rmap.rm_offset = ondisk_rec_offset_to_key(rec);
}
/*
......@@ -186,7 +196,7 @@ xfs_rmapbt_init_high_key_from_rec(
key->rmap.rm_startblock = rec->rmap.rm_startblock;
be32_add_cpu(&key->rmap.rm_startblock, adj);
key->rmap.rm_owner = rec->rmap.rm_owner;
key->rmap.rm_offset = rec->rmap.rm_offset;
key->rmap.rm_offset = ondisk_rec_offset_to_key(rec);
if (XFS_RMAP_NON_INODE_OWNER(be64_to_cpu(rec->rmap.rm_owner)) ||
XFS_RMAP_IS_BMBT_BLOCK(be64_to_cpu(rec->rmap.rm_offset)))
return;
......@@ -219,6 +229,16 @@ xfs_rmapbt_init_ptr_from_cur(
ptr->s = agf->agf_roots[cur->bc_btnum];
}
/*
* Mask the appropriate parts of the ondisk key field for a key comparison.
* Fork and bmbt are significant parts of the rmap record key, but written
* status is merely a record attribute.
*/
static inline uint64_t offset_keymask(uint64_t offset)
{
return offset & ~XFS_RMAP_OFF_UNWRITTEN;
}
STATIC int64_t
xfs_rmapbt_key_diff(
struct xfs_btree_cur *cur,
......@@ -240,8 +260,8 @@ xfs_rmapbt_key_diff(
else if (y > x)
return -1;
x = XFS_RMAP_OFF(be64_to_cpu(kp->rm_offset));
y = rec->rm_offset;
x = offset_keymask(be64_to_cpu(kp->rm_offset));
y = offset_keymask(xfs_rmap_irec_offset_pack(rec));
if (x > y)
return 1;
else if (y > x)
......@@ -253,31 +273,43 @@ STATIC int64_t
xfs_rmapbt_diff_two_keys(
struct xfs_btree_cur *cur,
const union xfs_btree_key *k1,
const union xfs_btree_key *k2)
const union xfs_btree_key *k2,
const union xfs_btree_key *mask)
{
const struct xfs_rmap_key *kp1 = &k1->rmap;
const struct xfs_rmap_key *kp2 = &k2->rmap;
int64_t d;
__u64 x, y;
/* Doesn't make sense to mask off the physical space part */
ASSERT(!mask || mask->rmap.rm_startblock);
d = (int64_t)be32_to_cpu(kp1->rm_startblock) -
be32_to_cpu(kp2->rm_startblock);
if (d)
return d;
if (!mask || mask->rmap.rm_owner) {
x = be64_to_cpu(kp1->rm_owner);
y = be64_to_cpu(kp2->rm_owner);
if (x > y)
return 1;
else if (y > x)
return -1;
}
x = XFS_RMAP_OFF(be64_to_cpu(kp1->rm_offset));
y = XFS_RMAP_OFF(be64_to_cpu(kp2->rm_offset));
if (!mask || mask->rmap.rm_offset) {
/* Doesn't make sense to allow offset but not owner */
ASSERT(!mask || mask->rmap.rm_owner);
x = offset_keymask(be64_to_cpu(kp1->rm_offset));
y = offset_keymask(be64_to_cpu(kp2->rm_offset));
if (x > y)
return 1;
else if (y > x)
return -1;
}
return 0;
}
......@@ -387,8 +419,8 @@ xfs_rmapbt_keys_inorder(
return 1;
else if (a > b)
return 0;
a = XFS_RMAP_OFF(be64_to_cpu(k1->rmap.rm_offset));
b = XFS_RMAP_OFF(be64_to_cpu(k2->rmap.rm_offset));
a = offset_keymask(be64_to_cpu(k1->rmap.rm_offset));
b = offset_keymask(be64_to_cpu(k2->rmap.rm_offset));
if (a <= b)
return 1;
return 0;
......@@ -417,13 +449,33 @@ xfs_rmapbt_recs_inorder(
return 1;
else if (a > b)
return 0;
a = XFS_RMAP_OFF(be64_to_cpu(r1->rmap.rm_offset));
b = XFS_RMAP_OFF(be64_to_cpu(r2->rmap.rm_offset));
a = offset_keymask(be64_to_cpu(r1->rmap.rm_offset));
b = offset_keymask(be64_to_cpu(r2->rmap.rm_offset));
if (a <= b)
return 1;
return 0;
}
STATIC enum xbtree_key_contig
xfs_rmapbt_keys_contiguous(
struct xfs_btree_cur *cur,
const union xfs_btree_key *key1,
const union xfs_btree_key *key2,
const union xfs_btree_key *mask)
{
ASSERT(!mask || mask->rmap.rm_startblock);
/*
* We only support checking contiguity of the physical space component.
* If any callers ever need more specificity than that, they'll have to
* implement it here.
*/
ASSERT(!mask || (!mask->rmap.rm_owner && !mask->rmap.rm_offset));
return xbtree_key_contig(be32_to_cpu(key1->rmap.rm_startblock),
be32_to_cpu(key2->rmap.rm_startblock));
}
static const struct xfs_btree_ops xfs_rmapbt_ops = {
.rec_len = sizeof(struct xfs_rmap_rec),
.key_len = 2 * sizeof(struct xfs_rmap_key),
......@@ -443,6 +495,7 @@ static const struct xfs_btree_ops xfs_rmapbt_ops = {
.diff_two_keys = xfs_rmapbt_diff_two_keys,
.keys_inorder = xfs_rmapbt_keys_inorder,
.recs_inorder = xfs_rmapbt_recs_inorder,
.keys_contiguous = xfs_rmapbt_keys_contiguous,
};
static struct xfs_btree_cur *
......@@ -460,10 +513,7 @@ xfs_rmapbt_init_common(
cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_rmap_2);
cur->bc_ops = &xfs_rmapbt_ops;
/* take a reference for the cursor */
atomic_inc(&pag->pag_ref);
cur->bc_ag.pag = pag;
cur->bc_ag.pag = xfs_perag_hold(pag);
return cur;
}
......
......@@ -72,7 +72,8 @@ xfs_sb_validate_v5_features(
}
/*
* We support all XFS versions newer than a v4 superblock with V2 directories.
* We current support XFS v5 formats with known features and v4 superblocks with
* at least V2 directories.
*/
bool
xfs_sb_good_version(
......@@ -86,16 +87,16 @@ xfs_sb_good_version(
if (xfs_sb_is_v5(sbp))
return xfs_sb_validate_v5_features(sbp);
/* versions prior to v4 are not supported */
if (XFS_SB_VERSION_NUM(sbp) != XFS_SB_VERSION_4)
return false;
/* We must not have any unknown v4 feature bits set */
if ((sbp->sb_versionnum & ~XFS_SB_VERSION_OKBITS) ||
((sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT) &&
(sbp->sb_features2 & ~XFS_SB_VERSION2_OKBITS)))
return false;
/* versions prior to v4 are not supported */
if (XFS_SB_VERSION_NUM(sbp) < XFS_SB_VERSION_4)
return false;
/* V4 filesystems need v2 directories and unwritten extents */
if (!(sbp->sb_versionnum & XFS_SB_VERSION_DIRV2BIT))
return false;
......
......@@ -204,6 +204,18 @@ enum xfs_ag_resv_type {
XFS_AG_RESV_RMAPBT,
};
/* Results of scanning a btree keyspace to check occupancy. */
enum xbtree_recpacking {
/* None of the keyspace maps to records. */
XBTREE_RECPACKING_EMPTY = 0,
/* Some, but not all, of the keyspace maps to records. */
XBTREE_RECPACKING_SPARSE,
/* The entire keyspace maps to records. */
XBTREE_RECPACKING_FULL,
};
/*
* Type verifier functions
*/
......
// SPDX-License-Identifier: GPL-2.0+
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2017 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2017-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#include "xfs.h"
#include "xfs_fs.h"
......@@ -18,6 +18,15 @@
#include "scrub/scrub.h"
#include "scrub/common.h"
int
xchk_setup_agheader(
struct xfs_scrub *sc)
{
if (xchk_need_intent_drain(sc))
xchk_fsgates_enable(sc, XCHK_FSGATES_DRAIN);
return xchk_setup_fs(sc);
}
/* Superblock */
/* Cross-reference with the other btrees. */
......@@ -42,8 +51,9 @@ xchk_superblock_xref(
xchk_xref_is_used_space(sc, agbno, 1);
xchk_xref_is_not_inode_chunk(sc, agbno, 1);
xchk_xref_is_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_FS);
xchk_xref_is_only_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_FS);
xchk_xref_is_not_shared(sc, agbno, 1);
xchk_xref_is_not_cow_staging(sc, agbno, 1);
/* scrub teardown will take care of sc->sa for us */
}
......@@ -505,9 +515,10 @@ xchk_agf_xref(
xchk_agf_xref_freeblks(sc);
xchk_agf_xref_cntbt(sc);
xchk_xref_is_not_inode_chunk(sc, agbno, 1);
xchk_xref_is_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_FS);
xchk_xref_is_only_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_FS);
xchk_agf_xref_btreeblks(sc);
xchk_xref_is_not_shared(sc, agbno, 1);
xchk_xref_is_not_cow_staging(sc, agbno, 1);
xchk_agf_xref_refcblks(sc);
/* scrub teardown will take care of sc->sa for us */
......@@ -633,8 +644,9 @@ xchk_agfl_block_xref(
xchk_xref_is_used_space(sc, agbno, 1);
xchk_xref_is_not_inode_chunk(sc, agbno, 1);
xchk_xref_is_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_AG);
xchk_xref_is_only_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_AG);
xchk_xref_is_not_shared(sc, agbno, 1);
xchk_xref_is_not_cow_staging(sc, agbno, 1);
}
/* Scrub an AGFL block. */
......@@ -689,8 +701,9 @@ xchk_agfl_xref(
xchk_xref_is_used_space(sc, agbno, 1);
xchk_xref_is_not_inode_chunk(sc, agbno, 1);
xchk_xref_is_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_FS);
xchk_xref_is_only_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_FS);
xchk_xref_is_not_shared(sc, agbno, 1);
xchk_xref_is_not_cow_staging(sc, agbno, 1);
/*
* Scrub teardown will take care of sc->sa for us. Leave sc->sa
......@@ -844,8 +857,9 @@ xchk_agi_xref(
xchk_xref_is_used_space(sc, agbno, 1);
xchk_xref_is_not_inode_chunk(sc, agbno, 1);
xchk_agi_xref_icounts(sc);
xchk_xref_is_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_FS);
xchk_xref_is_only_owned_by(sc, agbno, 1, &XFS_RMAP_OINFO_FS);
xchk_xref_is_not_shared(sc, agbno, 1);
xchk_xref_is_not_cow_staging(sc, agbno, 1);
xchk_agi_xref_fiblocks(sc);
/* scrub teardown will take care of sc->sa for us */
......
// SPDX-License-Identifier: GPL-2.0+
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2018 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2018-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#include "xfs.h"
#include "xfs_fs.h"
......@@ -487,10 +487,11 @@ xrep_agfl_walk_rmap(
/* Strike out the blocks that are cross-linked according to the rmapbt. */
STATIC int
xrep_agfl_check_extent(
struct xrep_agfl *ra,
uint64_t start,
uint64_t len)
uint64_t len,
void *priv)
{
struct xrep_agfl *ra = priv;
xfs_agblock_t agbno = XFS_FSB_TO_AGBNO(ra->sc->mp, start);
xfs_agblock_t last_agbno = agbno + len - 1;
int error;
......@@ -538,7 +539,6 @@ xrep_agfl_collect_blocks(
struct xrep_agfl ra;
struct xfs_mount *mp = sc->mp;
struct xfs_btree_cur *cur;
struct xbitmap_range *br, *n;
int error;
ra.sc = sc;
......@@ -579,11 +579,7 @@ xrep_agfl_collect_blocks(
/* Strike out the blocks that are cross-linked. */
ra.rmap_cur = xfs_rmapbt_init_cursor(mp, sc->tp, agf_bp, sc->sa.pag);
for_each_xbitmap_extent(br, n, agfl_extents) {
error = xrep_agfl_check_extent(&ra, br->start, br->len);
if (error)
break;
}
error = xbitmap_walk(agfl_extents, xrep_agfl_check_extent, &ra);
xfs_btree_del_cursor(ra.rmap_cur, error);
if (error)
goto out_bmp;
......@@ -629,21 +625,58 @@ xrep_agfl_update_agf(
XFS_AGF_FLFIRST | XFS_AGF_FLLAST | XFS_AGF_FLCOUNT);
}
struct xrep_agfl_fill {
struct xbitmap used_extents;
struct xfs_scrub *sc;
__be32 *agfl_bno;
xfs_agblock_t flcount;
unsigned int fl_off;
};
/* Fill the AGFL with whatever blocks are in this extent. */
static int
xrep_agfl_fill(
uint64_t start,
uint64_t len,
void *priv)
{
struct xrep_agfl_fill *af = priv;
struct xfs_scrub *sc = af->sc;
xfs_fsblock_t fsbno = start;
int error;
while (fsbno < start + len && af->fl_off < af->flcount)
af->agfl_bno[af->fl_off++] =
cpu_to_be32(XFS_FSB_TO_AGBNO(sc->mp, fsbno++));
trace_xrep_agfl_insert(sc->mp, sc->sa.pag->pag_agno,
XFS_FSB_TO_AGBNO(sc->mp, start), len);
error = xbitmap_set(&af->used_extents, start, fsbno - 1);
if (error)
return error;
if (af->fl_off == af->flcount)
return -ECANCELED;
return 0;
}
/* Write out a totally new AGFL. */
STATIC void
STATIC int
xrep_agfl_init_header(
struct xfs_scrub *sc,
struct xfs_buf *agfl_bp,
struct xbitmap *agfl_extents,
xfs_agblock_t flcount)
{
struct xrep_agfl_fill af = {
.sc = sc,
.flcount = flcount,
};
struct xfs_mount *mp = sc->mp;
__be32 *agfl_bno;
struct xbitmap_range *br;
struct xbitmap_range *n;
struct xfs_agfl *agfl;
xfs_agblock_t agbno;
unsigned int fl_off;
int error;
ASSERT(flcount <= xfs_agfl_size(mp));
......@@ -662,36 +695,18 @@ xrep_agfl_init_header(
* blocks than fit in the AGFL, they will be freed in a subsequent
* step.
*/
fl_off = 0;
agfl_bno = xfs_buf_to_agfl_bno(agfl_bp);
for_each_xbitmap_extent(br, n, agfl_extents) {
agbno = XFS_FSB_TO_AGBNO(mp, br->start);
trace_xrep_agfl_insert(mp, sc->sa.pag->pag_agno, agbno,
br->len);
while (br->len > 0 && fl_off < flcount) {
agfl_bno[fl_off] = cpu_to_be32(agbno);
fl_off++;
agbno++;
/*
* We've now used br->start by putting it in the AGFL,
* so bump br so that we don't reap the block later.
*/
br->start++;
br->len--;
}
if (br->len)
break;
list_del(&br->list);
kfree(br);
}
xbitmap_init(&af.used_extents);
af.agfl_bno = xfs_buf_to_agfl_bno(agfl_bp),
xbitmap_walk(agfl_extents, xrep_agfl_fill, &af);
error = xbitmap_disunion(agfl_extents, &af.used_extents);
if (error)
return error;
/* Write new AGFL to disk. */
xfs_trans_buf_set_type(sc->tp, agfl_bp, XFS_BLFT_AGFL_BUF);
xfs_trans_log_buf(sc->tp, agfl_bp, 0, BBTOB(agfl_bp->b_length) - 1);
xbitmap_destroy(&af.used_extents);
return 0;
}
/* Repair the AGFL. */
......@@ -744,7 +759,9 @@ xrep_agfl(
* buffers until we know that part works.
*/
xrep_agfl_update_agf(sc, agf_bp, flcount);
xrep_agfl_init_header(sc, agfl_bp, &agfl_extents, flcount);
error = xrep_agfl_init_header(sc, agfl_bp, &agfl_extents, flcount);
if (error)
goto err;
/*
* Ok, the AGFL should be ready to go now. Roll the transaction to
......
// SPDX-License-Identifier: GPL-2.0+
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2017 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2017-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#include "xfs.h"
#include "xfs_fs.h"
......@@ -24,10 +24,19 @@ int
xchk_setup_ag_allocbt(
struct xfs_scrub *sc)
{
if (xchk_need_intent_drain(sc))
xchk_fsgates_enable(sc, XCHK_FSGATES_DRAIN);
return xchk_setup_ag_btree(sc, false);
}
/* Free space btree scrubber. */
struct xchk_alloc {
/* Previous free space extent. */
struct xfs_alloc_rec_incore prev;
};
/*
* Ensure there's a corresponding cntbt/bnobt record matching this
* bnobt/cntbt record, respectively.
......@@ -75,9 +84,11 @@ xchk_allocbt_xref_other(
STATIC void
xchk_allocbt_xref(
struct xfs_scrub *sc,
xfs_agblock_t agbno,
xfs_extlen_t len)
const struct xfs_alloc_rec_incore *irec)
{
xfs_agblock_t agbno = irec->ar_startblock;
xfs_extlen_t len = irec->ar_blockcount;
if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
return;
......@@ -85,6 +96,25 @@ xchk_allocbt_xref(
xchk_xref_is_not_inode_chunk(sc, agbno, len);
xchk_xref_has_no_owner(sc, agbno, len);
xchk_xref_is_not_shared(sc, agbno, len);
xchk_xref_is_not_cow_staging(sc, agbno, len);
}
/* Flag failures for records that could be merged. */
STATIC void
xchk_allocbt_mergeable(
struct xchk_btree *bs,
struct xchk_alloc *ca,
const struct xfs_alloc_rec_incore *irec)
{
if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
return;
if (ca->prev.ar_blockcount > 0 &&
ca->prev.ar_startblock + ca->prev.ar_blockcount == irec->ar_startblock &&
ca->prev.ar_blockcount + irec->ar_blockcount < (uint32_t)~0U)
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
memcpy(&ca->prev, irec, sizeof(*irec));
}
/* Scrub a bnobt/cntbt record. */
......@@ -93,17 +123,17 @@ xchk_allocbt_rec(
struct xchk_btree *bs,
const union xfs_btree_rec *rec)
{
struct xfs_perag *pag = bs->cur->bc_ag.pag;
xfs_agblock_t bno;
xfs_extlen_t len;
bno = be32_to_cpu(rec->alloc.ar_startblock);
len = be32_to_cpu(rec->alloc.ar_blockcount);
struct xfs_alloc_rec_incore irec;
struct xchk_alloc *ca = bs->private;
if (!xfs_verify_agbext(pag, bno, len))
xfs_alloc_btrec_to_irec(rec, &irec);
if (xfs_alloc_check_irec(bs->cur, &irec) != NULL) {
xchk_btree_set_corrupt(bs->sc, bs->cur, 0);
return 0;
}
xchk_allocbt_xref(bs->sc, bno, len);
xchk_allocbt_mergeable(bs, ca, &irec);
xchk_allocbt_xref(bs->sc, &irec);
return 0;
}
......@@ -114,10 +144,11 @@ xchk_allocbt(
struct xfs_scrub *sc,
xfs_btnum_t which)
{
struct xchk_alloc ca = { };
struct xfs_btree_cur *cur;
cur = which == XFS_BTNUM_BNO ? sc->sa.bno_cur : sc->sa.cnt_cur;
return xchk_btree(sc, cur, xchk_allocbt_rec, &XFS_RMAP_OINFO_AG, NULL);
return xchk_btree(sc, cur, xchk_allocbt_rec, &XFS_RMAP_OINFO_AG, &ca);
}
int
......@@ -141,15 +172,15 @@ xchk_xref_is_used_space(
xfs_agblock_t agbno,
xfs_extlen_t len)
{
bool is_freesp;
enum xbtree_recpacking outcome;
int error;
if (!sc->sa.bno_cur || xchk_skip_xref(sc->sm))
return;
error = xfs_alloc_has_record(sc->sa.bno_cur, agbno, len, &is_freesp);
error = xfs_alloc_has_records(sc->sa.bno_cur, agbno, len, &outcome);
if (!xchk_should_check_xref(sc, &error, &sc->sa.bno_cur))
return;
if (is_freesp)
if (outcome != XBTREE_RECPACKING_EMPTY)
xchk_btree_xref_set_corrupt(sc, sc->sa.bno_cur, 0);
}
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* Copyright (C) 2019 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2019-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#ifndef __XFS_SCRUB_ATTR_H__
#define __XFS_SCRUB_ATTR_H__
......@@ -10,59 +10,15 @@
* Temporary storage for online scrub and repair of extended attributes.
*/
struct xchk_xattr_buf {
/* Size of @buf, in bytes. */
size_t sz;
/* Bitmap of used space in xattr leaf blocks and shortform forks. */
unsigned long *usedmap;
/*
* Memory buffer -- either used for extracting attr values while
* walking the attributes; or for computing attr block bitmaps when
* checking the attribute tree.
*
* Each bitmap contains enough bits to track every byte in an attr
* block (rounded up to the size of an unsigned long). The attr block
* used space bitmap starts at the beginning of the buffer; the free
* space bitmap follows immediately after; and we have a third buffer
* for storing intermediate bitmap results.
*/
uint8_t buf[];
};
/* A place to store attribute values. */
static inline uint8_t *
xchk_xattr_valuebuf(
struct xfs_scrub *sc)
{
struct xchk_xattr_buf *ab = sc->buf;
return ab->buf;
}
/* Bitmap of free space in xattr leaf blocks. */
unsigned long *freemap;
/* A bitmap of space usage computed by walking an attr leaf block. */
static inline unsigned long *
xchk_xattr_usedmap(
struct xfs_scrub *sc)
{
struct xchk_xattr_buf *ab = sc->buf;
return (unsigned long *)ab->buf;
}
/* A bitmap of free space computed by walking attr leaf block free info. */
static inline unsigned long *
xchk_xattr_freemap(
struct xfs_scrub *sc)
{
return xchk_xattr_usedmap(sc) +
BITS_TO_LONGS(sc->mp->m_attr_geo->blksize);
}
/* A bitmap used to hold temporary results. */
static inline unsigned long *
xchk_xattr_dstmap(
struct xfs_scrub *sc)
{
return xchk_xattr_freemap(sc) +
BITS_TO_LONGS(sc->mp->m_attr_geo->blksize);
}
/* Memory buffer used to extract xattr values. */
void *value;
size_t value_sz;
};
#endif /* __XFS_SCRUB_ATTR_H__ */
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0+
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2018 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2018-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#ifndef __XFS_SCRUB_BITMAP_H__
#define __XFS_SCRUB_BITMAP_H__
struct xbitmap_range {
struct list_head list;
uint64_t start;
uint64_t len;
};
struct xbitmap {
struct list_head list;
struct rb_root_cached xb_root;
};
void xbitmap_init(struct xbitmap *bitmap);
void xbitmap_destroy(struct xbitmap *bitmap);
#define for_each_xbitmap_extent(bex, n, bitmap) \
list_for_each_entry_safe((bex), (n), &(bitmap)->list, list)
#define for_each_xbitmap_block(b, bex, n, bitmap) \
list_for_each_entry_safe((bex), (n), &(bitmap)->list, list) \
for ((b) = (bex)->start; (b) < (bex)->start + (bex)->len; (b)++)
int xbitmap_clear(struct xbitmap *bitmap, uint64_t start, uint64_t len);
int xbitmap_set(struct xbitmap *bitmap, uint64_t start, uint64_t len);
int xbitmap_disunion(struct xbitmap *bitmap, struct xbitmap *sub);
int xbitmap_set_btcur_path(struct xbitmap *bitmap,
......@@ -34,4 +22,93 @@ int xbitmap_set_btblocks(struct xbitmap *bitmap,
struct xfs_btree_cur *cur);
uint64_t xbitmap_hweight(struct xbitmap *bitmap);
/*
* Return codes for the bitmap iterator functions are 0 to continue iterating,
* and non-zero to stop iterating. Any non-zero value will be passed up to the
* iteration caller. The special value -ECANCELED can be used to stop
* iteration, because neither bitmap iterator ever generates that error code on
* its own. Callers must not modify the bitmap while walking it.
*/
typedef int (*xbitmap_walk_fn)(uint64_t start, uint64_t len, void *priv);
int xbitmap_walk(struct xbitmap *bitmap, xbitmap_walk_fn fn,
void *priv);
typedef int (*xbitmap_walk_bits_fn)(uint64_t bit, void *priv);
int xbitmap_walk_bits(struct xbitmap *bitmap, xbitmap_walk_bits_fn fn,
void *priv);
bool xbitmap_empty(struct xbitmap *bitmap);
bool xbitmap_test(struct xbitmap *bitmap, uint64_t start, uint64_t *len);
/* Bitmaps, but for type-checked for xfs_agblock_t */
struct xagb_bitmap {
struct xbitmap agbitmap;
};
static inline void xagb_bitmap_init(struct xagb_bitmap *bitmap)
{
xbitmap_init(&bitmap->agbitmap);
}
static inline void xagb_bitmap_destroy(struct xagb_bitmap *bitmap)
{
xbitmap_destroy(&bitmap->agbitmap);
}
static inline int xagb_bitmap_clear(struct xagb_bitmap *bitmap,
xfs_agblock_t start, xfs_extlen_t len)
{
return xbitmap_clear(&bitmap->agbitmap, start, len);
}
static inline int xagb_bitmap_set(struct xagb_bitmap *bitmap,
xfs_agblock_t start, xfs_extlen_t len)
{
return xbitmap_set(&bitmap->agbitmap, start, len);
}
static inline bool
xagb_bitmap_test(
struct xagb_bitmap *bitmap,
xfs_agblock_t start,
xfs_extlen_t *len)
{
uint64_t biglen = *len;
bool ret;
ret = xbitmap_test(&bitmap->agbitmap, start, &biglen);
if (start + biglen >= UINT_MAX) {
ASSERT(0);
biglen = UINT_MAX - start;
}
*len = biglen;
return ret;
}
static inline int xagb_bitmap_disunion(struct xagb_bitmap *bitmap,
struct xagb_bitmap *sub)
{
return xbitmap_disunion(&bitmap->agbitmap, &sub->agbitmap);
}
static inline uint32_t xagb_bitmap_hweight(struct xagb_bitmap *bitmap)
{
return xbitmap_hweight(&bitmap->agbitmap);
}
static inline bool xagb_bitmap_empty(struct xagb_bitmap *bitmap)
{
return xbitmap_empty(&bitmap->agbitmap);
}
static inline int xagb_bitmap_walk(struct xagb_bitmap *bitmap,
xbitmap_walk_fn fn, void *priv)
{
return xbitmap_walk(&bitmap->agbitmap, fn, priv);
}
int xagb_bitmap_set_btblocks(struct xagb_bitmap *bitmap,
struct xfs_btree_cur *cur);
#endif /* __XFS_SCRUB_BITMAP_H__ */
This diff is collapsed.
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0+
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2017 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2017-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#ifndef __XFS_SCRUB_BTREE_H__
#define __XFS_SCRUB_BTREE_H__
......@@ -19,6 +19,8 @@ bool xchk_btree_xref_process_error(struct xfs_scrub *sc,
/* Check for btree corruption. */
void xchk_btree_set_corrupt(struct xfs_scrub *sc,
struct xfs_btree_cur *cur, int level);
void xchk_btree_set_preen(struct xfs_scrub *sc, struct xfs_btree_cur *cur,
int level);
/* Check for btree xref discrepancies. */
void xchk_btree_xref_set_corrupt(struct xfs_scrub *sc,
......@@ -29,6 +31,11 @@ typedef int (*xchk_btree_rec_fn)(
struct xchk_btree *bs,
const union xfs_btree_rec *rec);
struct xchk_btree_key {
union xfs_btree_key key;
bool valid;
};
struct xchk_btree {
/* caller-provided scrub state */
struct xfs_scrub *sc;
......@@ -38,11 +45,12 @@ struct xchk_btree {
void *private;
/* internal scrub state */
bool lastrec_valid;
union xfs_btree_rec lastrec;
struct list_head to_check;
/* this element must come last! */
union xfs_btree_key lastkey[];
struct xchk_btree_key lastkey[];
};
/*
......
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0+
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2017 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2017-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#ifndef __XFS_SCRUB_COMMON_H__
#define __XFS_SCRUB_COMMON_H__
......@@ -32,6 +32,8 @@ xchk_should_terminate(
}
int xchk_trans_alloc(struct xfs_scrub *sc, uint resblks);
void xchk_trans_cancel(struct xfs_scrub *sc);
bool xchk_process_error(struct xfs_scrub *sc, xfs_agnumber_t agno,
xfs_agblock_t bno, int *error);
bool xchk_fblock_process_error(struct xfs_scrub *sc, int whichfork,
......@@ -72,6 +74,7 @@ bool xchk_should_check_xref(struct xfs_scrub *sc, int *error,
struct xfs_btree_cur **curpp);
/* Setup functions */
int xchk_setup_agheader(struct xfs_scrub *sc);
int xchk_setup_fs(struct xfs_scrub *sc);
int xchk_setup_ag_allocbt(struct xfs_scrub *sc);
int xchk_setup_ag_iallocbt(struct xfs_scrub *sc);
......@@ -132,10 +135,16 @@ int xchk_count_rmap_ownedby_ag(struct xfs_scrub *sc, struct xfs_btree_cur *cur,
const struct xfs_owner_info *oinfo, xfs_filblks_t *blocks);
int xchk_setup_ag_btree(struct xfs_scrub *sc, bool force_log);
int xchk_get_inode(struct xfs_scrub *sc);
int xchk_iget_for_scrubbing(struct xfs_scrub *sc);
int xchk_setup_inode_contents(struct xfs_scrub *sc, unsigned int resblks);
void xchk_buffer_recheck(struct xfs_scrub *sc, struct xfs_buf *bp);
int xchk_iget(struct xfs_scrub *sc, xfs_ino_t inum, struct xfs_inode **ipp);
int xchk_iget_agi(struct xfs_scrub *sc, xfs_ino_t inum,
struct xfs_buf **agi_bpp, struct xfs_inode **ipp);
void xchk_irele(struct xfs_scrub *sc, struct xfs_inode *ip);
int xchk_install_handle_inode(struct xfs_scrub *sc, struct xfs_inode *ip);
/*
* Don't bother cross-referencing if we already found corruption or cross
* referencing discrepancies.
......@@ -147,8 +156,21 @@ static inline bool xchk_skip_xref(struct xfs_scrub_metadata *sm)
}
int xchk_metadata_inode_forks(struct xfs_scrub *sc);
int xchk_ilock_inverted(struct xfs_inode *ip, uint lock_mode);
void xchk_stop_reaping(struct xfs_scrub *sc);
void xchk_start_reaping(struct xfs_scrub *sc);
/*
* Setting up a hook to wait for intents to drain is costly -- we have to take
* the CPU hotplug lock and force an i-cache flush on all CPUs once to set it
* up, and again to tear it down. These costs add up quickly, so we only want
* to enable the drain waiter if the drain actually detected a conflict with
* running intent chains.
*/
static inline bool xchk_need_intent_drain(struct xfs_scrub *sc)
{
return sc->flags & XCHK_NEED_DRAIN;
}
void xchk_fsgates_enable(struct xfs_scrub *sc, unsigned int scrub_fshooks);
#endif /* __XFS_SCRUB_COMMON_H__ */
// SPDX-License-Identifier: GPL-2.0+
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2017 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2017-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#include "xfs.h"
#include "xfs_fs.h"
......@@ -39,6 +39,7 @@ xchk_da_process_error(
switch (*error) {
case -EDEADLOCK:
case -ECHRNG:
/* Used to restart an op with deadlock avoidance. */
trace_xchk_deadlock_retry(sc->ip, sc->sm, *error);
break;
......
// SPDX-License-Identifier: GPL-2.0+
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* Copyright (C) 2017 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2017-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#ifndef __XFS_SCRUB_DABTREE_H__
#define __XFS_SCRUB_DABTREE_H__
......
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0+
/*
* Copyright (C) 2019 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <darrick.wong@oracle.com>
* Copyright (C) 2019-2023 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#include "xfs.h"
#include "xfs_fs.h"
......@@ -130,6 +130,13 @@ xchk_setup_fscounters(
struct xchk_fscounters *fsc;
int error;
/*
* If the AGF doesn't track btreeblks, we have to lock the AGF to count
* btree block usage by walking the actual btrees.
*/
if (!xfs_has_lazysbcount(sc->mp))
xchk_fsgates_enable(sc, XCHK_FSGATES_DRAIN);
sc->buf = kzalloc(sizeof(struct xchk_fscounters), XCHK_GFP_FLAGS);
if (!sc->buf)
return -ENOMEM;
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -798,7 +798,6 @@ xfs_qm_dqget_cache_insert(
error = radix_tree_insert(tree, id, dqp);
if (unlikely(error)) {
/* Duplicate found! Caller must try again. */
WARN_ON(error != -EEXIST);
mutex_unlock(&qi->qi_tree_lock);
trace_xfs_dqget_dup(dqp);
return error;
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment