Commit 18ccb223 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab Committed by Jonathan Corbet

docs: filesystems: convert orangefs.txt to ReST

- Add a SPDX header;
- Adjust document and section titles;
- Some whitespace fixes and new line breaks;
- Mark literal blocks as such;
- Add it to filesystems/index.rst.
Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/6f438eeff5b029d229197a602bd9b74004fe9b63.1581955849.git.mchehab+huawei@kernel.orgSigned-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent 7cbb468f
...@@ -79,6 +79,7 @@ Documentation for filesystem implementations. ...@@ -79,6 +79,7 @@ Documentation for filesystem implementations.
ocfs2 ocfs2
ocfs2-online-filecheck ocfs2-online-filecheck
omfs omfs
orangefs
overlayfs overlayfs
virtiofs virtiofs
vfat vfat
.. SPDX-License-Identifier: GPL-2.0
========
ORANGEFS ORANGEFS
======== ========
...@@ -21,25 +24,25 @@ Orangefs features include: ...@@ -21,25 +24,25 @@ Orangefs features include:
* Stateless * Stateless
MAILING LIST ARCHIVES Mailing List Archives
===================== =====================
http://lists.orangefs.org/pipermail/devel_lists.orangefs.org/ http://lists.orangefs.org/pipermail/devel_lists.orangefs.org/
MAILING LIST SUBMISSIONS Mailing List Submissions
======================== ========================
devel@lists.orangefs.org devel@lists.orangefs.org
DOCUMENTATION Documentation
============= =============
http://www.orangefs.org/documentation/ http://www.orangefs.org/documentation/
USERSPACE FILESYSTEM SOURCE Userspace Filesystem Source
=========================== ===========================
http://www.orangefs.org/download http://www.orangefs.org/download
...@@ -48,16 +51,16 @@ Orangefs versions prior to 2.9.3 would not be compatible with the ...@@ -48,16 +51,16 @@ Orangefs versions prior to 2.9.3 would not be compatible with the
upstream version of the kernel client. upstream version of the kernel client.
RUNNING ORANGEFS ON A SINGLE SERVER Running ORANGEFS On a Single Server
=================================== ===================================
OrangeFS is usually run in large installations with multiple servers and OrangeFS is usually run in large installations with multiple servers and
clients, but a complete filesystem can be run on a single machine for clients, but a complete filesystem can be run on a single machine for
development and testing. development and testing.
On Fedora, install orangefs and orangefs-server. On Fedora, install orangefs and orangefs-server::
dnf -y install orangefs orangefs-server dnf -y install orangefs orangefs-server
There is an example server configuration file in There is an example server configuration file in
/etc/orangefs/orangefs.conf. Change localhost to your hostname if /etc/orangefs/orangefs.conf. Change localhost to your hostname if
...@@ -70,29 +73,29 @@ single line. Uncomment it and change the hostname if necessary. This ...@@ -70,29 +73,29 @@ single line. Uncomment it and change the hostname if necessary. This
controls clients which use libpvfs2. This does not control the controls clients which use libpvfs2. This does not control the
pvfs2-client-core. pvfs2-client-core.
Create the filesystem. Create the filesystem::
pvfs2-server -f /etc/orangefs/orangefs.conf pvfs2-server -f /etc/orangefs/orangefs.conf
Start the server. Start the server::
systemctl start orangefs-server systemctl start orangefs-server
Test the server. Test the server::
pvfs2-ping -m /pvfsmnt pvfs2-ping -m /pvfsmnt
Start the client. The module must be compiled in or loaded before this Start the client. The module must be compiled in or loaded before this
point. point::
systemctl start orangefs-client systemctl start orangefs-client
Mount the filesystem. Mount the filesystem::
mount -t pvfs2 tcp://localhost:3334/orangefs /pvfsmnt mount -t pvfs2 tcp://localhost:3334/orangefs /pvfsmnt
BUILDING ORANGEFS ON A SINGLE SERVER Building ORANGEFS on a Single Server
==================================== ====================================
Where OrangeFS cannot be installed from distribution packages, it may be Where OrangeFS cannot be installed from distribution packages, it may be
...@@ -102,49 +105,51 @@ You can omit --prefix if you don't care that things are sprinkled around ...@@ -102,49 +105,51 @@ You can omit --prefix if you don't care that things are sprinkled around
in /usr/local. As of version 2.9.6, OrangeFS uses Berkeley DB by in /usr/local. As of version 2.9.6, OrangeFS uses Berkeley DB by
default, we will probably be changing the default to LMDB soon. default, we will probably be changing the default to LMDB soon.
./configure --prefix=/opt/ofs --with-db-backend=lmdb ::
make ./configure --prefix=/opt/ofs --with-db-backend=lmdb
make install make
Create an orangefs config file. make install
/opt/ofs/bin/pvfs2-genconfig /etc/pvfs2.conf Create an orangefs config file::
Create an /etc/pvfs2tab file. /opt/ofs/bin/pvfs2-genconfig /etc/pvfs2.conf
echo tcp://localhost:3334/orangefs /pvfsmnt pvfs2 defaults,noauto 0 0 > \ Create an /etc/pvfs2tab file::
/etc/pvfs2tab
Create the mount point you specified in the tab file if needed. echo tcp://localhost:3334/orangefs /pvfsmnt pvfs2 defaults,noauto 0 0 > \
/etc/pvfs2tab
mkdir /pvfsmnt Create the mount point you specified in the tab file if needed::
Bootstrap the server. mkdir /pvfsmnt
/opt/ofs/sbin/pvfs2-server -f /etc/pvfs2.conf Bootstrap the server::
Start the server. /opt/ofs/sbin/pvfs2-server -f /etc/pvfs2.conf
/opt/osf/sbin/pvfs2-server /etc/pvfs2.conf Start the server::
/opt/osf/sbin/pvfs2-server /etc/pvfs2.conf
Now the server should be running. Pvfs2-ls is a simple Now the server should be running. Pvfs2-ls is a simple
test to verify that the server is running. test to verify that the server is running::
/opt/ofs/bin/pvfs2-ls /pvfsmnt /opt/ofs/bin/pvfs2-ls /pvfsmnt
If stuff seems to be working, load the kernel module and If stuff seems to be working, load the kernel module and
turn on the client core. turn on the client core::
/opt/ofs/sbin/pvfs2-client -p /opt/osf/sbin/pvfs2-client-core /opt/ofs/sbin/pvfs2-client -p /opt/osf/sbin/pvfs2-client-core
Mount your filesystem. Mount your filesystem::
mount -t pvfs2 tcp://localhost:3334/orangefs /pvfsmnt mount -t pvfs2 tcp://localhost:3334/orangefs /pvfsmnt
RUNNING XFSTESTS Running xfstests
================ ================
It is useful to use a scratch filesystem with xfstests. This can be It is useful to use a scratch filesystem with xfstests. This can be
...@@ -159,21 +164,23 @@ Then there are two FileSystem sections: orangefs and scratch. ...@@ -159,21 +164,23 @@ Then there are two FileSystem sections: orangefs and scratch.
This change should be made before creating the filesystem. This change should be made before creating the filesystem.
pvfs2-server -f /etc/orangefs/orangefs.conf ::
pvfs2-server -f /etc/orangefs/orangefs.conf
To run xfstests, create /etc/xfsqa.config. To run xfstests, create /etc/xfsqa.config::
TEST_DIR=/orangefs TEST_DIR=/orangefs
TEST_DEV=tcp://localhost:3334/orangefs TEST_DEV=tcp://localhost:3334/orangefs
SCRATCH_MNT=/scratch SCRATCH_MNT=/scratch
SCRATCH_DEV=tcp://localhost:3334/scratch SCRATCH_DEV=tcp://localhost:3334/scratch
Then xfstests can be run Then xfstests can be run::
./check -pvfs2 ./check -pvfs2
OPTIONS Options
======= =======
The following mount options are accepted: The following mount options are accepted:
...@@ -193,32 +200,32 @@ The following mount options are accepted: ...@@ -193,32 +200,32 @@ The following mount options are accepted:
Distributed locking is being worked on for the future. Distributed locking is being worked on for the future.
DEBUGGING Debugging
========= =========
If you want the debug (GOSSIP) statements in a particular If you want the debug (GOSSIP) statements in a particular
source file (inode.c for example) go to syslog: source file (inode.c for example) go to syslog::
echo inode > /sys/kernel/debug/orangefs/kernel-debug echo inode > /sys/kernel/debug/orangefs/kernel-debug
No debugging (the default): No debugging (the default)::
echo none > /sys/kernel/debug/orangefs/kernel-debug echo none > /sys/kernel/debug/orangefs/kernel-debug
Debugging from several source files: Debugging from several source files::
echo inode,dir > /sys/kernel/debug/orangefs/kernel-debug echo inode,dir > /sys/kernel/debug/orangefs/kernel-debug
All debugging: All debugging::
echo all > /sys/kernel/debug/orangefs/kernel-debug echo all > /sys/kernel/debug/orangefs/kernel-debug
Get a list of all debugging keywords: Get a list of all debugging keywords::
cat /sys/kernel/debug/orangefs/debug-help cat /sys/kernel/debug/orangefs/debug-help
PROTOCOL BETWEEN KERNEL MODULE AND USERSPACE Protocol between Kernel Module and Userspace
============================================ ============================================
Orangefs is a user space filesystem and an associated kernel module. Orangefs is a user space filesystem and an associated kernel module.
...@@ -234,7 +241,8 @@ The kernel module implements a pseudo device that userspace ...@@ -234,7 +241,8 @@ The kernel module implements a pseudo device that userspace
can read from and write to. Userspace can also manipulate the can read from and write to. Userspace can also manipulate the
kernel module through the pseudo device with ioctl. kernel module through the pseudo device with ioctl.
THE BUFMAP: The Bufmap
----------
At startup userspace allocates two page-size-aligned (posix_memalign) At startup userspace allocates two page-size-aligned (posix_memalign)
mlocked memory buffers, one is used for IO and one is used for readdir mlocked memory buffers, one is used for IO and one is used for readdir
...@@ -250,7 +258,8 @@ copied from user space to kernel space with copy_from_user and is used ...@@ -250,7 +258,8 @@ copied from user space to kernel space with copy_from_user and is used
to initialize the kernel module's "bufmap" (struct orangefs_bufmap), which to initialize the kernel module's "bufmap" (struct orangefs_bufmap), which
then contains: then contains:
* refcnt - a reference counter * refcnt
- a reference counter
* desc_size - PVFS2_BUFMAP_DEFAULT_DESC_SIZE (4194304) - the IO buffer's * desc_size - PVFS2_BUFMAP_DEFAULT_DESC_SIZE (4194304) - the IO buffer's
partition size, which represents the filesystem's block size and partition size, which represents the filesystem's block size and
is used for s_blocksize in super blocks. is used for s_blocksize in super blocks.
...@@ -259,17 +268,19 @@ then contains: ...@@ -259,17 +268,19 @@ then contains:
* desc_shift - log2(desc_size), used for s_blocksize_bits in super blocks. * desc_shift - log2(desc_size), used for s_blocksize_bits in super blocks.
* total_size - the total size of the IO buffer. * total_size - the total size of the IO buffer.
* page_count - the number of 4096 byte pages in the IO buffer. * page_count - the number of 4096 byte pages in the IO buffer.
* page_array - a pointer to page_count * (sizeof(struct page*)) bytes * page_array - a pointer to ``page_count * (sizeof(struct page*))`` bytes
of kcalloced memory. This memory is used as an array of pointers of kcalloced memory. This memory is used as an array of pointers
to each of the pages in the IO buffer through a call to get_user_pages. to each of the pages in the IO buffer through a call to get_user_pages.
* desc_array - a pointer to desc_count * (sizeof(struct orangefs_bufmap_desc)) * desc_array - a pointer to ``desc_count * (sizeof(struct orangefs_bufmap_desc))``
bytes of kcalloced memory. This memory is further intialized: bytes of kcalloced memory. This memory is further intialized:
user_desc is the kernel's copy of the IO buffer's ORANGEFS_dev_map_desc user_desc is the kernel's copy of the IO buffer's ORANGEFS_dev_map_desc
structure. user_desc->ptr points to the IO buffer. structure. user_desc->ptr points to the IO buffer.
pages_per_desc = bufmap->desc_size / PAGE_SIZE ::
offset = 0
pages_per_desc = bufmap->desc_size / PAGE_SIZE
offset = 0
bufmap->desc_array[0].page_array = &bufmap->page_array[offset] bufmap->desc_array[0].page_array = &bufmap->page_array[offset]
bufmap->desc_array[0].array_count = pages_per_desc = 1024 bufmap->desc_array[0].array_count = pages_per_desc = 1024
...@@ -293,7 +304,8 @@ then contains: ...@@ -293,7 +304,8 @@ then contains:
* readdir_index_lock - a spinlock to protect readdir_index_array during * readdir_index_lock - a spinlock to protect readdir_index_array during
update. update.
OPERATIONS: Operations
----------
The kernel module builds an "op" (struct orangefs_kernel_op_s) when it The kernel module builds an "op" (struct orangefs_kernel_op_s) when it
needs to communicate with userspace. Part of the op contains the "upcall" needs to communicate with userspace. Part of the op contains the "upcall"
...@@ -308,13 +320,19 @@ in flight at any given time. ...@@ -308,13 +320,19 @@ in flight at any given time.
Ops are stateful: Ops are stateful:
* unknown - op was just initialized * unknown
* waiting - op is on request_list (upward bound) - op was just initialized
* inprogr - op is in progress (waiting for downcall) * waiting
* serviced - op has matching downcall; ok - op is on request_list (upward bound)
* purged - op has to start a timer since client-core * inprogr
- op is in progress (waiting for downcall)
* serviced
- op has matching downcall; ok
* purged
- op has to start a timer since client-core
exited uncleanly before servicing op exited uncleanly before servicing op
* given up - submitter has given up waiting for it * given up
- submitter has given up waiting for it
When some arbitrary userspace program needs to perform a When some arbitrary userspace program needs to perform a
filesystem operation on Orangefs (readdir, I/O, create, whatever) filesystem operation on Orangefs (readdir, I/O, create, whatever)
...@@ -389,10 +407,15 @@ union of structs, each of which is associated with a particular ...@@ -389,10 +407,15 @@ union of structs, each of which is associated with a particular
response type. response type.
The several members outside of the union are: The several members outside of the union are:
- int32_t type - type of operation.
- int32_t status - return code for the operation. ``int32_t type``
- int64_t trailer_size - 0 unless readdir operation. - type of operation.
- char *trailer_buf - initialized to NULL, used during readdir operations. ``int32_t status``
- return code for the operation.
``int64_t trailer_size``
- 0 unless readdir operation.
``char *trailer_buf``
- initialized to NULL, used during readdir operations.
The appropriate member inside the union is filled out for any The appropriate member inside the union is filled out for any
particular response. particular response.
...@@ -449,18 +472,20 @@ Userspace uses writev() on /dev/pvfs2-req to pass responses to the requests ...@@ -449,18 +472,20 @@ Userspace uses writev() on /dev/pvfs2-req to pass responses to the requests
made by the kernel side. made by the kernel side.
A buffer_list containing: A buffer_list containing:
- a pointer to the prepared response to the request from the - a pointer to the prepared response to the request from the
kernel (struct pvfs2_downcall_t). kernel (struct pvfs2_downcall_t).
- and also, in the case of a readdir request, a pointer to a - and also, in the case of a readdir request, a pointer to a
buffer containing descriptors for the objects in the target buffer containing descriptors for the objects in the target
directory. directory.
... is sent to the function (PINT_dev_write_list) which performs ... is sent to the function (PINT_dev_write_list) which performs
the writev. the writev.
PINT_dev_write_list has a local iovec array: struct iovec io_array[10]; PINT_dev_write_list has a local iovec array: struct iovec io_array[10];
The first four elements of io_array are initialized like this for all The first four elements of io_array are initialized like this for all
responses: responses::
io_array[0].iov_base = address of local variable "proto_ver" (int32_t) io_array[0].iov_base = address of local variable "proto_ver" (int32_t)
io_array[0].iov_len = sizeof(int32_t) io_array[0].iov_len = sizeof(int32_t)
...@@ -475,7 +500,7 @@ responses: ...@@ -475,7 +500,7 @@ responses:
of global variable vfs_request (vfs_request_t) of global variable vfs_request (vfs_request_t)
io_array[3].iov_len = sizeof(pvfs2_downcall_t) io_array[3].iov_len = sizeof(pvfs2_downcall_t)
Readdir responses initialize the fifth element io_array like this: Readdir responses initialize the fifth element io_array like this::
io_array[4].iov_base = contents of member trailer_buf (char *) io_array[4].iov_base = contents of member trailer_buf (char *)
from out_downcall member of global variable from out_downcall member of global variable
...@@ -517,13 +542,13 @@ from a dentry is cheap, obtaining it from userspace is relatively expensive, ...@@ -517,13 +542,13 @@ from a dentry is cheap, obtaining it from userspace is relatively expensive,
hence the motivation to use the dentry when possible. hence the motivation to use the dentry when possible.
The timeout values d_time and getattr_time are jiffy based, and the The timeout values d_time and getattr_time are jiffy based, and the
code is designed to avoid the jiffy-wrap problem: code is designed to avoid the jiffy-wrap problem::
"In general, if the clock may have wrapped around more than once, there "In general, if the clock may have wrapped around more than once, there
is no way to tell how much time has elapsed. However, if the times t1 is no way to tell how much time has elapsed. However, if the times t1
and t2 are known to be fairly close, we can reliably compute the and t2 are known to be fairly close, we can reliably compute the
difference in a way that takes into account the possibility that the difference in a way that takes into account the possibility that the
clock may have wrapped between times." clock may have wrapped between times."
from course notes by instructor Andy Wang from course notes by instructor Andy Wang
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment