Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
N
neoppod
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
1
Issues
1
List
Boards
Labels
Milestones
Merge Requests
2
Merge Requests
2
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
nexedi
neoppod
Commits
19476823
Commit
19476823
authored
Mar 16, 2017
by
Kirill Smelkov
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
X Notes on I/O
parent
98780ea4
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
92 additions
and
0 deletions
+92
-0
t/neo/storage/fs1/notes_io.txt
t/neo/storage/fs1/notes_io.txt
+92
-0
No files found.
t/neo/storage/fs1/notes_io.txt
0 → 100644
View file @
19476823
Notes on Input/Output
---------------------
Several options available here:
pread
~~~~~
The kernel handles both disk I/O and caching (in pagecache).
For hot cache case:
Cost = C(pread(n)) = α + β⋅n
α - syscall cost
β - cost to copy 1 byte (both src and dst is in cache)
α is quite big ≈ (200 - 300) ns
α/β ≈ 2-3.5 · 10^4
see details here: https://github.com/golang/go/issues/19563
thus:
the cost to pread 1 page (n ≈ 4·10^3) is ~ (1.1 - 1.2) · α
the cost to copy 1 page is ~ (0.1 - 0.2) · α
if there are many small reads and for each read syscall is made it works slow
becaus α is big.
pread + user-buffer
~~~~~~~~~~~~~~~~~~~
It is possible to mitigate high α and buffer data from bigger reads in
user-space and for smaller client reads copy data from that buffer.
Performance depends on buffer hit/miss ration and will be evaluated for simple
1-page buffer.
mmap
~~~~
The kernel handles both disk I/O and caching (in pagecache).
Cost ~ α (XXX recheck) is spent on first-time access.
Future accesses to page, given it is still in page-cache, does not incur α cost.
However I/O errors are reported as SIGBUS on memory access. Thus if for read
requst pointer to mmaped-memory is returned, clients could get I/O errors as
exceptions potentially everywhere.
To get & check I/O errors on actual read request the read service will thus
need to access and copy data from mmapped-memory to other buffer incurring β⋅n
cost in hot-cache case.
Not doing the copy can lead to situation where data was first read/checked by
read service ok, then evicted from page-cache by kernel, then accessed by
client which cause real disk I/O, and if this I/O fails -> client get SIGBUS.
Another potential disadvantage: if memory access causes disk I/O whole thread
is blocked, not only goroutine which issued the access.
Note: madvice should be used to guide kernel cache read-ahead/backwards or
where we are planning to access data next. madvice is syscall so this can add α
back.
...
Direct I/O
~~~~~~~~~~
Kernel handles disk I/O directly to user-space memory.
The kernel does not handle caching.
Cache must be implemented in user-space.
pros:
- kernel is accessed only when there is real need for disk IO.
- memory can be managed completely by "us" in userspace.
- what to cache and preload can be more integrated with client workload.
- various copy discipline for reads are possible,
including providing pointer to in-cache data to clients (though this
requires implementing ref-count and such)
cons:
- harder to implement
- Linus dislikes Direct I/O very much
- probably more kernel bugs as this is kind of more exotic area
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment