[t:4372] couple more updates to brt comment

git-svn-id: file:///svn/toku/tokudb@39466 c7de825b-a66e-492c-adef-691d508d4ae1

[t:4372] couple more updates to brt comment
git-svn-id: file:///svn/toku/tokudb@39466 c7de825b-a66e-492c-adef-691d508d4ae1
75e3b574 · John Esmet · Yoni Fogel · 89142362 · 75e3b574
Commit 75e3b574 authored Feb 02, 2012 by John Esmet Committed by Yoni Fogel Apr 17, 2013
Hide whitespace changes
Inline Side-by-side

Showing with 21 additions and 35 deletions

newbrt/brt.c newbrt/brt.c +21 -35

No files found.
--- a/newbrt/brt.c
+++ b/newbrt/brt.c
@@ -10,13 +10,28 @@ Managing the tree shape:  How insertion, deletion, and querying work
 When we insert a message into the BRT, here's what happens.

 to insert a message at the root
+
    - find the root node
    - capture the next msn of the root node and assign it to the message
    - split the root if it needs to be split
    - insert the message into the root buffer
    - if the root is too full, then flush_some_child() of the root on a flusher thread

+flusher functions use an advice struct with provides some functions to
+call that tell it what to do based on the context of the flush. see brt-flusher.h
+
+to flush some child, given a parent and some advice
+    - pick the child using advice->pick_child()
+    - remove that childs buffer from the parent
+    - flush the buffer to the child
+    - if the child has stable reactivity and 
+      advice->should_recursively_flush() is true, then
+      flush_some_child() of the child
+    - otherwise split the child if it needs to be split
+    - otherwise maybe merge the child if it needs to be merged
+
 flusher threads:
+
    flusher threads are created on demand as the result of internal nodes
    becoming gorged by insertions. this allows flushing to be done somewhere
    other than the client thread. these work items are enqueued onto
@@ -31,7 +46,7 @@ cleaner threads:
    the cleaner thread need not actually do a flush when awoken, so only
    nodes that have sufficient cache pressure are flushed.

-checkpoingting:
+checkpointing:

    the checkpoint thread wakes up every minute to checkpoint dirty nodes
    to disk. at the time of this writing, nodes during checkpoint are
@@ -41,39 +56,6 @@ checkpoingting:
    many nodes and preventing other threads from traversing down the tree,
    for a query or otherwise.

-Flusher functions use an advice struct with provides some functions to
-call that tell it what to do based on the context of the flush. see brt-flusher.h
-
-to flush some child, given a parent and some advice
-    - pick the child using advice->pick_child()
-    - remove that childs buffer from the parent
-    - flush the buffer to the child
-    - if the child has stable reactivity and 
-      advice->should_recursively_flush() is true, then
-      flush_some_child() of the child
-    - otherwise split the child if it needs to be split
-    - otherwise maybe merge the child if it needs to be merged
-
-background flattener
-   It's state is a height and a key and a child number
-   Repeat:
-      sleep (say 1s)
-      grab the ydb lock
-      descend the tree to find the height and key
-      while the node is not empty:
-	 bring the child into memory (possibly causing a TRY_AGAIN)
-	 move all messages from the node into the child
-	 if the child needs to be split or merged then split or merge the child
-	 set the state to operate on the next relevant node in the depth-first order
-	   That is: if there are more children, increment the child number, and return.
-		    if there are no more children, then return with an error code that says "next".  At the first point at the descent is not to the ultimate
-			     child, then set the state to visit that node and that child.  
-		    if we get back up to the root then the state goes to "root" and "child 0" so the whole background flattener can run again the next BRT.   
-		      Probably only open BRTs get flattened.
-		      It may be important for the flattener not to run if there've been no message insertions since the last time it ran.
-      The background flattener should also garbage collect MVCC versions.  The flattener should remember the MVCC versions it has encountered
-	so that if any of those are no longer live, it can run again.
-
 To shrink a file: Let X be the size of the reachable data.  
    We define an acceptable bloat constant of C.  For example we set C=2 if we are willing to allow the file to be as much as 2X in size.
    The goal is to find the smallest amount of stuff we can move to get the file down to size CX.
@@ -117,7 +99,11 @@ lookups:
    algorithm on insertions.
    - when a node is brought into memory, we apply ancestor messages above it.
    - for point queries, we do not read the entire node into memory. instead,
-      only the required basement node is read
+      we only read in the required basement node
+    - for range queries, cursors may return cursor continue in their callback
+      to take a shortcut path to the next row in the basement node.
+    - for range queries, cursors that prelock a range benefit from 
+      internal prefetching of nodes within that range.

 */