Commit ef02fc98 authored by monty@mashka.mysql.fi's avatar monty@mashka.mysql.fi

Merge bk-internal.mysql.com:/home/bk/mysql-4.0

into mashka.mysql.fi:/home/my/mysql-4.0
parents da3e2d29 10d46843
......@@ -43,18 +43,18 @@ END-INFO-DIR-ENTRY
@page
@end titlepage
@node Top, caching, (dir), (dir)
@node Top, coding guidelines, (dir), (dir)
@ifinfo
This is a manual about @strong{MySQL} internals.
@end ifinfo
@menu
* coding guidelines:: Coding Guidelines
* caching:: How MySQL Handles Caching
* join_buffer_size::
* join_buffer_size::
* flush tables:: How MySQL Handles @code{FLUSH TABLES}
* filesort:: How MySQL Does Sorting (@code{filesort})
* coding guidelines:: Coding Guidelines
* Algorithms::
* mysys functions:: Functions In The @code{mysys} Library
* DBUG:: DBUG Tags To Use
* protocol:: MySQL Client/Server Protocol
......@@ -67,7 +67,167 @@ This is a manual about @strong{MySQL} internals.
@end menu
@node caching, join_buffer_size, Top, Top
@node coding guidelines, caching, Top, Top
@chapter Coding Guidelines
@itemize @bullet
@item
We use @uref{http://www.bitkeeper.com/, BitKeeper} for source management.
@item
You should use the @strong{MySQL} 4.0 source for all developments.
@item
If you have any questions about the @strong{MySQL} source, you can post these
to @email{dev-public@@mysql.com} and we will answer them. Please
remember to not use this internal email list in public!
@item
Try to write code in a lot of black boxes that can be reused or use at
least a clean, easy to change interface.
@item
Reuse code; There is already a lot of algorithms in MySQL for list handling,
queues, dynamic and hashed arrays, sorting, etc. that can be reused.
@item
Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
@code{my_malloc()} that you can find in the @code{mysys} library instead
of the direct system calls; This will make your code easier to debug and
more portable.
@item
Try to always write optimized code, so that you don't have to
go back and rewrite it a couple of months later. It's better to
spend 3 times as much time designing and writing an optimal function than
having to do it all over again later on.
@item
Avoid CPU wasteful code, even where it does not matter, so that
you will not develop sloppy coding habits.
@item
If you can write it in fewer lines, do it (as long as the code will not
be slower or much harder to read).
@item
Don't use two commands on the same line.
@item
Do not check the same pointer for @code{NULL} more than once.
@item
Use long function and variable names in English. This makes your code
easier to read.
@item
Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_}
rather than dancing SHIFT to seperate words in identifiers).
@item
Think assembly - make it easier for the compiler to optimize your code.
@item
Comment your code when you do something that someone else may think
is not ``trivial''.
@item
Use @code{libstring} functions (in the @file{strings} directory)
instead of standard @code{libc} string functions whenever possible.
@item
Avoid using @code{malloc()} (its REAL slow); For memory allocations
that only need to live for the lifetime of one thread, one should use
@code{sql_alloc()} instead.
@item
Before making big design decisions, please first post a summary of
what you want to do, why you want to do it, and how you plan to do
it. This way we can easily provide you with feedback and also
easily discuss it thoroughly if some other developer thinks there is better
way to do the same thing!
@item
Class names start with a capital letter.
@item
Structure types are @code{typedef}'ed to an all-caps identifier.
@item
Any @code{#define}'s are in all-caps.
@item
Matching @samp{@{} are in the same column.
@item
Put the @samp{@{} after a @code{switch} on the same line, as this gives
better overall indentation for the switch statement:
@example
switch (arg) @{
@end example
@item
In all other cases, @samp{@{} and @samp{@}} should be on their own line, except
if there is nothing inside @samp{@{} and @samp{@}}.
@item
Have a space after @code{if}
@item
Put a space after @samp{,} for function arguments
@item
Functions return @samp{0} on success, and non-zero on error, so you can do:
@example
if(a() || b() || c()) @{ error("something went wrong"); @}
@end example
@item
Using @code{goto} is okay if not abused.
@item
Avoid default variable initalizations, use @code{LINT_INIT()} if the
compiler complains after making sure that there is really no way
the variable can be used uninitialized.
@item
Do not instantiate a class if you do not have to.
@item
Use pointers rather than array indexing when operating on strings.
@end itemize
Suggested mode in emacs:
@example
(load "cc-mode")
(setq c-mode-common-hook '(lambda ()
(turn-on-font-lock)
(setq comment-column 48)))
(setq c-style-alist
(cons
'("MY"
(c-basic-offset . 2)
(c-comment-only-line-offset . 0)
(c-offsets-alist . ((statement-block-intro . +)
(knr-argdecl-intro . 0)
(substatement-open . 0)
(label . -)
(statement-cont . +)
(arglist-intro . c-lineup-arglist-intro-after-paren)
(arglist-close . c-lineup-arglist)
))
)
c-style-alist))
(c-set-style "MY")
(setq c-default-style "MY")
@end example
@node caching, join_buffer_size, coding guidelines, Top
@chapter How MySQL Handles Caching
@strong{MySQL} has the following caches:
......@@ -181,7 +341,7 @@ same algorithm described above to handle it. (In other words, we store
the same row combination several times into different buffers)
@end itemize
@node flush tables, filesort, join_buffer_size, Top
@node flush tables, Algorithms, join_buffer_size, Top
@chapter How MySQL Handles @code{FLUSH TABLES}
@itemize @bullet
......@@ -226,8 +386,19 @@ After this it will give other threads a chance to open the same tables.
@end itemize
@node filesort, coding guidelines, flush tables, Top
@chapter How MySQL Does Sorting (@code{filesort})
@node Algorithms, mysys functions, flush tables, Top
@chapter Different algoritms used in MySQL
MySQL uses a lot of different algorithms. This chapter tries to describe
some of these:
@menu
* filesort::
* bulk-insert::
@end menu
@node filesort, bulk-insert, Algorithms, Algorithms
@section How MySQL Does Sorting (@code{filesort})
@itemize @bullet
......@@ -266,169 +437,20 @@ and then we read the rows in the sorted order into a row buffer
@end itemize
@node bulk-insert, , filesort, Algorithms
@section Bulk insert
@node coding guidelines, mysys functions, filesort, Top
@chapter Coding Guidelines
@itemize @bullet
@item
We use @uref{http://www.bitkeeper.com/, BitKeeper} for source management.
@item
You should use the @strong{MySQL} 4.0 source for all developments.
@item
If you have any questions about the @strong{MySQL} source, you can post these
to @email{dev-public@@mysql.com} and we will answer them. Please
remember to not use this internal email list in public!
@item
Try to write code in a lot of black boxes that can be reused or use at
least a clean, easy to change interface.
@item
Reuse code; There is already a lot of algorithms in MySQL for list handling,
queues, dynamic and hashed arrays, sorting, etc. that can be reused.
@item
Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
@code{my_malloc()} that you can find in the @code{mysys} library instead
of the direct system calls; This will make your code easier to debug and
more portable.
@item
Try to always write optimized code, so that you don't have to
go back and rewrite it a couple of months later. It's better to
spend 3 times as much time designing and writing an optimal function than
having to do it all over again later on.
@item
Avoid CPU wasteful code, even where it does not matter, so that
you will not develop sloppy coding habits.
@item
If you can write it in fewer lines, do it (as long as the code will not
be slower or much harder to read).
@item
Don't use two commands on the same line.
@item
Do not check the same pointer for @code{NULL} more than once.
@item
Use long function and variable names in English. This makes your code
easier to read.
@item
Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_}
rather than dancing SHIFT to seperate words in identifiers).
@item
Think assembly - make it easier for the compiler to optimize your code.
@item
Comment your code when you do something that someone else may think
is not ``trivial''.
@item
Use @code{libstring} functions (in the @file{strings} directory)
instead of standard @code{libc} string functions whenever possible.
@item
Avoid using @code{malloc()} (its REAL slow); For memory allocations
that only need to live for the lifetime of one thread, one should use
@code{sql_alloc()} instead.
@item
Before making big design decisions, please first post a summary of
what you want to do, why you want to do it, and how you plan to do
it. This way we can easily provide you with feedback and also
easily discuss it thoroughly if some other developer thinks there is better
way to do the same thing!
@item
Class names start with a capital letter.
@item
Structure types are @code{typedef}'ed to an all-caps identifier.
@item
Any @code{#define}'s are in all-caps.
@item
Matching @samp{@{} are in the same column.
@item
Put the @samp{@{} after a @code{switch} on the same line, as this gives
better overall indentation for the switch statement:
@example
switch (arg) @{
@end example
@item
In all other cases, @samp{@{} and @samp{@}} should be on their own line, except
if there is nothing inside @samp{@{} and @samp{@}}.
@item
Have a space after @code{if}
@item
Put a space after @samp{,} for function arguments
@item
Functions return @samp{0} on success, and non-zero on error, so you can do:
@example
if(a() || b() || c()) @{ error("something went wrong"); @}
@end example
@item
Using @code{goto} is okay if not abused.
@item
Avoid default variable initalizations, use @code{LINT_INIT()} if the
compiler complains after making sure that there is really no way
the variable can be used uninitialized.
@item
Do not instantiate a class if you do not have to.
@item
Use pointers rather than array indexing when operating on strings.
@end itemize
Suggested mode in emacs:
@example
(load "cc-mode")
(setq c-mode-common-hook '(lambda ()
(turn-on-font-lock)
(setq comment-column 48)))
(setq c-style-alist
(cons
'("MY"
(c-basic-offset . 2)
(c-comment-only-line-offset . 0)
(c-offsets-alist . ((statement-block-intro . +)
(knr-argdecl-intro . 0)
(substatement-open . 0)
(label . -)
(statement-cont . +)
(arglist-intro . c-lineup-arglist-intro-after-paren)
(arglist-close . c-lineup-arglist)
))
)
c-style-alist))
(c-set-style "MY")
(setq c-default-style "MY")
@end example
Logic behind bulk insert optimisation is simple.
Instead of writing each key value to b-tree (that is to keycache, but
bulk insert code doesn't know about keycache) keys are stored in
balanced binary (red-black) tree, in memory. When this tree reaches its
memory limit it's writes all keys to disk (to keycache, that is). But
as key stream coming from the binary tree is already sorted inserting
goes much faster, all the necessary pages are already in cache, disk
access is minimized, etc.
@node mysys functions, DBUG, coding guidelines, Top
@node mysys functions, DBUG, Algorithms, Top
@chapter Functions In The @code{mysys} Library
Functions in @code{mysys}: (For flags see @file{my_sys.h})
......@@ -624,6 +646,16 @@ Print query.
* fieldtype codes::
* protocol functions::
* protocol version 2::
* 4.1 protocol changes::
* 4.1 field packet::
* 4.1 field desc::
* 4.1 ok packet::
* 4.1 end packet::
* 4.1 error packet::
* 4.1 prep init::
* 4.1 long data::
* 4.1 execute::
* 4.1 binary result::
@end menu
@node raw packet without compression, raw packet with compression, protocol, protocol
......@@ -690,7 +722,7 @@ is the header of the packet.
@end menu
@node ok packet, error packet, basic packets, basic packets, basic packets
@node ok packet, error packet, basic packets, basic packets
@subsection OK Packet
For details, see @file{sql/net_pkg.cc::send_ok()}.
......@@ -720,7 +752,7 @@ For details, see @file{sql/net_pkg.cc::send_ok()}.
@end table
@node error packet, , ok packet, basic packets, basic packets
@node error packet, , ok packet, basic packets
@subsection Error Packet
@example
......@@ -835,7 +867,7 @@ For details, see @file{sql/net_pkg.cc::send_ok()}.
n data
@end example
@node fieldtype codes, protocol functions, communication
@node fieldtype codes, protocol functions, communication, protocol
@section Fieldtype Codes
@example
......@@ -859,7 +891,7 @@ Time 03 08 00 00 |01 0B |03 00 00 00
Date 03 0A 00 00 |01 0A |03 00 00 00
@end example
@node protocol functions, protocol version 2, fieldtype codes
@node protocol functions, protocol version 2, fieldtype codes, protocol
@section Functions used to implement the protocol
@c This should be merged with the above one and changed to texi format
......@@ -971,7 +1003,7 @@ client. If this is equal to the new message the client sends to the
server then the password is accepted.
@end example
@node protocol version 2, 4.1 protocol changes, protocol functions
@node protocol version 2, 4.1 protocol changes, protocol functions, protocol
@section Another description of the protocol
@c This should be merged with the above one and changed to texi format.
......@@ -1664,7 +1696,7 @@ fe 00 . .
@c @node 4.1 protocol,,,
@c @chapter MySQL 4.1 protocol
@node 4.1 protocol changes, 4.1 field packet, protocol version 2
@node 4.1 protocol changes, 4.1 field packet, protocol version 2, protocol
@section Changes to 4.0 protocol in 4.1
All basic packet handling is identical to 4.0. When communication
......@@ -1699,7 +1731,7 @@ results will sent as binary (low-byte-first).
@end itemize
@node 4.1 field packet, 4.1 field desc, 4.1 protocol changes
@node 4.1 field packet, 4.1 field desc, 4.1 protocol changes, protocol
@section 4.1 field description packet
The field description packet is sent as a response to a query that
......@@ -1719,7 +1751,7 @@ uses this to send the number of rows in the table)
This packet is always followed by a field description set.
@xref{4.1 field desc}.
@node 4.1 field desc, 4.1 ok packet, 4.1 field packet
@node 4.1 field desc, 4.1 ok packet, 4.1 field packet, protocol
@section 4.1 field description result set
The field description result set contains the meta info for a result set.
......@@ -1737,7 +1769,7 @@ The field description result set contains the meta info for a result set.
@end multitable
@node 4.1 ok packet, 4.1 end packet, 4.1 field desc
@node 4.1 ok packet, 4.1 end packet, 4.1 field desc, protocol
@section 4.1 ok packet
The ok packet is the first that is sent as an response for a query
......@@ -1763,7 +1795,7 @@ The message is optional. For example for multi line INSERT it
contains a string for how many rows was inserted / deleted.
@node 4.1 end packet, 4.1 error packet, 4.1 ok packet
@node 4.1 end packet, 4.1 error packet, 4.1 ok packet, protocol
@section 4.1 end packet
The end packet is sent as the last packet for
......@@ -1792,7 +1824,7 @@ by checking the packet length < 9 bytes (in which case it's and end
packet).
@node 4.1 error packet, 4.1 prep init, 4.1 end packet
@node 4.1 error packet, 4.1 prep init, 4.1 end packet, protocol
@section 4.1 error packet.
The error packet is sent when something goes wrong.
......@@ -1809,7 +1841,7 @@ The client/server protocol is designed in such a way that a packet
can only start with 255 if it's an error packet.
@node 4.1 prep init, 4.1 long data, 4.1 error packet
@node 4.1 prep init, 4.1 long data, 4.1 error packet, protocol
@section 4.1 prepared statement init packet
This is the return packet when one sends a query with the COM_PREPARE
......@@ -1843,7 +1875,7 @@ prepared statement will contain a result set. In this case the packet
is followed by a field description result set. @xref{4.1 field desc}.
@node 4.1 long data, 4.1 execute, 4.1 prep init
@node 4.1 long data, 4.1 execute, 4.1 prep init, protocol
@section 4.1 long data handling
This is used by mysql_send_long_data() to set any parameter to a string
......@@ -1870,7 +1902,7 @@ The server will NOT send an @code{ok} or @code{error} packet in
responce for this. If there is any errors (like to big string), one
will get the error when calling execute.
@node 4.1 execute, 4.1 binary result, 4.1 long data
@node 4.1 execute, 4.1 binary result, 4.1 long data, protocol
@section 4.1 execute
On execute we send all parameters to the server in a COM_EXECUTE
......@@ -1908,7 +1940,7 @@ The parameters are stored the following ways:
The result for this will be either an ok packet or a binary result
set.
@node 4.1 binary result, , 4.1 execute
@node 4.1 binary result, , 4.1 execute, protocol
@section 4.1 binary result set
A binary result are sent the following way.
......@@ -2384,7 +2416,7 @@ work for different record formats are: /myisam/mi_statrec.c,
/myisam/mi_dynrec.c, and /myisam/mi_packrec.c.
@*
@node InnoDB Record Structure,InnoDB Page Structure,MyISAM Record Structure,Top
@node InnoDB Record Structure, InnoDB Page Structure, MyISAM Record Structure, Top
@chapter InnoDB Record Structure
This page contains:
......@@ -2690,7 +2722,7 @@ shorter because the NULLs take no space.
The most relevant InnoDB source-code files are rem0rec.c, rem0rec.ic,
and rem0rec.h in the rem ("Record Manager") directory.
@node InnoDB Page Structure,Files in MySQL Sources,InnoDB Record Structure,Top
@node InnoDB Page Structure, Files in MySQL Sources, InnoDB Record Structure, Top
@chapter InnoDB Page Structure
InnoDB stores all records inside a fixed-size unit which is commonly called a
......@@ -3121,7 +3153,7 @@ header.
The most relevant InnoDB source-code files are page0page.c,
page0page.ic, and page0page.h in \page directory.
@node Files in MySQL Sources,Files in InnoDB Sources,InnoDB Page Structure,Top
@node Files in MySQL Sources, Files in InnoDB Sources, InnoDB Page Structure, Top
@chapter Annotated List Of Files in the MySQL Source Code Distribution
This is a description of the files that you get when you download the
......@@ -4942,7 +4974,7 @@ The MySQL program that uses zlib is \mysys\my_compress.c. The use is
for packet compression. The client sends messages to the server which
are compressed by zlib. See also: \sql\net_serv.cc.
@node Files in InnoDB Sources,,Files in MySQL Sources,Top
@node Files in InnoDB Sources, , Files in MySQL Sources, Top
@chapter Annotated List Of Files in the InnoDB Source Code Distribution
ERRATUM BY HEIKKI TUURI (START)
......
......@@ -605,6 +605,9 @@ Package=<5>
Package=<4>
{{{
Begin Project Dependency
Project_Dep_Name strings
End Project Dependency
}}}
###############################################################################
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment