Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
M
MariaDB
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
nexedi
MariaDB
Commits
c4cc59d2
Commit
c4cc59d2
authored
Mar 20, 2002
by
paul@teton.kitebird.com
Browse files
Options
Browse Files
Download
Plain Diff
Merge paul@work.mysql.com:/home/bk/mysql-4.0
into teton.kitebird.com:/home/paul/mysql-4.0
parents
1fde1da1
7a10816b
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
112 additions
and
96 deletions
+112
-96
Docs/manual.texi
Docs/manual.texi
+112
-96
No files found.
Docs/manual.texi
View file @
c4cc59d2
...
@@ -33990,8 +33990,8 @@ DELETE FROM t1,t2 USING t1,t2,t3 WHERE t1.id=t2.id AND t2.id=t3.id
...
@@ -33990,8 +33990,8 @@ DELETE FROM t1,t2 USING t1,t2,t3 WHERE t1.id=t2.id AND t2.id=t3.id
In the above case we delete matching rows just from tables @code{t1} and
In the above case we delete matching rows just from tables @code{t1} and
@code{t2}.
@code{t2}.
@code{ORDER BY} and using multiple tables in the @code{DELETE}
is supported
@code{ORDER BY} and using multiple tables in the @code{DELETE}
statement
in MySQL 4.0.
i
s supported i
n MySQL 4.0.
If an @code{ORDER BY} clause is used, the rows will be deleted in that order.
If an @code{ORDER BY} clause is used, the rows will be deleted in that order.
This is really only useful in conjunction with @code{LIMIT}. For example:
This is really only useful in conjunction with @code{LIMIT}. For example:
...
@@ -35947,16 +35947,17 @@ You can set the default isolation level for @code{mysqld} with
...
@@ -35947,16 +35947,17 @@ You can set the default isolation level for @code{mysqld} with
@cindex full-text search
@cindex full-text search
@cindex FULLTEXT
@cindex FULLTEXT
Since
Version 3.23.23, MySQL has support for full-text indexing
As of
Version 3.23.23, MySQL has support for full-text indexing
and searching. Full-text indexes in MySQL are an index of type
and searching. Full-text indexes in MySQL are an index of type
@code{FULLTEXT}. @code{FULLTEXT} indexes can be created from @code{VARCHAR}
@code{FULLTEXT}. @code{FULLTEXT} indexes can be created from @code{VARCHAR}
and @code{TEXT} columns at @code{CREATE TABLE} time or added later with
and @code{TEXT} columns at @code{CREATE TABLE} time or added later with
@code{ALTER TABLE} or @code{CREATE INDEX}. For large datasets, adding
@code{ALTER TABLE} or @code{CREATE INDEX}. For large datasets, it will be
@code{FULLTEXT} index with @code{ALTER TABLE} (or @code{CREATE INDEX})
much faster to load your data into a table that has no @code{FULLTEXT}
would be much faster than inserting rows into the empty table that has
index, then create the index with @code{ALTER TABLE} (or @code{CREATE
a @code{FULLTEXT} index.
INDEX}). Loading data into a table that already has a @code{FULLTEXT}
index will be slower.
Full-text search
is performed with the @code{MATCH
} function.
Full-text search
ing is performed with the @code{MATCH()
} function.
@example
@example
mysql> CREATE TABLE articles (
mysql> CREATE TABLE articles (
...
@@ -35988,24 +35989,35 @@ mysql> SELECT * FROM articles
...
@@ -35988,24 +35989,35 @@ mysql> SELECT * FROM articles
2 rows in set (0.00 sec)
2 rows in set (0.00 sec)
@end example
@end example
The function @code{MATCH} matches a natural language (or boolean,
The @code{MATCH()} function performs a natural language search for a string
see below) query in case-insensitive fashion @code{AGAINST}
against a text collection (a set of of one or more columns included in
a text collection (which is simply the set of columns covered by a
a @code{FULLTEXT} index). The search string is given as the argument to
@code{FULLTEXT} index). For every row in a table it returns relevance -
@code{AGAINST()}. The search is performed in case-insensitive fashion.
a similarity measure between the text in that row (in the columns that are
For every row in the table, @code{MATCH()} returns a relevance value,
part of the collection) and the query. When it is used in a @code{WHERE}
that is, a similarity measure between the search string and the text in
clause (see example above) the rows returned are automatically sorted with
that row in the columns named in the @code{MATCH()} list.
relevance decreasing. Relevance is a non-negative floating-point number.
Zero relevance means no similarity. Relevance is computed based on the
number of words in the row, the number of unique words in that row, the
total number of words in the collection, and the number of documents (rows)
that contain a particular word.
The above is a basic example of using @code{MATCH} function. Rows are
When @code{MATCH()} is used in a @code{WHERE} clause (see example above)
returned with relevance decreasing.
the rows returned are automatically sorted with highest relevance first.
Relevance values are non-negative floating-point numbers. Zero relevance
means no similarity. Relevance is computed based on the number of words
in the row, the number of unique words in that row, the total number of
words in the collection, and the number of documents (rows) that contain
a particular word.
It is also possible to perform a boolean mode search. This is explained
later in the section.
The preceding example is a basic illustration showing how to use the
@code{MATCH()} function. Rows are returned in order of decreasing
relevance.
The next example shows how to retrieve the relevance values explicitly.
As neither @code{WHERE} nor @code{ORDER BY} clauses are present, returned
rows are not ordered.
@example
@example
mysql> SELECT id,MATCH
title,body
AGAINST ('Tutorial') FROM articles;
mysql> SELECT id,MATCH
(title,body)
AGAINST ('Tutorial') FROM articles;
+----+-----------------------------------------+
+----+-----------------------------------------+
| id | MATCH (title,body) AGAINST ('Tutorial') |
| id | MATCH (title,body) AGAINST ('Tutorial') |
+----+-----------------------------------------+
+----+-----------------------------------------+
...
@@ -36019,12 +36031,16 @@ mysql> SELECT id,MATCH title,body AGAINST ('Tutorial') FROM articles;
...
@@ -36019,12 +36031,16 @@ mysql> SELECT id,MATCH title,body AGAINST ('Tutorial') FROM articles;
6 rows in set (0.00 sec)
6 rows in set (0.00 sec)
@end example
@end example
This example shows how to retrieve the relevances. As neither @code{WHERE}
The following example is more complex. The query returns the relevance
nor @code{ORDER BY} clauses are present, returned rows are not ordered.
and still sorts the rows in order of decreasing relevance. To achieve
this result, you should specify @code{MATCH()} twice. This will cause no
additional overhead, because the MySQL optimiser will notice that the
two @code{MATCH()} calls are identical and invoke the full-text search
code only once.
@example
@example
mysql> SELECT id, body, MATCH
title,body AGAINST (
mysql> SELECT id, body, MATCH
(title,body) AGAINST
-> 'Security implications of running MySQL as root') AS score
->
(
'Security implications of running MySQL as root') AS score
-> FROM articles WHERE MATCH (title,body) AGAINST
-> FROM articles WHERE MATCH (title,body) AGAINST
-> ('Security implications of running MySQL as root');
-> ('Security implications of running MySQL as root');
+----+-------------------------------------+-----------------+
+----+-------------------------------------+-----------------+
...
@@ -36036,18 +36052,12 @@ mysql> SELECT id, body, MATCH title,body AGAINST (
...
@@ -36036,18 +36052,12 @@ mysql> SELECT id, body, MATCH title,body AGAINST (
2 rows in set (0.00 sec)
2 rows in set (0.00 sec)
@end example
@end example
This is more complex example - the query returns the relevance and still
MySQL uses a very simple parser to split text into words. A ``word''
sorts the rows with relevance decreasing. To achieve it one should specify
is any sequence of characters consisting of letters, numbers, @samp{'},
@code{MATCH} twice. Note, that this will cause no additional overhead, as
and @samp{_}. Any ``word'' that is present in the stopword list or is just
MySQL optimiser will notice that these two @code{MATCH} calls are
too short (3 characters or less) is ignored.
identical and will call full-text search code only once.
MySQL uses a very simple parser to split text into words. A
Every correct word in the collection and in the query is weighted
``word'' is any sequence of letters, numbers, @samp{'}, and @samp{_}. Any
``word'' that is present in the stopword list or just too short (3
characters or less) is ignored.
Every correct word in the collection and in the query is weighted,
according to its significance in the query or collection. This way, a
according to its significance in the query or collection. This way, a
word that is present in many documents will have lower weight (and may
word that is present in many documents will have lower weight (and may
even have a zero weight), because it has lower semantic value in this
even have a zero weight), because it has lower semantic value in this
...
@@ -36057,28 +36067,28 @@ relevance of the row.
...
@@ -36057,28 +36067,28 @@ relevance of the row.
Such a technique works best with large collections (in fact, it was
Such a technique works best with large collections (in fact, it was
carefully tuned this way). For very small tables, word distribution
carefully tuned this way). For very small tables, word distribution
does not reflect adequately their semantic
al
value, and this model
does not reflect adequately their semantic value, and this model
may sometimes produce bi
s
arre results.
may sometimes produce bi
z
arre results.
@example
@example
mysql> SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('MySQL');
mysql> SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('MySQL');
Empty set (0.00 sec)
Empty set (0.00 sec)
@end example
@end example
Search for the word @code{MySQL} produces no results in the above example.
The search for the word @code{MySQL} produces no results in the above
Word @code{MySQL} is present in more than half of rows, and as such, is
example, because that word is present in more than half of rows. As such,
effectively treated as a stopword (that is, with semantical value zero).
it is effectively treated as a stopword (that is, a word with zero semantic
It is, really, the desired behavior - a natural language query should not
value). This is the most desirable behavior -- a natural language query
return every second row in
1GB table.
should not return every second row from a
1GB table.
A word that matches half of rows in a table is less likely to locate relevant
A word that matches half of rows in a table is less likely to locate relevant
documents. In fact, it will most likely find plenty of irrelevant documents.
documents. In fact, it will most likely find plenty of irrelevant documents.
We all know this happens far too often when we are trying to find something on
We all know this happens far too often when we are trying to find something on
the Internet with a search engine. It is with this reasoning that such rows
the Internet with a search engine. It is with this reasoning that such rows
have been assigned a low semantic
al
value in @strong{this particular dataset}.
have been assigned a low semantic value in @strong{this particular dataset}.
Since version 4.0.1 MySQL can also perform boolean full
text searches using
As of Version 4.0.1, MySQL can also perform boolean full-
text searches using
@code{IN BOOLEAN MODE} modifier.
the
@code{IN BOOLEAN MODE} modifier.
@example
@example
mysql> SELECT * FROM articles WHERE MATCH (title,body)
mysql> SELECT * FROM articles WHERE MATCH (title,body)
...
@@ -36095,38 +36105,44 @@ mysql> SELECT * FROM articles WHERE MATCH (title,body)
...
@@ -36095,38 +36105,44 @@ mysql> SELECT * FROM articles WHERE MATCH (title,body)
@end example
@end example
This query retrieved all the rows that contain the word @code{MySQL}
This query retrieved all the rows that contain the word @code{MySQL}
(note: 50% threshold is gone), but does @strong{not} contain the word
(note: the 50% threshold is not used), but that do @strong{not} contain
@code{YourSQL}. Note, that it does not auto-magically sort rows in
the word @code{YourSQL}. Note that a boolean mode search does not
decreasing relevance order (the last row has the highest relevance,
auto-magically sort rows in order of decreasing relevance. You can
as it contains @code{MySQL} twice). Boolean fulltext search can also
see this from result of the preceding query, where the row with the
work even without @code{FULLTEXT} index, but it would be @strong{slow}.
highest relevance (the one that contains @code{MySQL} twice) is listed
last, not first. A boolean full-text search can also work even without
a @code{FULLTEXT} index, although it would be @strong{slow}.
Boolean fulltext search
supports the following operators:
The boolean full-text search capability
supports the following operators:
@table @code
@table @code
@item +
@item +
A
plus sign prepended to a word
indicates that this word @strong{must be}
A
leading plus sign
indicates that this word @strong{must be}
present in every row returned.
present in every row returned.
@item -
@item -
A
minus sign prepended to a word indicates that this word @strong{must not
}
A
leading minus sign indicates that this word @strong{must not be
}
be present in the rows
returned.
present in any row
returned.
@item
@item
By default - without plus or minus - the word is optional, but the rows that
By default (when neither plus nor minus is specified) the word is optional,
contain it will be rated higher. This mimicks the behaviour of
but the rows that contain it will be rated higher. This mimicks the
@code{MATCH ... AGAINST()} without @code{IN BOOLEAN MODE} modifier.
behaviour of @code{MATCH() ... AGAINST()} without the @code{IN BOOLEAN
MODE} modifier.
@item < >
@item < >
These two operators are used to increase and decrease word's contribution
These two operators are used to change a word's contribution to the
to the relevance value, assigned to a row. See an example below.
relevance value that is assigned to a row. The @code{<} operator
decreases the contribution and the @code{>} operator increases it.
See the example below.
@item ( )
@item ( )
Parentheses are used
- as usual -
to group words into subexpressions.
Parentheses are used to group words into subexpressions.
@item ~
@item ~
This is negation operator. It makes word's contribution to the row
A leading tilde acts as a negation operator, causing the word's
relevance negative. It's useful for marking noise words. A row that has
contribution to the row relevance to be negative. It's useful for marking
such a word will be rated lower than others, but will not be excluded
noise words. A row that contains such a word will be rated lower than
altogether, as with @code{-} operator.
others, but will not be excluded altogether, as it would be with the
@code{-} operator.
@item *
@item *
This is truncation operator. Unlike others it should be @strong{appended}
An asterisk is the truncation operator. Unlike the other operators, it
to the word, not prepended.
should be @strong{appended}
to the word, not prepended.
@end table
@end table
And here are some examples:
And here are some examples:
...
@@ -36148,25 +36164,25 @@ order), but rank ``apple pie'' higher than ``apple strudel''.
...
@@ -36148,25 +36164,25 @@ order), but rank ``apple pie'' higher than ``apple strudel''.
@end table
@end table
@menu
@menu
* Fulltext Restrictions:: Fulltext Restrictions
* Fulltext Restrictions:: Full
-
text Restrictions
* Fulltext Fine-tuning:: Fine-tuning MySQL Full-text Search
* Fulltext Fine-tuning:: Fine-tuning MySQL Full-text Search
* Fulltext TODO:: Full-text Search TODO
* Fulltext TODO:: Full-text Search TODO
@end menu
@end menu
@node Fulltext Restrictions, Fulltext Fine-tuning, Fulltext Search, Fulltext Search
@node Fulltext Restrictions, Fulltext Fine-tuning, Fulltext Search, Fulltext Search
@subsection Fulltext Restrictions
@subsection Full
-
text Restrictions
@itemize @bullet
@itemize @bullet
@item
@item
All parameters to the @code{MATCH} function must be columns from the
All parameters to the @code{MATCH
()
} function must be columns from the
same table that is part of the same
fulltext index, unless this
same table that is part of the same
@code{FULLTEXT} index, unless the
@code{MATCH} is @code{IN BOOLEAN MODE}.
@code{MATCH
()
} is @code{IN BOOLEAN MODE}.
@item
@item
Column list between @code{MATCH} and @code{AGAINST} must match exactly
The @code{MATCH()} column list must exactly match the column list in some
a column list in the @code{FULLTEXT} index definition, unless this
@code{FULLTEXT} index definition for the table, unless this @code{MATCH()}
@code{MATCH}
is @code{IN BOOLEAN MODE}.
is @code{IN BOOLEAN MODE}.
@item
@item
The argument to @code{AGAINST} must be a constant string.
The argument to @code{AGAINST
()
} must be a constant string.
@end itemize
@end itemize
...
@@ -36176,7 +36192,7 @@ The argument to @code{AGAINST} must be a constant string.
...
@@ -36176,7 +36192,7 @@ The argument to @code{AGAINST} must be a constant string.
Unfortunately, full-text search has few user-tunable parameters yet,
Unfortunately, full-text search has few user-tunable parameters yet,
although adding some is very high on the TODO. If you have a
although adding some is very high on the TODO. If you have a
MySQL source distribution (@pxref{Installing source}), you can
MySQL source distribution (@pxref{Installing source}), you can
more control on the full-text search
behavior.
exert more control over full-text searching
behavior.
Note that full-text search was carefully tuned for the best searching
Note that full-text search was carefully tuned for the best searching
effectiveness. Modifying the default behavior will, in most cases,
effectiveness. Modifying the default behavior will, in most cases,
...
@@ -36186,37 +36202,37 @@ unless you know what you are doing!
...
@@ -36186,37 +36202,37 @@ unless you know what you are doing!
@itemize @bullet
@itemize @bullet
@item
@item
Minimal length of word to be indexed is defined by
MySQL
The minimum length of words to be indexed is defined by the
MySQL
variable @code{ft_min_word_length}. @xref{SHOW VARIABLES}.
variable @code{ft_min_word_length}. @xref{SHOW VARIABLES}.
Change it to the value you prefer, and rebuild
Change it to the value you prefer, and rebuild
your @code{FULLTEXT} indexes.
your @code{FULLTEXT} indexes.
@item
@item
The stopword list is defined in @file{myisam/ft_static.c}
The stopword list is defined in @file{myisam/ft_static.c}
Modify it to your taste, recompile MySQL and rebuild
Modify it to your taste, recompile MySQL
,
and rebuild
your @code{FULLTEXT} indexes.
your @code{FULLTEXT} indexes.
@item
@item
The 50% threshold is
caused by the particular weighting scheme chosen. To
The 50% threshold is
determined by the particular weighting scheme chosen.
disable it, change the following line in @file{myisam/ftdefs.h}:
To
disable it, change the following line in @file{myisam/ftdefs.h}:
@example
@example
#define GWS_IN_USE GWS_PROB
#define GWS_IN_USE GWS_PROB
@end example
@end example
to
To:
@example
@example
#define GWS_IN_USE GWS_FREQ
#define GWS_IN_USE GWS_FREQ
@end example
@end example
and
recompile MySQL.
Then
recompile MySQL.
There is no need to rebuild the indexes in this case.
There is no need to rebuild the indexes in this case.
@strong{Note:} by doing this you @strong{severely} decrease MySQL ability
@strong{Note:} by doing this you @strong{severely} decrease MySQL
's
ability
to provide adequate relevance values
by @code{MATCH
} function.
to provide adequate relevance values
for the @code{MATCH()
} function.
I
t means, that if you really need to search for such a common words,
I
f you really need to search for such common words, it would be better to
then you should rather search @code{IN BOOLEAN MODE}, which does not
search using @code{IN BOOLEAN MODE} instead, which does not observe the 50%
has 50%
threshold.
threshold.
@item
@item
Sometimes
search engine maintaner would like to chang
e operators used
Sometimes
the search engine maintainer would like to change th
e operators used
for boolean fulltext search
. They are defined by a
for boolean fulltext search
es. These are defined by the
@code{ft_boolean_syntax} variable. @xref{SHOW VARIABLES}.
@code{ft_boolean_syntax} variable. @xref{SHOW VARIABLES}.
Still, this variable is read-only, its value is set in
Still, this variable is read-only, its value is set in
@file{myisam/ft_static.c}.
@file{myisam/ft_static.c}.
...
@@ -36237,7 +36253,7 @@ the user wants to treat as words, examples are "C++", "AS/400", "TCP/IP", etc.
...
@@ -36237,7 +36253,7 @@ the user wants to treat as words, examples are "C++", "AS/400", "TCP/IP", etc.
@item Support for multi-byte charsets.
@item Support for multi-byte charsets.
@item Make stopword list to depend of the language of the data.
@item Make stopword list to depend of the language of the data.
@item Stemming (dependent of the language of the data, of course).
@item Stemming (dependent of the language of the data, of course).
@item Generic user-suppl
y
able UDF (?) preparser.
@item Generic user-suppl
i
able UDF (?) preparser.
@item Make the model more flexible (by adding some adjustable
@item Make the model more flexible (by adding some adjustable
parameters to @code{FULLTEXT} in @code{CREATE/ALTER TABLE}).
parameters to @code{FULLTEXT} in @code{CREATE/ALTER TABLE}).
@end itemize
@end itemize
...
@@ -49697,7 +49713,7 @@ Fixed bug with @code{LOCK TABLE} and BDB tables.
...
@@ -49697,7 +49713,7 @@ Fixed bug with @code{LOCK TABLE} and BDB tables.
@itemize @bullet
@itemize @bullet
@item
@item
Fixed a bug when using @code{MATCH} in @code{HAVING} clause.
Fixed a bug when using @code{MATCH
()
} in @code{HAVING} clause.
@item
@item
Fixed a bug when using @code{HEAP} tables with @code{LIKE}.
Fixed a bug when using @code{HEAP} tables with @code{LIKE}.
@item
@item
...
@@ -50266,7 +50282,7 @@ that caused @code{mysql_install_db} to core dump on some Linux machines.
...
@@ -50266,7 +50282,7 @@ that caused @code{mysql_install_db} to core dump on some Linux machines.
@item
@item
Changed @code{mi_create()} to use less stack space.
Changed @code{mi_create()} to use less stack space.
@item
@item
Fixed bug with optimiser trying to over-optimise @code{MATCH} when used
Fixed bug with optimiser trying to over-optimise @code{MATCH
()
} when used
with @code{UNIQUE} key.
with @code{UNIQUE} key.
@item
@item
Changed @code{crash-me} and the MySQL benchmarks to also work
Changed @code{crash-me} and the MySQL benchmarks to also work
...
@@ -50722,7 +50738,7 @@ More variables in @code{SHOW SLAVE STATUS} and @code{SHOW MASTER STATUS}.
...
@@ -50722,7 +50738,7 @@ More variables in @code{SHOW SLAVE STATUS} and @code{SHOW MASTER STATUS}.
@item
@item
@code{SLAVE STOP} now will not return until the slave thread actually exits.
@code{SLAVE STOP} now will not return until the slave thread actually exits.
@item
@item
Full text search via the @code{MATCH} function and @code{FULLTEXT} index type
Full text search via the @code{MATCH
()
} function and @code{FULLTEXT} index type
(for MyISAM files). This makes @code{FULLTEXT} a reserved word.
(for MyISAM files). This makes @code{FULLTEXT} a reserved word.
@end itemize
@end itemize
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment