C++ CSS HTML Java JavaScript MySQL Oracle PERL PHP SQL Unix VBScript XHTML XML Сети
Optimising Database Structure (MySQL 4.0)
 
Optimising Database Structure
=============================

Design Choices
--------------


systems.

 one
column. Because this degenerates so quickly when more than one column
is accessed, we believe that this model is not good for general purpose
databases.

 layout is that
it, in many cases, depending on how well the index is cached, saves a
disk read.  The bad things with this layout are:

   * Table scanning is much slower because you have to read through the
     indexes to get at the data.

   * You can't use only the index table to retrieve data for a query.

   * You lose a lot of space, as you must duplicate indexes from the
     nodes (as you can't store the row in the nodes).

   * Deletes will degenerate the table over time (as indexes in nodes
     are usually not updated on delete).

   * It's harder to cache only the index data.

Get Your Data as Small as Possible
----------------------------------

 normally less
main memory will be used. Indexing also takes less resources if done on
smaller columns.

MySQL supports a lot of different table types and row formats.
Choosing the right table format may give you a big performance gain.
*Note Table types::.

You can get better performance on a table and minimise storage space
using the techniques listed here:

   * Use the most efficient (smallest) types possible. MySQL has many
     specialised types that save disk space and memory.

   * Use the smaller integer types if possible to get smaller tables.
     For example, `MEDIUMINT' is often better than `INT'.

    it. Just
     avoid having it on all columns by default.

   * If you don't have any variable-length columns (`VARCHAR', `TEXT',
     or `BLOB' columns), a fixed-size record format is used. This is
     faster but unfortunately may waste some space.  *Note `MyISAM'
     table formats: MyISAM table formats.

   * The primary index of a table should be as short as possible. This
     makes identification of one row easy and efficient.

   * For each table, you have to decide which storage/index method to
     use. *Note Table types::.

   * Only create the indexes that you really need. Indexes are good for
     retrieval but bad when you need to store things fast. If you mostly
     access a table by searching on a combination of columns, make an
     index on them. The first index part should be the most used
     column. If you are *always* using many columns, you should use the
     column with more duplicates first to get better compression of the
     index.

   * If it's very likely that a column has a unique prefix on the first
     number of characters, it's better to only index this prefix. MySQL
     supports an index on a part of a character column. Shorter indexes
     are faster not only because they take less disk space but also
     because they will give you more hits in the index cache and thus
     fewer disk seeks. *Note Server parameters::.

   * In some circumstances it can be beneficial to split into two a
     table that is scanned very often. This is especially true if it is
     a dynamic format table and it is possible to use a smaller static
     format table that can be used to find the relevant rows when
     scanning the table.

How MySQL Uses Indexes
----------------------

 The bigger
the table, the more this costs. If the table has an index for the
columns in question, MySQL can quickly get a position to seek to in the
middle of the datafile without having to look at all the data. If a
table has 1000 rows, this is at least 100 times faster than reading
sequentially. Note that if you need to access almost all 1000 rows it
is faster to read sequentially because we then avoid disk seeks.

All MySQL indexes (`PRIMARY', `UNIQUE', and `INDEX') are stored in
B-trees. Strings are automatically prefix- and end-space compressed.
*Note `CREATE INDEX': CREATE INDEX.

Indexes are used to:
   * Quickly find the rows that match a `WHERE' clause.

   * Retrieve rows from other tables when performing joins.

   * Find the `MAX()' or `MIN()' value for a specific indexed column.
     This is optimised by a preprocessor that checks if you are using
     `WHERE' key_part_# = constant on all key parts < N.  In this case
     MySQL will do a single key lookup and replace the `MIN()'
     expression with a constant.  If all expressions are replaced with
     constants, the query will return at once:

          SELECT MIN(key_part2),MAX(key_part2) FROM table_name where key_part1=10

   * Sort or group a table if the sorting or grouping is done on a
     leftmost prefix of a usable key (for example, `ORDER BY
     key_part_1,key_part_2 '). The key is read in reverse order if all
     key parts are followed by `DESC'.  *Note ORDER BY optimisation::.

   * In some cases a query can be optimised to retrieve values without
     consulting the datafile. If all used columns for some table are
     numeric and form a leftmost prefix for some key, the values may be
     retrieved from the index tree for greater speed:

          SELECT key_part3 FROM table_name WHERE key_part1=1


Suppose you issue the following `SELECT' statement:

     mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;

If a multiple-column index exists on `col1' and `col2', the appropriate
rows can be fetched directly. If separate single-column indexes exist
on `col1' and `col2', the optimiser tries to find the most restrictive
index by deciding which index will find fewer rows and using that index
to fetch the rows.

If the table has a multiple-column index, any leftmost prefix of the
index can be used by the optimiser to find rows. For example, if you
have a three-column index on `(col1,col2,col3)', you have indexed
search capabilities on `(col1)', `(col1,col2)', and `(col1,col2,col3)'.

MySQL can't use a partial index if the columns don't form a leftmost
prefix of the index.  Suppose you have the `SELECT' statements shown
here:

     mysql> SELECT * FROM tbl_name WHERE col1=val1;
     mysql> SELECT * FROM tbl_name WHERE col2=val2;
     mysql> SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3;

If an index exists on `(col1,col2,col3)', only the first of the
preceding queries uses the index. The second and third queries do
involve indexed columns, but `(col2)' and `(col2,col3)' are not
leftmost prefixes of `(col1,col2,col3)'.

MySQL also uses indexes for `LIKE' comparisons if the argument to
`LIKE' is a constant string that doesn't start with a wildcard
character.  For example, the following `SELECT' statements use indexes:

     mysql> SELECT * FROM tbl_name WHERE key_col LIKE "Patrick%";
     mysql> SELECT * FROM tbl_name WHERE key_col LIKE "Pat%_ck%";

In the first statement, only rows with `"Patrick" <= key_col <
"Patricl"' are considered.  In the second statement, only rows with
`"Pat" <= key_col < "Pau"' are considered.

The following `SELECT' statements will not use indexes:
     mysql> SELECT * FROM tbl_name WHERE key_col LIKE "%Patrick%";
     mysql> SELECT * FROM tbl_name WHERE key_col LIKE other_col;

In the first statement, the `LIKE' value begins with a wildcard
character.  In the second statement, the `LIKE' value is not a constant.

MySQL 4.0 does another optimisation on `LIKE'.  If you use `... LIKE
"%string%"' and `string' is longer than 3 characters, MySQL will use
the `Turbo Boyer-Moore' algorithm to initialise the pattern for the
string and then use this pattern to perform the search quicker.

Searching using `column_name IS NULL' will use indexes if column_name
is an index.

MySQL normally uses the index that finds the least number of rows. An
index is used for columns that you compare with the following operators:
`=', `>', `>=', `<', `<=', `BETWEEN', and a `LIKE' with a non-wildcard
prefix like `'something%''.



The following `WHERE' clauses use indexes:
     ... WHERE index_part1=1 AND index_part2=2 AND other_column=3
     ... WHERE index=1 OR A=10 AND index=2      /* index = 1 OR index = 2 */
     ... WHERE index_part1='hello' AND index_part_3=5
               /* optimised like "index_part1='hello'" */
     ... WHERE index1=1 and index2=2 or index1=3 and index3=3;
               /* Can use index on index1 but not on index2 or index 3 */

These `WHERE' clauses do *NOT* use indexes:
     ... WHERE index_part2=1 AND index_part3=2  /* index_part_1 is not used */
     ... WHERE index=1 OR A=10                  /* Index is not used in
                                                             both AND parts */
     ... WHERE index_part1=1 OR index_part2=10  /* No index spans all rows  */

Note that in some cases MySQL will not use an index, even if one would
be available.  Some of the cases where this happens are:

   * If the use of the index would require MySQL to access more than
     30% of the rows in the table.  (In this case a table scan is
     probably much faster, as this will require us to do much fewer
     seeks.)  Note that if such a query uses `LIMIT' to only retrieve
     part of the rows, MySQL will use an index anyway, as it can much
     more quickly find the few rows to return in the result.

Column Indexes
--------------

All MySQL column types can be indexed.  Use of indexes on the relevant
columns is the best way to improve the performance of `SELECT'
operations.

The maximum number of keys and the maximum index length is defined per
storage engine. *Note Table types::. You can with all storage engines
have at least 16 keys and a total index length of at least 256 bytes.

For `CHAR' and `VARCHAR' columns, you can index a prefix of a column.
This is much faster and requires less disk space than indexing the
whole column.  The syntax to use in the `CREATE TABLE' statement to
index a column prefix looks like this:

     KEY index_name (col_name(length))

The example here creates an index for the first 10 characters of the
`name' column:

     mysql> CREATE TABLE test (
         ->        name CHAR(200) NOT NULL,
         ->        KEY index_name (name(10)));

For `BLOB' and `TEXT' columns, you must index a prefix of the column.
You cannot index the entire column.

In MySQL Version 3.23.23 or later, you can also create special
`FULLTEXT' indexes. They are used for full-text search. Only the
`MyISAM' table type supports `FULLTEXT' indexes. They can be created
only from `CHAR', `VARCHAR', and `TEXT' columns.  Indexing always
happens over the entire column and partial indexing is not supported.
See *Note Fulltext Search:: for details.

Multiple-Column Indexes
-----------------------

MySQL can create indexes on multiple columns.  An index may consist of
up to 15 columns. (On `CHAR' and `VARCHAR' columns you can also use a
prefix of the column as a part of an index.)

A multiple-column index can be considered a sorted array containing
values that are created by concatenating the values of the indexed
columns.

 other columns.

Suppose a table is created using the following specification:

     mysql> CREATE TABLE test (
         ->       id INT NOT NULL,
         ->       last_name CHAR(30) NOT NULL,
         ->       first_name CHAR(30) NOT NULL,
         ->       PRIMARY KEY (id),
         ->       INDEX name (last_name,first_name));

Then the index `name' is an index over `last_name' and `first_name'.
The index will be used for queries that specify values in a known range
for `last_name', or for both `last_name' and `first_name'.  Therefore,
the `name' index will be used in the following queries:

     mysql> SELECT * FROM test WHERE last_name="Widenius";
     
     mysql> SELECT * FROM test WHERE last_name="Widenius"
         ->                    AND first_name="Michael";
     
     mysql> SELECT * FROM test WHERE last_name="Widenius"
         ->                    AND (first_name="Michael" OR first_name="Monty");
     
     mysql> SELECT * FROM test WHERE last_name="Widenius"
         ->                    AND first_name >="M" AND first_name < "N";

However, the `name' index will NOT be used in the following queries:

     mysql> SELECT * FROM test WHERE first_name="Michael";
     
     mysql> SELECT * FROM test WHERE last_name="Widenius"
         ->                    OR first_name="Michael";

For more information on the manner in which MySQL uses indexes to
improve query performance, see *Note MySQL indexes: MySQL indexes.

Why So Many Open tables?
------------------------

When you run `mysqladmin status', you'll see something like this:

     Uptime: 426 Running threads: 1 Questions: 11082 Reloads: 1 Open tables: 12

This can be somewhat perplexing if you only have 6 tables.

MySQL is multi-threaded, so it may have many queries on the same table
simultaneously. To minimise the problem with two threads having
different states on the same file, the table is opened independently by
each concurrent thread. This takes some memory but will normaly increase
performance.  With `ISAM' and `MyISAM' tables this also requires one
extra file descriptor for the datafile. With these table types the index
file descriptor is shared between all threads.

You can read more about this topic in the next section. *Note Table
cache::.

How MySQL Opens and Closes Tables
---------------------------------

 your
operating system on the per-process number of open file descriptors.
However, you can increase the limit on many systems.  Consult your OS
documentation to find out how to do this, because the method for
changing the limit varies widely from system to system.

`table_cache' is related to `max_connections'.  For example, for 200
concurrent running connections, you should have a table cache of at
least `200 * n', where `n' is the maximum number of tables in a join.
You also need to reserve some extra file descriptors for temporary
tables and files.

Make sure that your operating system can handle the number of open file
descriptors implied by the `table_cache' setting.  If `table_cache' is
set too high, MySQL may run out of file descriptors and refuse
connections, fail to perform queries, and be very unreliable.  You also
have to take into account that the `MyISAM' storage engine needs two
file descriptors for each unique open table. You can in increase the
number of file descriptors available for MySQL with the
`--open-files-limit=#' startup option. *Note Not enough file handles::.

The cache of open tables will be kept at a level of `table_cache'
entries. The default value is 64; this can be changed with the `-O
table_cache=#' option to `mysqld').  Note that MySQL may temporarily
open even more tables to be able to execute queries.

A not used table is closed and removed from the table cache under the
following circumstances:

   * When the cache is full and a thread tries to open a table that is
     not in the cache.

   * When the cache contains more than `table_cache' entries and a
     thread is no longer using a table.

   * When someone executes `mysqladmin refresh' or `mysqladmin
     flush-tables'.

   * When someone executes a `FLUSH TABLES' statement.

When the table cache fills up, the server uses the following procedure
to locate a cache entry to use:

   * Tables that are not currently in use are released, in
     least-recently-used order.

   * If the cache is full and no tables can be released, but a new
     table needs to be opened, the cache is temporarily extended as
     necessary.

   * If the cache is in a temporarily extended state and a table goes
     from in-use to not-in-use state, the table is closed and released
     from the cache.

  The
first open of any table takes two file descriptors; each additional use
of the table takes only one file descriptor.  The extra descriptor for
the first open is used for the index file; this descriptor is shared
among all threads.

If you are opening a table with the `HANDLER table_name OPEN'
statement, a dedicated table object is allocated for the thread.  This
table object is not shared by other threads an will not be closed until
the thread calls `HANDLER table_name CLOSE' or the thread dies.  *Note
`HANDLER': HANDLER.  When this happens, the table is put back in the
table cache (if it isn't full).

You can check if your table cache is too small by checking the `mysqld'
variable `Opened_tables'.  If this is quite big, even if you haven't
done a lot of `FLUSH TABLES', you should increase your table cache.
*Note `Opened_tables': SHOW STATUS.

Drawbacks to Creating Large Numbers of Tables in the Same Database
------------------------------------------------------------------


is full, because for every table that has to be opened, another must be
closed. You can reduce this overhead by making the table cache larger.

[Назад] [Содержание] [Вперед]

Главная