C++ CSS HTML Java JavaScript MySQL Oracle PERL PHP SQL Unix VBScript XHTML XML Сети
`InnoDB' Tables (MySQL 4.0)
 
`InnoDB' Tables
===============

InnoDB Tables Overview
----------------------


consistent non-locking read in `SELECT's. These features increase
multiuser concurrency and performance. There is no need for lock
escalation in `InnoDB', because row level locks in InnoDB fit in very
small space.  `InnoDB' is the first storage manager in MySQL to support
`FOREIGN KEY' constraints.

`InnoDB' has been designed for maximum performance when processing
large data volumes. Its CPU efficiency is probably not matched by any
other disk-based relational database engine.


and another site handles an average load of 800 inserts/updates per
second in `InnoDB'.

 which
may consist of several files (or raw disk partitions).  This is
different from, for example, `MyISAM' tables where each table is stored
as a separate file.  `InnoDB' tables can be of any size even on
operating systems where file-size is limited to 2 GB.

You can find the latest information about `InnoDB' at
`http://www.innodb.com/'. The most up-to-date version of the `InnoDB'
manual is always placed there.

`InnoDB' is published under the same GNU `GPL' License Version 2 (of
June 1991) as MySQL. If you distribute MySQL/InnoDB, and your
application does not satisfy the restrictions of the GPL license, you
have to buy a commercial *MySQL Pro* license from
`https://order.mysql.com/?sub=pg&pg_no=1'.

InnoDB in MySQL Version 3.23
----------------------------

From MySQL version 4.0, `InnoDB' is enabled by default.  The following
information only applies to the 3.23 series.

InnoDB tables are included in the MySQL source distribution starting
from 3.23.34a and are activated in the MySQL -Max binary of the 3.23
series.  For Windows the -Max binaries are contained in the standard
distribution.

If you have downloaded a binary version of MySQL that includes support
for InnoDB, simply follow the instructions of the MySQL manual for
installing a binary version of MySQL. If you already have MySQL-3.23
installed, then the simplest way to install MySQL -Max is to replace
the server executable `mysqld' with the corresponding executable in the
-Max distribution.  MySQL and MySQL -Max differ only in the server
executable.  *Note Installing binary::.  *Note `mysqld-max': mysqld-max.

To compile MySQL with InnoDB support, download MySQL-3.23.34a or newer
version from `http://www.mysql.com/' and configure MySQL with the
`--with-innodb' option. See the MySQL manual about installing a MySQL
source distribution.  *Note Installing source::.

     cd /path/to/source/of/mysql-3.23.37
     ./configure --with-innodb

To use InnoDB tables in MySQL-Max-3.23 you *must* specify configuration
parameters in the `[mysqld]' section of the configuration file
`my.cnf', or on Windows optionally in `my.ini'.

 these files
to the `datadir' of MySQL.  If you specify `innodb_data_home_dir' as an
empty string, then you can give absolute paths to your datafiles in
`innodb_data_file_path'.

The minimal way to modify it is to add to the `[mysqld]' section the
line

     innodb_data_file_path=ibdata:30M

but to get good performance it is best that you specify options as
recommended. *Note InnoDB start::.

InnoDB Startup Options
----------------------

To enable `InnoDB' tables in MySQL version 3.23, see *Note InnoDB in
MySQL 3.23::.

In MySQL-4.0 you are not required to do anything specific to enable
`InnoDB' tables.

The default behaviour in MySQL-4.0 and MySQL-4.1 is to create an
auto-extending 10 MB file `ibdata1' in the `datadir' of MySQL and two 5
MB `ib_logfile's to the `datadir'.  (In MySQL-4.0.0 and 4.0.1 the
datafile is 64 MB and not auto-extending.)

*Note*: To get good performance you *should* explicitly set the InnoDB
parameters listed in the following examples.

If you don't want to use `InnoDB' tables, you can add the `skip-innodb'
option to your MySQL option file.

Starting from versions 3.23.50 and 4.0.2, `InnoDB' allows the last
datafile on the `innodb_data_file_path' line to be specified as
*auto-extending*. The syntax for `innodb_data_file_path' is then the
following:
     pathtodatafile:sizespecification;pathtodatafile:sizespecification;...
     ...  ;pathtodatafile:sizespecification[:autoextend[:max:sizespecification]]
If you specify the last datafile with the `autoextend' option, `InnoDB'
will extend the last datafile if it runs out of free space in the
tablespace. The increment is 8 MB at a time. An example:
     innodb_data_home_dir =
     innodb_data_file_path = /ibdata/ibdata1:100M:autoextend
instructs `InnoDB' to create just a single datafile whose initial size
is 100 MB and which is extended in 8 MB blocks when space runs out.  If
the disk becomes full you may want to add another datafile to another
disk, for example. Then you have to look at the size of `ibdata1',
round the size downward to the closest multiple of 1024 * 1024 bytes (=
1 MB), and specify the rounded size of `ibdata1' explicitly in
`innodb_data_file_path'.  After that you can add another datafile:
     innodb_data_home_dir =
     innodb_data_file_path = /ibdata/ibdata1:988M;/disk2/ibdata2:50M:autoextend
Be cautious on filesystems where the maximum file-size is 2 GB.  InnoDB
is not aware of the OS maximum file-size. On those filesystems you
might want to specify the max size for the datafile:
     innodb_data_home_dir =
     innodb_data_file_path = /ibdata/ibdata1:100M:autoextend:max:2000M

*A simple `my.cnf' example.* Suppose you have a computer with 128 MB
RAM and one hard disk. Below is an example of possible configuration
parameters in `my.cnf' or `my.ini' for InnoDB. We assume you are running
MySQL-Max-3.23.50 or later, or MySQL-4.0.2 or later.  This example
suits most users, both on Unix and Windows, who do not want to
distribute InnoDB datafiles and log files on several disks. This
creates an auto-extending datafile `ibdata1' and two InnoDB log files
`ib_logfile0' and `ib_logfile1' to the `datadir' of MySQL (typically
`/mysql/data').  Also the small archived InnoDB log file
`ib_arch_log_0000000000' ends up in the `datadir'.

      and indexes.
     #                                  Make sure you have enough
     #                                  free disk space.
     innodb_data_file_path = ibdata1:10M:autoextend
     #                                  Set buffer pool size to
     #                                  50 - 80 % of your computer's
     #                                  memory
     set-variable = innodb_buffer_pool_size=70M
     set-variable = innodb_additional_mem_pool_size=10M
     #                                  Set the log file size to about
     #                                  25 % of the buffer pool size
     set-variable = innodb_log_file_size=20M
     set-variable = innodb_log_buffer_size=8M
     #                                  Set ..flush_log_at_trx_commit
     #                                  to 0 if you can afford losing
     #                                  some last transactions
     innodb_flush_log_at_trx_commit=1

*Check that the MySQL server has the rights to create files* in
`datadir'.

Note that datafiles must be < 2 GB in some file systems!  The combined
size of the log files must be < 4 GB. The combined size of datafiles
must be >= 10 MB.

When you for the first time create an InnoDB database, it is best that
you start the MySQL server from the command prompt.  Then InnoDB will
print the information about the database creation to the screen, and
you see what is happening.  See below next section what the printout
should look like.  For example, in Windows you can start
`mysqld-max.exe' with:

     your-path-to-mysqld\mysqld-max --console

*Where to put `my.cnf' or `my.ini' in Windows?* The rules for Windows
are the following:
   * Only one of `my.cnf' or `my.ini' should be created.

   * The `my.cnf' file should be placed in the root directory of the
     drive `C:'.

   * The `my.ini' file should be placed in the WINDIR directory, e.g,
     `C:\WINDOWS' or `C:\WINNT'. You can use the `SET' command of
     MS-DOS to print the value of WINDIR.

   * If your PC uses a boot loader where the `C:' drive is not the boot
     drive, then your only option is to use the `my.ini' file.

*Where to specify options in Unix?* On Unix `mysqld' reads options from
the following files, if they exist, in the following order:
   * `/etc/my.cnf'  Global options.

   * `COMPILATION_DATADIR/my.cnf'  Server-specific options.

   * `defaults-extra-file'  The file specified with
     `--defaults-extra-file=...'.

    a binary installation or `/usr/local/var'
for a source installation).

If you are not sure from where `mysqld' reads its `my.cnf' or `my.ini',
you can give the path as the first command-line option to the server:
`mysqld --defaults-file=your_path_to_my_cnf'.

 needed. If the keyword `innodb_data_home_dir' is not
mentioned in `my.cnf' at all, the default for it is the 'dot' directory
`./' which means the `datadir' of MySQL.

*An advanced `my.cnf' example.* Suppose you have a Linux computer with
2 GB RAM and three 60 GB hard disks (at directory paths `/', `/dr2' and
`/dr3'). Below is an example of possible configuration parameters in
`my.cnf' for InnoDB.

*Note that InnoDB does not create directories: you have to create them
yourself.* Use the Unix or MS-DOS `mkdir' command to create the data
and log group home directories.

                      hold your data and indexes
     innodb_data_file_path = /ibdata/ibdata1:2000M;/dr2/ibdata/ibdata2:2000M:autoextend
     #                                  Set buffer pool size to
     #                                  50 - 80 % of your computer's
     #                                  memory, but make sure on Linux
     #                                  x86 total memory usage is
     #                                  < 2 GB
     set-variable = innodb_buffer_pool_size=1G
     set-variable = innodb_additional_mem_pool_size=20M
     innodb_log_group_home_dir = /dr3/iblogs
     #                                  .._log_arch_dir must be the same
     #                                  as .._log_group_home_dir
     innodb_log_arch_dir = /dr3/iblogs
     set-variable = innodb_log_files_in_group=3
     #                                  Set the log file size to about
     #                                  15 % of the buffer pool size
     set-variable = innodb_log_file_size=150M
     set-variable = innodb_log_buffer_size=8M
     #                                  Set ..flush_log_at_trx_commit to
     #                                  0 if you can afford losing
     #                                  some last transactions
     innodb_flush_log_at_trx_commit=1
     set-variable = innodb_lock_wait_timeout=50
     #innodb_flush_method=fdatasync
     #set-variable = innodb_thread_concurrency=5

Note that we have placed the two datafiles on different disks.  InnoDB
will fill the tablespace formed by the datafiles from bottom up. In
some cases it will improve the performance of the database if all data
is not placed on the same physical disk. Putting log files on a
different disk from data is very often beneficial for performance.  You
can also use *raw disk partitions* (raw devices) as datafiles. In some
Unixes they speed up I/O. See the manual section on InnoDB file space
management about how to specify them in `my.cnf'.

*Warning*: on Linux x86 you must be careful you *do not set memory usage
too high*. glibc will allow the process heap to grow over thread stacks,
which will crash your server. It is a risk if the value of
     innodb_buffer_pool_size + key_buffer +
     max_connections * (sort_buffer + read_buffer_size) + max_connections * 2 MB
is close to 2 GB or exceeds 2 GB. Each thread will use a stack (often 2
MB, but in MySQL AB binaries only 256 KB) and in the worst case also
`sort_buffer + read_buffer_size' additional memory.

 = sort_buffer=1M
     #                                  Set key_buffer to 5 - 50%
     #                                  of your RAM depending on how
     #                                  much you use MyISAM tables, but
     #                                  keep key_buffer + InnoDB
     #                                  buffer pool size < 80% of
     #                                  your RAM
     set-variable = key_buffer=...


The meanings of the configuration parameters are the following:

*Option*                  *Description*
`innodb_data_home_dir'     The common part of the directory path for all
                          InnoDB datafiles.  If you do not mentioned
                          this option in `my.cnf' the default is the
                          `datadir' of MySQL.  You can specify this also
                          as an empty string, in which case you can use
                          absolute file paths in `innodb_data_file_path'.
`innodb_data_file_path'    Paths to individual datafiles and their
                          sizes. The full directory path to each
                          datafile is acquired by concatenating
                          innodb_data_home_dir to the paths specified
                          here. The file sizes are specified in
                          megabytes, hence the 'M' after the size
                          specification above.  InnoDB also understands
                          the abbreviation 'G', 1 G meaning 1024 MB.
                          Starting from 3.23.44 you can set the
                          file-size bigger than 4 GB on those operating
                          systems which support big files.  On some
                          operating systems files must be < 2 GB.  If
                          you do not specify `innodb_data_file_path',
                          the default behaviour starting from 4.0 is to
                          create a 10 MB auto-extending datafile
                          `ibdata1'.  The sum of the sizes of the files
                          must be at least 10 MB.
`innodb_mirrored_log_groups' Number of identical copies of log groups we
                          keep for the database. Currently this should
                          be set to 1.
`innodb_log_group_home_dir' Directory path to InnoDB log files.  If you
                          do not mentioned this option in `my.cnf' the
                          default is the `datadir' of MySQL.
`innodb_log_files_in_group' Number of log files in the log group. InnoDB
                          writes to the files in a circular fashion.
                          Value 2 is recommended here. The default is 2.
`innodb_log_file_size'     Size of each log file in a log group in
                          megabytes. Sensible values range from 1M to
                          1/nth of the size of the buffer pool specified
                          below, where n is the number of log files in
                          the group. The bigger the value, the less
                          checkpoint flush activity is needed in the
                          buffer pool, saving disk I/O. But bigger log
                          files also mean that recovery will be slower
                          in case of a crash. The combined size of log
                          files must be < 4 GB on 32-bit computers.  The
                          default is 5M.
`innodb_log_buffer_size'   The size of the buffer which InnoDB uses to
                          write log to the log files on disk.  Sensible
                          values range from 1M to 8M.  A big log buffer
                          allows large transactions to run without a need
                          to write the log to disk until the transaction
                          commit. Thus, if you have big transactions,
                          making the log buffer big will save disk I/O.
`innodb_flush_log_at_trx_commit' Normally you set this to 1, meaning that at a
                          transaction commit the log is flushed to disk,
                          and the modifications made by the transaction
                          become permanent, and survive a database
                          crash. If you are willing to compromise this
                          safety, and you are running small
                          transactions, you may set this to 0 or 2 to
                          reduce disk I/O to the logs. Value 0 means
                          that the log is only written to the log file
                          and the log file flushed to disk approximately
                          once per second. Value 2 means the log is
                          written to the log file at each commit, but
                          the log file is only flushed to disk
                          approximately once per second. The default
                          value is 1 starting from MySQL-4.0.13,
                          previously it was 0.
`innodb_log_arch_dir'      The directory where fully written log files
                          would be archived if we used log archiving.
                          The value of this parameter should currently
                          be set the same as `innodb_log_group_home_dir'.
`innodb_log_archive'       This value should currently be set to 0.  As
                          recovery from a backup is done by MySQL using
                          its own log files, there is currently no need
                          to archive InnoDB log files.
`innodb_buffer_pool_size'  The size of the memory buffer InnoDB uses to
                          cache data and indexes of its tables.  The
                          bigger you set this the less disk I/O is
                          needed to access data in tables. On a
                          dedicated database server you may set this
                          parameter up to 80% of the machine physical
                          memory size. Do not set it too large, though,
                          because competition of the physical memory may
                          cause paging in the operating system.
`innodb_buffer_pool_awe_mem_mb' Size of the buffer pool in MB, if it is
                          placed in the AWE memory of 32-bit Windows.
                          Available starting from 4.1.0 and only
                          relevant in 32-bit Windows. If your 32-bit
                          Windows operating system supports > 4 GB
                          memory, so-called Address Windowing
                          Extensions, you can allocate the InnoDB buffer
                          pool into the AWE physical memory using this
                          parameter. The maximum possible value for this
                          is 64000.  If this parameter is specified,
                          then innodb_buffer_pool_size is the window in
                          the 32-bit address space of mysqld where
                          InnoDB maps that AWE memory. A good value for
                          innodb_buffer_pool_size is then 500M.
`innodb_additional_mem_pool_size' Size of a memory pool InnoDB uses to store
                          data dictionary information and other internal
                          data structures. A sensible value for this
                          might be 2M, but the more tables you have in
                          your application the more you will need to
                          allocate here. If InnoDB runs out of memory in
                          this pool, it will start to allocate memory
                          from the operating system, and write warning
                          messages to the MySQL error log.
`innodb_file_io_threads'   Number of file I/O threads in InnoDB.
                          Normally, this should be 4, but on Windows
                          disk I/O may benefit from a larger number.
`innodb_lock_wait_timeout' Timeout in seconds an InnoDB transaction may
                          wait for a lock before being rolled back.
                          InnoDB automatically detects transaction
                          deadlocks in its own lock table and rolls back
                          the transaction. If you use the `LOCK TABLES'
                          command, or other transaction-safe storage
                          engines than InnoDB in the same transaction,
                          then a deadlock may arise which InnoDB cannot
                          notice. In cases like this the timeout is
                          useful to resolve the situation.
`innodb_flush_method'      (Available from 3.23.40 up.)  The default
                          value for this is `fdatasync'.  Another option
                          is `O_DSYNC'.
`innodb_force_recovery'    Warning: this option should only be defined
                          in an emergency situation when you want to
                          dump your tables from a corrupt database!
                          Possible values are 1 - 6. See below at
                          section 'Forcing recovery' about the meanings
                          of the values.  As a safety measure InnoDB
                          prevents a user from modifying data when this
                          option is > 0.  This option is available
                          starting from version 3.23.44.

Creating InnoDB Tablespace
--------------------------

Suppose you have installed MySQL and have edited `my.cnf' so that it
contains the necessary InnoDB configuration parameters.  Before
starting MySQL you should check that the directories you have specified
for InnoDB datafiles and log files exist and that you have access
rights to those directories. InnoDB cannot create directories, only
files. Check also you have enough disk space for the data and log files.

When you now start MySQL, InnoDB will start creating your datafiles and
log files. InnoDB will print something like the following:

     ~/mysqlm/sql > mysqld
     InnoDB: The first specified datafile /home/heikki/data/ibdata1
     did not exist:
     InnoDB: a new database to be created!
     InnoDB: Setting file /home/heikki/data/ibdata1 size to 134217728
     InnoDB: Database physically writes the file full: wait...
     InnoDB: datafile /home/heikki/data/ibdata2 did not exist:
     new to be created
     InnoDB: Setting file /home/heikki/data/ibdata2 size to 262144000
     InnoDB: Database physically writes the file full: wait...
     InnoDB: Log file /home/heikki/data/logs/ib_logfile0 did not exist:
     new to be created
     InnoDB: Setting log file /home/heikki/data/logs/ib_logfile0 size to 5242880
     InnoDB: Log file /home/heikki/data/logs/ib_logfile1 did not exist:
     new to be created
     InnoDB: Setting log file /home/heikki/data/logs/ib_logfile1 size to 5242880
     InnoDB: Log file /home/heikki/data/logs/ib_logfile2 did not exist:
     new to be created
     InnoDB: Setting log file /home/heikki/data/logs/ib_logfile2 size to 5242880
     InnoDB: Started
     mysqld: ready for connections

A new InnoDB database has now been created. You can connect to the MySQL
server with the usual MySQL client programs like `mysql'.  When you
shut down the MySQL server with `mysqladmin shutdown', InnoDB output
will be like the following:

     010321 18:33:34  mysqld: Normal shutdown
     010321 18:33:34  mysqld: Shutdown Complete
     InnoDB: Starting shutdown...
     InnoDB: Shutdown completed

You can now look at the datafiles and logs directories and you will see
the files created. The log directory will also contain a small file
named `ib_arch_log_0000000000'. That file resulted from the database
creation, after which InnoDB switched off log archiving.  When MySQL is
again started, the output will be like the following:

     ~/mysqlm/sql > mysqld
     InnoDB: Started
     mysqld: ready for connections

If Something Goes Wrong in Database Creation
............................................

If InnoDB prints an operating system error in a file operation, usually
the problem is one of the following:

   * You did not create InnoDB data or log directories.

   * `mysqld' does not have the rights to create files in those
     directories.

   * `mysqld' does not read the right `my.cnf' or `my.ini' file, and
     consequently does not see the options you specified.

   * The disk is full or a disk quota is exceeded.

   * You have created a subdirectory whose name is equal to a datafile
     you specified.

   * There is a syntax error in `innodb_data_home_dir' or
     `innodb_data_file_path'.

 did
create some InnoDB tables, delete also the corresponding `.frm' files
for these tables from the MySQL database directories. Then you can try
the InnoDB database creation again.

Creating InnoDB Tables
----------------------

Suppose you have started the MySQL client with the command `mysql test'.
To create a table in the InnoDB format you must specify `TYPE = InnoDB'
in the table creation SQL command:

     CREATE TABLE CUSTOMER (A INT, B CHAR (20), INDEX (A)) TYPE = InnoDB;

 the
MySQL database directory `test'.  Internally, InnoDB will add to its
own data dictionary an entry for table `'test/CUSTOMER''. Thus you can
create a table of the same name `CUSTOMER' in another database of
MySQL, and the table names will not collide inside InnoDB.


tablespace appears in the table comment section in the output of
`SHOW'. An example:

     SHOW TABLE STATUS FROM test LIKE 'CUSTOMER'

Note that the statistics `SHOW' gives about InnoDB tables are only
approximate: they are used in SQL optimisation. Table and index
reserved sizes in bytes are accurate, though.

Converting MyISAM Tables to InnoDB
..................................

InnoDB does not have a special optimisation for separate index creation.
Therefore it does not pay to export and import the table and create
indexes afterwards.  The fastest way to alter a table to InnoDB is to
do the inserts directly to an InnoDB table, that is, use `ALTER TABLE
... TYPE=INNODB', or create an empty InnoDB table with identical
definitions and insert the rows with `INSERT INTO ... SELECT * FROM
...'.

To get better control over the insertion process, it may be good to
insert big tables in pieces:

     INSERT INTO newtable SELECT * FROM oldtable
        WHERE yourkey > something AND yourkey <= somethingelse;

After all data has been inserted you can rename the tables.

 the log
buffer large.


In inserts InnoDB uses the insert buffer to merge secondary index
records to indexes in batches. That saves a lot of disk I/O. In
rollback no such mechanism is used, and the rollback can take 30 times
longer than the insertion.

In the case of a runaway rollback, if you do not have valuable data in
your database, it is better that you kill the database process and
delete all InnoDB datafiles and log files and all InnoDB table `.frm'
files, and start your job again, rather than wait for millions of disk
I/Os to complete.

`FOREIGN KEY' Constraints
.........................


The syntax of a foreign key constraint definition in InnoDB:
     [CONSTRAINT symbol] FOREIGN KEY (index_col_name, ...)
                       REFERENCES table_name (index_col_name, ...)
                       [ON DELETE {CASCADE | SET NULL | NO ACTION
                                   | RESTRICT}]
                       [ON UPDATE {CASCADE | SET NULL | NO ACTION
                                   | RESTRICT}]
Both tables have to be InnoDB type, *in the table there must be an
INDEX where the foreign key columns are listed as the FIRST columns in
the same order,* and *in the referenced table there must be an INDEX
where the referenced columns are listed as the FIRST columns in the
same order.* InnoDB does not auto-create indexes on foreign keys or
referenced keys: you have to create them explicitly. The indexes are
needed for foreign key checks to be fast and not require a table scan.

Corresponding columns in the foreign key and the referenced key must
have similar internal datatypes inside InnoDB so that they can be
compared without a type conversion.  The *size and the signedness of
integer types has to be the same*.  The length of string types need not
be the same.  If you specify a `SET NULL' action, make sure you have
*not declared the columns in the child table* `NOT NULL'.

If MySQL gives the error number 1005 from a `CREATE TABLE' statement,
and the error message string refers to errno 150, then the table
creation failed because a foreign key constraint was not correctly
formed.  Similarly, if an `ALTER TABLE' fails and it refers to errno
150, that means a foreign key definition would be incorrectly formed
for the altered table. Starting from version 4.0.13, you can use `SHOW
INNODB STATUS' to look at a detailed explanation of the latest InnoDB
foreign key error in the server.

Starting from version 3.23.50, InnoDB does not check foreign key
constraints on those foreign key or referenced key values which contain
a NULL column.

*A deviation from SQL standards:* if in the parent table there are
several rows which have the same referenced key value, then InnoDB acts
in foreign key checks like the other parent rows with the same key
value would not exist. For example, if you have defined a `RESTRICT'
type constraint, and there is a child row with several parent rows,
InnoDB does not allow the deletion of any of those parent rows.


from 4.0.8. If `ON DELETE CASCADE' is specified, and a row in the
parent table is deleted, then InnoDB automatically deletes also all
those rows in the child table whose foreign key values are equal to the
referenced key value in the parent row. If `ON DELETE SET NULL' is
specified, the child rows are automatically updated so that the columns
in the foreign key are set to the SQL `NULL' value.

*A deviation from SQL standards:* if `ON UPDATE CASCADE' or `ON UPDATE
SET NULL' recurses to update the SAME TABLE it has already updated
during the cascade, it acts like `RESTRICT'. This is to prevent
infinite loops resulting from cascaded updates. A self-referential `ON
DELETE SET NULL', on the other hand, works starting from 4.0.13.  A
self-referential `ON DELETE CASCADE' has always worked.

An example:
     CREATE TABLE parent(id INT NOT NULL, PRIMARY KEY (id)) TYPE=INNODB;
     CREATE TABLE child(id INT, parent_id INT, INDEX par_ind (parent_id),
                  FOREIGN KEY (parent_id) REFERENCES parent(id)
                  ON DELETE SET NULL
     ) TYPE=INNODB;

A complex example:
     CREATE TABLE product (category INT NOT NULL, id INT NOT NULL,
                           price DECIMAL,
                           PRIMARY KEY(category, id)) TYPE=INNODB;
     CREATE TABLE customer (id INT NOT NULL,
                           PRIMARY KEY (id)) TYPE=INNODB;
     CREATE TABLE product_order (no INT NOT NULL AUTO_INCREMENT,
                           product_category INT NOT NULL,
                           product_id INT NOT NULL,
                           customer_id INT NOT NULL,
                           PRIMARY KEY(no),
                           INDEX (product_category, product_id),
                           FOREIGN KEY (product_category, product_id)
                             REFERENCES product(category, id)
                             ON UPDATE CASCADE ON DELETE RESTRICT,
                           INDEX (customer_id),
                           FOREIGN KEY (customer_id)
                             REFERENCES customer(id)) TYPE=INNODB;

Starting from version 3.23.50, InnoDB allows you to add a new foreign
key constraint to a table through
     ALTER TABLE yourtablename
     ADD [CONSTRAINT symbol] FOREIGN KEY (...) REFERENCES anothertablename(...)
     [on_delete_and_on_update_actions]
*Remember to create the required indexes first*, though.

Starting from version 4.0.13, InnoDB supports
     ALTER TABLE yourtablename DROP FOREIGN KEY internally_generated_foreign_key_id
You have to use `SHOW CREATE TABLE' to determine the internally
generated foreign key ID when you want to drop a foreign key.

In InnoDB versions < 3.23.50 `ALTER TABLE' or `CREATE INDEX' should not
be used in connection with tables which have foreign key constraints or
which are referenced in foreign key constraints: Any `ALTER TABLE'
removes all foreign key constraints defined for the table. You should
not use `ALTER TABLE' to the referenced table either, but use `DROP
TABLE' and `CREATE TABLE' to modify the schema. When MySQL does an
`ALTER TABLE' it may internally use `RENAME TABLE', and that will
confuse the foreign key costraints which refer to the table.  A `CREATE
INDEX' statement is in MySQL processed as an `ALTER TABLE', and these
restrictions apply also to it.

When doing foreign key checks, InnoDB sets shared row level locks on
child or parent records it has to look at.  InnoDB checks foreign key
constraints immediately: the check is not deferred to transaction
commit.

If you want to ignore foreign key constraints during, for example for a
`LOAD DATA' operation, you can do `SET FOREIGN_KEY_CHECKS=0'.

InnoDB allows you to drop any table even though that would break the
foreign key constraints which reference the table. When you drop a
table the constraints which were defined in its create statement are
also dropped.

If you re-create a table which was dropped, it has to have a definition
which conforms to the foreign key constraints referencing it. It must
have the right column names and types, and it must have indexes on the
referenced keys, as stated above.  If these are not satisfied, MySQL
returns error number 1005 and refers to errno 150 in the error message
string.

Starting from version 3.23.50 InnoDB returns the foreign key
definitions of a table when you call
     SHOW CREATE TABLE yourtablename
Then also `mysqldump' produces correct definitions of tables to the
dump file, and does not forget about the foreign keys.


Adding and Removing InnoDB Data and Log Files
---------------------------------------------

From version 3.23.50 and 4.0.2 you can specify the last InnoDB datafile
to `autoextend'. Alternatively, you can increase to your tablespace by
specifying an additional datafile. To do this you have to shut down the
MySQL server, edit the `my.cnf' file adding a new datafile to
`innodb_data_file_path', and then start the MySQL server again.

Currently you cannot remove a datafile from InnoDB. To decrease the
size of your database you have to use `mysqldump' to dump all your
tables, create a new database, and import your tables to the new
database.

If you want to change the number or the size of your InnoDB log files,
you have to shut down MySQL and make sure that it shuts down without
errors.  Then copy the old log files into a safe place just in case
something went wrong in the shutdown and you will need them to recover
the database. Delete then the old log files from the log file directory,
edit `my.cnf', and start MySQL again. InnoDB will tell you at the
startup that it is creating new log files.

Backing up and Recovering an InnoDB Database
--------------------------------------------

The key to safe database management is taking regular backups.

 or disturb
your normal database processing. InnoDB Hot Backup is a non-free
additional tool which is not included in the standard MySQL
distribution. See the InnoDB Hot Backup homepage
`http://www.innodb.com/hotbackup.html' for detailed information and
screenshots.

If you are able to shut down your MySQL server, then to take a 'binary'
backup of your database you have to do the following:

   * Shut down your MySQL database and make sure it shuts down without
     errors.

   * Copy all your datafiles into a safe place.

   * Copy all your InnoDB log files to a safe place.

   * Copy your `my.cnf' configuration file(s) to a safe place.

   * Copy all the `.frm' files for your InnoDB tables into a safe place.

 it.
Dumped tables are stored into text files which are human-readable and
much simpler than database binary files. Seeing table corruption from
dumped files is easier, and since their format is simpler, the chance
for serious data corruption in them is smaller.


dumps. Then you can take the binary backup, and you will then have a
consistent snapshot of your database in two formats.

To be able to recover your InnoDB database to the present from the
binary backup described above, you have to run your MySQL database with
the general logging and log archiving of MySQL switched on. Here by the
general logging we mean the logging mechanism of the MySQL server which
is independent of InnoDB logs.

To recover from a crash of your MySQL server process, the only thing
you have to do is to restart it. InnoDB will automatically check the
logs and perform a roll-forward of the database to the present.  InnoDB
will automatically roll back uncommitted transactions which were
present at the time of the crash. During recovery, InnoDB will print
out something like the following:

     ~/mysqlm/sql >
     InnoDB: Doing recovery: scanned up to log sequence number 0 13739520
     InnoDB: Doing recovery: scanned up to log sequence number 0 13805056
     InnoDB: Doing recovery: scanned up to log sequence number 0 13870592
     InnoDB: Doing recovery: scanned up to log sequence number 0 13936128
     ...
     InnoDB: Doing recovery: scanned up to log sequence number 0 20555264
     InnoDB: Doing recovery: scanned up to log sequence number 0 20620800
     InnoDB: Doing recovery: scanned up to log sequence number 0 20664692
     InnoDB: 1 uncommitted transaction(s) which must be rolled back
     InnoDB: Starting rollback of uncommitted transactions
     InnoDB: Rolling back trx no 16745
     InnoDB: Rolling back of trx no 16745 completed
     InnoDB: Rollback of uncommitted transactions completed
     InnoDB: Starting an apply batch of log records to the database...
     InnoDB: Apply batch completed
     InnoDB: Started
     mysqld: ready for connections

If your database gets corrupted or your disk fails, you have to do the
recovery from a backup. In the case of corruption, you should first
find a backup which is not corrupted. From a backup do the recovery
from the general log files of MySQL according to instructions in the
MySQL manual.

Forcing recovery
................

If there is database page corruption, you may want to dump your tables
from the database with `SELECT INTO OUTFILE', and usually most of the
data is intact and correct. But the corruption may cause `SELECT * FROM
table', or InnoDB background operations to crash or assert, or even the
InnoDB roll-forward recovery to crash. Starting from the InnoDB version
3.23.44, there is a `my.cnf' option with which you can force InnoDB to
start up, and you can also prevent background operations from running,
so that you will be able to dump your tables. For example, you can set
     set-variable = innodb_force_recovery = 4
in `my.cnf'.

The alternatives for `innodb_force_recovery' are listed below.  The
database must not otherwise be used with these options!  As a safety
measure InnoDB prevents a user from doing `INSERT', `UPDATE', or
`DELETE' when this option is > 0.

Starting from version 3.23.53 and 4.0.4, you are allowed to `DROP' or
`CREATE' a table even if forced recovery is used. If you know that a
certain table is causing a crash in rollback, you can drop it.  You can
use this also to stop a runaway rollback caused by a failing mass
import or `ALTER TABLE'.  You can kill the mysqld process and use the
`my.cnf' option `innodb_force_recovery=3' to bring your database up
without the rollback.  Then `DROP' the table which is causing the
runaway rollback.

A bigger number below means that all precautions of lower numbers are
included. If you are able to dump your tables with an option at most 4,
then you are relatively safe that only some data on corrupt individual
pages is lost. Option 6 is more dramatic, because database pages are
left in an obsolete state, which in turn may introduce more corruption
into B-trees and other database structures.

   * 1 (SRV_FORCE_IGNORE_CORRUPT) let the server run even if it detects
     a                                 corrupt page; try to make
                             `SELECT * FROM table'
                jump over corrupt index records and pages,
                        which helps in dumping tables;

   * 2 (SRV_FORCE_NO_BACKGROUND) prevent the main thread from
                          running: if a crash would occur
                       in purge, this prevents it;

   * 3 (SRV_FORCE_NO_TRX_UNDO) do not run transaction rollbacks
                            after recovery;

    not
     calculate table statistics;

   * 5 (SRV_FORCE_NO_UNDO_LOG_SCAN) do not look at undo logs when
                              starting the database: InnoDB will
                              treat even incomplete transactions
                              as committed;

   * 6 (SRV_FORCE_NO_LOG_REDO)   do not do the log roll-forward
                            in connection with recovery.

Checkpoints
...........

InnoDB implements a checkpoint mechanism called a fuzzy checkpoint.
InnoDB will flush modified database pages from the buffer pool in small
batches, there is no need to flush the buffer pool in one single batch,
which would in practice stop processing of user SQL statements for a
while.

In crash recovery InnoDB looks for a checkpoint label written to the
log files. It knows that all modifications to the database before the
label are already present on the disk image of the database.  Then
InnoDB scans the log files forward from the place of the checkpoint
applying the logged modifications to the database.

InnoDB writes to the log files in a circular fashion.  All committed
modifications which make the database pages in the buffer pool
different from the images on disk must be available in the log files in
case InnoDB has to do a recovery. This means that when InnoDB starts to
reuse a log file in the circular fashion, it has to make sure that the
database page images on disk already contain the modifications logged
in the log file InnoDB is going to reuse. In other words, InnoDB has to
make a checkpoint and often this involves flushing of modified database
pages to disk.

 in big
log files is that crash recovery can last longer because there will be
more log to apply to the database.

Moving an InnoDB Database to Another Machine
--------------------------------------------

On Windows InnoDB stores the database names and table names internally
always in lower case. To move databases in a binary format from Unix to
Windows or from Windows to Unix you should have all table and database
names in lower case. A convenient way to accomplish this is to add on
Unix the line
     set-variable=lower_case_table_names=1
to the `[mysqld]' section of your `my.cnf' before you start creating
your tables. On Windows the setting 1 is the default.

InnoDB data and log files are binary-compatible on all platforms if the
floating-point number format on the machines is the same.  You can move
an InnoDB database simply by copying all the relevant files, which we
already listed in the previous section on backing up a database. If the
floating-point formats on the machines are different but you have not
used `FLOAT' or `DOUBLE' datatypes in your tables then the procedure is
the same: just copy the relevant files. If the formats are different
and your tables contain floating-point data, you have to use `mysqldump'
and `mysqlimport' to move those tables.

A performance tip is to switch off auto-commit mode when you import
data into your database, assuming your tablespace has enough space for
the big rollback segment the big import transaction will generate.  Do
the commit only after importing a whole table or a segment of a table.

InnoDB Transaction Model and Locking
------------------------------------

In the `InnoDB' transaction model the goal has been to combine the best
properties of a multi-versioning database to traditional two-phase
locking.  InnoDB does locking on row level and runs queries by default
as non-locking consistent reads, in the style of Oracle.  The lock
table in `InnoDB' is stored so space-efficiently that lock escalation
is not needed: typically several users are allowed to lock every row in
the database, or any random subset of the rows, without `InnoDB'
running out of memory.

In `InnoDB' all user activity happens inside transactions. If the
autocommit mode is used in MySQL, then each SQL statement forms a
single transaction.  MySQL always starts a new connection with the
autocommit mode switched on.

If the autocommit mode is switched off with `SET AUTOCOMMIT = 0', then
we can think that a user always has a transaction open. If he issues
the SQL `COMMIT' or `ROLLBACK' statement, it ends the current
transaction, and a new one starts. Both statements will release all
`InnoDB' locks that were set during the current transaction. A `COMMIT'
means that the changes made in the current transaction are made
permanent and become visible to other users. A `ROLLBACK' statement, on
the other hand, cancels all modifications made by the current
transaction.

If the connection has `AUTOCOMMIT = 1', then the user can still perform
a multi-statement transaction by starting it with `START TRANSACTION' or
`BEGIN' and ending it with `COMMIT' or `ROLLBACK'.

InnoDB and `SET ... TRANSACTION ISOLATION LEVEL ...'
....................................................

In terms of the SQL-92 transaction isolation levels, the `InnoDB'
default is `REPEATABLE READ'.  Starting from version 4.0.5, `InnoDB'
offers all 4 different transaction isolation levels described by the
SQL-92 standard.  You can set the default isolation level for all
connections in the `[mysqld]' section of `my.cnf':

     transaction-isolation = {READ-UNCOMMITTED | READ-COMMITTED
                              | REPEATABLE-READ | SERIALIZABLE}

A user can change the isolation level of a single session or all new
incoming connections with the `SET TRANSACTION' statement. Its syntax
is as follows:

     SET [SESSION | GLOBAL] TRANSACTION ISOLATION LEVEL
                            {READ UNCOMMITTED | READ COMMITTED
                             | REPEATABLE READ | SERIALIZABLE}

Note that there are no hyphens in level names in the SQL syntax.

The default behaviour is to set the isolation level for the next (not
started) transaction.  If you use the `GLOBAL' keyword, the statement
sets the default transaction level globally for all new connections
created from that point on (but not existing connections).  You need
the `SUPER' privilege to do this.  Using the `SESSION' keyword sets the
default transaction level for all future transactions performed on the
current connection.  Any client is free to change the session isolation
level (even in the middle of a transaction), or the isolation level for
the next transaction.  In versions earlier than 3.23.50, `SET
TRANSACTION' had no effect on `InnoDB' tables. In versions < 4.0.5 only
`REPEATABLE READ' and `SERIALIZABLE' were available.

You can query the global and session transaction isolation levels with:
     SELECT @@global.tx_isolation;
     SELECT @@tx_isolation;

In row level locking `InnoDB' uses so-called next-key locking.  That
means that besides index records, `InnoDB' can also lock the "gap"
before an index record to block insertions by other users immediately
before the index record. A next-key lock means a lock which locks an
index record and the gap before it.  A gap lock means a lock which only
locks a gap before some index record.

A detailed description of each isolation level in `InnoDB':

   * `READ UNCOMMITTED' This is also called "dirty read": non-locking
     `SELECT's are performed so that we do not look at a possible
     earlier version of a record; thus they are not 'consistent' reads
     under this isolation level; otherwise this level works like `READ
     COMMITTED'.

   * `READ COMMITTED' Somewhat Oracle-like isolation level.  All
     `SELECT ... FOR UPDATE' and `SELECT ... LOCK IN SHARE MODE'
     statements only lock the index records, *not* the gaps before
     them, and thus allow free inserting of new records next to locked
     records.  `UPDATE' and `DELETE' which use a unique index with a
     unique search condition, only lock the index record found, not the
     gap before it.  But still in range type `UPDATE' and `DELETE',
     `InnoDB' must set next-key or gap locks and block insertions by
     other users to the gaps covered by the range. This is necessary
     since "phantom rows" have to be blocked for MySQL replication and
     recovery to work.  *Consistent reads* behave as in Oracle: each
     consistent read, even within the same transaction, sets and reads
     its own fresh snapshot.

   * `REPEATABLE READ' This is the default isolation level of `InnoDB'.
     `SELECT ... FOR UPDATE', `SELECT ... LOCK IN SHARE MODE',
     `UPDATE', and `DELETE' which use a unique index with a unique
     search condition, only lock the index record found, not the gap
     before it.  Otherwise these operations employ next-key locking,
     locking the index range scanned with next-key or gap locks, and
     block new insertions by other users.  In *consistent reads* there
     is an important difference from the previous isolation level: in
     this level all consistent reads within the same transaction read
     the same snapshot established by the first read. This convention
     means that if you issue several plain `SELECT's within the same
     transaction, these `SELECT's are consistent also with respect to
     each other.

   * `SERIALIZABLE' This level is like the previous one, but all plain
     `SELECT's are implicitly converted to `SELECT ... LOCK IN SHARE
     MODE'.

Consistent Non-Locking Read
...........................


committed before that point of time, and no changes made by later or
uncommitted transactions. The exception to this rule is that the query
will see the changes made by the transaction itself which issues the
query.

If you are running with the default `REPEATABLE READ' isolation level,
then all consistent reads within the same transaction read the snapshot
established by the first such read in that transaction. You can get a
fresher snapshot for your queries by committing the current transaction
and after that issuing new queries.

 accesses, and
therefore other users are free to modify those tables at the same time
a consistent read is being performed on the table.

Locking Reads `SELECT ... FOR UPDATE' and `SELECT ... LOCK IN SHARE MODE'
.........................................................................

A consistent read is not convenient in some circumstances.  Suppose you
want to add a new row into your table `CHILD', and make sure that the
child already has a parent in table `PARENT'.

Suppose you use a consistent read to read the table `PARENT' and indeed
see the parent of the child in the table. Can you now safely add the
child row to table `CHILD'? No, because it may happen that meanwhile
some other user has deleted the parent row from the table `PARENT', and
you are not aware of that.

The solution is to perform the `SELECT' in a locking mode, `LOCK IN
SHARE MODE'.

     SELECT * FROM PARENT WHERE NAME = 'Jones' LOCK IN SHARE MODE;

Performing a read in share mode means that we read the latest available
data, and set a shared mode lock on the rows we read.  If the latest
data belongs to a yet uncommitted transaction of another user, we will
wait until that transaction commits.  A shared mode lock prevents
others from updating or deleting the row we have read. After we see
that the above query returns the parent `'Jones'', we can safely add
his child to table `CHILD', and commit our transaction.  This example
shows how to implement referential integrity in your application code.

Let us look at another example: we have an integer counter field in a
table `CHILD_CODES' which we use to assign a unique identifier to each
child we add to table `CHILD'.  Obviously, using a consistent read or a
shared mode read to read the present value of the counter is not a good
idea, since then two users of the database may see the same value for
the counter, and we will get a duplicate key error when we add the two
children with the same identifier to the table.

In this case there are two good ways to implement the reading and
incrementing of the counter: (1) update the counter first by
incrementing it by 1 and only after that read it, or (2) read the
counter first with a lock mode `FOR UPDATE', and increment after that:

     SELECT COUNTER_FIELD FROM CHILD_CODES FOR UPDATE;
     UPDATE CHILD_CODES SET COUNTER_FIELD = COUNTER_FIELD + 1;

A `SELECT ... FOR UPDATE' will read the latest available data setting
exclusive locks on each row it reads.  Thus it sets the same locks a
searched SQL `UPDATE' would set on the rows.

Next-key Locking: Avoiding the Phantom Problem
..............................................

In row level locking InnoDB uses an algorithm called next-key locking.
InnoDB does the row level locking so that when it searches or scans an
index of a table, it sets shared or exclusive locks on the index
records it encounters. Thus the row level locks are more precisely
called index record locks.

The locks InnoDB sets on index records also affect the 'gap' before
that index record. If a user has a shared or exclusive lock on record R
in an index, then another user cannot insert a new index record
immediately before R in the index order.  This locking of gaps is done
to prevent the so-called phantom problem. Suppose I want to read and
lock all children with identifier bigger than 100 from table `CHILD',
and update some field in the selected rows.

     SELECT * FROM CHILD WHERE ID > 100 FOR UPDATE;

Suppose there is an index on table `CHILD' on column `ID'. Our query
will scan that index starting from the first record where `ID' is
bigger than 100.  Now, if the locks set on the index records would not
lock out inserts made in the gaps, a new child might meanwhile be
inserted to the table. If now I in my transaction execute

     SELECT * FROM CHILD WHERE ID > 100 FOR UPDATE;


during the transaction. If we regard a set of rows as a data item, then
the new 'phantom' child would break this isolation principle.

When InnoDB scans an index it can also lock the gap after the last
record in the index. Just that happens in the previous example: the
locks set by InnoDB will prevent any insert to the table where `ID'
would be bigger than 100.

You can use next-key locking to implement a uniqueness check in your
application: if you read your data in share mode and do not see a
duplicate for a row you are going to insert, then you can safely insert
your row and know that the next-key lock set on the successor of your
row during the read will prevent anyone meanwhile inserting a duplicate
for your row. Thus the next-key locking allows you to 'lock' the
non-existence of something in your table.

Locks Set by Different SQL Statements in `InnoDB'
.................................................

   * `SELECT ... FROM ...': this is a consistent read, reading a
     snapshot of the database and setting no locks.

   * `SELECT ... FROM ... LOCK IN SHARE MODE': sets shared next-key
     locks on all index records the read encounters.

   * `SELECT ... FROM ... FOR UPDATE': sets exclusive next-key locks on
     all index records the read encounters.

   
     inserted row. If a duplicate key error occurs, sets a shared lock
     on the duplicate index record.

   * `INSERT INTO T SELECT ... FROM S WHERE ...' sets an exclusive
     (non-next-key) lock on each row inserted into `T'. Does the search
     on `S' as a consistent read, but sets shared next-key locks on `S'
     if the MySQL logging is on. `InnoDB' has to set locks in the
     latter case because in roll-forward recovery from a backup every
     SQL statement has to be executed in exactly the same way as it was
     done originally.

   * `CREATE TABLE ... SELECT ...' performs the `SELECT' as a
     consistent read or with shared locks, like in the previous item.

   * `REPLACE' is done like an insert if there is no collision on a
     unique key. Otherwise, an exclusive next-key lock is placed on the
     row which has to be updated.

   * `UPDATE ... SET ... WHERE ...': sets an exclusive next-key lock on
     every record the search encounters.

   * `DELETE FROM ... WHERE ...': sets an exclusive next-key lock on
     every record the search encounters.

   * If a `FOREIGN KEY' constraint is defined on a table, any insert,
     update, or delete which requires checking of the constraint
     condition sets shared record level locks on the records it looks
     at to check the constraint. Also in the case where the constraint
     fails, `InnoDB' sets these locks.

   * `LOCK TABLES ... ': sets table locks. In the implementation the
     MySQL layer of code sets these locks. The automatic deadlock
     detection of `InnoDB' cannot detect deadlocks where such table
     locks are involved: see the following section.  Also, since MySQL
     does know about row level locks, it is possible that you get a
     table lock on a table where another user currently has row level
     locks. But that does not put transaction integrity into danger.
     *Note InnoDB restrictions::.

Deadlock Detection and Rollback
...............................

`InnoDB' automatically detects a deadlock of transactions and rolls
back a transaction or transactions to prevent the deadlock. Starting
from version 4.0.5, `InnoDB' will try to pick small transactions to roll
back. The size of a transaction is determined by the number of rows it
has inserted, updated, or deleted. Previous to 4.0.5, `InnoDB' always
rolled back the transaction whose lock request was the last one to
build a deadlock, that is, a cycle in the waits-for graph of
transactions.

`InnoDB' cannot detect deadlocks where a lock set by a MySQL `LOCK
TABLES' statement is involved, or if a lock set in another storage
engine than `InnoDB' is involved. You have to resolve these situations
using `innodb_lock_wait_timeout' set in `my.cnf'.

 locks set
by the SQL statement may be preserved. This is because `InnoDB' stores
row locks in a format where it cannot afterwards know which was set by
which SQL statement.

An Example of How the Consistent Read Works in `InnoDB'
.......................................................

Suppose you are running on the default `REPEATABLE READ' isolation
level.  When you issue a consistent read, that is, an ordinary `SELECT'
statement, `InnoDB' will give your transaction a timepoint according to
which your query sees the database. Thus, if transaction B deletes a
row and commits after your timepoint was assigned, then you will not
see the row deleted. Similarly with inserts and updates.

You can advance your timepoint by committing your transaction and then
doing another `SELECT'.

This is called multi-versioned concurrency control.

                       User A                 User B
     
                   SET AUTOCOMMIT=0;      SET AUTOCOMMIT=0;
     time
     |             SELECT * FROM t;
     |             empty set
     |                                    INSERT INTO t VALUES (1, 2);
     |
     v             SELECT * FROM t;
                   empty set
                                          COMMIT;
     
                   SELECT * FROM t;
                   empty set;
     
                   COMMIT;
     
                   SELECT * FROM t;
                   ---------------------
                   |    1    |    2    |
                   ---------------------

Thus user A sees the row inserted by B only when B has committed the
insert, and A has committed his own transaction so that the timepoint
is advanced past the commit of B.

If you want to see the "freshest" state of the database, you should use
a locking read:

     SELECT * FROM t LOCK IN SHARE MODE;

How to cope with deadlocks?
...........................

Deadlocks are a classic problem in transactional databases, but they
are not dangerous, unless they are so frequent that you cannot run
certain transactions at all. Normally you have to write your
applications so that they are always prepared to re-issue a transaction
if it gets rolled back because of a deadlock.

`InnoDB' uses automatic row level locking. You can get deadlocks even
in the case of transactions which just insert or delete a single row.
That is because these operations are not really 'atomic': they
automatically set locks on the (possibly several) index records of the
row inserted/deleted.

You can cope with deadlocks and reduce the number of them with the
following tricks:

   * Use `SHOW INNODB STATUS' in MySQL versions >= 3.23.52 and >= 4.0.3
     to determine the cause of the latest deadlock. That can help you
     to tune your application to avoid deadlocks.

   * Always be prepared to re-issue a transaction if it fails in a
     deadlock. Deadlocks are not dangerous. Just try again.

   * Commit your transactions often. Small transactions are less prone
     to collide.

   * If you are using locking reads `SELECT ... FOR UPDATE' or `...
     LOCK IN SHARE MODE', try using a lower isolation level `READ
     COMMITTED'.

   * Access your tables and rows in a fixed order. Then transactions
     will form nice queues, and do not deadlock.

    indexes
     for your queries.

   * Use less locking: if you can afford a `SELECT' to return data from
     an old snapshot, do not add the clause `FOR UPDATE' or `LOCK IN
     SHARE MODE' to it. Using `READ COMMITTED' isolation level is good
     here, because each consistent read within the same transaction
     reads from its own fresh snapshot.

   * If nothing helps, serialize your transactions with table level
     locks: `LOCK TABLES t1 WRITE, t2 READ, ... ; [do something with
     tables t1 and t2 here]; UNLOCK TABLES'.  Table level locks make
     you transactions to queue nicely, and deadlocks are avoided. Note
     that `LOCK TABLES' implicitly starts a transaction, just like the
     command `BEGIN', and `UNLOCK TABLES' implicitly ends the
     transaction in a `COMMIT'.

   * Another solution to serialize transactions is to create an
     auxiliary 'semaphore' table where there is just a single row. Each
     transaction updates that row before accessing other tables. In
     that way all transactions happen in a serial fashion. Note that
     then also the `InnoDB' instant deadlock detection algorithm works,
     because the serializing lock is a row level lock.  In MySQL table
     level locks we have to resort to the timeout method to resolve a
     deadlock.

Performance Tuning Tips
-----------------------

  1. If the Unix `top' or the Windows `Task Manager' shows that the CPU
     usage percentage with your workload is less than 70%, your
     workload is probably disk-bound. Maybe you are making too many
     transaction commits, or the buffer pool is too small.  Making the
     buffer pool bigger can help, but do not set it bigger than 80% of
     physical memory.

  2. Wrap several modifications into one transaction. `InnoDB' must
     flush the log to disk at each transaction commit, if that
     transaction made modifications to the database. Since the rotation
     speed of a disk is typically at most 167 revolutions/second, that
     constrains the number of commits to the same 167/second if the
     disk does not fool the operating system.

  3. If you can afford the loss of some latest committed transactions,
     you can set the `my.cnf' parameter `innodb_flush_log_at_trx_commit'
     to 0. `InnoDB' tries to flush the log once per second anyway,
     though the flush is not guaranteed.

  4. Make your log files big, even as big as the buffer pool. When
     `InnoDB' has written the log files full, it has to write the
     modified contents of the buffer pool to disk in a checkpoint.
     Small log files will cause many unnecessary disk writes. The
     drawback in big log files is that recovery time will be longer.

  5. Also the log buffer should be quite big, say 8 MB.

  6. (Relevant from 3.23.39 up.)  In some versions of Linux and Unix,
     flushing files to disk with the Unix `fdatasync' and other similar
     methods is surprisingly slow.  The default method `InnoDB' uses is
     the `fdatasync' function.  If you are not satisfied with the
     database write performance, you may try setting
     `innodb_flush_method' in `my.cnf' to `O_DSYNC', though `O_DSYNC'
     seems to be slower on most systems.

  7. In importing data to `InnoDB', make sure that MySQL does not have
     `autocommit=1' on. Then every insert requires a log flush to disk.
     Put before your plain SQL import file line

          SET AUTOCOMMIT=0;

     and after it

          COMMIT;

     If you use the `mysqldump' option `--opt', you will get dump files
     which are fast to import also to an `InnoDB' table, even without
     wrapping them to the above `SET AUTOCOMMIT=0; ... COMMIT;'
     wrappers.

  8. Beware of big rollbacks of mass inserts: `InnoDB' uses the insert
     buffer to save disk I/O in inserts, but in a corresponding
     rollback no such mechanism is used. A disk-bound rollback can take
     30 times the time of the corresponding insert. Killing the
     database process will not help because the rollback will start
     again at the database startup. The only way to get rid of a
     runaway rollback is to increase the buffer pool so that the
     rollback becomes CPU-bound and runs fast, or delete the whole
     `InnoDB' database.

  9. Beware also of other big disk-bound operations.  Use `DROP TABLE'
     or `TRUNCATE' (from MySQL-4.0 up) to empty a table, not `DELETE
     FROM yourtable'.

 10. Use the multi-line `INSERT' to reduce communication overhead
     between the client and the server if you need to insert many rows:

          INSERT INTO yourtable VALUES (1, 2), (5, 5);

     This tip is of course valid for inserts into any table type, not
     just `InnoDB'.


`SHOW INNODB STATUS' and the `InnoDB' Monitors
..............................................

 INNODB
STATUS' to fetch the output of the standard InnoDB Monitor to the SQL
client.  The data is useful in performance tuning. If you are using the
`mysql' interactive SQL client, the output is more readable if you
replace the usual semicolon at the SQL command end by `\G':
     SHOW INNODB STATUS\G


print data about once every 15 seconds. If you run `mysqld' as a daemon
then this output is usually directed to the `.err' log in the MySQL
`datadir'.  This data is useful in performance tuning.  On Windows you
must start `mysqld-max' from an MS-DOS prompt with the `--console'
option if you want to direct the output to the MS-DOS prompt window.

There is a separate `innodb_lock_monitor' which prints the same
information as `innodb_monitor' plus information on locks set by each
transaction.

The printed information includes data on:
   * lock waits of a transactions,

   * semaphore waits of threads,

   * pending file I/O requests,

   * buffer pool statistics, and

   * purge and insert buffer merge activity of the main thread of
     `InnoDB'.

You can start InnoDB Monitor through the following SQL command:

     CREATE TABLE innodb_monitor(a INT) type = innodb;

and stop it by

     DROP TABLE innodb_monitor;

The `CREATE TABLE' syntax is just a way to pass a command to the
`InnoDB' engine through the MySQL SQL parser: the created table is not
relevant at all for InnoDB Monitor. If you shut down the database when
the monitor is running, and you want to start the monitor again, you
have to drop the table before you can issue a new `CREATE TABLE' to
start the monitor.  This syntax may change in a future release.

A sample output of the InnoDB Monitor:

     ================================
     010809 18:45:06 INNODB MONITOR OUTPUT
     ================================
     --------------------------
     LOCKS HELD BY TRANSACTIONS
     --------------------------
     LOCK INFO:
     Number of locks in the record hash table 1294
     LOCKS FOR TRANSACTION ID 0 579342744
     TABLE LOCK table test/mytable trx id 0 582333343 lock_mode IX
     
     RECORD LOCKS space id 0 page no 12758 n bits 104 table test/mytable index
     PRIMARY trx id 0 582333343 lock_mode X
     Record lock, heap no 2 PHYSICAL RECORD: n_fields 74; 1-byte offs FALSE;
     info bits 0
      0: len 4; hex 0001a801; asc ;; 1: len 6; hex 000022b5b39f; asc ";;
      2: len 7; hex 000002001e03ec; asc ;; 3: len 4; hex 00000001;
     ...
     -----------------------------------------------
     CURRENT SEMAPHORES RESERVED AND SEMAPHORE WAITS
     -----------------------------------------------
     SYNC INFO:
     Sorry, cannot give mutex list info in non-debug version!
     Sorry, cannot give rw-lock list info in non-debug version!
     -----------------------------------------------------
     SYNC ARRAY INFO: reservation count 6041054, signal count 2913432
     4a239430 waited for by thread 49627477 op. S-LOCK file NOT KNOWN line 0
     Mut ex 0 sp 5530989 r 62038708 sys 2155035;
     rws 0 8257574 8025336; rwx 0 1121090 1848344
     -----------------------------------------------------
     CURRENT PENDING FILE I/O'S
     --------------------------
     Pending normal aio reads:
     Reserved slot, messages 40157658 4a4a40b8
     Reserved slot, messages 40157658 4a477e28
     ...
     Reserved slot, messages 40157658 4a4424a8
     Reserved slot, messages 40157658 4a39ea38
     Total of 36 reserved aio slots
     Pending aio writes:
     Total of 0 reserved aio slots
     Pending insert buffer aio reads:
     Total of 0 reserved aio slots
     Pending log writes or reads:
     Reserved slot, messages 40158c98 40157f98
     Total of 1 reserved aio slots
     Pending synchronous reads or writes:
     Total of 0 reserved aio slots
     -----------
     BUFFER POOL
     -----------
     LRU list length 8034
     Free list length 0
     Flush list length 999
     Buffer pool size in pages 8192
     Pending reads 39
     Pending writes: LRU 0, flush list 0, single page 0
     Pages read 31383918, created 51310, written 2985115
     ----------------------------
     END OF INNODB MONITOR OUTPUT
     ============================
     010809 18:45:22 InnoDB starts purge
     010809 18:45:22 InnoDB purged 0 pages

Some notes on the output:

   * If the section LOCKS HELD BY TRANSACTIONS reports lock waits, then
     your application may have lock contention. The output can also
     help to trace reasons for transaction deadlocks.

   * Section SYNC INFO will report reserved semaphores if you compile
     InnoDB with `UNIV_SYNC_DEBUG' defined in `univ.i'.

   * Section SYNC ARRAY INFO reports threads waiting for a semaphore
     and statistics on how many times threads have needed a spin or a
     wait on a mutex or a rw-lock semaphore. A big number of threads
     waiting for semaphores may be a result of disk I/O, or contention
     problems inside InnoDB. Contention can be due to heavy parallelism
     of queries, or problems in operating system thread scheduling.

   * Section CURRENT PENDING FILE I/O'S lists pending file I/O
     requests. A large number of these indicates that the workload is
     disk I/O-bound.

   * Section BUFFER POOL gives you statistics on pages read and
     written. You can calculate from these numbers how many datafile
     I/Os your queries are currently doing.

Implementation of Multi-versioning
----------------------------------

Since InnoDB is a multi-versioned database, it must keep information of
old versions of rows in the tablespace. This information is stored in a
data structure we call a rollback segment after an analogous data
structure in Oracle.

InnoDB internally adds two fields to each row stored in the database.
A 6-byte field tells the transaction identifier for the last
transaction which inserted or updated the row. Also a deletion is
internally treated as an update where a special bit in the row is set
to mark it as deleted. Each row also contains a 7-byte field called the
roll pointer. The roll pointer points to an undo log record written to
the rollback segment. If the row was updated, then the undo log record
contains the information necessary to rebuild the content of the row
before it was updated.

InnoDB uses the information in the rollback segment to perform the undo
operations needed in a transaction rollback. It also uses the
information to build earlier versions of a row for a consistent read.

Undo logs in the rollback segment are divided into insert and update
undo logs. Insert undo logs are only needed in transaction rollback and
can be discarded as soon as the transaction commits. Update undo logs
are used also in consistent reads, and they can be discarded only after
there is no transaction present for which InnoDB has assigned a
snapshot that in a consistent read could need the information in the
update undo log to build an earlier version of a database row.

You must remember to commit your transactions regularly, also those
transactions which only issue consistent reads.  Otherwise InnoDB
cannot discard data from the update undo logs, and the rollback segment
may grow too big, filling up your tablespace.

The physical size of an undo log record in the rollback segment is
typically smaller than the corresponding inserted or updated row. You
can use this information to calculate the space need for your rollback
segment.

In our multi-versioning scheme a row is not physically removed from the
database immediately when you delete it with an SQL statement.  Only
when InnoDB can discard the update undo log record written for the
deletion, it can also physically remove the corresponding row and its
index records from the database. This removal operation is called a
purge, and it is quite fast, usually taking the same order of time as
the SQL statement which did the deletion.

Table and Index Structures
--------------------------

 When
MySQL drops a table or a database, it has to delete both a `.frm' file
or files, and the corresponding entries inside the InnoDB data
dictionary.  This is the reason why you cannot move InnoDB tables
between databases simply by moving the `.frm' files, and why `DROP
DATABASE' did not work for InnoDB type tables in MySQL versions <=
3.23.43.

Every InnoDB table has a special index called the clustered index where
the data of the rows is stored. If you define a `PRIMARY KEY' on your
table, then the index of the primary key will be the clustered index.

 ID is a
6-byte field which monotonically increases as new rows are inserted.
Thus the rows ordered by the row ID will be physically in the insertion
order.

Accessing a row through the clustered index is fast, because the row
data will be on the same page where the index search leads us. In many
databases the data is traditionally stored on a different page from the
index record. If a table is large, the clustered index architecture
often saves a disk I/O when compared to the traditional solution.

The records in non-clustered indexes (we also call them secondary
indexes), in InnoDB contain the primary key value for the row. InnoDB
uses this primary key value to search for the row from the clustered
index. Note that if the primary key is long, the secondary indexes will
use more space.

Physical Structure of an Index
..............................

All indexes in InnoDB are B-trees where the index records are stored in
the leaf pages of the tree. The default size of an index page is 16 KB.
When new records are inserted, InnoDB tries to leave 1 / 16 of the page
free for future insertions and updates of the index records.

 - 15/16
full. If the fillfactor of an index page drops below 1/2, InnoDB will
try to contract the index tree to free the page.

Insert Buffering
................

It is a common situation in a database application that the primary key
is a unique identifier and new rows are inserted in the ascending order
of the primary key. Thus the insertions to the clustered index do not
require random reads from a disk.

On the other hand, secondary indexes are usually non-unique and
insertions happen in a relatively random order into secondary indexes.
This would cause a lot of random disk I/Os without a special mechanism
used in InnoDB.

If an index record should be inserted to a non-unique secondary index,
InnoDB checks if the secondary index page is already in the buffer
pool. If that is the case, InnoDB will do the insertion directly to the
index page. But, if the index page is not found from the buffer pool,
InnoDB inserts the record to a special insert buffer structure.  The
insert buffer is kept so small that it entirely fits in the buffer
pool, and insertions can be made to it very fast.

The insert buffer is periodically merged to the secondary index trees
in the database. Often we can merge several insertions on the same page
in of the index tree, and hence save disk I/Os.  It has been measured
that the insert buffer can speed up insertions to a table up to 15
times.

Adaptive Hash Indexes
.....................


defined for a table, and if InnoDB notices that queries could benefit
from building of a hash index, such an index is automatically built.

But note that the hash index is always built based on an existing
B-tree index on the table. InnoDB can build a hash index on a prefix of
any length of the key defined for the B-tree, depending on what search
pattern InnoDB observes on the B-tree index.  A hash index can be
partial: it is not required that the whole B-tree index is cached in
the buffer pool. InnoDB will build hash indexes on demand to those
pages of the index which are often accessed.

In a sense, through the adaptive hash index mechanism InnoDB adapts
itself to ample main memory, coming closer to the architecture of main
memory databases.

Physical Record Structure
.........................

   * Each index record in InnoDB contains a header of 6 bytes. The
     header is used to link consecutive records together, and also in
     the row level locking.

   * Records in the clustered index contain fields for all user-defined
     columns. In addition, there is a 6-byte field for the transaction
     id and a 7-byte field for the roll pointer.

   * If the user has not defined a primary key for a table, then each
     clustered index record contains also a 6-byte row ID field.

   * Each secondary index record contains also all the fields defined
     for the clustered index key.

   * A record contains also a pointer to each field of the record.  If
     the total length of the fields in a record is < 128 bytes, then
     the pointer is 1 byte, else 2 bytes.

How an `AUTO_INCREMENT' Column Works in InnoDB
..............................................

 executes
`SELECT MAX(auto-inc-column) FROM T', and assigns that value incremented
by one to the column and the auto-increment counter of the table.  We
say that the auto-increment counter for table `T' has been initialised.

InnoDB follows the same procedure in initialising the auto-increment
counter for a freshly created table.

Note that if the user specifies in an insert the value 0 to the
auto-increment column, then InnoDB treats the row like the value would
not have been specified.

 is set to the
specified column value. If the user does not explicitly specify a
value, then InnoDB increments the counter by one and assigns its new
value to the column.

The auto-increment mechanism, when assigning values from the counter,
bypasses locking and transaction handling. Therefore you may also get
gaps in the number sequence if you roll back transactions which have
got numbers from the counter.

The behaviour of auto-increment is not defined if a user gives a
negative value to the column or if the value becomes bigger than the
maximum integer that can be stored in the specified integer type.

File Space Management and Disk I/O
----------------------------------

Disk I/O
........

In disk I/O InnoDB uses asynchronous I/O. On Windows NT it uses the
native asynchronous I/O provided by the operating system.  On Unix,
InnoDB uses simulated asynchronous I/O built into InnoDB: InnoDB
creates a number of I/O threads to take care of I/O operations, such as
read-ahead. In a future version we will add support for simulated aio
on Windows NT and native aio on those versions of Unix which have one.

On Windows NT InnoDB uses non-buffered I/O. That means that the disk
pages InnoDB reads or writes are not buffered in the operating system
file cache. This saves some memory bandwidth.

Starting from 3.23.41 InnoDB uses a novel file flush technique called
doublewrite.  It adds safety to crash recovery after an operating
system crash or a power outage, and improves performance on most Unix
flavors by reducing the need for fsync operations.

Doublewrite means that InnoDB before writing pages to a datafile first
writes them to a contiguous tablespace area called the doublewrite
buffer. Only after the write and the flush to the doublewrite buffer
has completed, InnoDB writes the pages to their proper positions in the
datafile. If the operating system crashes in the middle of a page
write, InnoDB will in recovery find a good copy of the page from the
doublewrite buffer.

Starting from 3.23.41 you can also use a raw disk partition as a
datafile, though this has not been tested yet. When you create a new
datafile you have to put the keyword `newraw' immediately after the
datafile size in `innodb_data_file_path'. The partition must be at
least as large as the size that you specify. Note that 1M in InnoDB is
1024 x 1024 bytes, while in disk specifications 1 MB usually means 1000
000 bytes.

     innodb_data_file_path=/dev/hdd1:5Gnewraw;/dev/hdd2:2Gnewraw

When you start the database again you *must* change the keyword to
`raw'. Otherwise, InnoDB will write over your partition!

     innodb_data_file_path=/dev/hdd1:5Graw;/dev/hdd2:2Graw

By using a raw disk you can on some versions of Unix perform unbuffered
I/O.

When you use raw disk partitions, make sure they have permissions that
allow read and write access to the account used for running the MySQL
server.

  Then
InnoDB will post in advance a batch of reads of database pages to the
I/O system. In random read-ahead InnoDB notices that some area in a
tablespace seems to be in the process of being fully read into the
buffer pool. Then InnoDB posts the remaining reads to the I/O system.

File Space Management
.....................

 in the
tablespace your tables will be allocated. However, in a newly created
tablespace, InnoDB will allocate space starting from the low end.

The tablespace consists of database pages whose default size is 16 KB.
The pages are grouped into extents of 64 consecutive pages. The 'files'
inside a tablespace are called segments in InnoDB. The name of the
rollback segment is somewhat misleading because it actually contains
many segments in the tablespace.

For each index in InnoDB we allocate two segments: one is for non-leaf
nodes of the B-tree, the other is for the leaf nodes. The idea here is
to achieve better sequentiality for the leaf nodes, which contain the
data.

When a segment grows inside the tablespace, InnoDB allocates the first
32 pages to it individually. After that InnoDB starts to allocate whole
extents to the segment.  InnoDB can add to a large segment up to 4
extents at a time to ensure good sequentiality of data.

Some pages in the tablespace contain bitmaps of other pages, and
therefore a few extents in an InnoDB tablespace cannot be allocated to
segments as a whole, but only as individual pages.

When you issue a query `SHOW TABLE STATUS FROM ... LIKE ...' to ask for
available free space in the tablespace, InnoDB will report the extents
which are definitely free in the tablespace.  InnoDB always reserves
some extents for clean-up and other internal purposes; these reserved
extents are not included in the free space.

When you delete data from a table, InnoDB will contract the
corresponding B-tree indexes. It depends on the pattern of deletes if
that frees individual pages or extents to the tablespace, so that the
freed space is available for other users. Dropping a table or deleting
all rows from it is guaranteed to release the space to other users, but
remember that deleted rows can be physically removed only in a purge
operation after they are no longer needed in transaction rollback or
consistent read.

Defragmenting a Table
.....................

If there are random insertions or deletions in the indexes of a table,
the indexes may become fragmented. By fragmentation we mean that the
physical ordering of the index pages on the disk is not close to the
alphabetical ordering of the records on the pages, or that there are
many unused pages in the 64-page blocks which were allocated to the
index.

It can speed up index scans if you periodically use `mysqldump' to dump
the table to a text file, drop the table, and reload it from the dump.
Another way to do the defragmenting is to perform a 'null' alter table
operation `ALTER TABLE tablename TYPE=InnoDB'.  That makes `MySQL' to
rebuild the table.

If the insertions to an index are always ascending and records are
deleted only from the end, then the file space management algorithm of
InnoDB guarantees that fragmentation in the index will not occur.

Error Handling
--------------

 rolls
back only part of the statement, or the whole transaction.  The
following list specifies the error handling of InnoDB.

   * If you run out of file space in the tablespace, you will get the
     MySQL `'Table is full'' error and InnoDB rolls back the SQL
     statement.

   * A transaction deadlock or a timeout in a lock wait make InnoDB to
     roll back the whole transaction.

   * A duplicate key error only rolls back the insert of that
     particular row, even in a statement like `INSERT INTO ... SELECT
     ...'.  This will probably change so that the SQL statement will be
     rolled back if you have not specified the `IGNORE' option in your
     statement.

   * A 'row too long' error rolls back the SQL statement.

   * Other errors are mostly detected by the MySQL layer of code, and
     they roll back the corresponding SQL statement.

Restrictions on InnoDB Tables
-----------------------------

   * InnoDB tables do not support fulltext indexes.

   
     have all table and database names in lower case.

    backup
     or re-generate them with the `mysql_install_db' script.

   

   * If you try to create a unique index on a prefix of a column you
     will get an error:

          CREATE TABLE T (A CHAR(20), B INT, UNIQUE (A(5))) TYPE = InnoDB;

     If you create a non-unique index on a prefix of a column, InnoDB
     will create an index over the whole column.

   * `INSERT DELAYED' is not supported for InnoDB tables.

   * The MySQL `LOCK TABLES' operation does not know of InnoDB row
     level locks set in already completed SQL statements: this means
     that you can get a table lock on a table even if there still exist
     transactions of other users which have row level locks on the same
     table. Thus your operations on the table may have to wait if they
     collide with these locks of other users. Also a deadlock is
     possible. However, this does not endanger transaction integrity,
     because the row level locks set by InnoDB will always take care of
     the integrity.  Also, a table lock prevents other transactions
     from acquiring more row level locks (in a conflicting lock mode)
     on the table.

   * A table cannot contain more than 1000 columns.

   * `DELETE FROM TABLE' does not regenerate the table but instead
     deletes all rows, one by one, which is not that fast. In future
     versions of MySQL you can use `TRUNCATE' which is fast.

   * The default database page size in InnoDB is 16 KB. By recompiling
     the code one can set it from 8 KB to 64 KB.  The maximun row
     length is slightly less than half of a database page in versions
     <= 3.23.40 of InnoDB. Starting from source release 3.23.41 BLOB and
     TEXT columns are allowed to be < 4 GB, the total row length must
     also be < 4 GB. InnoDB does not store fields whose size is <= 128
     bytes on separate pages. After InnoDB has modified the row by
     storing long fields on separate pages, the remaining length of the
     row must be less than half a database page. The maximun key length
     is 7000 bytes.

   * On some operating systems datafiles must be < 2 GB. The combined
     size of log files must be < 4 GB.

   * The maximum tablespace size is 4 billion database pages. This is
     also the maximum size for a table. The minimum tablespace size is
     10 MB.

   * When you restart the MySQL server, InnoDB may reuse an old value
     for an `AUTO_INCREMENT' column.

    and
     delete that dummy row.

InnoDB Change History
---------------------

MySQL/InnoDB-4.0.14, July 22, 2003
..................................

   bullet InnoDB now supports the `SAVEPOINT' and `ROLLBACK TO
     SAVEPOINT' SQL statements. See
     http://www.innodb.com/ibman.html#Savepoints for the syntax.

   bullet You can now create column prefix keys like in `CREATE TABLE t
     (a BLOB, INDEX (a(10)))'.

   bullet You can also use `O_DIRECT' as the `innodb_flush_method' on
     the latest versions of Linux and FreeBSD. Beware of possible bugs
     in those operating systems, though.

   bullet Fixed the checksum calculation of data pages. Previously most
     OS file system corruption went unnoticed. Note that if you
     downgrade from version >= 4.0.14 to an earlier version < 4.0.14
     then in the first startup(s) InnoDB will print warnings:
          InnoDB: Warning: an inconsistent page in the doublewrite buffer
          InnoDB: space id 2552202359 page number 8245, 127'th page in dblwr buf.
     but that is not dangerous and can be ignored.

    the
     workload is a mixture of reads and writes.

   bullet The buffer pool checkpoint flush algorithm now tries to flush
     also close neighbors of the page at the end of the flush list.
     This can speed up database shutdown, and can also speed up disk
     writes if InnoDB log files are very small compared to the buffer
     pool size.

    the info.

   bullet Fixed a bug: `SET FOREIGN_KEY_CHECKS=0' was not replicated
     properly in the MySQL replication. The fix will not be backported
     to 3.23.

   bullet Fixed a bug: the parameter `innodb_max_dirty_pages_pct'
     forgot to take into account the free pages in the buffer pool.
     This could lead to excessive flushing even though there were lots
     of free pages in the buffer pool. Workaround: `SET GLOBAL
     innodb_max_dirty_pages_pct = 100'.

   bullet Fixed a bug: if there were big index scans then a file read
     request could starve and InnoDB could assert because of a very
     long semaphore wait.

   bullet Fixed a bug: if `AUTOCOMMIT=1' then inside `LOCK TABLES' MySQL
     failed to do the commit after an updating SQL statement if
     binlogging was not on, and for `SELECT's did not commit regardless
     of binlogging state.

        page_dir_find_slot().

    the column
     length wrong in the child table and corrupt the child table.
     Because of MySQL's 'silent column specification changes' a
     fixed-length `CHAR' column can change internally to a `VARCHAR'
     and cause this error.

   bullet Fixed a bug: if a non-`latin1' character set was used and if
     in a `FOREIGN KEY' the parent column was of a different internal
     storage length than the child column, then all inserts to the
     child table would fail in a foreign key error.

   bullet Fixed a bug: InnoDB could complain that it cannot find the
     clustered index record, or in rare cases return an extraneous row
     if a rollback, purge, and a `SELECT' coincided.

   bullet Fixed a possible hang over the btr0sea.c latch if `SELECT'
     was used inside `LOCK TABLES'.

   bullet Fixed a bug: contrary to what the release note of 4.0.13
     said, the group commit still did not work if the MySQL binlogging
     was on.

   bullet Fixed a bug: os_event_wait() did not work properly in Unix,
     which might have caused starvation in various log operations.

   
     statement as it should, and also wrote the failed statement to the
     binlog, reporting there a non-zero error_code.

    cause
     an assertion.

MySQL/InnoDB-3.23.57, June 20, 2003
...................................

   bullet Changed the default value of `innodb_flush_log_at_trx_commit'
     from 0 to 1. If you have not specified it explicitly in your
     `my.cnf', and your application runs much slower with this new
     release, it is because the value 1 causes a log flush to disk at
     each transaction commit.

   bullet Fixed a bug: InnoDB forgot to call pthread_mutex_destroy()
     when a table was dropped. That could cause memory leakage on
     FreeBSD and other non-Linux Unixes.

   bullet Fixed a bug: MySQL could erroneously return 'Empty set' if
     InnoDB estimated an index range size to 0 records though the range
     was not empty; MySQL also failed to do the next-key locking in the
     case of an empty index range.

   bullet Fixed a bug: `GROUP BY' and `DISTINCT' could treat NULL values
     inequal.

MySQL/InnoDB-4.0.13, May 20, 2003
.................................

   bullet `InnoDB' now supports `ALTER TABLE DROP FOREIGN KEY'. You
     have to use `SHOW CREATE TABLE' to find the internally generated
     foreign key ID when you want to drop a foreign key.

   bullet `SHOW INNODB STATUS' now prints detailed information of the
     latest detected `FOREIGN KEY' and `UNIQUE KEY' errors. If you do
     not understand why `InnoDB' gives the error 150 from a `CREATE
     TABLE', you can use this statement to study the reason.

   bullet `ANALYZE TABLE' now works also for `InnoDB' type tables. It
     makes 10 random dives to each of the index trees and updates index
     cardinality estimates accordingly. Note that since it is only an
     estimate, repeated runs of `ANALYZE TABLE' may produce different
     numbers. MySQL uses index cardinality estimates only in join
     optimisation. If some join is not optimised in the right way, you
     may try using `ANALYZE TABLE'.

   bullet `InnoDB' group commit capability now works also when MySQL
     binlogging is switched on. There have to be > 2 client threads for
     the group commit to become active.

   bullet Changed the default value of `innodb_flush_log_at_trx_commit'
     from 0 to 1. If you have not specified it explicitly in your
     `my.cnf', and your application runs much slower with this new
     release, it is because the value 1 causes a log flush to disk at
     each transaction commit.

   bullet Added a new global settable MySQL system variable
     `innodb_max_dirty_pages_pct'. It is an integer in the range 0 -
     100.  The default is 90. The main thread in `InnoDB' tries to
     flush pages from the buffer pool so that at most this many
     percents are not yet flushed at any time.

   bullet If `innodb_force_recovery=6', do not let `InnoDB' do repair
     of corrupt pages based on the doublewrite buffer.

   bullet `InnoDB' start-up now happens faster because it does not set
     the memory in the buffer pool to zero.

   bullet Fixed a bug: The `InnoDB' parser for `FOREIGN KEY'
     definitions was confused by the keywords 'foreign key' inside
     MySQL comments.

   
     in function `dict_load_table()'.

   bullet Fixed a bug: `GROUP BY' and `DISTINCT' could treat `NULL'
     values as not equal.  MySQL also failed to do the next-key locking
     in the case of an empty index range.

   bullet Fixed a bug: Do not commit the current transaction when a
     MyISAM table is updated; this also makes `CREATE TABLE' not to
     commit an `InnoDB' transaction, even when binlogging is enabled.

   bullet Fixed a bug: We did not allow `ON DELETE SET NULL' to modify
     the same table where the delete was made; we can allow it because
     that cannot produce infinite loops in cascaded operations.

   bullet Fixed a bug: Allow `HANDLER PREV' and `NEXT' also after
     positioning the cursor with a unique search on the primary key.

   bullet Fixed a bug: If `MIN()' or `MAX()' resulted in a deadlock or
     a lock wait timeout, MySQL did not return an error, but returned
     `NULL' as the function value.

   bullet Fixed a bug: `InnoDB' forgot to call
     `pthread_mutex_destroy()' when a table was dropped. That could
     cause memory leakage on FreeBSD and other non-Linux Unix systems.

MySQL/InnoDB-4.1.0, April 3, 2003
.................................

   * `InnoDB' now supports up to 64 GB of buffer pool memory in a
     Windows 32-bit Intel computer. This is possible because `InnoDB'
     can use the AWE extension of Windows to address memory over the 4
     GB limit of a 32-bit process. A new startup variable
     `innodb_buffer_pool_awe_mem_mb' enables AWE and sets the size of
     the buffer pool in megabytes.

   * Reduced the size of buffer headers and the lock table. `InnoDB'
     uses 2 % less memory.

MySQL/InnoDB-3.23.56, March 17, 2003
....................................

   * Fixed a major bug in InnoDB query optimisation: queries of type
     SELECT ... WHERE indexcolumn < x and SELECT ... WHERE indexcolumn
     > x could cause a table scan even if the selectivity would have
     been very good.

   * Fixed a potential bug if MySQL calls store_lock with TL_IGNORE in
     the middle of a query.

MySQL/InnoDB-4.0.12, March 18, 2003
...................................

   * In crash recovery InnoDB now prints the progress in percents of a
     transaction rollback.

   * Fixed a bug/feature: if your application program used
     mysql_use_result(), and used >

    the
     case of an empty index range.

MySQL/InnoDB-4.0.11, February 25, 2003
......................................

   * Fixed a bug introduced in 4.0.10: SELECT ... FROM ... ORDER BY ...
     DESC could hang in an infinite loop.

   * An outstanding bug: SET FOREIGN_KEY_CHECKS=0 is not replicated
     properly in the MySQL replication.

MySQL/InnoDB-4.0.10, February 4, 2003
.....................................

   * In INSERT INTO t1 SELECT ... FROM t2 WHERE ... MySQL previously
     set a table level read lock on t2. This lock is now removed.

   * Increased SHOW INNODB STATUS max printed length to 200 KB.

   * Fixed a major bug in InnoDB query optimisation: queries of type
     SELECT ... WHERE indexcolumn < x and SELECT ... WHERE indexcolumn
     > x could cause a table scan even if the selectivity would have
     been very good.

   

   * Fixed a bug: using InnoDB HANDLER commands on a fresh handle
     crashed mysqld in ha_innobase::change_active_index().

   * Fixed a bug: if MySQL estimated a query in the middle of a SELECT
     statement, InnoDB could hang on the adaptive hash index latch in
     btr0sea.c.

   * Fixed a bug: InnoDB could report table corruption and assert in
     page_dir_find_owner_slot() if an adaptive hash index search
     coincided with purge or an insert.

    write 100 times at 1 second intervals.

   * Fixed a bug: REPLACE INTO t1 SELECT ... did not work if t1 has an
     auto-inc column.

   * An outstanding bug: SET FOREIGN_KEY_CHECKS=0 is not replicated
     properly in the MySQL replication.

MySQL/InnoDB-3.23.55, January 24, 2003
......................................

   * In INSERT INTO t1 SELECT ... FROM t2 WHERE ... MySQL previously set
     a table level read lock on t2. This lock is now removed.

   * Fixed a bug: if the combined size of InnoDB log files was >= 2 GB
     in a 32-bit computer, InnoDB would write log in a wrong position.
     That could make crash recovery and InnoDB Hot Backup to fail in
     log scan.

   * Fixed a bug: index cursor restoration could theoretically fail.

   * Fixed a bug: an assertion in btr0sea.c, in function
     btr_search_info_update_slow could theoretically fail in a race of
     3 threads.

   

   * Fixed a bug: if MySQL estimated a query in the middle of a SELECT
     statement, InnoDB could hang on the adaptive hash index latch in
     btr0sea.c.

   * Fixed a bug: InnoDB could report table corruption and assert in
     page_dir_find_owner_slot() if an adaptive hash index search
     coincided with purge or an insert.

    write 100 times at 1 second intervals.

   * An outstanding bug: SET FOREIGN_KEY_CHECKS=0 is not replicated
     properly in the MySQL replication. The fix will appear in 4.0.11
     and will probably not be backported to 3.23.

   * Fixed bug in InnoDB page0cur.c in function
     page_cur_search_with_match which caused InnoDB to remain on the
     same page forever.  This bug is evident only in tables with more
     than one page.

MySQL/InnoDB-4.0.9, January 14, 2003
....................................

   * Removed the warning message: 'InnoDB: Out of memory in additional
     memory pool.'

   * Fixed a bug: if the combined size of InnoDB log files was >= 2 GB
     in a 32-bit computer, InnoDB would write log in a wrong position.
     That could make crash recovery and InnoDB Hot Backup to fail.

   * Fixed a bug: index cursor restoration could theoretically fail.

MySQL/InnoDB-4.0.8, January 7, 2003
...................................

   * InnoDB now supports also FOREIGN KEY (...) REFERENCES ...(...)
     [ON UPDATE CASCADE | ON UPDATE SET NULL | ON UPDATE RESTRICT | ON
     UPDATE NO ACTION].

   * Tables and indexes now reserve 4 % less space in the tablespace.
     Also existing tables reserve less space. By upgrading to 4.0.8 you
     will see more free space in "InnoDB free" in SHOW TABLE STATUS.

   * Fixed bugs: updating the PRIMARY KEY of a row would generate a
     foreign key error on all FOREIGN KEYs which referenced secondary
     keys of the row to be updated. Also, if a referencing FOREIGN KEY
     constraint only referenced the first columns in an index, and
     there were more columns in that index, updating the additional
     columns generated a foreign key error.

   * Fixed a bug: if an index contains some column twice, and that
     column is updated, the table will become corrupt. From now on
     InnoDB prevents creation of such indexes.

   * Fixed a bug: removed superfluous error 149 and 150 printouts from
     the .err log when a locking SELECT caused a deadlock or a lock
     wait timeout.

   * Fixed a bug: an assertion in btr0sea.c, in function
     btr_search_info_update_slow could theoretically fail in a race of
     3 threads.

   * Fixed a bug: one could not switch a session transaction isolation
     level back to REPEATABLE READ after setting it to something else.

MySQL/InnoDB-4.0.7, December 26, 2002
.....................................

   * InnoDB in 4.0.7 is essentially the same as in 4.0.6.

MySQL/InnoDB-4.0.6, December 19, 2002
.....................................

   * Since innodb_log_arch_dir has no relevance under MySQL, there is
     no need to specify it any more in my.cnf.

   * LOAD DATA INFILE in AUTOCOMMIT=1 mode no longer does implicit
     commits for each 1 MB of written binlog.

   * Fixed a bug introduced in 4.0.4: LOCK TABLES ... READ LOCAL should
     not set row locks on the rows read. This caused deadlocks and lock
     wait timeouts in mysqldump.

   * Fixed two bugs introduced in 4.0.4: in AUTO_INCREMENT, REPLACE
     could cause the counter to be left 1 too low.  A deadlock or a
     lock wait timeout could cause the same problem.

   * Fixed a bug: TRUNCATE on a TEMPORARY table crashed InnoDB.

   * Fixed a bug introduced in 4.0.5: if binlogging was not switched on,
     INSERT INTO ... SELECT ... or CREATE TABLE ... SELECT ... could
     cause InnoDB to hang on a semaphore created in btr0sea.c, line 128.
     Workaround: switch binlogging on.

   * Fixed a bug: in replication issuing SLAVE STOP in the middle of a
     multi-statement transaction could cause that SLAVE START would
     only perform a part of the transaction. A similar error could occur
     if the slave crashed and was restarted.

MySQL/InnoDB-3.23.54, December 12, 2002
.......................................

   * Fixed a bug: the InnoDB range estimator greatly exaggerated the
     size of a short index range if the paths to the endpoints of the
     range in the index tree happened to branch already in the root.
     This could cause unnecessary table scans in SQL queries.

   * Fixed a bug: ORDER BY could fail if you had not created a primary
     key to a table, but had defined several indexes of which at least
     one was a UNIQUE index with all its columns declared as NOT NULL.

   * Fixed a bug: a lock wait timeout in connection with ON DELETE
     CASCADE could cause corruption in indexes.

   * Fixed a bug: if a SELECT was done with a unique key from a primary
     index, and the search matched to a delete-marked record, InnoDB
     could erroneously return the NEXT record.

   * Fixed a bug introduced in 3.23.53: LOCK TABLES ... READ LOCAL
     should not set row locks on the rows read. This caused deadlocks
     and lock wait timeouts in mysqldump.

   * Fixed a bug: if an index contains some column twice, and that
     column is updated, the table will become corrupt. From now on
     InnoDB prevents creation of such indexes.

MySQL/InnoDB-4.0.5, November 18, 2002
.....................................

       MySQL easier.

   * Deadlock resolution is now selective: we try to pick as victims
     transactions with less modified or inserted rows.

   * FOREIGN KEY definitions are now aware of the
     lower_case_table_names setting in my.cnf.

   * SHOW CREATE TABLE does not output the database name to a FOREIGN
     KEY definition if the referred table is in the same database as
     the table.

   * InnoDB does a consistency check to most index pages before writing
     them to a datafile.

   * If you set innodb_force_recovery > 0, InnoDB tries to jump over
     corrupt index records and pages when doing SELECT * FROM table.
     This helps in dumping.

   * InnoDB now again uses asynchronous unbuffered I/O in Windows 2000
     and XP; only unbuffered simulated async I/O in NT, 95/98/ME.

   * Fixed a bug: the InnoDB range estimator greatly exaggerated the
     size of a short index range if the paths to the endpoints of the
     range in the index tree happened to branch already in the root.
     This could cause unnecessary table scans in SQL queries.  The fix
     will also be backported to 3.23.54.

   * Fixed a bug present in 3.23.52, 4.0.3, 4.0.4: InnoDB startup could
     take very long or even crash on some Windows 95/98/ME computers.

   * Fixed a bug: the AUTO-INC lock was held to the end of the
     transaction if it was granted after a lock wait. This could cause
     unnecessary deadlocks.

   * Fixed a bug: if SHOW INNODB STATUS, innodb_monitor, or
     innodb_lock_monitor had to print several hundred transactions in
     one report, and the output became truncated, InnoDB would hang,
     printing to the error log many waits for a mutex created at
     srv0srv.c, line 1621.

   * Fixed a bug: SHOW INNODB STATUS on Unix always reported average
     file read size as 0 bytes.

   * Fixed a potential bug in 4.0.4: InnoDB now does ORDER BY ... DESC
     like MyISAM.

   * Fixed a bug: DROP TABLE could cause crash or a hang if there was a
     rollback concurrently running on the table. The fix will only be
     backported to 3.23 if this appears a real problem for users.

   * Fixed a bug: ORDER BY could fail if you had not created a primary
     key to a table, but had defined several indexes of which at least
     one was a UNIQUE index with all its columns declared as NOT NULL.

   * Fixed a bug: a lock wait timeout in connection with ON DELETE
     CASCADE could cause corruption in indexes.

   * Fixed a bug: if a SELECT was done with a unique key from a primary
     index, and the search matched to a delete-marked record, InnoDB
     could return the NEXT record.

        These will be fixed in 4.0.6.

MySQL/InnoDB-3.23.53, October 9, 2002
.....................................

   * We again use unbuffered disk I/O to datafiles in Windows.  Windows
     XP and Windows 2000 read performance seems to be very poor with
     normal I/O.

   * Tuned range estimator so that index range scans are preferred over
     full index scans.

   * Allow dropping and creating a table even if innodb_force_recovery
     is set. One can use this to drop a table which would cause a crash
     in rollback or purge, or if a failed table import causes a runaway
     rollback in recovery.

   * Fixed a bug present in 3.23.52, 4.0.3, 4.0.4: InnoDB startup could
     take very long or even crash on some Windows 95/98/ME computers.

   * Fixed a bug: fast shutdown (which is the default) sometimes was
     slowed down by purge and insert buffer merge.

   * Fixed a bug: doing a big SELECT from a table where no rows were
     visible in a consistent read could cause a very long (> 600
     seconds) semaphore wait in btr0cur.c line 310.

   * Fixed a bug: the AUTO-INC lock was held to the end of the
     transaction if it was granted after a lock wait.  This could cause
     unnecessary deadlocks.

   * Fixed a bug: if you created a temporary table inside LOCK TABLES,
     and used that temporary table, that caused an assertion failure in
     ha_innobase.cc.

   * Fixed a bug: if SHOW INNODB STATUS, innodb_monitor, or
     innodb_lock_monitor had to print several hundred transactions in
     one report, and the output became truncated, InnoDB would hang,
     printing to the error log many waits for a mutex created at
     srv0srv.c, line 1621.

   * Fixed a bug: SHOW INNODB STATUS on Unix always reported average
     file read size as 0 bytes.

MySQL/InnoDB-4.0.4, October 2, 2002
...................................

   * We again use unbuffered disk I/O in Windows. Windows XP and
     Windows 2000 read performance seems to be very poor with normal
     I/O.

   * Increased the max key length of InnoDB tables from 500 to 1024
     bytes.

   * Increased the table comment field in SHOW TABLE STATUS so that up
     to 16000 characters of foreign key definitions can be printed
     there.

   * The auto-increment counter is no longer incremented if an insert
     of a row immediately fails in an error.

    a runaway
     rollback in recovery.

   * Fixed a bug: Using ORDER BY primarykey DESC in 4.0.3 causes an
     assertion failure in btr0pcur.c, line 203.

   * Fixed a bug: fast shutdown (which is the default) sometimes was
     slowed down by purge and insert buffer merge.

   * Fixed a bug: doing a big SELECT from a table where no rows were
     visible in a consistent read could cause a very long (> 600
     seconds) semaphore wait in btr0cur.c line 310.

   * Fixed a bug: if the MySQL query cache was used, it did not get
     invalidated by a modification done by ON DELETE CASCADE or ...SET
     NULL.

   * Fixed a bug: if you created a temporary table inside LOCK TABLES,
     and used that temporary table, that caused an assertion failure in
     ha_innodb.cc.

   * Fixed a bug: if you set innodb_flush_log_at_trx_commit to 1, SHOW
     VARIABLES would show its value as 16 million.

MySQL/InnoDB-4.0.3, August 28, 2002
...................................

   * Removed unnecessary deadlocks when inserts have to wait for a
     locking read, update, or delete to release its next-key lock.

   
     indexes of tables. In some cases `HANDLER' can be used as a
     substitute of server-side cursors.

   * Fixed a bug in 4.0.2: even a simple insert could crash the AIX
     version.

   * Fixed a bug: if you used in a table name characters whose code is
     > 127, in DROP TABLE InnoDB could assert on line 155 of pars0sym.c.

   * Compilation from source now provides a working version both on
     HP-UX-11 and HP-UX-10.20. The source of 4.0.2 worked only on 11,
     and the source of 3.23.52 only on 10.20.

   * Fixed a bug: if compiled on 64-bit Solaris, InnoDB produced a bus
     error at startup.

MySQL/InnoDB-3.23.52, August 16, 2002
.....................................

   * The feature set of 3.23 will be frozen from this version on. New
     features will go the 4.0 branch, and only bug fixes will be made
     to the 3.23 branch.

   * Many CPU-bound join queries now run faster. On Windows also many
     other CPU-bound queries run faster.

   * InnoDB made the SQL query optimiser to avoid too much index-only
     range scans and choose full table scans instead. This is now fixed.

    slaves.

   * A replication slave now prints in crash recovery the last master
     binlog position it was able to recover to.

    and the setting 2 also has the
     nice feature that in a crash where the operating system does not
     crash, no committed transaction is lost.  If the operating system
     crashes or there is a power outage, then the setting 2 is no safer
     than the setting 0.

   * Added checksum fields to log blocks.

   * SET FOREIGN_KEY_CHECKS=0 helps in importing tables in an arbitrary
     order which does not respect the foreign key rules.

    no
     UNIQUE constraint violations.

   * SHOW TABLE STATUS now lists also possible ON DELETE CASCADE or ON
     DELETE SET NULL in the comment field of the table.

   * When CHECK TABLE is run on any InnoDB type table, it now checks
     also the adaptive hash index for all tables.

   
     error 'Cannot delete parent row'.

   * Improved the auto-increment algorithm: now the first insert or
     SHOW TABLE STATUS initialises the auto-increment counter for the
     table.  This removes almost all surprising deadlocks caused by
     SHOW TABLE STATUS.

   * Aligned some buffers used in reading and writing to datafiles.
     This allows using unbuffered raw devices as datafiles in Linux.

   * Fixed a bug: If you updated the primary key of a table so that
     only the case of characters changed, that could cause assertion
     failures, mostly in page0page.ic line 515.

   * Fixed a bug: If you delete or update a row referenced in a foreign
     key constraint and the foreign key check has to wait for a lock,
     then the check may report an erroneous result. This affects also
     the ON DELETE... operation.

   * Fixed a bug: A deadlock or a lock wait timeout error in InnoDB
     causes InnoDB to roll back the whole transaction, but MySQL could
     still write the earlier SQL statements to the binlog, even though
     InnoDB rolled them back. This could, for example, cause replicated
     databases to get out-of-sync.

   * Fixed a bug: If the database happened to crash in the middle of a
     commit, then the recovery might leak tablespace pages.

   * Fixed a bug: If you specified a non-latin1 character set in my.cnf,
     then, in contrary to what is stated in the manual, in a foreign key
     constraint a string type column had to have the same length
     specification in the referencing table and the referenced table.

   * Fixed a bug: DROP TABLE or DROP DATABASE could fail if there
     simultaneously was a CREATE TABLE running.

   * Fixed a bug: If you configured the buffer pool bigger than 2 GB in
     a 32-bit computer, InnoDB would assert in buf0buf.ic line 214.

   * Fixed a bug: on 64-bit computers updating rows which contained the
     SQL NULL in some column could cause the undo log and the ordinary
     log to become corrupt.

   * Fixed a bug: innodb_log_monitor caused a hang if it suppressed
     lock prints for a page.

   * Fixed a bug: in the HP-UX-10.20 version mutexes would leak and
     cause race conditions and crashes in any part of InnoDB code.

   * Fixed a bug: if compiled on 64-bit Solaris, InnoDB produced a bus
     error at startup.

MySQL/InnoDB-4.0.2, July 10, 2002
.................................

   * InnoDB is essentially the same as InnoDB-3.23.51.

   * If no innodb_data_file_path is specified, InnoDB at the database
     creation now creates a 10 MB auto-extending datafile ibdata1 to
     the datadir of MySQL. In 4.0.1 the file was 64 MB and not
     auto-extending.

MySQL/InnoDB-3.23.51, June 12, 2002
...................................

   * Fixed a bug: a join could result in a seg fault in copying of a
     BLOB or TEXT column if some of the BLOB or TEXT columns in the
     table contained SQL NULL values.

   * Fixed a bug: if you added self-referential foreign key constraints
     with ON DELETE CASCADE to tables and a row deletion caused InnoDB
     to attempt the deletion of the same row twice because of a
     cascading delete, then you got an assertion failure.

   * Fixed a bug: if you use MySQL 'user level locks' and close a
     connection, then InnoDB may assert in ha_innobase.cc, line 302.

MySQL/InnoDB-3.23.50, April 23, 2002
....................................

   * InnoDB now supports an auto-extending last datafile. You do not
     need to preallocate the whole datafile at the database startup.

   * Made several changes to facilitate the use of the InnoDB Hot Backup
     tool. It is a separate non-free tool you can use to take online
     backups of your database without shutting down the server or
     setting any locks.

   * If you want to run the InnoDB Hot Backup tool on an auto-extending
     datafile you have to upgrade it to version ibbackup-0.35.

   * The log scan phase in crash recovery will now run much faster.

   * Starting from this server version, the hot backup tool truncates
     unused ends in the backup InnoDB datafiles.

   * To allow the hot backup tool to work, on Windows we no longer use
     unbuffered I/O or native async I/O; instead we use the same
     simulated async I/O as on Unix.

   * You can now define the ON DELETE CASCADE or ON DELETE SET NULL
     clause on foreign keys.

   * FOREIGN KEY constraints now survive ALTER TABLE and CREATE INDEX.

   * SHOW CREATE TABLE now lists also foreign key constraints. Also
     mysqldump no longer forgets about foreign keys in table
     definitions.

   * You can now add a new foreign key constraint with ALTER TABLE ...
     ADD CONSTRAINT FOREIGN KEY (...) REFERENCES ... (...).

   * FOREIGN KEY definitions now allow backquotes around table and
     column names.

    to
     all consistent reads. If a transaction is defined to have any other
     isolation level, then InnoDB obeys its default locking strategy
     which is REPEATABLE READ.

   * SHOW TABLE STATUS no longer sets an x-lock at the end of an
     auto-increment index if the auto-increment counter has already
     been initialised. This removes in almost all cases the surprising
     deadlocks caused by SHOW TABLE STATUS.

   * Fixed a bug: in a CREATE TABLE statement the string 'foreign'
     followed by a non-space character confused the FOREIGN KEY parser
     and caused table creation to fail with errno 150.

MySQL/InnoDB-3.23.49, February 17, 2002
.......................................

   * Fixed a bug: if you called DROP DATABASE for a database on which
     there simultaneously were running queries, the MySQL server could
     crash or hang. Crashes fixed, but a full fix has to wait some
     changes in the MySQL layer of code.

   * Fixed a bug: on Windows one had to put the database name in lower
     case for DROP DATABASE to work. Fixed in 3.23.49: case no longer
     matters on Windows. On Unix the database name remains
     case-sensitive.

   * Fixed a bug: if one defined a non-latin1 character set as the
     default character set, then definition of foreign key constraints
     could fail in an assertion failure in dict0crea.c, reporting an
     internal error 17.

MySQL/InnoDB-3.23.48, February 9, 2002
......................................

   * Tuned the SQL optimiser to favor more often index searches over
     table scans.

   * Fixed a performance problem when several large SELECT queries are
     run concurrently on a multiprocessor Linux computer.  Large
     CPU-bound SELECT queries will now also generally run faster on all
     platforms.

   * If MySQL binlogging is used, InnoDB now prints after crash
     recovery the latest MySQL binlog file name and the position in
     that file (= byte offset) InnoDB was able to recover to. This is
     useful, for example, when resynchronizing a master and a slave
     database in replication.

   * Added better error messages to help in installation problems.

   * One can now recover also MySQL temporary tables which have become
     orphaned inside the InnoDB tablespace.

   * InnoDB now prevents a FOREIGN KEY declaration where the signedness
     is not the same in the referencing and referenced integer columns.

    TABLE.

   * Fixed a bug: if on Unix you did an ALTER TABLE to an InnoDB table
     and simultaneously did queries to it, mysqld could crash with an
     assertion failure in row0row.c, line 474.

   * Fixed a bug: if inserts to several tables containing an auto-inc
     column were wrapped inside one LOCK TABLES, InnoDB asserted in
     lock0lock.c.

      switched on.

   * Fixed a bug: if you ran mysqld-max-nt as a service on Windows
     NT/2000, the service shutdown did not always wait long enough for
     the InnoDB shutdown to finish.

MySQL/InnoDB-3.23.47, December 28, 2001
.......................................

   * Recovery happens now faster, especially in a lightly loaded system,
     because background checkpointing has been made more frequent.

   * InnoDB allows now several similar key values in a UNIQUE secondary
     index if those values contain SQL NULLs. Thus the convention is
     now the same as in MyISAM tables.

   * InnoDB gives a better row count estimate for a table which
     contains BLOBs.

   * In a FOREIGN KEY constraint InnoDB is now case-insensitive to
     column names, and in Windows also to table names.

   
     changes do not hinder FOREIGN KEY declaration any more.

   * Recovery has been made more resilient to corruption of log files.

   * Unnecessary statistics calculation has been removed from queries
     which generate a temporary table. Some ORDER BY and DISTINCT
     queries will now run much faster.

   * MySQL now knows that the table scan of an InnoDB table is done
     through the primary key. This will save a sort in some ORDER BY
     queries.

   * The maximum key length of InnoDB tables is again restricted to 500
     bytes.  The MySQL interpreter is not able to handle longer keys.

   * The default value of innodb_lock_wait_timeout was changed from
     infinite to 50 seconds, the default value of
     innodb_file_io_threads from 9 to 4.

MySQL/InnoDB-4.0.1, December 23, 2001
.....................................

   * InnoDB is the same as in 3.23.47.

   * In 4.0.0 the MySQL interpreter did not know the syntax LOCK IN
     SHARE MODE. This has been fixed.

   * In 4.0.0 multi-table delete did not work for transactional tables.
     This has been fixed.

MySQL/InnoDB-3.23.46, November 30, 2001
.......................................

   * This is the same as 3.23.45.

MySQL/InnoDB-3.23.45, November 23, 2001
.......................................

   * This is a bugfix release.

   * In versions 3.23.42-.44 when creating a table on Windows you have
     to use lower case letters in the database name to be able to
     access the table. Fixed in 3.23.45.

   * InnoDB now flushes stdout and stderr every 10 seconds: if these
     are redirected to files, the file contents can be better viewed
     with an editor.

   * Fixed an assertion failure in .44, in trx0trx.c, line 178 when you
     drop a table which has the .frm file but does not exist inside
     InnoDB.

    table
     imports or alterations.

   * Fixed a bug in recovery: InnoDB could go into an infinite loop
     constantly printing a warning message that it cannot find free
     blocks from the buffer pool.

   * Fixed a bug: when you created a temporary table of the InnoDB
     type, and then used ALTER TABLE to it, the MySQL server could
     crash.

   * Prevented creation of MySQL system tables 'mysql.user',
     'mysql.host', or 'mysql.db', in the InnoDB type.

   * Fixed a bug which can cause an assertion failure in 3.23.44 in
     srv0srv.c, line 1728.

MySQL/InnoDB-3.23.44, November 2, 2001
......................................

   * You can define foreign key constraints on InnoDB tables. An
     example: FOREIGN KEY (col1) REFERENCES table2(col2).

   * You can create > 4 GB datafiles in those file systems that allow
     it.

   * Improved InnoDB monitors, including a new innodb_table_monitor
     which allows you to print the contents of the InnoDB internal data
     dictionary.

   * DROP DATABASE will now work also for InnoDB tables.

   * Accent characters in the default character set latin1 will be
     ordered according to the MySQL ordering.
NOTE: if you are using latin1 and have inserted characters whose code is > 127 to an indexed CHAR column, you should run CHECK TABLE on your table when you upgrade to 3.23.43, and drop and reimport the table if CHECK TABLE reports an error! * InnoDB will calculate better table cardinality estimates. * Change in deadlock resolution: in .43 a deadlock rolls back only the SQL statement, in .44 it will roll back the whole transaction. * A new my.cnf parameter innodb_thread_concurrency helps in performance tuning in high concurrency environments. * A new my.cnf option innodb_force_recovery will help you in dumping tables from a corrupted database. * A new my.cnf option innodb_fast_shutdown will speed up shutdown. Normally InnoDB does a full purge and an insert buffer merge at shutdown. * Raised maximum key length to 7000 bytes from a previous limit of 500 bytes. * Fixed a bug in replication of auto-inc columns with multiline inserts. * Fixed a bug when the case of letters changes in an update of an indexed secondary column. * Fixed a hang when there are > 24 datafiles. * Fixed a crash when MAX(col) is selected from an empty table, and col is a not the first column in a multi-column index. * Fixed a bug in purge which could cause crashes. MySQL/InnoDB-3.23.43, October 4, 2001 ..................................... * This is essentially the same as InnoDB-3.23.42. MySQL/InnoDB-3.23.42, September 9, 2001 ....................................... * Fixed a bug which corrupted the table if the primary key of a > 8000-byte row was updated. * There are now 3 types of InnoDB Monitors: innodb_monitor, innodb_lock_monitor, and innodb_tablespace_monitor. innodb_monitor now prints also buffer pool hit rate and the total number of rows inserted, updated, deleted, read. * Fixed a bug in RENAME TABLE. * Fixed a bug in replication with an auto-increment column. MySQL/InnoDB-3.23.41, August 13, 2001 ..................................... * Support for < 4 GB rows. The previous limit was 8000 bytes. * Use the doublewrite file flush method. * Raw disk partitions supported as datafiles. * InnoDB Monitor. * Several hang bugs fixed and an `ORDER BY' bug ('Sort aborted') fixed. MySQL/InnoDB-3.23.40, July 16, 2001 ................................... * Only a few rare bugs fixed. MySQL/InnoDB-3.23.39, June 13, 2001 ................................... * `CHECK TABLE' now works for `InnoDB' tables. * A new `my.cnf' parameter `innodb_unix_file_flush_method' introduced. It can be used to tune disk write performance. * An auto-increment column now gets new values past the transaction mechanism. This saves CPU time and eliminates transaction deadlocks in new value assignment. * Several bug fixes, most notably the rollback bug in 3.23.38. MySQL/InnoDB-3.23.38, May 12, 2001 .................................. * The new syntax `SELECT ... LOCK IN SHARE MODE' is introduced. * `InnoDB' now calls `fsync()' after every disk write and calculates a checksum for every database page it writes or reads, which will reveal disk defects. * Several bug fixes. `InnoDB' Contact Information ---------------------------- Contact information of Innobase Oy, producer of the `InnoDB' engine. Web site: `http://www.innodb.com/'. E-mail:

[Назад] [Содержание] [Вперед]

Главная