Wednesday, June 21, 2006

Investigating reasons why slaves get behind master

Since MySQL uses asynchronous replication, slaves can get behind the master from time to time. Among the reasons a slave can get behind includes heavy query load (SELECTs) on slave servers. From time to time I come across a slave server that will start lagging behind during the peak hours causing the Seconds_Behind_Master value to go higher and higher. As the load settles, the slave eventually catches up. I have seen Seconds_Behind_Master value in excess of 19000. With a value so high it is clear that the slave is falling considerably behind in reading the relay log.

As Mike pointed out in a comment to my earlier post "seconds behind master only tells you the difference between when a statement was put into the relay log on the slave and when it was executed, it doesn't account for the time it takes for the statement to get from the master to the slave."

On a slave server, MySQL compares the TIMESTAMP value taken from the slave and subtracts the TIMESTAMP value present in the binary log and then adds the difference between the UNIX_TIMESTAMP value taken from the master and the slave. (See Seconds Behind Master weirdness if you are getting a Seconds_Behind_Master value of 18446744073709551615).

If we want to decrease the value of the Seconds_Behind_Master, we need to find the queries that take extraordinarily long to execute on both the master and the slave. To do this we can use the slow query log (also see slow query log manual page for the latest version).

We can enable the logging of slow queries by starting the mysqld with the

--log-slow-queries

option. If we want we can also specify the filename where the slow queries log should be kept.

--log-slow-queries=/path/to/slow_queries_log

.

By default, MySQL will create the slow query log as hostname-slow.log in the data directory.

Using another option long_query_time, we can specify the amount of time that is considered acceptable. For instance, if we specify

long_query_time = 5

in the my.cnf file then all queries that take longer than 5 seconds will be considered slow queries and therefore logged.

I believe, the default value for long_query_time is 10 seconds.

Once you have the slow queries log, you can read it with mysqldumpslow.

By investigating and optimizing the "slow queries" we can help our slaves not get too behind the master.

More to come. Comments, questions and suggestions are welcome.

5 comments:

Anonymous said...

Frank, as well as slow queries, the slow query log will also log queries that do not make use of an index.

Cool blog btw!

Imran Chaudhry,
www.atomdatabase.com

Anonymous said...

To enable logging of queries not using indexes under MySQL 3.23.x or 4.0.x use the 'log-long-format' option at startup.

For MySQL 4.1 and greater the option is changed to 'log-queries-not-using-indexes'

Matt Montgomery

Frankly Speaking! said...

Imran-

Thanks. Glad you like the blog.

Matt-

Thanks for the tip.

Frank

roobaron said...

Thanks for the tip. RSS subscribing to your blog.

Andrew said...

This statement seems to be under dispute:

"As Mike pointed out in a comment to my earlier post "seconds behind master only tells you the difference between when a statement was put into the relay log on the slave and when it was executed, it doesn't account for the time it takes for the statement to get from the master to the slave.""

This contradicts Guilhem Bichot's description in http://bugs.mysql.com/bug.php?id=2826 :

Every statement executed on the master is stored on the master's binary log together
with the timestamp when it was executed.
* So when the slave SQL thread reads a statement from the binary log, it executes it and
then it does:
seconds_behind_master= SELECT UNIX_TIMESTAMP() on slave - timestamp of the master's binary log event - c