Monday, July 31, 2006

Relay Log File Size and Limit

For any DBA, broken replication represents a dreaded moment. Replication can break for many factors including when the master server issues a 'Duplicate entry' error. At that point you can either skip the slave counter (recommended) or ignore/skip the errors (slave-skip-errors=1062) if you are sure that ignoring such errors won't cause a nasty effect somewhere.

Depending on when you troubleshoot the situation, when replication resumes, slaves may fall significantly behind and the relay log files may start consuming huge amount of disk space. If you have ample free disk space then that is not an issue however, if you are tight on disk space then the relay-log-space-limit option can really come handy.

The relay-log-space-limit option limits the disk space used by the relay log files. You can specify the value in bytes, megabytes (10000M) or gigabytes (10G) in either my.cnf file or as a start up option.

To specify it in the my.cnf file, backup your current my.cnf file (always recommended), stop slave, stop the MySQL server and place the following option:

# relay log restrictions

Then save and quit the file and start MySQL. Unless you configured differently, MySQL will automatically start the slave thread.

Now assuming the relay log files were using more than 15G space, issuing a


would show the following message:

Slave_IO_State: Waiting for the slave SQL thread to free enough relay log space

What that message indicates is that MySQL is waiting for the existing relay logs to be read so some space can be freed before new logs will be downloaded to the slave.

If you do not want to restrict the disk space for relay log files then specify 0 as the value for relay-log-space-limit.

For more information on the states of slave I/O threads, see the MySQL manual.


Sheeri said...

Of course, the right thing to do is monitor your replication setup automatically. It isn't too hard to write a script to do that, andd there are already scripts written for the major monitoring systems. After all, even if you have a large hard drive, and set the log size to 0, if you don't notice it, you can still lose data if you fill up the disk.

Peter Zaitsev said...


Using sql_skip_slave_counter alone is
road leading to slave inconsistency.

Before using this variable I would see which statement caused the error and so how serve is it - so you might need to run it manually with some adjustement, modify some data in tables to make it succeed or skip it.

Frankly Speaking! said...

Very true Sheeri, setting the log size to 0 even on systems with "sufficient" disk space is a recipe for later disaster.