Account quota enforcement on mysql file-level halts entire mysql server

Today we had the whole mysql server halting and corrupted tables on a shared server due to the quota enforcement of Virtualmin on a single virtual server which had a buggy script filling a cache folder with files.

Virtualmin indicated in main page that mysql server was down.

The error we had when trying accessing mysql when it was halted was:

/usr/bin/mysqladmin: connect to server at 'localhost' failed
error: 'Too many connections'

Even trying to stop and restart didn't work:

/etc/init.d/mysql stop
* Stopping MySQL database server mysqld
   ...fail!
root@mu22:~# /etc/init.d/mysql start
* Starting MySQL database server mysqld
   ...done.
/usr/bin/mysqladmin: connect to server at 'localhost' failed
error: 'Too many connections'

I finally found the cause when trying to restart the mysql server and repairing a damaged table that i had to abort after waiting for quite some time:

mysql> REPAIR TABLE `jos_aqsgmeta_address`;
Query aborted by Ctrl+C
+----------------------------------------------------------------+--------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table                                                          | Op     | Msg_type | Msg_text                                                                                                                                                          |
+----------------------------------------------------------------+--------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ccc.jos_aqsgmeta_address | repair | error    | 122 when writing to datafile                                                                                                                                      |
| ccc.jos_aqsgmeta_address | repair | Error    | Disk is full writing './ccc/jos_aqsgmeta_address.TMD' (Errcode: 122). Waiting for someone to free space... Retry in 60 secs |
| ccc.jos_aqsgmeta_address | repair | Error    | Error writing file './cccc/jos_aqsgmeta_address.TMD' (Errcode: 122)                                                          |
| ccc.jos_aqsgmeta_address | repair | status   | Operation failed                                                                                                                                              

While the quota got respected, the way that it was respected crashed all websites using mysql databases.

The quota enforcement on mysql should imho not be done at the mysql files level, as obviously that crashes the entire server, but maybe at database access level ?

Status: 
Active

Comments

Wow, I've never seen that before ... it feels like a MySQL bug to me if a single failed file write can cause corruption of the entire DB.

You can turn this off in Virtualmin at System Settings -> Server Templates -> Default Settings -> MySQL Databsae -> Set group ownership of MySQL database files?. Once this is changed to "No", you can chgrp all the database files under /var/lib/mysql to the "mysql" user.

The down-side is that this will allow users to exceed their disk quotas by creating huge databases. Virtualmin can be setup to send you warning emails if this happens, but it won't be able to enforce the limits.

Yes, indeed it was a pretty bad surprise.

My guess is that Mysql server didn't really crash, but had all its available maximum mysql connections locked by that single site which had exceeded its limit, by queries "waiting 60 seconds" in the hope to have more disk space for their files.

My guess is that the tables not properly closed and the ones that were belonging to that site needing repair probably happened when i rebooted that server to be able to access mysql.

Think this issue needs more thoughts. I'm reluctant to change the ownership of the database files, and not sure if mysql itself implements some quotas too ?

Joe's picture
Submitted by Joe on Sat, 12/19/2009 - 16:35 Pro Licensee

I'm thinking the reboot is what did the damage, and not quotas. MySQL shouldn't behave badly in a disk full situation (which is what a quota limit looks like to the application), it should merely return an error to the application.

As far as I know, MySQL doesn't have quotas .. and even if it did, they wouldn't integrate with the Unix file-level quotas Virtualmin sets up.

Right, probably the reboot didn't / couldn't shutdown properly mysql server and had to force stop. But as I couldn't stop mysql, nor access it to see what was wrong, my only solution was to reboot.

The real issue remains: a quota blocks requests for n times 60 seconds, eating up the maximum number of mysql connections available, and blocking other sites from reaching mysql.

Even separate quotas are better than mysql connections getting eaten up with 1 site.

A few other idea: - The Quotas accordion pane on main menu should fold open when a virtual server is reaching or almost reaching the assigned quota. That would help understanding why the mysql server is flagged as not running (while actually it was, just not enough connections): - maybe when you display the icon not running, you could also displaying the error received could help too, as when there is an issue first reflex is to go to virtualmin main panel... ;)

While i have your attention, there is something that seems being an apache version update security issue that i made private report and seems to got forgotten since over a month here: (did you see it ? , should i make it public?): https://www.virtualmin.com/node/12467

Many thanks for your quick replies and help making the systems running smooth. You guys and your software both rock :)

That's a good idea to make the quota panel open .. I will add that.

As for the underlying issue, there isn't really much Virtualmin can do here .. it really looks like a MySQL bug. Although increasing the number of FDs might help.

I will ask Joe to take a look at that Apache package bug.

Just wanted to add that i've seen this exact problem on Centos 5.4 and ended up disabling quotas altogether

Here is the relevant section from the MySQL docs - http://dev.mysql.com/doc/refman/5.0/en/full-disk.html Unusually, there seems to be no way of altering this behaviour through a my.conf ini setting, for example, to return an error to the client and drop the blocking thread.

With this in mind, it may be a good idea to make the default setting for "Set group ownership of MySQL database files" NO, or place a warning about this setting on the Check Config page if quota enforcing is set to strict.

Thanks Chris for this reply.

Set group ownership of MySQL database files" NO: But that means that there is no user quota at all on databases, and theoretically a user could fill database and block all other users, in addition of database being not in his total quotas ?

I agree it's less likely and better than the effects we had.

Removing user quotas althogether can be dangerous if a user fills-up the wholde disk.

In all cases, the warning about this setting on the Check Config page if quota enforcing is set to strict is a good idea for webmin :)

Yeah, it's a trade-off .. on one hand you run the risk of users exceeding their quotas with DB tables, but on the other you could hit this bug. It seems that this bug is quite rare though, and hopefully should be fixed soon by MySQL.

In my situation disabling quotas is acceptable because i don't resell hosting on this platform, i only host host sites that i build, so i'm in control of the disk usage.

@Jamie - I do not believe that MySQL see this as a bug, its simply the way it works. If the disk is full the thread will wait for space to be available to complete, block all other threads waiting for access to that table and eventually consuming all available threads bringing the mysql server to its knees.

Is it not possible to "manually" calculate usage for a servers db's and add this to the usage reported by the quota?

for example:

bash> du -b /var/lib/mysql/some_db /var/lib/mysql/other_db | awk '{sum = sum + $1} END {print sum}'

I feel it is a MySQL bug - ideally in this case, it should just fail the SQL statement that went over the disk quota, rather than hanging forever. Or better still, this should be configurable.

Virtualmin does support reporting on DB disk usage if quotas aren't enforced - this is shown on the Edit Virtual Server and System Information pages. The problem is that without enforcement, users could consume the whole disk with excessive DB usage.

Jamie, I'm in complete agreement with you that this does look like a bug, and the default behaviour should ideally be configurable to make the offending thread throw an error, either immediately or after a given period of time or number of retries.

If i get time i will post a bug report on mysql and see what they say.

In the meantime given the choice between any db hitting its group quota and hanging the server, or a site filling the entire disk, i'd say that the former is likely to happen more often, so i'll be sticking with db quotas disabled.

I'm also going to evaluate this project as a potential stop gap measure - http://lrem.net/software/mysql-quota-daemon.xhtml

Good point and idea, actually, it's mainly the warning signals (emails, and warning on webmin main page) of exceeded database quota to site owner (and to server admin) that's important. Likelyhood of filling up the whole disk being way less.

Maybe such a periodic function (e.g. just listing database files and getting their size independantly of owner) could be included into virtualmin's periodic usage stats, avoiding installing third-party packages ?

It would allow for a more gracefull quota-enforcement avoiding customer's site going down without warning.

Virtualmin's regular quota notification emails already take DB sizes into account, if they are not included in regular disk quotas..

CentOS Linux 5.5 MySQL version 5.0.77

I just got caught by this. With more than 140 mysql sites all coming to a halt because of one over the limit account, this is an unfortunate situation.

I need to make sure I watch the Dashboard Quotas more vigilantly. How do I make the Dashboard Quotas be open by default instead of New Virtualmin Features?

Brian

If this is hurting you, I would recommend turning off quota enforcement for MySQL DBs, as explained somewhere above..