Incorrect restore of only database + correct setting in MySQL settings generated by Virtualmin Pro ?

Hi Jamie,

1) Restoring only the "Content of MySQL" for a site (new tgz singe site per file format) doesn't do the right thing. It only does:

Restoring allowed MySQL hosts ..

instead of:

Deleting old MySQL databases ..
.. done

Restoring allowed MySQL hosts ..
.. done

Re-loading MySQL database xxxx ..
Creating MySQL database xxxx ..
.. done
.. done

2) I was hitting the issue with "ERROR 2006 (HY000) at line 3633: MySQL server has gone away" when restoring a database with some tables having millions of raws and weighting gigabytes:

I'm wondering if the mysqldump's max_allowed_packet should not be smaller than MysqlD max_allowed_packet ?

[mysqld]
max_allowed_packet = 16M

[mysqldump]
quick
quote-names
max_allowed_packet      = 16M

Increasing the MySQLd max_allowed_packet a bit allowed to finally restore that already existing backup.

I also lost the window with the restore, but unfortunately the Restores do not get logged into the Backup Log.... But i found them in the Webim actions log :-)

More bug reports to come once migration is completed...

Status: 
Active

Comments

Could you post the full restore output from when the databases didn't get included? I'd like to see if there was some error message earlier or later in the process ..

Nota bene: When restorign All features, the database is correctly deleted first. Which it's not here, and thus it's not recreated.

I've seen no errors before or after, otherwise would have pasted them here. :-)

Here the Wemin page contents from the activity log: (I love that activity log feature:

Starting restore of 1 domains from local file /root/backupFINAL-2012-12-17/example.com.tar.gz ..

Extracting backup archive file ..
.. done

Restoring backup for virtual server joomlapolis.com ..
Restoring allowed MySQL hosts ..
.. done
Enabling PHP modules for restored scripts ..
.. no PHP modules needed to be installed
.. restore complete.

While at it, this is the trace from the activity log of same log entry: could maybe help you (replaced domain and user by example and IP address at end by 1.1.1.1):

 Files changed and commands run
 Executed command
setquota -u exampleuser 0 0 0 0 \/
 Executed command
setquota -g exampleuser 0 0 0 0 \/
 Executed SQL statement in database mysql
delete from user where user = 'exampleuser'
 Executed SQL statement in database mysql
delete from db where user = 'exampleuser'
 Executed SQL statement in database mysql
insert into user (host, user, password) values ('localhost', 'exampleuser', password('XXXXXXXXX'))
 Executed SQL statement in database mysql
delete from db where host = 'localhost' and db = 'example\_databasename' and user = 'exampleuser'
 Executed SQL statement in database mysql
insert into db (host, db, user, Select_priv, Insert_priv, Update_priv, Delete_priv, Create_priv, Drop_priv, Grant_priv, References_priv, Index_priv, Alter_priv, Create_tmp_table_priv, Lock_tables_priv, Create_view_priv, Show_view_priv, Create_routine_priv, Alter_routine_priv, Execute_priv, Event_priv, Trigger_priv) values ('localhost', 'joomlapo\_production', 'exampleuser', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y')
 Executed SQL statement in database mysql
flush privileges
 Executed SQL statement in database mysql
flush privileges
 Executed command
setquota -u exampleuser 104857600 104857600 0 0 \/
 Executed command
setquota -g exampleuser 0 0 0 0 \/
 Executed command
setquota -u exampleuser 104857600 104857600 0 0 \/
 Executed command
setquota -g exampleuser 104857600 104857600 0 0 \/
 Changed file /etc/webmin/virtual-server/domains/121588260116075
3c3
< file=/etc/webmin/virtual-server/domains/
---
> file=/etc/webmin/virtual-server/domains/121588260116075
31c31
< lastsave=1355746097
---
> lastsave=1355753256
 Executed command
setquota -u exampleuser 104857600 104857600 0 0 \/
 Executed command
setquota -g exampleuser 104857600 104857600 0 0 \/
 Changed file /etc/webmin/virtual-server/domains/121588260116075
25d24
< wasmissing=1
131d129
< old_dns_ip=
152d149
< old_ip=1.1.1.1

Hope that helps. :-)

It almost looks like the backup file is missing any MySQL databases..

Could you run tar tzf /root/backupFINAL-2012-12-17/example.com.tar.gz | grep mysql and let me know what it outputs?

Doing... t is not x :-D

Here the output of the command:

./mysqltuner.pl
./.backup/example.com_mysql
./.backup/example.com_mysql_example_databasename.gz
./public_html/includes/database.mysql5.php
./public_html/includes/database.mysqli.php
./public_html/mambots/content/geshi/geshi/mysql.php
./public_html/pbugs/adodb/datadict/datadict-mysql.inc.php
./public_html/pbugs/adodb/drivers/adodb-mysql.inc.php
./public_html/pbugs/adodb/drivers/adodb-mysqli.inc.php
./public_html/pbugs/adodb/drivers/adodb-mysqlt.inc.php
./public_html/pbugs/adodb/drivers/adodb-pdo_mysql.inc.php
./public_html/pbugs/adodb/perf/perf-mysql.inc.php
./public_html/pforum/Themes/babylon/images/h_powered-mysql.gif
./public_html/pforum/Themes/babylon/images/powered-mysql.gif
./public_html/pforum/Themes/classic/images/mysql.gif
./public_html/pforum/Themes/default/images/h_powered-mysql.gif
./public_html/pforum/Themes/default/images/powered-mysql.gif

(mysqltuner.pl was a file in the main folder of that server, not accessible from the web).

That looks fine to me - the database example_databasename should have been restored.

I assume that on the restore form, you selected only the "Contents of server's MySQL databases" feature?

Would it be possible to get a copy of this backup file, so I can do a test restore and see if the problem can be replicated?

You probably don't want to have 9.7 Gigabytes in your inbox.... ;-)

I thought it was an easy to reproducee bug, but seems that it works on a correctly restored site... So:

I need to add one more thing that might help you find the issue: Maybe "1)" is related to "2)": In the sense that following the bug "2)", I tried restoring only database as "1)".

But "2)" didn't complete, as I found out after posting bug, as it stopped at database restoration step with error "ERROR 2006 (HY000) at line 3633: MySQL server has gone away", and didn't complete the steps after that.

Here the exact full restore (from folder containing all new-version backup files) that ran before the database-content-only-restoration that failed BEFORE the trial above: (note that the ns3 was not migrated yet, and was giving errors, that could be ignored fine in all other restores that worked):

BUT also note that after that database max_packet issue all next domains restoration failed communicating with the slaved DNSes too. But interestingly, after fixing the setting of mysql, deleting example.com and example.net and following virtual servers, and restoring them worked fine.

Re-creating virtual server example.com ..
• Creating administration group example_user ..
• .. done Creating administration user example_user ..
.. done
Creating aliases for administration user ..
.. done
Adding administration user to groups ..
.. done
Creating home directory ..
.. done
Creating mailbox for administration user ..
.. done
Adding new DNS zone ..
.. done
Adding slave zone on ns2.example.com ns1.example.com ns3.example.com ..
.. some slave servers failed :
ns3.example.com : This zone already exists
Adding to email domains list ..
.. done
Adding default mail aliases ..
.. done
Adding DKIM records to DNS domain example.com ..
.. added successfully
Adding new virtual website ..
.. done
Adding webserver user www-data to server's group ..
.. done
Performing other Apache configuration ..
.. done
Setting up scheduled Webalizer reporting ..
.. done
Adding new SSL virtual website ..
.. done
Setting up log file rotation ..
.. done
Creating MySQL login ..
.. done
Creating MySQL database example_databasename ..
.. done
Setting up spam filtering ..
.. done
Setting up virus filtering ..
.. done
Creating status monitor for website ..
.. done
Creating status monitor for SSL website ..
.. done
Creating status monitor for SSL certifcate ..
.. done
Setting up AWstats reporting ..
.. done
Setting up password protection for AWstats ..
.. done
Re-starting DNS server ..
.. done
Re-starting slave DNS servers ..
.. some slave servers failed
ns3.example.com : BIND does not appear to be running on the slave server ns3.example.com
Applying web server configuration ..
.. not running!
Saving server details ..
.. done

Restoring backup for virtual server example.com ..
• Restoring virtual server password, quota and other details ..
• .. done Updating administration password and quotas ..
.. done
Restoring Cron jobs ..
.. done
Extracting TAR file of home directory ..
.. done
Setting ownership of home directory ..
.. done
Re-creating records in DNS domain ..
.. done
Restoring Apache virtual host configuration ..
.. done
Checking restored PHP execution mode ..
.. mode Apache mod_php OK for this system
Restoring Webalizer configuration files and Cron job ..
.. done
Restoring SSL Apache virtual host configuration and certificate ..
.. done
Restoring Logrotate configuration ..
.. done
Restoring allowed MySQL hosts ..
.. done
Re-loading MySQL database example_databasename ..


• .. load failed! ERROR 2006 (HY000) at line 3633: MySQL server has gone away


Re-creating virtual server example.net ..
• Creating administration group netexample ..
• .. done Creating administration user netexample ..
.. done
Creating aliases for administration user ..
.. done
Adding administration user to groups ..
.. done
Creating home directory ..
.. done
Creating mailbox for administration user ..
.. done
Adding new DNS zone ..
.. done
Adding slave zone on ns2.example.com ns1.example.com ns3.example.com ..
.. some slave servers failed :
ns2.example.com : Error reading response length from fastrpc.cgi :
ns1.example.com : Error reading response length from fastrpc.cgi :
ns3.example.com : Error reading response length from fastrpc.cgi :
Re-starting DNS server ..
.. done
Re-starting slave DNS servers ..
.. some slave servers failed
ns2.example.com : Error reading response length from fastrpc.cgi :
ns1.example.com : Error reading response length from fastrpc.cgi :
ns3.example.com : Error reading response length from fastrpc.cgi :
Applying web server configuration ..
.. not running!
Saving server details ..
.. done

Restoring backup for virtual server example.net ..

• Restoring virtual server password, quota and other details ..
• .. done Updating administration password and quotas ..
.. done
Restoring Cron jobs ..
.. done
Extracting TAR file of home directory ..
.. done
Setting ownership of home directory ..
.. done
Re-creating records in DNS domain ..
.. done
Re-creating mail and FTP users ..
.. done
Re-creating mail aliases ..
.. done
Restoring mail and FTP user Cron jobs ..
.. done

Re-creating virtual server NEXT.......

ALL NEXT ONES WITH SAME ERRORS "Error reading response length from fastrpc.cgi : " ON ALL 3 NS servers, but when deleted virtual servers (one by one....) and redone a restore was ok...strange....

I assume that on the restore form, you selected only the "Contents of server's MySQL databases" feature?

yes, only that checkmark.

I think the "mysql server has gone away" error may have contributed to this ... did MySQL actually crash on your system?

In my tests, I was unable to re-produce this with a simple restore.

i'm not 100% sure if mysqld crashed, or just mysql client returned that error, as I was away during that very long restore, and no site was active during that restore.

Is there a log somewhere if the monitor of Virtualmin catches a mysqld down and restarts it automatically ?

Maybe related, but doubt it....: Separate actually very very annoying new bug "3)":

I also noticed that during last few days, on the old server automatic backups, and it happened to me also in the first trial to manually backup, the example.com.tar.gz file had only 20 bytes instead of 9.7 GigaBytes !!!

I'm attaching such a file (added .txt at end to upload)

The worst part of that other bug is that the email was like the whole backup was successful, no error, no notices, nothing, same as if everything went well ! ... except that backup was empty....and choking on invalid tar.gz file at restore. Fortunately, it was a server migration, and not a crash recovery! :-)

e.g.: (note total size and backup times):

Backup is complete. Final size was 6.11 GB. Total backup time was 48 minutes, 12 seconds.

Sent by Virtualmin at: https://example.com:10000

Running pre-backup command ..
.. done

......

Creating backup for virtual server example.com ..
   Copying virtual server configuration ..
   .. done

   Backing up Cron jobs ..
   .. none defined.

   Copying records in DNS domain ..
   .. done

   Saving mail aliases ..
   .. done

   Saving mail and FTP users ..
   .. done

   Backing up mail and FTP user Cron jobs ..
   .. done

   Copying Apache virtual host configuration ..
   .. done

   Copying Webalizer configuration files ..
   .. done

   Copying SSL Apache virtual host configuration and certificate ..
   .. done

   Copying Logrotate configuration ..
   .. done

   Dumping MySQL database joomlapo_production ..
   .. done

   Copying Procmail and SpamAssassin configuration files ..
   .. done

   Backing up AWstats configuration file ..
   .. done

   Creating TAR file of home directory ..
   .. done

   Uploading archive to SSH server backupserver.example.com ..
   .. done

.. completed in 11 minutes, 10 seconds

Creating backup for virtual server nextserver.exemple.com ..

....

20 servers backed up successfully, 0 had errors.
9 Virtualmin configuration settings backed up successfully.
Deleting backups from /mnt/data/backups/example-%Y-%m-%d on SSH server backups.example.com older than 21 days ..
   Deleting file /mnt/data/backups/example-2012-11-23 via SSH, which is 22 days old ..
   .. deleted 4 kB.

.. deleted 1 old backups

Running post-backup command ..
.. done

To be compared with a successful backup (really successful):

Backup is complete. Final size was 15.19 GB. Total backup time was 1 hours, 52:33 minutes.   Sent by Virtualmin at: https://example.com:10000   Running pre-backup command .. .. done     Creating backup for virtual server joomlapolis.com .. Copying virtual server configuration .. .. done   Backing up Cron jobs .. .. none defined.   Copying records in DNS domain .. .. done   Saving mail aliases .. .. done   Saving mail and FTP users .. .. done   Backing up mail and FTP user Cron jobs .. .. done   Copying Apache virtual host configuration .. .. done   Copying Webalizer configuration files .. .. done   Copying SSL Apache virtual host configuration and certificate .. .. done   Copying Logrotate configuration .. .. done   Dumping MySQL database joomlapo_production .. .. done   Copying Procmail and SpamAssassin configuration files .. .. done   Backing up AWstats configuration file .. .. done   Creating TAR file of home directory .. .. done   Uploading archive to SSH server li98.cbpolis.com .. .. done   .. completed in 1 hours, 17:39 minutes   Creating backup for virtual server ......   ......   20 servers backed up successfully, 0 had errors. 9 Virtualmin configuration settings backed up successfully. Deleting backups from /mnt/data/backups/example-%Y-%m-%d on SSH server backups.example.com older than 21 days .. Deleting file /mnt/data/backups/example-2012-11-25 via SSH, which is 21 days old .. .. deleted 4 kB.   .. deleted 1 old backups   Running post-backup command .. .. done <code>     (note that it says ".. deleted 4 kB." (which is the meta-data size of the folder, and not the content of it).   Unfortunately, automated backups seem not to be logged (with great precision) in the activity log like e.g. manual restores. So didn't find any deeper trace.   But this is a sign for us that we will have to put a second different and independent backup method in place...as we can't rely on this one only.

That's really odd - it looks like you got only an empty (but valid) tar file!

Was anything else happening on the system when this failure happened, like perhaps low disk space or another backup to the same destination?

I also emailed you the 3 files (.dom and .info) if that can help.

No, we monitor disk space closely using Zabbix, and have alarm emails firing within seconds when the used space is above 80%, and that didn't happen on neither system.

Also there are concurrent backups by ssh to the backup server from other systems, but we try to have them not at same time. And that backED-up server is a dedicated server, so no other site owner can do backups at same time. There is only one daily scheduled backup. And it's the one with that (biggest virtual server with big database tables and lots of files) that failed.

Is this problem still happening on your system, or can you easily re-produce it? That would make it much easier to debug ..

I will sure keep an eye on backup sizes from now onwards and write in here when I have something.

In the mean time 2 other Backup restoration bugs:

4) When restoring a domain which was sharing an SSL key with another domain with same IP, the SSL key is not backed up with that domain, making the separate restoration of that domain on a different server not possible with the SSL feature: It gives an error that SSL can't be restored, because /home/otherdomain/ssl.key doesn't exist.

5) But when then trying to enable the SSL feature in "edit domain", it it STILL impossible to activate SSL, this is the result of the action: (btw, weird, it tries to change IP address, while it stays same, as it's only domain on that server, and using the shared address)

Changing IP address of virtual website ..
.. done
Creating SSL certificate and private key ..
.. SSL website failed! : Failed to open /home/otherdomain/ssl.cert.webmintmp.11802 : No such file or directory at /usr/share/webmin/web-lib-funcs.pl line 1361, line 1.

Saving server details ..
.. done

Applying web server configuration ..
.. done

So I created the folder and copied over the ssl certificates. Then the feature got added ok.

So next bug:

6) But then when I go to edit the SSL certs, it says:

This virtual server shares its SSL certificate with , so it cannot be edited on this page. Use its Manage SSL Certificate page to change SSL settings.

clicking on the link to edit doesn't go anywhere since the corresponding virtual server doesn't exist:

This section shows the details of the SSL certificate currently being used by this virtual server.
Current SSL certificate details
SSL certificate file (EMPTY)
SSL private key file (EMPTY)
Certificate type Self-signed
Download certificate PEM format | PKCS12 format
Download private key PEM format | PKCS12 format

so yet another bug:

7) Even more wierd, DISABLING SSL certificate still doesn't get rid of the pointer to the other domain, because re-enabling it doesn't re-create a self-signed clean blank certificate.

Ok, two more bug when trying to separate websites to different servers one at backup time and one at restore time:

8) (backup):

+

9) (restore):

When deselecting to backup the users Email/FTP accounts, and deselecting to restore them, the huge folders homes/usernames and homes/usernames/Maildir (we use IMAP so they are huge) still get backed up AND still get restored. Even if backed up, if that feature is unselected at restore time, it should NOT be restored.

For me, main big bug in here is still "3)", and as said I will keep an eye on it...Btw, if destination folder is full, SSH sees that the transfer failed, and the backup is marked and emailed as Failed correctly. So it's certainly not that.

For this SSL issue, did you restore all the domains that shared an SSL cert at the same time?

For this SSL issue, did you restore all the domains that shared an SSL cert at the same time?

No, only 1 of them that I wanted to put on a separate server (and not the one that Virtualmin decided was the "holder" of the SSL certificate, which was probably the first one that was added SSL on the old servers.

Ok, the kind of splitting of the SSL cert that would be needed to handle that case hasn't been implemented yet, sorry ..

Ok, no worries, bug "4)" in here I can live with it. Would be nice if you had a work-around for bug "7)" which disallows to get out of bug "4)" situation. Any hint how to bring Virtualmin to forget about the SSL cert which was in the other now inexistent domain when disabling and then re-enabling the SSL feature ?

Still keeping an eye on the most serious of these bugs, the "3)" (20-bytes empty tar file instead of 10 gigs). I have seen backups on other old servers doing that too from time to time. Keeping an eye to new backups sizes.

How do I get rid of error: number 6) above in my comment #14 https://www.virtualmin.com/comment/590756#comment-590756 ?

StartSSL has been distrusted, so I'm moving to Let's-Encrypt, but can't because of that error, and I don't find where to unlink manually. Disabling SSL from domain then re-enabling it doesn't unlink from the unexistant domain.

Solving that issue is becoming urgent here.

Nevermind, found it: in file /etc/webmin/virtual-server/domains/XXXXXXXXXX remove line ssl_same="YYYYYYYY" fixed it for me.

Glad you were able to figure it out, thanks for letting us know how you fixed it!

You are welcome.

btw. I also had to change in that server file XXXXXXXXXXX the ssl private, public and cert file path. But that was obvious once the certs couldn't be written.

Note: The bugs reported in here are all still open.

Ok, first off what I'll look into is a way to break the SSL cert linkage with another domain manually.

Thanks, no rush, problem is solved on my end. Now all my certs use Let's encrypt. It took a full day to replace around 25 certs on around 12 servers because of all these linkages and a few surprises.

I found following way to break it manually on a server where they were linked properly: Replace the wildcard cert to which all domains were smart-linked by a self-signed one generated in virtualmin.

For the server where the migration of domains without the linked-to domain (splitting sites), I had to edit the virtualmin config files.

Oh, so you replaced a wildcard cert with a regular one, and the linkages to domains that used to be valid weren't broken?

Linkages disappeared and the linked sites automatically got a self-signed certificate (according to the Virtualmin SSL page, didn't check if that was real, as I then just requested through virtualmin a Let's Encrpty Cert for each of those servers for subdomains of the main domain that had a wildcard cert separately with configured auto-renewals all 2 months in virtualmin). I luckily just didn't hit the 20 subdomains per week limit of Let's Encrypt!!!

btw, I have probably hit another bug in Virtualmin's Postfix's IP ports auto-config when I did delete a server and added it back as a subserver, it mixed up the IP addresses for Postfix, and as a result a server with dedicated IP address didn't get mails anymore (secondary server got them and forwarded them once the issue was fixed). But that could have been an old config that wasn't "right", as it was listening for smtp on all * IP addresses, and working fine. The new auto-config in Virtualmin replaced the * with main server IP address, but didn't add back an untouched domain with a private IP address that wasn't in there before, as not needed. Not enough clear thing to open a bug ticket for it.

FYI, the next release of Virtualmin will include a button to break the SSL cert link to another domain.

Nice! Thanks Jamie! You rock as usual :-) Happy New Year!

(keeping open because of the other bugs)