Bad database import causing MySQL to crash [#59509]

Submitted by sgrayban on Mon, 11/19/2018 - 14:47

I attempted a domain move from 1 server to my client new server and the database didn't import correctly and now its cashing..

MySQL 5.5.62-0+deb8u1 (Debian)

This is just 1 error out of many that that's causing the crash and I can't figure out how to fix this....

181119 15:17:42  InnoDB: Error: table 'i474470_wp2/wp_users'
InnoDB: in InnoDB data dictionary has tablespace id 215,
InnoDB: but tablespace with that id or name does not exist. Have
InnoDB: you deleted or moved .ibd files?
InnoDB: This may also be a table created with CREATE TEMPORARY TABLE
InnoDB: whose .ibd and .frm files MySQL automatically removed, but the
InnoDB: table still exists in the InnoDB internal data dictionary.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.

181119 15:21:19  InnoDB: error: space object of table 'i474470_wp2/wp_aiowps_login_activity',
InnoDB: space id 182 did not exist in memory. Retrying an open.
181119 15:21:19  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
181119 15:21:19  InnoDB: Error: trying to open a table, but could not
InnoDB: open the tablespace file './i474470_wp2/wp_aiowps_login_activity.ibd'!
InnoDB: Have you moved InnoDB .ibd files around without using the
InnoDB: commands DISCARD TABLESPACE and IMPORT TABLESPACE?
InnoDB: It is also possible that this is a temporary table #sql...,
InnoDB: and MySQL removed the .ibd file for this.
InnoDB: Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting-datadict.html
InnoDB: for how to resolve the issue.
181119 15:21:19  InnoDB: cannot calculate statistics for table i474470_wp2/wp_aiowps_login_activity
InnoDB: because the .ibd file is missing.  For help, please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting.html
181119 15:21:19 [ERROR] MySQL is trying to open a table handle but the .ibd file for
table i474470_wp2/wp_aiowps_login_activity does not exist.
Have you deleted the .ibd file from the database directory under
the MySQL datadir, or have you used DISCARD TABLESPACE?
See http://dev.mysql.com/doc/refman/5.5/en/innodb-troubleshooting.html
how you can resolve the problem.

I have the database running in innodb_force_recovery = 4

But this isn't going to work for long. Please HELP !!

Status:

Closed (cannot reproduce)

Comments

Submitted by JamieCameron on Mon, 11/19/2018 - 16:47 Comment #1

Wow, that's an unusual one! A Virtualmin database import just runs SQL statements to re-create tables from the original system, which I wouldn't expect can actually crash it!

Was the import from the same MySQL version?

Submitted by sgrayban on Mon, 11/19/2018 - 16:52 Comment #2

No. But I moved other sites from the same server with the older mysql and no problems.

This move crippled mysql.

Submitted by andreychek on Mon, 11/19/2018 - 17:23 Comment #3

Is it possible that some of the database files had been copied manually? As while we hadn't ever seen an issue like the above before after a typical backup/restore, we had seen issues like that when database files were manually copied from one system to another.

Joe's thought was to try deleting the domain in question, and see if MySQL starts normally at that point.

If it does, re-attempt the restore, and note the restore output.

Does MySQL work okay now, or do you see those same issues?

If you see the same issues, what is the output you received during the restore?

Submitted by sgrayban on Mon, 11/19/2018 - 18:19 Comment #4

I did delete the domain but it erorred out on the database with same errors. Now I mysql is running is safe mode so no writes to any database.

I even tried deleting the database and it wont let me. This is beyond my scope of mysql and I have no idea how to fix it.

Submitted by sgrayban on Mon, 11/19/2018 - 18:21 Comment #5

I enabled ssh for support to look at.

Submitted by sgrayban on Mon, 11/19/2018 - 18:50 Comment #6

I placed a dump from the old server in /var/lib/mysql

file is i474470_wp2.dump

Submitted by andreychek on Mon, 11/19/2018 - 18:50 Comment #7

I've asked Joe for his thoughts on that problem.

In the meantime, would it be possible for us to access that backup to see if we can reproduce that problem on our test systems?

Submitted by sgrayban on Mon, 11/19/2018 - 18:52 Comment #8

You want a whole backup of the domain or just the DB ?

I did place a db dump in /var/lib/mysql file i474470_wp2.dump

Submitted by sgrayban on Mon, 11/19/2018 - 18:57 Comment #9

Old mysql version is

mysql Ver 14.14 Distrib 5.5.54, for debian-linux-gnu (x86_64) using readline 6.2

Submitted by andreychek on Mon, 11/19/2018 - 19:00 Comment #10

We'd like whatever was used immediately prior to seeing this database issue.

If this initially happened during a Virtualmin restore, we'd definitely want that exact Virtualmin backup archive that was used.

Submitted by sgrayban on Mon, 11/19/2018 - 19:02 Comment #11

Ok I am making a backup of the domain.

I initially moved the domain via cloudmin server-to-server not manually..

Submitted by sgrayban on Mon, 11/19/2018 - 19:04 Comment #12

Ok how do you want the backup ? It's 2gb in size.

Submitted by andreychek on Mon, 11/19/2018 - 19:07 Comment #13

Is it possible to put it into a directory on the server that's experiencing the problem? We can download it from there.

Also, to help us in reviewing the logs, when did you first attempt the migration/restore?

Submitted by sgrayban on Mon, 11/19/2018 - 19:09 Comment #14

It's in /root/domain-backup

In the mean time I need to get mysql working---its in read mode only and i have clients unhappy right now.

Submitted by andreychek on Mon, 11/19/2018 - 19:23 Comment #15

Thanks!

To help us in reviewing the logs, when did you first attempt the migration/restore? That is, when did this issue all begin?

We unfortunately don't have the staffing to be able to work on an emergency/on-call basis, so I don't have an ETA for them someone will be able to review that. That's also assuming it's fixable at all.

If you are experiencing a critical issue and you need to get the system online ASAP, you may need to look into restoring the system from a previous backup. For example, if this is a VPS, if you happen to have a VPS snapshot available, that might be something to consider.

Submitted by sgrayban on Mon, 11/19/2018 - 19:53 Comment #16

Sunday 2-3 pm.

So I have a broken system because a import failed ? wow

No its a dedicated server. No backups.

Submitted by Joe on Tue, 11/20/2018 - 19:28 Pro Licensee Comment #17

I've never seen this error before, so I'm flying pretty blindly here.

But, according to the document linked in the error you pasted above, this can happen if MySQL crashes or is otherwise interrupted during a migration (maybe OOM killer, or maybe disk space, maybe something else). And, the link also provides steps for correcting it, but I wouldn't feel comfortable attempting them or suggesting you attempt them without the system having good backups first.

I don't think this is really related to Virtualmin or Cloudmin (as there's nothing they do that should be able to make MySQL crash), I think it just happened during a move because a move does a lot of work in a short time. So, any disk/memory/kernel/MySQL issues or bugs would be most likely to be triggered at that time. (Of course, if there is a repeatable issue with moving a database, we'll fix it, but I think we would have heard about it before now, since this part of the code hasn't changed much in years, and this is a relatively old and well-supported version of MySQL.)

Submitted by sgrayban on Tue, 11/20/2018 - 19:38 Comment #18

Do you know of a way to force the database to drop even with bad tables?

Submitted by Joe on Tue, 11/20/2018 - 23:40 Pro Licensee Comment #19

The link in your paste above says this:

"To work around this problem, start the mysql client with the --skip-auto-rehash option and try DROP TABLE again. (With name completion on, mysql tries to construct a list of table names, which fails when a problem such as just described exists.) "

So, I think they're saying that (starting with --skip-auto-rehash) should allow you to drop tables that exhibit this problem. So, maybe that will allow you to get things into a usable state.

Submitted by sgrayban on Wed, 11/21/2018 - 00:48 Comment #20

Tried that... I can't even dump the good databases while its in recovery.

I'm so fucked.

Submitted by sgrayban on Wed, 11/21/2018 - 00:50 Comment #21

Do you know anyone that's a pro at MySQL ?

In my 30 years of maintaining linux servers I have never had a sql server get screwed up like this.

Submitted by sgrayban on Wed, 11/21/2018 - 01:38 Comment #22

Ok I was able to remove the bad database but mysql still wont start.

Submitted by Jfro on Wed, 11/21/2018 - 02:38 Comment #23

Don't know or it makes sense. While you say customers waiting and so on.

Is it a idea to setup a extra new server for those and trying with the databases and data you have to get them in the air before time is running out and they are gone.? . ( even a kind of clean "empty DATAbase" could somtimes makes sense while they can go on with their work ) Yep you have more work afterwards if so to "merge") It depends how important things are in wich cases.

Also while even if mysql is running and their databases are working do you know for sure then they are not corrupted somehow or have some unrelaible data in them.?

Submitted by sgrayban on Wed, 11/21/2018 - 02:46 Comment #24

Good news I was able to make backups of all the other databases that are ok.

Not sure what I should do next. Delete mysql and start over ?? Is that possible ? I wonder if I can dump the user and passwords that are stored in mysql.....

Submitted by Joe on Wed, 11/21/2018 - 04:11 Pro Licensee Comment #25

So, you weren't able to remove the problem tables, or not able to drop the problem database after removing those tables?

What specific errors are you getting when trying to remove the problem tables and/or drop the database?

Submitted by sgrayban on Wed, 11/21/2018 - 04:17 Comment #26

I got the db to drop but it still errors out... It's so cryptic its not funny. I have googled everything I can think of but no go.

Stopping mysql all I get it this line -- InnoDB: Waiting for 1 active transactions to finish -- which seems to be in a loop and I can't figure out why. I think I'm at the end of fixing this correctly.

181121  2:33:12 [Note] Plugin 'FEDERATED' is disabled.
181121  2:33:12 InnoDB: The InnoDB memory heap is disabled
181121  2:33:12 InnoDB: Mutexes and rw_locks use GCC atomic builtins
181121  2:33:12 InnoDB: Compressed tables use zlib 1.2.8
181121  2:33:12 InnoDB: Using Linux native AIO
181121  2:33:12 InnoDB: Initializing buffer pool, size = 128.0M
181121  2:33:13 InnoDB: Completed initialization of buffer pool
181121  2:33:13 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 903884773
181121  2:33:13  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 903964929
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 1 row operations to undo
InnoDB: Trx id counter is 1A8A500
181121  2:33:13  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
InnoDB: Starting in background the rollback of uncommitted transactions
181121  2:33:13  InnoDB: Rolling back trx with id 1993F72, 1 rows to undo
181121  2:33:13  InnoDB: Waiting for the background threads to start
181121  2:33:13  InnoDB: Assertion failure in thread 140519250327296 in file fut0lst.ic line 83
InnoDB: Failing assertion: addr.page == FIL_NULL || addr.boffset >= FIL_PAGE_DATA
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
07:33:13 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=838860800
read_buffer_size=104857600
max_used_connections=0
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 16592589 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x33)[0x559ee1db60f3]
/usr/sbin/mysqld(handle_fatal_signal+0x3e4)[0x559ee1ca1b14]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7fcd7c8bf890]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fcd7b28d067]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fcd7b28e448]
/usr/sbin/mysqld(+0x613e4d)[0x559ee1ee0e4d]
/usr/sbin/mysqld(+0x596591)[0x559ee1e63591]
/usr/sbin/mysqld(+0x5abd6a)[0x559ee1e78d6a]
/usr/sbin/mysqld(+0x5a3cd4)[0x559ee1e70cd4]
/usr/sbin/mysqld(+0x5a4f80)[0x559ee1e71f80]
/usr/sbin/mysqld(+0x59f85f)[0x559ee1e6c85f]
/usr/sbin/mysqld(+0x65844c)[0x559ee1f2544c]
/usr/sbin/mysqld(+0x658bb6)[0x559ee1f25bb6]
/usr/sbin/mysqld(+0x59de4a)[0x559ee1e6ae4a]
/usr/sbin/mysqld(+0x59e523)[0x559ee1e6b523]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7fcd7c8b8064]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fcd7b34062d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Submitted by Joe on Wed, 11/21/2018 - 04:17 Pro Licensee Comment #27

And, is it possible there's a hardware (i.e. disk) problem? This is pretty surprising behavior...MySQL is pretty reliable, I've never seen it in an unrecoverable state, at least not in the past decade or so.

Submitted by sgrayban on Wed, 11/21/2018 - 04:22 Comment #28

It doesn't seem to fix the 1 transaction

181121  5:19:32  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 903964929
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 1 row operations to undo

Think I should just remove mysql and reinstall it ?

Submitted by andreychek on Wed, 11/21/2018 - 10:44 Comment #29

Joe and/or Jamie may have some additional thoughts, but just to chime in while the west coast is sleeping --

It may be fixable, it may not be... however, if it were my server, I just wasn't having any luck resolving the database issue, and I wanted to get it up and running ASAP -- I would seriously consider doing this:

(this all assumes I have a fairly recent Virtualmin backup containing the databases)

I'd start by making a new backup of everything possible
Remove the mysql-server package
Ensure that /var/lib/mysql is cleared out
Reinstall the mysql-server package
Restore Virtualmin backups for all my domains, but just the database feature

There's likely no reason to restore the full domains, and restoring just the database feature for all your domains should be all you'd need to get MySQL back up and running to a known good state.

Joe and/or Jamie may have some further thoughts for you on how to resolve the database corruption you're having. But as it seems like time is an issue here, if you have a way to revert to a known good state, maybe that would be a way to get things up and running again.

However, we'll hear from the other guys soon though if that's not the route you want to go :-)

Submitted by sgrayban on Wed, 11/21/2018 - 15:23 Comment #30

Ok I have the backups.... crossing fingers and removing mysql and reinstalling it. Just for good measure I did make a backup of mysql.sql just encase.

Submitted by sgrayban on Wed, 11/21/2018 - 16:16 Comment #31

It's back and all is well !! whew...........

Submitted by Joe on Wed, 11/21/2018 - 16:26 Pro Licensee Comment #32

Just rolling back that transaction would probably sort it out. It's possible there are other issues hiding behind that one (i.e., it's stopping as soon as it sees that problem and doesn't show others, but I think it usually complains about everything it sees wrong when starting up).

Submitted by Joe on Wed, 11/21/2018 - 16:30 Pro Licensee Comment #33

Glad you got it sorted. So, when you try the migration again, watch for errors, and let us know if something goes wrong. Make sure you've got all the latest updates and such first, though, and maybe do a basic disk check (something like badblocks--just a read only test is probably sufficient if this is the only weird problem you've seen on this hardware). Since this is a somewhat strange issue, and a crashing MySQL is possibly a sign of something wrong, you'll wanna be on the look out for what caused that crash (a regular old import from a dumped database obviously shouldn't crash MySQL!).

You might try it from the command line to insure you see all the errors.

Submitted by sgrayban on Wed, 11/21/2018 - 16:32 Comment #34

I had to completely delete mysql and /var/lib/mysql

Restored just the databases and back in working condition.

Submitted by sgrayban on Wed, 11/21/2018 - 16:35 Comment #35

I'm not going to risk another try at moving that domain again. Cost was too high.

And in 30 years of working on linux servers I have never seen a complex failure like this. I hope I never see one again either.

Submitted by Joe on Sun, 11/25/2018 - 17:18 Pro Licensee Comment #36

OK, glad you got it sorted out. I would still recommend doing some testing on that server. I strongly suspect it wasn't specific to that one database, I suspect it is a systemic problem that was just triggered by the large amount of activity. Could also be a MySQL or kernel bug, and making sure you're up to date will hopefully help prevent it from happening again.