Webmin and Virtualmin as well as all virtual servers fail after operating system update

Did an update today which required a reboot and after the reboot things have gone a bit pear shaped. I can no longer get webmin nor virtualmin to run in a browser, nor can I get any websites to run on the server. I can't SSH in either externally. The only way I can login to the server is through a terminal on the Hyper V host machine. Every reboot I have done shows Hyper-V services failing (Starting Hyper-V File Copy Protocol Daemon, Starting Hyper-V VSS Protocol Daemon, Starting Hyper-V KVP Protocol Daemon). I have rebooted a few times since with little effect. Not sure where to go from here so any help appreciated.

Status: 
Active

Comments

Howdy -- if you run "/sbin/ifconfig", does that show your correct server IP address? You'd want to make sure the IP address hasn't changed.

Also, what is the output of this command:

ps auxw

That will show what processes are running. If you aren't able to access Apache or SSH, seeing if they're running is a good step.

Hi thanks for the prompt reply. Eth0 and Eth1 both point to the correct addresses. One local the other internet facing. Ps auxw shows a lot of data that I can't copy as it's running in a terminal on a windows hyper v server (I've tried copy but with no success).

Well, the thing to do would be to review that process list, and see if you spot the Apache and SSH processes.

Actually, a quick way to do that would be with these two commands:

ps auxw | grep apache2
ps auxw | ssh

Also, can you ping the outside world from the terminal?

For example, can you ping this Google DNS server:

ping -c 1 8.8.8.8

ps auwx | grep apache2 returns about 13 lines of which most start with www-data and end with /usr/sbin/apache2 -k start. The other returns 2 lines with starting with root and the other my username. Root ends with /usr/sbin/sshd -D and the one with my username grep --color=auto ssh.

Ping returns 100% packet loss.

I can add that ping was successful to a LAN machine. Seems the problem is eth0.

Ah, it sounds like your server is having a networking problem then.

First off, is your Internet online, and can other machines on the network access the Internet (especially the host system)?

If so, you may need to review your network settings on your server, such as the gateway, to verify that it's all setup as expected.

The fact that I can RDP into the host machine indicates no network issue in the datacentre. Also we have a Windows VM that handles email and that's running fine. The issue lies with the updates I installed today and after the reboot. But I don't know what it might be. I tried turning of the firewall but it is already disabled. Any other thoughts?

I'd be surprised if it were related to the updates, though it could be related to the reboot.

Some sort of network setting may have been changed at some point in the config files, and rebooting may have activated that.

My suggestion is to review the network settings being used by your server, since it doesn't appear to be able to get outside the LAN. It could be that there is a missing or incorrect gateway.

I checked /etc/network/interfaces and all appears as it should. Am going to try and restore a backup made the day before yesterday. However I'm not overly confident in that and any support I can get here in Oz wont be for another 8 hours. I've shut the VM down for the moment while the restore is in progress as resources may dwindle to nothing if both are running alongside the Windows VM.

Actually my nerve failed on doing the restore. Will have to wait till I can call the help line. Thanks anyway.

I'd be curious what the output of these commands are:

iptables -L -n
route -n
/sbin/ifconfig

I suspect you don't need to restore, and that may not fix the issue anyhow, it may just be a configuration option that needs tweaked.

Hi, first command Error: could not insert 'ip_tables' : Operation not permitted. Iptables v1.4.21: can't initialize iptable filter : table does not exist (do you need insmod?) Perhaps iptables or your kernel needs to be upgraded.

Second command Destination Gatewway Genmask Flags Metric Ref Use Iface 0.0.0.0 111.223.233.17 0.0.0.0 UG 0 0 0 eth0 111.223.233.16 0.0.0.0 255.255.255.240 U 0 0 0 eth0 192.168.3.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1

Third command eth0 Link encap: Ethernet HWaddr (MAC address) followed by the address of the server and a lot of other stuff that's too much to type :)

It doesn't sound like a firewall is the issue, as it doesn't look like the various firewall modules are loaded.

It really looks like you're experiencing a networking issue of some kind.

However, I'm not familiar with the network setup there, so that's unfortunately not something I would know how to fix.

My suggestion would be to triple-check the network settings for eth0.

It looks like the gateway is setup as 111.223.233.17, 111.223.233.16 is the network IP, and netmask of 255.255.255.240.

Also, as a troubleshooting step, you may want to consider bringing eth1 offline temporarily (assuming that's the LAN IP), just to rule out some sort of unusual issue there.

Thanks for that. Managed to SSH in via PuTTy on the host machine using the LAN ip address of the server. Can now copy and paste to and from the Virtualmin machine. Have decided that it's time to restore as tech support shouldn't be too far away (it's just gone 8:30am).

Having said that I've shutdown the original machine and if what you think is true, the restored machine should exhibit the same issues. If it does I can shut the restored machine down and bootup the old one. At least that's the theory I'm currently working on. I'm currently at 72% of the restore process.

Ok then you are right, the restored machine is doing the same thing. So am shutting it down and rebooting the original.

To take eth1 down is that just "sudo ifconfig eth1 down"?

Shutting eth1 down restored internet connection momentarily. So I'm not sure where to go with this now.

When you say it restored it momentarily -- does that mean it worked briefly, but is no longer working?

Correct. For about 3 minutes.

I could ping 8.8.8.8 and connect to a website. However when I went to connect via webmin it fell over again.

It's tough to say what you're seeing there...

Do you have other working servers on that same network?

If so, what is the output of these two commands on them:

/sbin/ifconfig
route -n

Also, on the server that's not working properly, what is the output of this command:

dmesg | tail -50

While my initial thought was that you may be seeing a networking, the above command may offer some additional insight into some other possibilities, such as if it's hardware related.

All good now. The restore from the day before fixed the problem. The thought by the engineer was that when I did the update yesterday, it also suggested to run apt-get autoremove. I did this but it may have removed more than was necessary. The restored machine is running sweetly again but with a loss of a days data (no biggy really).

Thank you for all your help in trying to troubleshooting this.

Great, I'm glad that's working now!

Hi,

Actually just noticed something, after the restore I can no longer access AWstats from virtualmin. Got to admit though that I hadn't checked stats from vitualmin for a bit so am not sure it was broken before the update disaster last Friday.

In virtualmin I can list the domain>logs and reports>AWwstats Report. It processes something but nothing shows up on the screen on the righthand side as it used to. AWstats still work directly through domain.com/awstats/awstats.pl so the customer can still see them which would mean the AWstats is still opperatinig correctly. It's just that I can't without checking the individual domain. Is there something I can do to check what maybe the cause please.

However I don't know if this is related to the restore or something else. There are 18 package updates waiting to be installed but will do that tomorrow after doing a backup first (just in case).

Thanks for the help.

Just did the package update again and it killed the server. Restored again and we are now going to build a new server (CentOS) and migrate sites over to it. No idea what is going on but the thought is that these package updates are in someway conflicting with Hyper-V. Also can we install Virtualmin on CentOS7?

That's very odd! I'm not sure what might be causing the issue you're seeing... though if you wanted to troubleshoot it, you could try updating one package at a time, and testing after each one.

We haven't heard other reports of problems with Hyper-V.

Virtualmin is supported on CentOS 7 though, so you're welcome to perform an install onto that.

Yeah not sure what the issue is but the updates were all related to BIND when things fell over. I had installed a couple of extra things on the server that I won't install this time, to leave it as vanilla as possible. The bigger brains than I (engineers that is) prefer CentOS as they understand it and use it all the time. We started this journey back in August with Ubuntu so I think it best to now bow to their wishes and let them install that. It's a good suggestion though to do one update at a time which I might do after we have everyone bedded down in CentOS.

I tried doing the updates one at a time and everything fell over on the header updates (linux-generic-lts-utopic, linux-headers-generic-lts-utopic and linux-image-generic-lts-utopic). So we restored again to keep things running. That's given us time to install CentOS.

CentOS now is up and running and have migrated all sites over to it. When I run "Check Configuration" I get "An error was found in the ProFTPd configuration template : Unix group nogroup in Group directive does not exist. This must be fixed by editing the Default Settings on the Server Templates page." I have compared both servers, Ubuntu doesn't have this issue but CentOS does. Any idea what may cause this and is there an easy fix? And should I have this in a different support ticket?

That is odd about the Ubuntu update! I'm not sure why that one package would cause that issue, though knowing that, we can keep an eye out for that in the future. Thanks for letting us know!

Regarding the CentOS / ProFTPd issue -- yeah, if you could open a new request for that, that would be fantastic.

In that new request, include the output of this command:

rpm -qa | grep proftp

Thanks!