Updating IPs taking forever!

Hello, I am trying to change my IP to my server, and it is a nightmare, I don't know if it will get it done by tomorrow and what am I suppose the explain to people... Already propagated from the registrar, a few domains have the new IP, others, the old one, and I am thinking that if I try to reverse this it will take, of course, even MORE. For each domain I get some errors:

Changing IP address for bla.ro ..

Changing IP address of virtual website ..

.. done

.. Webalizer reporting failed! : Failed to lock file /etc/webmin/webalizer/home_bla_logs_access_log.log after 5 minutes. Last error was : at /usr/libexec/webmin/web-lib-funcs.pl line 1427.

Changing IP address of SSL virtual website ..
.. done

Updating Webmin user ..
.. done

Saving server details ..
.. done

or

.. BIND DNS domain failed! : Failed to lock file /var/named/bla.ro.hosts after 5 minutes. Last error was : at /usr/libexec/webmin/web-lib-funcs.pl line 1427.

So that scripts really hates line 1427; but a 5 minutes timeout? What's wrong with 3 seconds :D ?

Status: 
Active

Comments

It sounds like some .lock files may be holding things up.

That indicates that either another Webmin process is currently doing something with those files -- or previously, while working with them, a Webmin process got interrupted.

You may want to check and see if there are still .lock files present in those directories.

What is the output of these two commands:

ls -la /var/named/
ls -la /etc/webmin/webalizer/

If there is a .lock file there, you'd want to look inside it to get the PID of the process that's holding it open. If that PID no longer exists, it can be safely removed.

fakemoth's picture
Submitted by fakemoth on Fri, 05/20/2016 - 14:54

Thanks, but I didn't think someone will answer me that fast :) so I decided to reboot the machine - after that it worked fine; I just hope it didn't trash anything else...

fakemoth's picture
Submitted by fakemoth on Fri, 05/20/2016 - 15:22

Something did happen, just received these two emails from the system:

Error: Failed to lock file /etc/webmin/status/oldstatus after 5 minutes. Last error was :
Error
-----
Failed to lock file /etc/webmin/status/oldstatus after 5 minutes. Last error was :
-----

How worried should I be?

You may want to try what we were suggesting above, to see if there is a stale lock file in that directory, and if so, to remove that lock file.

It's odd that you're seeing so many of those, but there may been some left around from a previous issue that arose.

fakemoth's picture
Submitted by fakemoth on Fri, 05/20/2016 - 15:43

No .lock files anywhere at those locations; that was the first problem "solved" by rebooting. Now after the restart I get these emails every 5mins that the monitoring script cannot run... or something:

-Error: Failed to lock file /etc/webmin/status/oldstatus after 5 minutes. Last error was :
Error
-----
Failed to lock file /etc/webmin/status/oldstatus after 5 minutes. Last error was :
-----

BTW, there are some files there:

ls /etc/webmin/status/
config  fails  fails.lock  history  monitor.pl  oldstatus  oldstatus.lock  services

Should I delete those two, fails.lock and oldstatus.lock? Though... it sounds to me as it cannot lock them?

Yeah those are the .lock files you'd need to review. If it's saying it can't open "/etc/webmin/status/oldstatus", that means there is a .lock file in "/etc/webmin/status/" that is causing that problem.

fakemoth's picture
Submitted by fakemoth on Fri, 05/20/2016 - 15:58

I do have a number and a process:

[root@ns1 status]# ps -a 15679
  PID TTY      STAT   TIME COMMAND
15679 ?        Ss     0:01 /usr/bin/perl /usr/libexec/webmin/status/monitor.pl
17096 pts/1    R+     0:00 ps -a 15679

Does this process exists or not? 'Cause it doesn't seem so:

[root@ns1 status]# ps aux | grep monitor
root     15679  0.2  0.1 253076 55096 ?        Ss   23:45   0:01 /usr/bin/perl /usr/libexec/webmin/status/monitor.pl
root     17904  0.1  0.0 142708 14692 ?        Ss   23:55   0:00 /usr/bin/perl /usr/libexec/webmin/status/monitor.pl
root     18151  0.0  0.0 103308   876 pts/1    S+   23:56   0:00 grep monitor
<-code>

That actually looks normal.

Do you know what I think I might do? I might turn down how often the status monitoring runs. It's possible that for some reason, it's taking a long time to complete, and thus causing an issue.

To do that, you can go into System Settings -> Virtualmin Config -> Status Collection, and there, try setting "Interval between status collection job runs" to, say, "30 Minutes" instead of the default 5 minutes.

After changing that, you may need to delete any monitor.pl processes that are running, and remove the .lock file(s) from the /etc/webmin/status/oldstatus directory if there are any.

fakemoth's picture
Submitted by fakemoth on Fri, 05/20/2016 - 16:06

I removed the oldstatus.lock. No email for now, so the cron job does it's job. Verified and exactly after 5 minutes webmin recreated the file with a new pid inside, now 18967. So it seems to be working. Thank you, if there is anything else I will post here.

Bye!