Apache restarting on a Sunday morning

24 posts / 0 new
Last post
#1 Mon, 02/08/2010 - 08:41
pixel_paul
pixel_paul's picture

Apache restarting on a Sunday morning

Got a strange problem here that has been happening for the past few weeks on a Sunday morning at 0400 - but not every Sunday.

What happens is that Apache shuts down and then cannot restart.

This is the message in /var/log/messages:

Feb  7 04:02:15 DS-R1062667-001 syslogd 1.4.1: restart.
Feb  7 04:02:59 DS-R1062667-001 logrotate: ALERT exited abnormally with [1]

And then httpd error log:

[Sun Feb 07 04:02:15 2010] [notice] Digest: generating secret for digest authentication ...
[Sun Feb 07 04:02:15 2010] [notice] Digest: done
PHP Warning:  Module 'imap' already loaded in Unknown on line 0
[Sun Feb 07 04:02:15 2010] [notice] mod_python: Creating 4 session mutexes based on 256 max processes and 0 max threads.
[Sun Feb 07 04:02:16 2010] [notice] Apache/2.2.3 (CentOS) configured -- resuming normal operations
[Sun Feb 07 04:02:16 2010] [notice] Graceful restart requested, doing restart
[Sun Feb 07 04:02:16 2010] [error] (9)Bad file descriptor: apr_socket_accept: (client socket)
[Sun Feb 07 04:02:17 2010] [notice] Digest: generating secret for digest authentication ...
[Sun Feb 07 04:02:17 2010] [notice] Digest: done

repeated about 15 times, then this:

Sun Feb 07 04:02:58 2010] [emerg] (28)No space left on device: mod_fcgid: Can't create global mutex
[Sun Feb 07 04:02:59 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock
Configuration Failed
[Sun Feb 07 04:02:59 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock
Configuration Failed
[Sun Feb 07 04:02:59 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock
Configuration Failed
[Sun Feb 07 11:07:14 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock
Configuration Failed
[Sun Feb 07 11:08:23 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock
Configuration Failed
[Sun Feb 07 11:17:52 2010] [crit] (28)No space left on device: mod_rewrite: could not create rewrite_log_lock
Configuration Failed

This has me a bit lost as to where to look for source of problem, so any pointers would be much appreciated.

Thanks,

Paul

Mon, 02/08/2010 - 08:49
andreychek

Hi Paul,

It looks like Apache is generating space related errors.

Are any of your drives low on space?

You can see available space by typing this:

df -h

Since Apache's restart on Sunday is generally around the same time as the logrotation, it may be that whatever partition /var is on is having the space problems.

-Eric

Mon, 02/08/2010 - 09:14 (Reply to #2)
pixel_paul
pixel_paul's picture

Hi Eric,

[root@]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 69G 13G 52G 20% / /dev/sda3 99M 36M 58M 39% /boot tmpfs 506M 0 506M 0% /dev/shm

So there is definitely space left! I've just noticed an email from Cron Daemon:

Stopping httpd: [  OK  ]
Starting httpd: (98)Address already in use: make_sock: could not bind to address
[::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
[FAILED]
error: error running postrotate script for /home/*****1/logs/access_log /home/******1/logs/error_log
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: (98)Address already in use: make_sock: could not bind to address
[::]:80
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:80
no listening sockets available, shutting down
Unable to open logs
[FAILED]
error: error running postrotate script for /home/******2/domains/*****.com/logs/access_log
/home/*******2/domains/******.com/logs/error_log
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [  OK  ]
Starting httpd: [  OK  ]
Stopping httpd: [FAILED]
Starting httpd: [FAILED]
error: error running postrotate script for /home/*****/logs/access_log /home/*****/logs/error_log
Stopping httpd: [FAILED]
Starting httpd: [FAILED]
error: error running postrotate script for /home/****/logs/access_log /home/****/logs/error_log
Stopping httpd: [FAILED]
Starting httpd: [FAILED]
error: error running postrotate script for /home/*****/logs/access_log /home/****/logs/error_log

so there is something definitely fishy going on here.....

Thanks,

Paul

Mon, 02/08/2010 - 09:57
ronald
ronald's picture

seems one or more of the users have run out of space and the system cant write to the log files.

Mon, 02/08/2010 - 10:02 (Reply to #4)
pixel_paul
pixel_paul's picture

I don't have any quotas enabled for any users, so not sure how they can run out of space?

Mon, 02/08/2010 - 10:13
ronald
ronald's picture

looking gain I see the error is actually: Unable to open logs
and the logs are in the users space.

did the user delete those logs? I had this happening once and that crashed my server too.

Mon, 02/08/2010 - 10:16
andreychek

Howdy,

In addition to what Ronald said -- I'm curious what output you get from this command:

repquota -v /

Mon, 02/08/2010 - 10:38
pixel_paul
pixel_paul's picture

Ronald: The user has no access to the server - and as far as I know I haven't deleted them.

Eric:

[root@]# /usr/sbin/repquota -v /
repquota: Mountpoint (or device) / not found.
repquota: Not all specified mountpoints are using quota.

Thanks,

Paul

Mon, 02/08/2010 - 10:49
ronald
ronald's picture

thinking outloud, did you try a killall on the httpd?

Mon, 02/08/2010 - 10:54
pixel_paul
pixel_paul's picture

I actually restarted the machine in order to get this working again. Otherwise I hadn't touched the server other than creating a couple of virtual servers on Friday

Mon, 02/08/2010 - 11:25
ronald
ronald's picture

can you see if these log files for those users are linked to /var/log/virtualmin/*
as I remember that this has changed for new users but not existing users.
per haps it is relevant

Mon, 02/15/2010 - 09:15
ken.wiesner

It probably wasn't a drive space issue. It was probably due to semaphore array issue. This actually happened to my server on EC2 this morning. I found a blog that said to run:

[root@hosting01:/var/log/httpd] ipcs -s | grep apache
0x00000000 82575367   apache    600        1        
0x00000000 50528264   apache    600        1        
0x00000000 53248009   apache    600        1        
0x00000000 59244554   apache    600        1
...truncated...

I then ran the following to clear it out:

[root@hosting01:/var/log/httpd] ipcs -s | grep apache | awk '{print "ipcrm sem " $2}' | sh
resource(s) deleted
resource(s) deleted
resource(s) deleted
...truncated...

I then issued the restart for apache and all was well.

Sun, 02/21/2010 - 04:53
cruiskeen

I'm up at the moment nursing my server for this very reason. This is because logrotate is furiously trying to stop and restart httpd during the logrotate, which happens weekly at this time. I note that by default the logrotate set up here does a restart of apache, and when this happens over and over again in quick succession sometimes apache doesn't finish dying before the next restart, and then the semaphores pile up until you run out of semaphore space.

A couple things I'm trying here include having apache just do a graceful instead of a restart, and increasing the timeout in the httpd start script where it waits for the children to die. Also possibly having it clear the semaphores before starting apache.

Mon, 03/08/2010 - 06:33
pixel_paul
pixel_paul's picture

thanks everyone for ideas as to cause - finally had a chance to look at this a bit closer and it seems to be related to the semaphore issue.

cruiskeen: did you find that the changes you made re: graceful/restart or the timeout increase has made a difference?

Mon, 03/08/2010 - 09:23
cruiskeen

Hard to say since it only happened every once in a while in the first place. But it hasn't happened since I made the changes, for whatever that's worth.

Mon, 03/08/2010 - 10:19
pixel_paul
pixel_paul's picture

I'll give it a go and see what I find.

Thanks for the heads up on this.

Cheers,

Paul

Sun, 07/18/2010 - 16:46
uwe@comproso.com

Hi,

I am having the same issue, but only on 2 of my 5 virtualmin servers (1 is gpl, the others are customer or my servers commercial license). These two (commercial) die randomly always Sunday morning around 4:00 am, logrotate time. I have all updates in place on all of the servers, only difference to a stock virtualmin is php 5.11 (bleeding edge repo). I came across this thread by seeking a solution as it is just unhandy to get up at 4:00 in the morning to check if the servers are running :-)
Is there a fix available by now?

Uwe

Tue, 07/27/2010 - 10:50
pixel_paul
pixel_paul's picture

I managed to stop this from occuring by changing all post-rotation commands to

/etc/rc.d/init.d/httpd graceful

Since then I havent encountered this issue.

Tue, 07/27/2010 - 18:04
uwe@comproso.com

Hi picel_paul,

thanks for your reply. I instantly went and checked the settings and what do you know, there is different types of entries automatically generated :

/usr/sbin/apachectl graceful or restart
/etc/rc.d/init.d/httpd restart

The first once where mainly set to "graceful" the second ones all to restart. I changed all restart to graceful and hope for the best. On my virtualmin GPL server they were all set to graceful, seems to be the reason why that one did not fail.

Question now is, why different command lines? The servers created lately where all in the second form. Is there a difference inbetween these command lines? Which one should I use? And where to change/force it for new servers? I checked the template, but there are no settings for the post-rotation command.

Thanks

Uwe

Wed, 07/28/2010 - 03:26
pixel_paul
pixel_paul's picture

Hi Uwe,

All I know is that the hard restart with /etc/rc.d/init.d/httpd restart was causing the problem, and changing to graceful solved the problem.

When I create a new virtual server the logrotate post command is /etc/rc.d/init.d/httpd graceful - if you're getting it set to /etc/rc.d/init.d/httpd restart then this may need to be escalated to someone who knows a bit more....

Cheers,

Paul

Wed, 07/28/2010 - 09:06
uwe@comproso.com

Hi,

thanks a lot. As I only have max 10 servers on one virtualmin license its easy to keep track and check after a new server creation and correct if needed. Don't want to bother them with this tiny issue, they have more to do.

Uwe

PS: for me thats solved, unless the server still crashes, but that would be a new thread

Fri, 07/30/2010 - 17:02
cruiskeen

You should be aware that doing a graceful restart may take a long time to actually close the log files --- read man apachectl for details.

Mon, 09/13/2010 - 16:44
sergeich

I ran into the same Sunday Crashing situation. It all boils down to a few dozen restart requests in logrotate.conf that quickly follow one another:

'postrotate /etc/rc.d/init.d/httpd restart'

Once the amount of our virtual sites grew over ~20 we started experiencing this lovely phenomenon from time to time.

Frankly at this point I prefer the approach of processing log rotation and restarting Apache in order to avoid any crashes. Any downside to this approach? (info missing in the log files, etc?)

Fri, 10/15/2010 - 03:51
allenicon
I just manged to change all post-rotation commands to "/etc/rc.d/init.d/httpd graceful", later on i didn't get through this problem.... Rosie Allen