System Slowdown Catastrohpic!

13 posts / 0 new
Last post
#1 Tue, 03/17/2015 - 13:08
Shinzan

System Slowdown Catastrohpic!

Ubuntu 14.01 LTS 16GB RAM Power Edge Dell Dual Xeon 1.9 (8 cores total)

about 1/week I experience a horrendous slowdown on websites.

mail seems to run fine, I've installed a ton of diagnostic tools to see the problem but i can't fix it, htop shows all cores pegged, websites take like 10 seconds to first byte or error 500 out all together on complex .php sites.

the box has 17 sites.

Attatched are some htop results

Any thoughts or direction on how I can fix this? I'd like to put another important site on here but im scared because of these slowdowns.

Tue, 03/17/2015 - 13:26
andreychek

Howdy,

Your htop output there does show a load of 10.00, which is a bit on the high end. It's tough to determine the culprit from that though... next time that occurs, could you run the following commands, and share the output they produce:

ps auxwf
mailq | tail -1
netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -10

Also, just to reduce the load on your server -- you may want to go into System Settings -> Virtualmin Config -> Status Collection, and there, set "Interval between status collection job runs" to something a bit higher than the default... perhaps 60 minutes would be a good place to start.

-Eric

Tue, 03/17/2015 - 13:59
Welshman
Welshman's picture

Who your server with? I have 16gb servers with up to a hundred sites and they dont break a sweat.

Server needs tweaking I bet or your on a lousey network.

Chaos Reigns Within, Reflect, Repent and Reboot, Order Shall Return.

Tue, 03/17/2015 - 20:36
Shinzan
Tue, 03/17/2015 - 23:02
andreychek

Yeah I don't see anything too unusual in that process list there... whenever you ran that, were you experiencing the problem you're describing?

If it comes up again, you could always try disabling lfd, or any other process, just to make sure that isn't related.

-Eric

Wed, 03/18/2015 - 12:05 (Reply to #5)
Shinzan

I was having the problem then i enabled grey listing and within a few minutes I was back to normal, it may have been a coincidence, Ill know in a week because it generally happens every 5-10 days.

David

Tue, 04/07/2015 - 16:56
Shinzan

This is bad, I am not going to be able to use this system in production if i can't solve this problem. Again today the problem is a catastrophic slowdown. Any thoughts?

This is 2 weeks from the last slowdown. The small daily spikes are the midnight backups, the 2 HUGE spikes are .... well i dont know the problem comes and goes...

Seems there could be a correlation between mailq and my problem

Tue, 04/07/2015 - 16:58 (Reply to #7)
Shinzan

2nd view of CPU

Tue, 04/07/2015 - 18:21
andreychek

Howdy,

You mentioned a correlation between this CPU issue and mailq... whenever this occurs, how many messages are showing up in your email queue?

-Eric

Tue, 04/14/2015 - 21:54 (Reply to #9)
Shinzan

well it seemed like there was a correlation but only 35 were in there but then i deleted the whole queue and it was still slammed, I rebooted and it ironed itsself out but it was so bad stats on my server didn't even record for several hours, I have a big white gap in my stats for that time period where it was very bad, it seems like this just happens every 2 weeks or so, very strage, I'm trying to figure this out before i put two of my major clients websites on this box any help is appretiated.

David

Mon, 04/20/2015 - 20:38
andreychek

I'm having trouble remembering what all we tried -- but it might be worth verifying that in Email Messages -> Spam and Virus Scanning, we'd recommend setting "SpamAssassin client program" to "spamc (Client for SpamAssassin filter server spamd)", and "Virus scanning program" to "Server scanner (clamdscan)".

Those settings can each make a pretty big difference, if they aren't already set that way.

-Eric

Wed, 04/22/2015 - 08:57 (Reply to #11)
Shinzan

Thanks for the reply, I did change spamassassin over to spamc and clamscan to server scanner, so far I'm

System uptime 13 days, 19 hours, 19 minutes Running processes 230
CPU load averages 0.06 (1 min) 0.15 (5 mins) 0.13 (15 mins)

With no issues then out of the blue it will get hammered for a day. When it happens what type of log files should i be looking in do you think? The server is 16gb ram 6 cores dual processor so it should eat 17 websites and email for breakfast.

David

Tue, 07/19/2016 - 09:37
Shinzan

So after starting a support ticket the folks at Virtualmin really helped me figure it out.

Its CSF/LFD the monitoring of processes were firing off so fast that it basically started a denial of service on myself!

Long story short disable LFD, delete all the thousands of emails its dumping to root, and modify the CSF config files to increase the limits of processes to make it not monitor the threshholds too low or change the reporting interval because once the chain starts it just exponentially gets worse dragging down the whole server!

Topic locked