collectinfo.pl running rampant

31 posts / 0 new
Last post
#1 Wed, 12/12/2007 - 08:28
Jafo

collectinfo.pl running rampant

Ok, about every 5 minutes, collectinfo.pl runs and it basically slams the CPU, and load. Load will be at about 0.45 and jump to 20.00+ when collectinfo.pl starts running. It seems to lock everything up until it is finished.

Perusing through the code, the only assumption I can come up with is that it is locking a file(s) somewhere that is needed by httpd, such as a log or conf file..

Never had much of a problem with it before.. Last night I upgraded webmin, then virtualmin to the latest versions in hopes this would stop.

Is there some sort of log file it is choking on? Anything to look for?

It is getting to the point where I might just disable the job because it is bringing all the sites down..

Wed, 12/12/2007 - 16:33
Jafo

Anyone?

Thu, 12/13/2007 - 04:42 (Reply to #2)
PlayGod

collectinfo.pl has failed a few times lately on my box. I will make a note of the error email report next time it happens.

Thu, 12/13/2007 - 07:27 (Reply to #3)
Jafo

<div class='quote'>Any chance memory is a problem? If Apache gets pushed out to swap, it could make service quite slow.</div>

Thanks, I really appreciate your reply!

It is possible, however this would be a new thing as it has run fine for about 6 months now. Really have not added anything new to the server and traffic is within normal parameters.

I disabled the cron job last night and my load stats are back to normal.

What adverse affects will this have on virtualmin if it is not running?

Thu, 12/20/2007 - 14:21 (Reply to #4)
Joe
Joe's picture

Not really any adverse effects, except you won't have useful data on the system information page. collectinfo.pl is what gathers that data on a regular basis. This includes the historic graphs.

--

Check out the forum guidelines!

Thu, 12/20/2007 - 15:11 (Reply to #5)
PlayGod

Here's a couple of recently botched cron jobs.

<div class='quote'>From root@--.net (Cron Daemon)
To root@--.net
Date Wed, 19 Dec 2007 07:15:13 -0500 (EST)
Subject Cron &lt;root@--&gt; /etc/webmin/virtual-server/collectinfo.pl

Message text

Error: Failed to query Postfix config command to get the current value of parameter
process_id_directory: &lt;tt&gt;&lt;/tt&gt;
Error
-----
Failed to query Postfix config command to get the current value of parameter process_id_directory:
&lt;tt&gt;&lt;/tt&gt;
-----</div>

<div class='quote'>From root@--.net (Cron Daemon)
To root@--.net
Date Tue, 18 Dec 2007 00:10:03 -0500 (EST)
Subject Cron &lt;root@--&gt; /etc/webmin/status/monitor.pl

Message text

postfix::is_postfix_running failed : Failed to query Postfix config command to get
the current value of parameter process_id_directory: &lt;tt&gt;&lt;/tt&gt; at ../web-lib-funcs.pl
line 980.
</div>

Thu, 12/20/2007 - 15:37 (Reply to #6)
Joe
Joe's picture

Any chance this is a virtuozzo virtual server?

--

Check out the forum guidelines!

Sun, 06/07/2009 - 07:17 (Reply to #7)
PlayGod

My server is not. It's a fairly unburdened dual Xeon with 4 gigs RAM.

I'm not experiencing the same collectinfo.pl freakout as the guy above, but I am getting daily failures of Cron jobs. The 4:00am ones I would normally write off because that's when my backups are scheduled -- but the others are happening when there's really no burden... though sometimes during a hammering or brute force.

<div class='quote'>Date Mon, 24 Dec 2007 09:45:17 -0500 (EST)
Subject Cron &lt;root@asdf&gt; /etc/webmin/virtual-server/collectinfo.pl

Message text

sh: line 1: 13706 Segmentation fault &quot;/usr/sbin/httpd&quot; -l 2&gt;/dev/null</div>
<div class='quote'>Date Mon, 24 Dec 2007 14:35:02 -0500 (EST)
Subject Cron &lt;root@asdf&gt; /etc/webmin/status/monitor.pl

Message text

postfix::is_postfix_running failed : Failed to query Postfix config command to get
the current value of parameter queue_directory: &lt;tt&gt;&lt;/tt&gt; at ../web-lib-funcs.pl
line 980.
</div>

<div class='quote'>Date Sun, 23 Dec 2007 08:10:03 -0500 (EST)
Subject Cron &lt;root@asdf&gt; /etc/webmin/status/monitor.pl

Message text

sh: line 1: 31307 Segmentation fault su postgres -c \\\/usr\\\/bin\\\/psql\
\-U\ postgres\ \-c\ \'\'\ \ template1 2&gt;&amp;1
</div>

<div class='quote'>Date Sun, 23 Dec 2007 04:02:13 -0500 (EST)
Subject Cron &lt;root@asdf&gt; run-parts /etc/cron.daily

Message text

/etc/cron.daily/logrotate:

[Sun Dec 23 04:02:10 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:10 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:10 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:10 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:10 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:10 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:10 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:10 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:11 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:11 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:11 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:11 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:11 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:11 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:11 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:11 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:12 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:12 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:12 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:12 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:12 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:12 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
[Sun Dec 23 04:02:13 2007] [warn] VirtualHost 66.129.2.84:80 overlaps with VirtualHost
66.129.2.84:80, the first has precedence, perhaps you need a NameVirtualHost directive
[Sun Dec 23 04:02:13 2007] [warn] NameVirtualHost 66.129.2.83:80 has no VirtualHosts
</div>

Mon, 12/24/2007 - 18:27 (Reply to #8)
PlayGod

edit

Sun, 04/19/2009 - 16:52 (Reply to #9)
mrwilder

Wait - I should say that to be fair, I don't KNOW that this is the thing causing the problem.

When i run &quot;top&quot;, I see
&quot;gdm-simple-gree&quot; is consuming 58% of my ram...

Sun, 04/19/2009 - 17:15 (Reply to #10)
andreychek

Well, you'll want to disable any service you don't require for the server. It sounds like you're running X/GNOME on there, which would use up a fair amount of resources.

That said, collectinfo.pl also can use a fair amount of resources as well. It's what populates all the usage statistics and such within Virtualmin. If you don't need those stats updated every 5 minutes, you can certainly reduce the amount of times it runs, or even have it run outside of business hours.
-Eric

Sun, 04/19/2009 - 18:36 (Reply to #11)
mrwilder

I disabled the collectinfo.pl cron job - I'm not sure how or if I should disable gdm... don't I need that to run VNC server, for example?

Sun, 04/19/2009 - 19:50 (Reply to #12)
andreychek

If your goal is to run VNC, then that would indeed require X.

While I don't know what all you're doing on your server there, it's frequently considered a bad idea to run X on a server -- it uses up a lot of resources, and there's added security issues to deal with.

Administrative tasks can typically be handled by logging in using SSH (and perhaps Putty), and via Virtualmin within a web browser. But again, I don't know what you're using it for, perhaps your server requires X/VNC for some reason :-)
-Eric

Mon, 04/20/2009 - 19:09 (Reply to #13)
mrwilder

VNC is HANDY.

I have no problem with the command line, I'm old... but, uh, hey, nothing beats drag and drop!

STILL.... over a GIG of RAM to run the interface? That's absurd. And there has got to be a memory leak somewhere. When I first boot the machine, it runs great. It gets slower and slower and slower as time goes by. I would definitely be willing to turn it off and just use the command line assuming that this is really the cause and not a symptom of some stupid thing I've done or whatnot.

The only obvious system messages are the collectinfo.pl (now disabled) and top showing 58% RAM contributed to gdm-simple-gree.

I hazily understand that gdm is the Gnome display manager. I assume by your comments that I cannot leave only some crucial part of it running for VNC so I instead need to change some init file to prevent Gnome itself from launching at all at boot time and disable VNC.

I realize that this is not the main purview on this site, but, uh, any idea how to disable/completely remove (if necessary) Gnome?

Thanks again,
Tony

Mon, 04/20/2009 - 21:15 (Reply to #14)
Joe
Joe's picture

So, you know that we don't actually really know anything about gdm, right? ;-)

We're server guys. Anything in X and the Gnome desktop is not within our particular domain of knowledge (we all run Linux on our desktops at home and work and such, but we're just as baffled as you are when things act funny with desktop software).

I'm saying this not as a suggestion that we don't want to help...just that you'll find we are not really competent to help. There are some Gnome related mailing lists, and some OS related lists for whatever OS you're running, that might be able to help. I'm not sure who the best option is...but I'm sure we're not it. ;-)

--

Check out the forum guidelines!

Tue, 04/21/2009 - 03:50 (Reply to #15)
sgrayban

A gig of ram as in 1 is not enough these days to run a production server. You need at least 2gigs if not more depending on what is being used. MySQL can be demanding if many sites use it. Of course if you got loads of websites that will also drain memory as well.

A good server with 4 gigs+ of ram will typically serve 200-300 websites, anymore then that you need to get another server.

Tue, 04/21/2009 - 06:39 (Reply to #16)
andreychek

Joe is right, though one option I might offer is that if you're interested in running X/VNC, you could always configure your server to boot up in a run level that is text-only (run level 3 in CentOS/RHEL, or 2 in Debian/Ubuntu).

And then, launch X only when you need it using the &quot;startx&quot; command (though you may need to pass in some parameters so it doesn't try to display on your home computer).

That would spare you a lot of resources, and you could still get in to use the GUI if need be later on.

There may be a way to fix the resource problem you're seeing with GDM, but I don't know what it is -- however, launching X only when needed would get around that issue :-)
-Eric

Wed, 12/12/2007 - 22:43
Joe
Joe's picture

Howdy Jafo,

It's not locking any files that would prevent web service. I'm guessing it's just overloading your box. I believe 3.49 should have made collectinfo.pl less demanding rather than more...nice is used to run it at the least demanding priority.

Any chance memory is a problem? If Apache gets pushed out to swap, it could make service quite slow.

--

Check out the forum guidelines!

Thu, 12/20/2007 - 02:42
Jafo

Any adverse effects?

Mon, 12/24/2007 - 18:29
PlayGod

Ooops, I guess I'm in for a hammering now! Admin, please delete the post with all the IP's... :(

Mon, 02/25/2008 - 06:40 (Reply to #20)
fuzzie

My collectinfo.pl was running crazy too.
I upgraded from 3.52 GPL to the Pro, but that didn't fix it.
I disabled it for now....if you figure it out, let me know.
The body of the emaill would repeat this about 20 times:
quota: Quota file not found or has wrong format.

I disabled quotas, and that still came every 5 minutes.

Wed, 08/13/2008 - 09:43 (Reply to #21)
websmurf

I'm having the same issue as Fuzzie..

mail with this:
<div class='quote'>
quota: Quota file not found or has wrong format.
quota: Quota file not found or has wrong format.
quota: Quota file not found or has wrong format.
quota: Quota file not found or has wrong format.
</div>
every 5 minutes even though quota's are disabled..

any idea?

sys info:
Operating system Debian Linux 4.0
Webmin version 1.420
Virtualmin version 3.60.gpl (GPL)
Kernel and CPU Linux 2.6.18-5-amd64 on x86_64

Wed, 08/13/2008 - 17:20 (Reply to #22)
andreychek

Hrm, yeah, that message appears to be generated by calling the quota tools.

So just to verify a few things:

* Log into Virtualmin, then click Webmin -&gt; System -&gt; Disk Quotas, that quotas are listed as disabled there?

* Clicking Virtualmin -&gt; Limits and Validation -&gt; Disk Quota Monitoring -- is the all the quota monitoring disabled?

Perhaps one of those options is causing what you're seeing.
-Eric

Thu, 08/14/2008 - 00:09 (Reply to #23)
websmurf

Hi Eric,

Yeah, it's off in the first option.. (all are listed with an Enable quota link behind it.

Can't find the second option. It's present on my Virtualmin pro install, but not on the box in question (which is Virtualmin GPL).

Adam

Mon, 03/16/2009 - 15:27
eboughey

I'm having the same problem with collectinfo.pl taking my cpu to 100% every few minutes. It's just recently started happening and I can't figure out why. I don't really see any answers here though.

I've had the system up and running for over a year without problem.

Mon, 03/16/2009 - 17:55 (Reply to #25)
Joe
Joe's picture

<div class='quote'>I'm having the same problem with collectinfo.pl taking my cpu to 100% every few minutes.</div>

But, is it causing an actual problem? (A process taking all available CPU for a short time isn't actually a &quot;problem&quot;. It's just doing work. If it is interfering with other, more important tasks...then that would be a problem.)

--

Check out the forum guidelines!

Tue, 03/17/2009 - 09:17 (Reply to #26)
eboughey

no, no problems with interference, it's causing my computer to get very loud every few minutes which is driving us crazy though.

Tue, 03/17/2009 - 10:43 (Reply to #27)
Joe
Joe's picture

Heheheh...Your first mistake was letting a server live anywhere near you. Ours are 1500 miles away. ;-)

You can turn off collectinfo.pl, or make it run less often. It is configured in the &quot;Status collection&quot; page in Module Configuration.

Its purpose is to gather information about the system...stuff like memory usage, CPU usage, quotas, number of virtual servers, etc. If you can live with out of date information on the System Information page, or nothing changes very often, then you could set it to run once per day, or similar.

--

Check out the forum guidelines!

Wed, 03/18/2009 - 07:39 (Reply to #28)
eboughey

perfect. thank you! That makes a world of difference to our sensitive ears.

This is what I get for having a home based server.

Sun, 04/19/2009 - 16:34
mrwilder

Our server started doing the same thing last Sunday. The machine has a gig of ram, and is using a gig of swap space just to support collectinfo.pl.

Apr 19 17:19:02 ns2 CROND[15086]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 17:24:02 ns2 CROND[15265]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 17:29:02 ns2 CROND[15425]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 17:34:03 ns2 CROND[15562]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 17:39:02 ns2 CROND[15712]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 17:44:03 ns2 CROND[15868]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 17:49:02 ns2 CROND[16101]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 17:51:01 ns2 CROND[16189]: (root) CMD (/etc/webmin/virtual-server/spamconfig.pl)
Apr 19 17:54:03 ns2 CROND[16247]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 17:59:02 ns2 CROND[16397]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 18:00:01 ns2 CROND[16467]: (root) CMD (/etc/webmin/virtual-server/bw.pl)
Apr 19 18:01:01 ns2 CROND[16614]: (root) CMD (run-parts /etc/cron.hourly)
Apr 19 18:01:02 ns2 run-parts(/etc/cron.hourly)[16616]: starting inn-cron-nntpsend
Apr 19 18:01:03 ns2 run-parts(/etc/cron.hourly)[16623]: finished inn-cron-nntpsend
Apr 19 18:01:03 ns2 run-parts(/etc/cron.hourly)[16625]: starting inn-cron-rnews
Apr 19 18:01:05 ns2 run-parts(/etc/cron.hourly)[16631]: finished inn-cron-rnews
Apr 19 18:04:02 ns2 CROND[16792]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 18:09:02 ns2 CROND[17058]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 18:14:02 ns2 CROND[17198]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)
Apr 19 18:19:03 ns2 CROND[17377]: (root) CMD (/etc/webmin/virtual-server/collectinfo.pl)

CPU and RAM usage at 100%.

Otherwise the server is doing just about nada.

Thu, 12/10/2009 - 10:34 (Reply to #30)
merlynx

I used to use GDM and/or KDE for similar reasons (the drag and drop can't be beat). When I was not using it I would run at "init 3" or just strait-up stop the Xordg and KDM/GDM service(s). Most distros have a "startx" or an "X" command, I'd do "init 5" and these. You can do all that from the shell. That way, you're only running the desktop when you "need" it.
HTH

I've always had this issue with collectinfo.pl. It's parsing logs. And so if you have large logs it'll choke the server. I've found that usually there are a couple of "high traffic" sites that have huge log files (usually sites that someone is brute-forcing) and collectinfo.pl sucks up over 50% cpu.

I like the suggestion of running it at "non-business hours..."