does collectinfo.pl really need around 100 megabytes to execute ?

On a secondary mail server, with 256 MB RAM (which is enough for all purposes) we've regularly heavy swapping and load-overloads due to collectinfo.pl taking around 100 Megabytes to execute:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27846 root 20 0 192m 98m 988 D 0.7 38.5 0:12.68 collectinfo.pl

There is no website, no apache runing, so i'm stumped why (even with apache) collectinfo.pl needs so much RAM ?

How can we lower the RAM usage of that virtualmin Pro and GPL script on webservers ?

What does it do ?

Is it even needed every few minutes to run ?

Flagged it as support request, but eating 100 megs might justify to change it to "bug" ?!

Status: 
Closed (fixed)

Comments

That is rather excessive .. since on our test systems, it doesn't get above 60m.

Is this a 32-bit or 64-bit system?

64 bits. But still not a reason... compare that with other processes:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2369 root 20 0 59668 928 396 S 0.0 0.4 0:28.68 miniserv.pl
2758 root 20 0 97828 1064 440 S 0.0 0.4 0:29.92 miniserv.pl
3493 syslog 20 0 12296 300 200 S 0.0 0.1 4:45.54 syslogd
3568 root 20 0 50916 292 196 S 0.0 0.1 0:00.02 sshd
6548 root 20 0 3944 28 24 S 0.0 0.0 0:00.00 sh
14589 bind 20 0 100m 12m 1004 S 0.0 5.0 23:11.42 named

even named doesn't use more than 12 megs real memory...

btw has collect.pl any use if there are no virtual servers / apache runing ?

This is on another (management) server, without virtual-servers too:

24544 root 20 0 87824 84m 1656 R 99.0 8.3 0:08.35 collectinfo.pl

(there i saw the memory spike at 117m real)...

I've seen that perl apps use up to 2x the RAM on 64-bit systems, due to increase pointer size. That said, 100m is still excessive.

One thing you may want to try is going to System Settings -> Module Config -> Status collection, and changing "Collect all available package updates?" to "No".

Removed the packages updates collection and it is runing really fast now (like less than 5 seconds):

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18420 root 20 0 36012 16m 1708 R 6.0 6.6 0:00.18 collectinfo.pl

Here a snapshot at end of execution of collectinfo.pl on a shared webserver (64 bits)... : 178 megs

Before removing packages updates check:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31652 root 20 0 258m 178m 3692 S 97 5.3 0:12.68 collectinfo.pl

Also there, after changing that setting, can't see it anymore for more than 10 seconds there too, and this was the maximum memory footprint seen in top:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23403 root 20 0 165m 84m 3692 S 68 2.5 0:02.26 collectinfo.pl

So it seems that the package updates check (run every 5 minutes ???!) is what takes CPU and time at least and probably memory too.

But still don't get why it takes so much memory:

Here an apt-get update run on the first system:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18713 root 20 0 16992 2128 1584 S 0.7 0.8 0:00.02 apt-get

Even aptitude uses less than collect.pl for the whole aptitude functions and interface: 18806 root 20 0 108m 41m 16m S 0.0 16.3 0:01.08 aptitude

So: 1) memory and CPU use to collect package updates way to high 2) runing that all 5 minutes seems an overkill, once a day should be enough.

While we are at memory optimizations (and thus minimizing server swapping wind-down risk): we're running php in fastcgi mode, and many dormant php processes stay for each hosted site, squatting most memory. I didn't find a setting yet to tune that down, so that a maximum total number of such processes stay. Tuning max # of fastcgi per site isn't a solution, as it's not ideal for bursts. It would be good to be able to set the idle-lifetime to e.g. 30 seconds, as well as to be able to size the total maximum number of fastcgi scripts, like we can do for apache processes, so that by engineering there is never swap during normal operations. Any hints for tuning or params we missed in webmin/virtualmin Pro ?

There is some in-efficiency in the way collectinfo.pl fetches available package updates in the current Virtualmin release, which makes memory use excessive.

The good news is that I've just re-written this code, so with the next Webmin and Virtualmin releases it will use much less RAM, even when full package collection is enabled.

That's very good news indeed.

Did you make the package updates collection execute only once a day instead of every time (every 5 minutes) ?

No, I just made it use a more efficient method to generate the list of updates.