Perl Segmentation fault for server and user manipulation commands

After upgrade to webmin (1.791) I have this messages for various commands: list-users, changing quota etc.

I am on OVH VPS.

This is my shell commands how I reproduced problem:

root@webserver:~# aptitude reinstall webmin
The following packages will be REINSTALLED:
  webmin
0 packages upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not upgraded.
Need to get 0 B/28.2 MB of archives. After unpacking 0 B will be used.
(Reading database ... 95067 files and directories currently installed.)
Preparing to unpack .../archives/webmin_1.791_all.deb ...
Unpacking webmin (1.791) over (1.791) ...
Processing triggers for systemd (215-17+deb8u4) ...
Setting up webmin (1.791) ...
*** Error in `/usr/bin/perl': double free or corruption (!prev): 0x000000000516a3b0 ***
Webmin install complete. You can now login to ............
as root with your root password, or as any user who can use sudo
to run commands as root.
                                        
root@webserver:~# dmesg | tail -30
[120841.077862] monitor.pl[30771]: segfault at 9292130 ip 00007207102c75df sp 0000764c257ff7d0 error 6 in libc-2.19.so[72071024d000+1a2000]
[120841.077893] grsec: Segmentation fault occurred at 0000000009292130 in /usr/share/webmin/status/monitor.pl[monitor.pl:30771] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:30769] uid/euid:0/0 gid/egid:0/0
[121141.210150] monitor.pl[31715]: segfault at 8292350 ip 00006c09e2d395df sp 00007534d9cf5d80 error 6 in libc-2.19.so[6c09e2cbf000+1a2000]
[121141.210178] grsec: Segmentation fault occurred at 0000000008292350 in /usr/share/webmin/status/monitor.pl[monitor.pl:31715] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:31712] uid/euid:0/0 gid/egid:0/0
[121441.006295] monitor.pl[552]: segfault at 4e8c460 ip 0000758ff80d55df sp 00007df3479415b0 error 6 in libc-2.19.so[758ff805b000+1a2000]
[121441.006326] grsec: Segmentation fault occurred at 0000000004e8c460 in /usr/share/webmin/status/monitor.pl[monitor.pl:552] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:548] uid/euid:0/0 gid/egid:0/0
[121807.811416] /usr/share/webm[2295]: segfault at 14875bd0 ip 00007a9e57e7e5df sp 00007e4f2aa009d0 error 6 in libc-2.19.so[7a9e57e04000+1a2000]
[121807.811447] grsec: Segmentation fault occurred at 0000000014875bd0 in /usr/share/webmin/miniserv.pl[/usr/share/webm:2295] uid/euid:0/0 gid/egid:0/0, parent /usr/share/webmin/miniserv.pl[/usr/share/webm:2294] uid/euid:0/0 gid/egid:0/0
[123242.873079] monitor.pl[14000]: segfault at 536cbb0 ip 000075a5399d55df sp 00007b77a2774f90 error 6 in libc-2.19.so[75a53995b000+1a2000]
[123242.873108] grsec: Segmentation fault occurred at 000000000536cbb0 in /usr/share/webmin/status/monitor.pl[monitor.pl:14000] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:13998] uid/euid:0/0 gid/egid:0/0
[125044.500560] monitor.pl[20502]: segfault at a622cc0 ip 00006e4a377875df sp 00007bd0ea600e20 error 6 in libc-2.19.so[6e4a3770d000+1a2000]
[125044.500590] grsec: Segmentation fault occurred at 000000000a622cc0 in /usr/share/webmin/status/monitor.pl[monitor.pl:20502] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:20501] uid/euid:0/0 gid/egid:0/0
[125345.225486] monitor.pl[21448]: segfault at 50c49d0 ip 000070756f86c5df sp 0000759c7c09aa80 error 6 in libc-2.19.so[70756f7f2000+1a2000]
[125345.225580] grsec: Segmentation fault occurred at 00000000050c49d0 in /usr/share/webmin/status/monitor.pl[monitor.pl:21448] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:21446] uid/euid:0/0 gid/egid:0/0
[128347.215732] monitor.pl[32699]: segfault at 5242160 ip 00006607575335df sp 0000740c024066b0 error 6 in libc-2.19.so[6607574b9000+1a2000]
[128347.215782] grsec: Segmentation fault occurred at 0000000005242160 in /usr/share/webmin/status/monitor.pl[monitor.pl:32699] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:32697] uid/euid:0/0 gid/egid:0/0
[128647.077752] monitor.pl[1576]: segfault at 7eda5b0 ip 000071f711d955df sp 0000786cd39eb430 error 6 in libc-2.19.so[71f711d1b000+1a2000]
[128647.077809] grsec: Segmentation fault occurred at 0000000007eda5b0 in /usr/share/webmin/status/monitor.pl[monitor.pl:1576] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:1574] uid/euid:0/0 gid/egid:0/0
[129036.001083] /usr/share/webm[3534]: segfault at e687b58 ip 000073072123e67d sp 00007d15cf923d18 error 6 in libc-2.19.so[7307211b9000+1a2000]
[129036.001131] grsec: From 31.30.46.17: Segmentation fault occurred at 000000000e687b58 in /usr/share/webmin/virtual-server/list-users.pl[/usr/share/webm:3534] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:2863] uid/euid:0/0 gid/egid:0/0
[129239.580247] /usr/share/webm[3879]: segfault at e351b60 ip 0000702031f405df sp 0000728154ec4390 error 6 in libc-2.19.so[702031ec6000+1a2000]
[129239.580294] grsec: From 31.30.46.17: Segmentation fault occurred at 000000000e351b60 in /usr/share/webmin/virtual-server/list-users.pl[/usr/share/webm:3879] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:2863] uid/euid:0/0 gid/egid:0/0
[129314.996477] /usr/share/webm[4662]: segfault at 1450be30 ip 00007a9e57e7e5df sp 00007e4f2a9f7550 error 6 in libc-2.19.so[7a9e57e04000+1a2000]
[129314.996519] grsec: Segmentation fault occurred at 000000001450be30 in /usr/share/webmin/miniserv.pl[/usr/share/webm:4662] uid/euid:0/0 gid/egid:0/0, parent /usr/share/webmin/miniserv.pl[/usr/share/webm:4661] uid/euid:0/0 gid/egid:0/0
[129847.888731] monitor.pl[7339]: segfault at 793f1f0 ip 00006d1d36d4e5df sp 000070e6f6f38840 error 6 in libc-2.19.so[6d1d36cd4000+1a2000]
[129847.888766] grsec: Segmentation fault occurred at 000000000793f1f0 in /usr/share/webmin/status/monitor.pl[monitor.pl:7339] uid/euid:0/0 gid/egid:0/0, parent /usr/sbin/cron[cron:7337] uid/euid:0/0 gid/egid:0/0
[130210.995148] /usr/share/webm[9106]: segfault at 148c38b8 ip 00007a9e57e8967d sp 00007e4f2aa00e68 error 6 in libc-2.19.so[7a9e57e04000+1a2000]
[130210.995183] grsec: Segmentation fault occurred at 00000000148c38b8 in /usr/share/webmin/miniserv.pl[/usr/share/webm:9106] uid/euid:0/0 gid/egid:0/0, parent /usr/share/webmin/miniserv.pl[/usr/share/webm:9105] uid/euid:0/0 gid/egid:0/0
[130826.500934] systemd-sysv-generator[17485]: Overwriting existing symlink /run/systemd/generator.late/milter-greylist.service with real service
[130841.389904] systemd-sysv-generator[17790]: Overwriting existing symlink /run/systemd/generator.late/milter-greylist.service with real service

List domain users:

root@webserver:~# virtualmin list-users --domain zlatefinance.cz
e/webmin/virtual-server/list-users.pl: malloc.c:3700: _int_malloc: Assertion `victim->fd_nextsize->bk_nextsize == victim' failed.
Aborted
root@webserver:~#

Thanks for any relevant info or tips how to solve this issue.

Tomas

Status: 
Active

Comments

Howdy -- it sounds like your Linux system is restricting what the processes running on it are allowed to do.

It looks like grsecurity may be responsible for at least some of that.

You may want to try disabling grsecurity, or configure it to ignore the Webmin processes.

Hello, thanks for reply. I have accidentally replied into (https://www.virtualmin.com/comment/755602#comment-755602) but I have also tested several kernels from OVH (with distro kernel I was unable to use ETH driver) marked -std- and -grsec-.

GRSEC kernels stands for enabled security. STD without. I currently running the one:

root@webserver:~# uname -r
3.14.32-xxxx-std-ipv6-64-vps

I also tried to mess up with various bootup settings without any luck.

Output remain the same:

root@webserver:~# virtualmin list-users --domain XXXX.cz
Segmentation fault

All problems looks like related to users - I am unable to change quota, add user or create new virtual server :/

Any tips hove to revert back to normal?

I suppose that I am not only one running Webmin, Usermin on OVH VPS.

I have willingness to reinstall whole system to another VPS and migrate all domains (150+) to new system but without working user-lists and related function I am stuck :/

I have no problem to with to payed model for virtualmin or buy support package to solve this problem.

There likely isn't a simple way to get things working with grsecurity itself. So coming up with a way to disable grsecurity is the direction we'd suggest.

If you're open to using paid support, that does open up some options!

There's the option of switching to a Virtualmin Pro license, or purchasing a support incident. Or, if things are dire, you could actually hire us to migrate things to a new server for you.

We can definitely help out in that case; I'm not sure how yet, but we can figure something out.

I'm not sure which of those payment options to choose at the moment though, as this isn't an issue we've run into very often. Maybe you could hold off on buying one of those options until we have a better idea as to the problem?

I'd like to think that there is a way to switch kernels and have that work.

Can you describe what happens when switching to a standard kernel? You said you couldn't use the eth driver -- what was happening there exactly?

Hi, If my understands are correct OVH have some kind of pre-rebuild solution for their VMWare solution containing ethernet drivers.

My first experience with debian bundled kernel (though apt-get) was that everything started properly (postgres, nginx, etc.) but I was unable to do any connection inside or outside.

I will try at end of week (outside main times) some more kernels and I will let you know.

Just to verify, as I'm not fully certain I understand -- are the kernels you're trying now provided by OVH? Or are they Debian kernels, provided by the distro?

Hi,

I am sorry for delayed response I have to concentrate to another project delivery.

To clarify your question:

I have tested following kernels from OVH (std - as without security and grs - with security) and also standard distro kernel.

-rw-r--r--  1 root root  7708624 May  4 21:11 bzImage-3.10.23-xxxx-std-ipv6-64-vps
-rw-r--r--  1 root root  8436624 May  3 22:18 bzImage-3.14.32-xxxx-std-ipv6-64-vps
-rw-r--r--  1 root root  6999936 May 17  2013 bzImage-3.8.13-xxxx-grs-ipv6-64-vps
-rw-r--r--  1 root root    85345 May  4 21:11 config-3.10.23-xxxx-std-ipv6-64-vps
-rw-r--r--  1 root root    92957 May  3 22:18 config-3.14.32-xxxx-std-ipv6-64-vps
-rw-r--r--  1 root root   157726 Mar  9 05:42 config-3.16.0-4-amd64
-rw-r--r--  1 root root    78405 May  4 21:20 config-3.8.13-xxxx-grs-ipv6-64-vps
drwxr-xr-x  5 root root    12288 May  4 21:24 grub
-rw-r--r--  1 root root 17312513 Apr 29 00:11 initrd.img-3.16.0-4-amd64
-rw-r--r--  1 root root  3036762 Apr 28 23:27 System.map-3.14.63-xxxx-std-ipv6-64-vps
-rw-r--r--  1 root root  2676777 Mar  9 05:42 System.map-3.16.0-4-amd64
-rw-r--r--  1 root root  2650829 May 17  2013 System.map-3.8.13-xxxx-grs-ipv6-64-vps
-rw-r--r--  1 root root  3120288 Mar  9 05:38 vmlinuz-3.16.0-4-amd64

Virtualmin was installed with OVH kernel and worked properly over 2 years. No (I do not know why) it does not work properly:

May 23 07:16:02 webserver kernel: [1591723.776893] /usr/share/webm[27219]: segfault at d82aac0 ip 00007f05a4ac15df sp 00007fff38b372e0 error 6 in libc-2.19.so[7f05a4a47000+1a2000]
May 23 07:35:02 webserver kernel: [1592864.434065] monitor.pl[30548]: segfault at 67a1cc0 ip 00007fc92f87f5df sp 00007ffd7ba51160 error 6 in libc-2.19.so[7fc92f805000+1a2000]
May 23 07:45:02 webserver kernel: [1593465.420328] monitor.pl[32430]: segfault at 542a890 ip 00007f143e7b15df sp 00007ffc72d80dc0 error 6 in libc-2.19.so[7f143e737000+1a2000]
May 23 08:05:02 webserver kernel: [1594665.957944] monitor.pl[4169]: segfault at 61d8700 ip 00007f44d98d25df sp 00007ffcdb39c700 error 6 in libc-2.19.so[7f44d9858000+1a2000]

Looks that most of errors are related to user I am unable to list users of server where quota is enabled. But I am not able to change them when quota is reached. I am also unable to create new servers (in certain configurations) add users.

As I mentioned I have some spare budget for maintenance which I would like to spend for professional license of virtualmin or in this trouble case for support, because this error is out of my skills. Are you able to provide me some estimates of time/costs for help from your side?

No problem on the delay!

If we thought this was some sort of "normal" configuration issue, or a coding bug, we'd be more than happy to work with you on that, and to look deeper into that problem.

The kinds of problems you're seeing are really unusual though, and appear to be specific to the environment at your provider there. While there are folks using OVH who aren't experiencing that particular issue, I am wondering if the easiest solution isn't to just change providers (which I think you were mentioning as an option above, assuming you were able to run the appropriate commands to do so).

You mentioned that list-users isn't working. However, is the backup function working? Or does that error out too?

It really looks like there is some kind of security or resource restriction that is preventing things from working properly.

All that said, I do have a few questions for you just to rule out a handful of simpler things --

What kind of VPS do you have? For example, Xen, KVM, OpenVZ?

And what is the output of these two commands:

free -m
dmesg | tail -30

Oh, and this one might generate quite a bit of output, but I'm curious what it shows:

strace virtualmin list-users --domain XXXX.cz

Thanks. Last command reveals most important stuff:

recvfrom(4, "7\0\0\1\377\36\4#42S22Unknown column 'use"..., 16384, 0, NULL, NULL) = 59

So I suppose that is related with my MySQL version 5.7 and changes related to user table - notable password field is missing. I have trie hot-fix to add password column into user table and copy passwords from authentication_string into password. But looks like nothing changed.

To your questions: - VPS Type - If I understand correctly from console when restarting there is VMWare solution. I would like to migrate inside OVH to OpenStack (there are allowed to connect additional HDD)

root@webserver:~# free -m
             total       used       free     shared    buffers     cached
Mem:          8005       6985       1019        991        201       1989
-/+ buffers/cache:       4794       3210
Swap:         2049       2049          0
root@webserver:~#
root@webserver:~# dmesg | tail -30
[1603374.889557] monitor.pl[3186]: segfault at 70f2d50 ip 00007f0c6ac855df sp 00007fff352f4fb0 error 6 in libc-2.19.so[7f0c6ac0b000+1a2000]
[1605773.943632] monitor.pl[11851]: segfault at 484c790 ip 00007f1410ce65df sp 00007fff61152c40 error 6 in libc-2.19.so[7f1410c6c000+1a2000]
[1606974.702490] monitor.pl[16304]: segfault at 5d8c6a0 ip 00007f64ca1c85df sp 00007fffad843390 error 6 in libc-2.19.so[7f64ca14e000+1a2000]
[1607342.574984] /usr/share/webm[17625]: segfault at d7c38b0 ip 00007f05a4ac15df sp 00007fff38b40760 error 6 in libc-2.19.so[7f05a4a47000+1a2000]
[1607575.942411] monitor.pl[18522]: segfault at 5929210 ip 00007f25b5d3b5df sp 00007ffcc28aa440 error 6 in libc-2.19.so[7f25b5cc1000+1a2000]
[1607876.026723] monitor.pl[19625]: segfault at 4e8edf0 ip 00007f94819435df sp 00007ffff63d9560 error 6 in libc-2.19.so[7f94818c9000+1a2000]
[1608477.105817] monitor.pl[21800]: segfault at 5e1cee0 ip 00007f9650e1a5df sp 00007fff43b0a880 error 6 in libc-2.19.so[7f9650da0000+1a2000]
[1608776.798656] monitor.pl[22779]: segfault at 401d260 ip 00007f35fee6a5df sp 00007ffea04e10f0 error 6 in libc-2.19.so[7f35fedf0000+1a2000]
[1609145.500284] /usr/share/webm[23986]: segfault at d896550 ip 00007f05a4ac15df sp 00007fff38b40760 error 6 in libc-2.19.so[7f05a4a47000+1a2000]
[1609978.026078] monitor.pl[27014]: segfault at 6a55030 ip 00007f022d07f5df sp 00007ffd8910b090 error 6 in libc-2.19.so[7f022d005000+1a2000]
[1610278.273013] monitor.pl[27985]: segfault at 5870ab0 ip 00007fab8e1835df sp 00007ffc9c187fc0 error 6 in libc-2.19.so[7fab8e109000+1a2000]
[1610938.689946] /usr/share/webm[29968]: segfault at b12d2b0 ip 00007f05a4ac15df sp 00007fff38b408c0 error 6 in libc-2.19.so[7f05a4a47000+1a2000]
[1612682.049293] monitor.pl[4606]: segfault at 6302d60 ip 00007fbc58fc05df sp 00007fff2d7abfc0 error 6 in libc-2.19.so[7fbc58f46000+1a2000]
[1612980.118420] monitor.pl[5750]: segfault at 4c43140 ip 00007fb23065e5df sp 00007ffea8a0aa60 error 6 in libc-2.19.so[7fb2305e4000+1a2000]
[1615682.021277] monitor.pl[14993]: segfault at 78e64a0 ip 00007f23d15695df sp 00007fffe4d5ef80 error 6 in libc-2.19.so[7f23d14ef000+1a2000]
[1616883.010271] monitor.pl[19661]: segfault at 4ab7230 ip 00007f95e1e2d5df sp 00007ffdbf5e1810 error 6 in libc-2.19.so[7f95e1db3000+1a2000]
[1617483.408489] monitor.pl[21781]: segfault at 6413ac0 ip 00007f79490785df sp 00007ffcdeb0a7c0 error 6 in libc-2.19.so[7f7948ffe000+1a2000]
[1618083.406026] monitor.pl[23760]: segfault at 5e9db10 ip 00007fe4447955df sp 00007fffecc99bb0 error 6 in libc-2.19.so[7fe44471b000+1a2000]
[1619284.414632] monitor.pl[28353]: segfault at 3f4f090 ip 00007f19a9f3d5df sp 00007ffd2338e3f0 error 6 in libc-2.19.so[7f19a9ec3000+1a2000]
[1619885.730190] monitor.pl[30546]: segfault at 7d05030 ip 00007fb7bd84e5df sp 00007fff0ed186e0 error 6 in libc-2.19.so[7fb7bd7d4000+1a2000]
[1620244.491066] /usr/share/webm[31769]: segfault at d86fea0 ip 00007f05a4ac15df sp 00007fff38b40760 error 6 in libc-2.19.so[7f05a4a47000+1a2000]
[1621086.571205] monitor.pl[3184]: segfault at 4879f40 ip 00007f825f5c95df sp 00007ffd5bfb39d0 error 6 in libc-2.19.so[7f825f54f000+1a2000]
[1621987.585130] monitor.pl[6206]: segfault at 583cf90 ip 00007f62e4be45df sp 00007ffd9bd50830 error 6 in libc-2.19.so[7f62e4b6a000+1a2000]
[1622587.727728] monitor.pl[8086]: segfault at 808bbc0 ip 00007f5b9b5765df sp 00007ffea963f5f0 error 6 in libc-2.19.so[7f5b9b4fc000+1a2000]
[1623788.458545] monitor.pl[12045]: segfault at 42796f0 ip 00007faf22bdd5df sp 00007fff514d5500 error 6 in libc-2.19.so[7faf22b63000+1a2000]
[1623854.786168] /usr/share/webm[12160]: segfault at d7d3b20 ip 00007f05a4ac15df sp 00007fff38b40760 error 6 in libc-2.19.so[7f05a4a47000+1a2000]
[1624388.719367] monitor.pl[13896]: segfault at 799a9c0 ip 00007f15d6dc35df sp 00007ffd10491120 error 6 in libc-2.19.so[7f15d6d49000+1a2000]
[1624989.605459] monitor.pl[15816]: segfault at 4cc7e80 ip 00007f5abc1495df sp 00007ffebf2f6cd0 error 6 in libc-2.19.so[7f5abc0cf000+1a2000]
[1625289.757969] monitor.pl[16822]: segfault at 4bcdb10 ip 00007fae4fa4f5df sp 00007ffe16f191c0 error 6 in libc-2.19.so[7fae4f9d5000+1a2000]
[1626491.013545] monitor.pl[20661]: segfault at 7893640 ip 00007f4e16ba55df sp 00007ffcfc05c4c0 error 6 in libc-2.19.so[7f4e16b2b000+1a2000]
root@webserver:~#

and finally:

strace virtualmin list-users --domain XXXX.cz

is stored in external file (350 kB) https://www.dropbox.com/s/4i61xvwg3wa8tk5/virtualmin-strace.txt?dl=0

Segfaults mean that "something under the hood" is breaking. Something at the kernel level, or maybe a library.

Perl code can't cause a segfault. Instead, it indicates that something is very awry.

It shouldn't happen due to MySQL columns and such -- that would generate friendlier errors, as that's a more normal/expected kind of problem.

That MySQL activity in the strace output is most likely just normal activity that Virtualmin was performing when something else caused a problem.

That gave me an idea though.

What is the output of this command:

cat /etcsecurity/limits.conf

# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - an user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#        - NOTE: group and wildcard limits are not applied to root.
#          To apply a limit to the root user, <domain> must be
#          the literal username root.
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open files
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#        - chroot - change root to directory (Debian-specific)
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#root            hard    core            100000
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#ftp             -       chroot          /ftp
#@student        -       maxlogins       4

*       hard    nofile  100000
*       soft    nofile  100000
*       hard    nproc   100000
*       soft    nproc   100000
mysql   soft    nofile  100000
mysql   hard    nofile  100000   
root    soft    nproc   unlimited
root    hard    nproc   unlimited
root    soft    nofile  100000
root    hard    nofile  100000

# End of file

Try commenting out everything in your limtis.conf file, and see if there is a change after that.

If I comment mysql section I will be unable to run MySQL (known problem) and I am unable to do that with 100+ domains relied on this (maybe in the middle of night). Do you really believe that it might help?

Try increasing the MySQL section perhaps to 200000, and then comment out everything else.

You can always wait until after business hours to test that.

Okay... I'm unfortunately not quite sure what else to try there.

Have you tried talking to your provider about this issue?

It's possible they've heard about it before and would know what to do.

Other than that, the only other thought i have would be to attempt migrating elsewhere, either to another provider, or to another type of host at this provider.

The normal way to do that is using Virtualmin's backup functionality, though if that's not working properly that would be a problem.

The other way would be to stop all the services, and to rsync the various important files and directories (but not rsync'ing more than necessary, as you wouldn't want to accidentally copy over the cause of the problem). That wouldn't be easy, especially with 150+ sites, but if the backup functionality doesn't work that may be the only other way to get that up and running somewhere else.

Joe's picture
Submitted by Joe on Mon, 05/23/2016 - 14:08 Pro Licensee

I have to agree with Eric that there's a kernel issue happening here (it could also be underlying hardware; memory or disk corruption can lead to weird stuff like this that kinda spans a bunch of different services and tasks...though rebooting would likely change the symptoms). Nothing Webmin/Virtualmin does should even be able to cause a segfault (being written in high level languages without directly managing memory, Webmin doesn't have any control over this stuff), much less consistently trigger them for some tasks.

Am I understanding you correctly that you've tried booting into different kernels? Perhaps rolling back to one that worked in the past would be an option? If not, any solution will probably have to involve the OVH folks; kernel level segfaults are out of our reach...we can't fix them.

You could also try reducing memory usage on the system, to see if it leads to different behavior. If OVH are overselling memory, it might manifest this way (I don't know, as I haven't seen this particular behavior before, but some weird errors in the past have been due to oversold memory). So, you could turn off library caching in Webmin, reduce MySQL buffer sizes, get rid of unneeded Apache modules, etc. If the symptoms change or go away, you'd be able to guess that maybe that's the problem. Some VM hosting platforms are kinda notorious for memory problems (OpenVZ and Virtuozzo, in particular, as it is used so heavily by cut-rate hosts who oversell their systems heavily). OVH doesn't have a bad reputation, as far as I know, but it could still be a memory related issue, if they've misconfigured something.

Thanks for support, I will try reach OVH (we are under Czech branch which redirects requests to France) and we will see. I am going to keep you informed.

My assumption was correct. I am not Perl guy but syslog related outputs and malfunctioned MySQL module worried me.

Today I have googled more and found this https://sourceforge.net/p/webadmin/bugs/4476/

So I have changed nodbi=1 and everything works as expected.

Question is if this is really Webmin bug.

Joe's picture
Submitted by Joe on Tue, 05/24/2016 - 15:21 Pro Licensee

Oh, wow! I remember that issue now, but it's been years since anyone ran into it, so I didn't even think of it. I would think that would be a MySQL DBD bug. But, why it's hitting a new system is weird. I thought it was an old bug that had since been fixed. Does your system have a non-standard Perl or libraries from a source other than the OS repos?

Hi. It is now very confusing what caused what. During my journey to solve this problem I have tried almost everything - from Kernel switch, to install and trying to change gr security and of course CPAN manipulation :/

So I thing I am unable to answer with confidence :-)