nginx sub-server re-use parent fastcgi [#25437]

Submitted by aitte on Mon, 02/18/2013 - 04:49

Imagine that I create the server "hello.com"

And I create 20 sub-servers:

a.hello.com
b.hello.com
c.hello.com
...

Suddenly, I have 21 php-fcgi processes along with 4 children each.

That's 21 + (21 * 4) = ONE HUNDRED AND FIVE (105) php processes constantly spinning in memory and stealing cpu and ram.

Really, when we create a new "create sub-server" there must, must, must be an option to say "re-use parent fastcgi daemon" and it should be the default.

That option in turn should make it so that we do NOT create a subdomain fastcgi daemon, but instead tell the nginx sub-server to re-use the one from the parent domain. This is safe and should be the default setup for all new sub-servers, because the parent domain and sub-servers run as the same user so there is ZERO benefit to giving each subdomain their own fastcgi daemon in almost 100% of cases, unless the subdomain gets loads of traffic, or is a different site altogether.

When creating an alias server, it already re-uses the parent server's fastcgi process, so it would be trivial to make sub-servers do the same thing.

The only issue with all of this is that you may not want the sub-server to use the same php.ini as the parent server, but that is why you have an option to make the sub-server spawn its own php-fcgi process with its entirely own php.ini. But don't keep it as the default because it is a HUGE waste of resources in 99% of cases.

Status:

Active

Comments

Submitted by JamieCameron on Mon, 02/18/2013 - 23:14 Comment #1

Another user (jasongayson) brought this up recently as well .. he also had some very strong opinions about how Nginx support in Virtualmin should be implemented, that were quite similar to yours. So clearly there is some demand for this change.

However, it would break Virtualmin's current model of having a separate .ini file for each domain, which is relied on by script installers. So I'm very unlikely to make it the default.

Also, the resource overhead of having many php processes may be less than it first appears - all of those processes would share the same executable memory, and unless actually serving queries would be using zero CPU.

Submitted by aitte on Fri, 02/22/2013 - 05:43 Comment #2

Oh, the children of the overall php-fcgi daemon share memory. That's a really huge relief. "top" doesn't show how much is shared memory so I had a major heart attack when I saw how much "memory" was being used and how many processes I had.

Still a big waste to have one php-fcgi per domain, but less of a waste than I thought (1/5th of what I thought, since the 4 children share most of the same memory).

And it makes sense that the script installers would want to modify php.ini in some cases and need to know where the per-domain ini is...

However, it's actually possible to solve that by making the "get_domain_php_ini" function return the php.ini location of the parent/daemon server, which will allow script installers to continue to work great under the reusable php-fcgi system.

In fact, if that function is relied on everywhere, then it'd only be that single point that needed tweaking when migrating to the more efficient "one php-fcgi per user; optionally spawning extra ones on a per-subserver basis if they get a lot of traffic and warrant it, or if they really really want a private php.ini"

Submitted by JamieCameron on Tue, 02/19/2013 - 13:40 Comment #3

Yeah, the memory use shown by top is misleading - typically all instances of a single binary like php-cgi on the system will share the RAM used by the code, which is the majority in this case. I'm not 100% sure if this sharing spans users, but even if it doesn't there wouldn't be much gain from having fewer processes per user.

Even if the get_domain_php_ini function returned the parent .ini file, this could cause problems with script installs that have contradictory requirements (like magic quotes on vs off). Currently this can be handled by putting them into separate subdomains.

Submitted by aitte on Fri, 02/22/2013 - 05:43 Comment #4

Hmm, that's true... There could be ini conflicts.

Anyway, I read up on php-cgi since your claim that "all php-cgi instances from a single user share memory" was news to me. But every resource I found described it as follows:

php-cgi (parent: 32mb shared ram for instance)
  child1 (2mb own memory, 32mb shared)
  child2 (2mb own memory, 32mb shared)
  child3 (2mb own memory, 32mb shared)
  child4 (2mb own memory, 32mb shared)
  (means 8mb wasted by the children, 32mb by the parent)
php-cgi (another parent, another 32mb wasted)
  child1 (2mb own memory, 32mb shared)
  child2 (2mb own memory, 32mb shared)
  child3 (2mb own memory, 32mb shared)
  child4 (2mb own memory, 32mb shared)
  (means another 8mb wasted by the children, 32mb by the parent)

So the above example setup with two php-cgi processes and 4 children each would use 2 * 32 + 2 * (4 * 2) = 80mb ram for just two php-cgi processes with 4 kids each.

The only sharing that goes on is between each php-cgi parent and its children.

Another type of sharing is if you run the APC opcode caching module, which shares opcodes between all children within a single php-cgi process, but php-cgi itself doesn't share memory with any other php-cgi parents.

The other kind of sharing would be if you use the newer php-fpm which uses a single php process for the entire system, which has one pool of shared memory for the core engine, used by every child process and every user, and is by far the most effective.

Every description out there called php-cgi a giant memory hog and something that one should be very careful about spawning too many separate instances of.

Maybe you had php-cgi confused with php-fpm? I spent 20 minutes reading and searching and couldn't find a single site that said anything other than "beware of spawning extra php-cgis, it's a memory hog."

Submitted by JamieCameron on Wed, 02/20/2013 - 00:14 Comment #5

The memory sharing I am referring to is for the php-cgi executable itself, and its shared libraries. These account for most of the ~32 MB usage displayed by top .

You might want to try out stopping the php-cgi processes for one domain on your system and see how much the free memory changes. On my test system, killing the processes for one domain barely saved any memory. Killing them for all domains saved 32 MB.

This system wasn't under load though, so memory use by PHP apps themselves wasn't significant.

Submitted by aitte on Fri, 02/22/2013 - 05:43 Comment #6

Here are the mem statistics on a 32 bit low-specced test VM (hence the low total RAM used from the start):

used memory:
user1/domains user2/domains user3/domains
1/0 2/0 3/0 = 250388k - 0 users
1/1 2/0 3/0 = 278020k - 1 user
1/2 2/0 3/0 = 286796k
1/3 2/0 3/0 = 292316k
1/4 2/0 3/0 = 299300k
1/5 2/0 3/0 = 303944k
1/5 2/1 3/0 = 309200k - 2 users
1/5 2/2 3/0 = 315192k
1/5 2/3 3/0 = 320976k
1/5 2/3 3/1 = 326276k - 3 users
1/5 2/3 3/2 = 330712k

Hmm, this suggests that php-cgi sets up a shared memory area across all users, where it stores the PHP core binary, and then just does per-process/per-child data/state stores.

So it turns out that the entire internet was wrong. I really did spend 20 minutes reading on php.net and numerous other websites using loads of different search terms all relating to php-cgi and memory and every single site, yes every single site, said that every php-cgi instance uses its own chunk of memory and that you quickly run out of memory on a VPS by launching too many.

Well, the above tests prove all of those people wrong. Odd. This is probably one of those things that used to be true in the past and has become common knowledge, but was later fixed and improved to re-use memory. I've seen that kind of situation many times; it usually doesn't matter to people if an issue has been fixed, the outdated and conventional wisdom will keep spreading like wildfire. Old habits die hard, especially in sysadmins.

There is one important thing not shown in the above statistics though: if I was to enable the APC caching module, I would be in deep trouble, because it caches per-parent. With 1 process, it's fine and gives great performance. 40 parents (imagining 40 domains) each allocating a 32mb cache, not so fine anymore. That'd be 1.3gb of RAM wasted due to the frivolous spawning of parents.

However, I don't use APC, and this result briefly even made me reconsider my practice of always remapping subdomains to use the parent domain's php-cgi and shutting down the subdomain's own process.

Then again, php-cgi spawns children as-needed, so I see no reason to keep a heckload of parent processes active per-site. I am therefore going to continue killing subdomain php-cgi processes and redirecting them to the main domain's, because it's the right thing to do regardless.

Edit:

It just hit me: it may have looked fine in the statistics above, but once this system is put under load, all of those 4 children per site will be allocating extra memory for variables and php state, so their memory usage will quickly balloon up. With 1 php-cgi parent and 4 children, those 4 children are shared among several subdomains of the site and will usually be the only children alive. However, with loads of php-cgi parents, each will have 4 children each with their own private memory areas full of data. This must be what all of those people warning against php-cgi were talking about! It's not so much that php-cgi uses a lot of memory for each extra parent instance; it's that the data in each child in-use causes them to balloon up individually. Makes perfect sense now.

Submitted by JamieCameron on Wed, 02/20/2013 - 21:22 Comment #7

I wonder how large those processes get when under load and hosting a complex and high traffic website?

I guess if both the main domain and sub-servers are all hosting complex webapps the total memory usage could be higher. So in that case, sharing PHP processes would help.

Medium term I do plan to add support for php-fm, which I assume would be a bigger memory saving.

Submitted by aitte on Fri, 02/22/2013 - 05:41 Comment #8

Yeah php-fpm is the best way to run php, and hopefully becomes available in the new CentOS / RHEL version later this year.

By the way, I just had some thinking, and came up with the best solution yet:

Each user gets a single php-fcgi wrapper; just one per user
After I create any site, I will manually disable the autocreated fcgi script "now and on boot" and delete it from /etc/init.d
I will then edit the newly added site's configuration and point it at the user's single own php-fcgi
The php.ini configuration for the per-user php-fcgi will contain APC; the cache is shared among the parent and all of its children and persists even as children spawn/die. This php bytecode caching leads my user's sites to speed up page generation speed by 3-10x, so that they become snappier and can serve more people without constantly going to disk to re-compile the php source files.
The APC cache size can be tweaked as needed per user, to be able to fit all the compiled PHP code of all of their domains
Currently, Virtualmin sets up php-cgi as a TCP socket on a port; I will change my system to use a unix socket file on disk instead, for even more efficiency.
I will store sessions in /home/[user]/php5/session/, the ini in /home/[user]/php5/php.ini, the log in /home/[user]/php5/php.log, the pid in /home/[user]/php5/php-fcgi.pid, and the socket in /home/[user]/php5/php-fcgi.sock (fastcgi_pass unix:/home/user/php5/php-fcgi.sock)
The only, literally only, downside is that you no longer get one php.ini per-domain, but that's something that can be solved by giving them more than one php-fcgi wrapper if they truly need it

This is just the best solution from a hosting perspective. It leads to the best utilization of CPU and memory and allows super efficient APC caching to take place without waiting for php-fpm to become widely available. But, it is not the best solution when it comes to Virtualmin, since what I have described above would conflict with most of the hardcoded assumptions in Virtualmin about where things are stored.

So I'll just make this as a hack for myself, patching out the parts of the Virtualmin code that create the per-site fcgi process. Hopefully when php-fpm arrives, this kind of efficiency will be something that Virtualmin users at large get to enjoy.

By the way: as for php-fpm, you will need to open one socket/port per php.ini even then. Maybe it's time to casually begin looking into the Virtualmin script installers, and creating a conflict resolution system that looks at installed scripts for all domains for a user and what php.ini changes they've applied, and if there are no php.ini conflicts it just re-uses the main php-fpm/php-cgi for the user; if there are conflicts, it shows them and asks the user which of the two conflicting values they want to keep (or enter a new custom value that would satisfy both scripts), or if they'd rather create a new php-cgi/php-fpm listener for the new domain as a last resort. For instance, a conflict between the maximum memory might have one site want 32m and another 128m, so you'd choose to keep the latter option during the conflict resolution, without needing to make a new listener.

The thing is, that even with php-fpm, the apc cache is not shared between the various listener sockets owned by the same user, thereby wasting RAM yet again by setting up separate caches per domain. So it may be time to look at always doing things more efficiently at the Virtualmin level, rather than creating php.inis for every domain as a default.

In a more efficient setup, you'd have one php.ini in the user's main home folder, and then only ever having per-site ones in the per-domain folders for the rare, rare, rare occasion that the conflict resolution couldn't solve a problem.

This system would need two supporting features: get_domain_php_ini when creating a new domain would return the home-folder path at all times, and a separate cache would have to be maintained containing a list of home-folder php.ini values inserted/edited by various installed scripts; any time you install a script, it looks at this cache to see if that value has already been touched by a previous installer. If not, it's free to instantly re-use the user's main php process. If it finds a conflict, it's up to the user to pick the value to keep, or to enter a custom value (such as a value that merges both options), or as a last resort to create a new php.ini to avoid the conflict entirely. Under php-fpm, the latter, while bad, is still quite efficient and only means that the user loses out on having more efficiently shared APC caching, and wasting extra ram through the additional APC cache that's set up, but other than that big drawback, there are no other serious downsides to spawning per-site php.inis under php-fpm. Still, in the interest of efficiency and doing what's best for the user, server performance, and hardware utilization, it's absolutely important to be very careful about when you decide to create a new php.ini.

I have no horse in this race other than advising you on what really is the best solution. Personally I'll migrate all sites to the manually managed system I just described, so that I get to enjoy maximizing hardware utilization and having efficient APC caching today, rather than in the future if ever. ;) if Virtualmin continues being inefficient it really doesn't affect me since I'll just keep re-applying my patch to take out the per-site fcgi creation.

Submitted by aitte on Fri, 02/22/2013 - 05:40 Comment #9

pastebin.com/9X3iVWxz

Feel free to analyze it, Jamie. It sets up a top-modern php5 fastcgi wrapper with complete APC cache support (a single, shared cache for all connections, ensuring cache utilization) and proper run-state, and throws away all of the old junk.

Whether Virtualmin improves or not doesn't matter to me; I've written that code to migrate users over to the right way of doing things and I have a modern, efficient system as a result, so it doesn't matter at all to me if Virtualmin stays behind. The only reason I shared the code is to help you in case you're open to improving Virtualmin, if so then have a look at how I've done things over there. If anything, it would be a nice thing to do for all the other users, giving them reduced RAM and CPU usage, fully working APC caching, and a cleaner system. This new setup is pretty much as efficient as php-fpm.

The code has been set to "never expires" and anyone is free to read it, but I have to warn you that it assumes that the top-level /home/[user] folder is set up according to this system: virtualmin.com/node/25456

To anyone reading: it's written to be as generic as possible and should work on every Linux distro, but it's only been tested on CentOS/RHEL. I don't provide any support or explanations beyond the fact that the code will break your system if you aren't running a "dummy top-level server" setup as described in the other thread above. Moreover, you must ensure that your websites are in /etc/nginx/sites-available. I didn't release this for regular users to use, but if you really want to switch to it then at least you've now been warned about what you must do.

Submitted by aitte on Fri, 02/22/2013 - 05:11 Comment #10

New, final version which just adds a "--apc-nostat" config option, which is extremely useful for great performance on sites where source code files never change. It gets rid of the stat() calls, allowing APC to serve files immediately from memory rather than having to stat() every file. It's the kind of final-few-percent tuning that you should definitely be doing on a production server, and i was doing it so often that i figured i'd add it as an option to the tool.

pastebin.com/9X3iVWxz

Speaking of which; in case anyone (jamie) is curious, installing APC on CentOS is done as follows:

    # yum -y install gcc make pcre-devel php-devel php-pear pecl
    # pecl channel-update pecl.php.net
    # pecl install apc
    (accept all defaults during apc build configuration)
    # echo -e "extension=apc.so\napc.enabled=1" > /etc/php.d/apc.ini

This compiles APC and forcibly enables it globally for all sites (because /etc/php.d/ is read after the user's php.ini), thus ensuring that users cannot disable APC and slow down the server.

As you're probably aware, APC lets a server handle 3-10x more page requests per second, by caching the PHP bytecode instead of re-compiling the whole site on every page load, and then only re-compiling if the code has actually changed. There's no sense in letting users defeat the performance benefit by disabling it, hehe.

Oh and APC massively increases the performance of fastcgi, since each child can process requests much faster and can therefore handle a lot more requests per second, meaning that even a low number of children can serve a lot of people.

Well, just thought i'd leave this here as well, for completeness. How this information is put to use (or ignored) isn't of my concern, but now it's out there if you ever need to refer to the proper way of doing things.

Submitted by tpnsolutions on Thu, 02/21/2013 - 22:50 Comment #11

Hi,

Thanks for the excellent tip, we've enabled this within our infrastructure and it appears to be working like a charm. Cheers!

-Peter

Submitted by aitte on Fri, 02/22/2013 - 06:09 Comment #12

Hello Peter! That's funny, I never expected anyone else to be using this. I mean, sure, my code is the result of a day of work and is 100% correct, efficient and ready for production use, but I didn't exactly explain how to use it in depth, because it was only meant for myself, and for Jamie on the off-chance that he wanted to make Virtualmin do the right thing.

And yeah, it sure feels great to get to enjoy blazingly fast PHP performance, rather than the bloated mess that Virtualmin sets up.

So that's 2 production systems (yours are mine) where it's now Working Like A Charm.

See, this is why I take the time to share code. You never know when someone else will benefit. ;-)

I've seen you around the forums, helping people for free too, so I know that you share this mindset. Sharing makes the world go 'round, hehe.

PS: You'll want to re-download it from pastebin.com/9X3iVWxz (just uploaded) and re-export all PHP-FastCGI wrappers. I discovered that one of the last-minute additions had been using "$var" instead of "\$var", which didn't actually cause any issues at all (it was in a part of the code which could even have been left out but was included because I'm a perfectionist), but it's still good to export the final versions. Or you could leave it as-is, because again, it was a non-issue to begin with. ;-) However, the new link contains the final, "gold master" version of the code, never needing to be touched again.

Here are the final versions:

/usr/bin/php5fcgi-create:
   pastebin.com/9X3iVWxz
/usr/bin/php5fcgi-delete:
   pastebin.com/yiHXKhJq

Installing them in /usr/bin makes them very easy to use; you'll just run "php5fcgi-create [username]" in almost 100% of cases. In case it's a production site where code doesn't change often (i.e. a webmail system you have set up for your customers), you'd run "php5fcgi-create --apc-nostat [username]" for even more efficient page serving. The other options are useful in some cases as well, such as giving some users larger or smaller APC caches. However, most people will be fine with the 32MB default, since even a massive site framework like Wordpress only takes up 10MB of cache in total. Most scripts/sites take up less than a megabyte, meaning that a user can boost lots of sites with their 32MB of cache space. But the option is there to customize things further on a per-user basis if desired. Just run php5fcgi-create and it'll explain the command line options.

As for php5fcgi-delete, it does the opposite (stops and removes the new, modern wrapper), and is intended to be run just prior to deleting a user from the system. It only takes one argument; the username.

And again, for anyone coming late to this party, beware that these installers require some initial system setup steps and will break your system if you haven't set things up properly beforehand (carefully read post #9 and down in this thread).

I never intended for these to be widely used (basically only shared them as a reference for Jamie), and it's entirely up to the end user to carefully read the two threads related to these issues, to get an understanding of how the system should be configured before this modern PHP-FastCGI system can be used. In short, you need the Nginx webserver, site configurations stored under /etc/nginx/sites-available, and a "dummy domain" as the top-level entry for each user. The two threads explain the latter requirement in detail. Again: Warning, this is for people that want APC and a blazingly fast PHP system and are willing to put in the time to carefully read two long threads. It is not for beginners or the faint of heart, because it goes completely contrary to the inefficient way that Virtualmin wants to set up sites, throwing out the slow Apache and the bloated per-domain PHP engines, replacing it all with Nginx and a powerful per-user (rather than per-domain) PHP-FastCGI + APC bytecode caching system. As an added bonus, when fully implemented it massively cleans out each user's home/domain folders, making for an easier and cleaner hierarchy for end-users.

Submitted by aitte on Wed, 02/27/2013 - 11:18 Comment #13

It has just come to my attention that I was mistaken in thinking that my system was (as I put it) "nearly as good as PHP-FPM," in fact it turns out that my system is much, much, much better than PHP-FPM.

"Wait, how can this be?!" I hear you ask through the spy microphone I have installed in your book case.

Well, I am glad you ask. Let's step through it, shall we?

Stability:

PHP-FPM: A single process handles all per-user children. When that parent process crashes, you temporarily bring down all websites on the entire system.
My System: Each user runs their own isolated parent process, and if it crashes, only their sites are affected.

Memory:

PHP-FPM: All PHP modules + the core are loaded once, into the main PHP-FPM binary, which then runs children by fork()-ing. Memory usage is very low (about 4MB per user on the system for the interpreter).
My System: A single, shared, system-wide memory area is set up for all PHP modules + the core. The per-user processes run the executables from shared space, and only maintain their own per-process data storage for the internal runtime state of the PHP engine/modules. Memory usage is very low (about 6MB per user on the system for the interpreter).

Bytecode Caching (APC):

PHP-FPM: Because all per-user processes share the same PHP-FPM parent process, they do not get per-user bytecode caches. The entire system gets ONE bytecode cache shared by every user.
My System: Every user gets their own, isolated cache, ensuring that we can give one user a 32 MB cache and another user a 10 MB cache, and yet another user may get no cache at all.

Security (Bytecode Caching):

PHP-FPM: Because the entire system only gets ONE bytecode cache, it means that EVERY user on the system can read each other's APC-cached data. This is a massive security vulnerability, because APC caches all code and all user-data, meaning that the cache contains everything from SQL passwords and admin passwords to other sensitive data.
My System: Every user gets their own, isolated cache, ensuring that they cannot snoop on each other.

Security (Overall):

PHP-FPM: Every user runs as a child of a main PHP-FPM. The main PHP-FPM process must run as root in order to be able to "step down" (setuid) to the different users to run their PHP children with their privileges. When (when, not "if"), a 0-day exploit is discovered in PHP-FPM that allows a child to escalate its privileges to the root privileges of the PHP-FPM process, then your entire machine is rooted and wide open. Preventing this would require extra kernel-layer protection to jail the PHP-FPM process.
My System: Every user has their own, isolated process, ensuring that NO exploits can run with higher privileges than the user that was hacked.

PHP Configuration:

PHP-FPM: A single process handles all per-user children. That process reads from the main /etc/php.ini file. There is no way to do per-user php.ini files. Therefore, users cannot tweak settings to suit their sites.
My System: Each user runs their own isolated parent process, and gets their own per-user php.ini files, which they can tweak whichever way they like.

Summary:

PHP-FPM: Useless, insecure garbage. Users can snoop on and infect each others' APC data and access each other's passwords and SQL passwords since it's all cached in one system-wide memory location with full user-read/write privileges. The entire system can be crashed from a single point of failure when PHP-FPM dies. There is no easy way to configure PHP on a per-user basis. All exploits will allow compromising the security of the entire server machine by attaining root access via PHP-FPM.
My System: The only right, secure way of doing things. Memory usage is extremely low yet each user has their own, securely isolated PHP runtime state. Users have their own, individual per-user APC caches ensuring that data cannot leak or be infected by other users. APC caching is done properly and shared among all PHP worker children on a per-user basis. Each user gets their own per-user PHP configuration (php.ini). Exploits can only execute with the privileges of the affected user (never as root). If their process crashes, only their sites suffer, rather than all sites on the entire server. Plus, my system allows any server to deploy a perfect APC setup today, on any version of PHP, to speed up their websites by 3-10x.

To put it with as few swear words as possible: I will never, ever deploy the PHP-FPM garbage on a server for as long as I live, and neither will any other intelligent admin. ;-) I feel bad for those that will suffer through it not knowing how bad and dangerous it is. I can only hope that RedHat continues to compile their PHP binaries without FPM support, for the users' sakes.

Submitted by tpnsolutions on Fri, 02/22/2013 - 09:29 Comment #14

Hi,

Just got your email this morning, and in fact we're using Apache and have installed APC system wide across 3 of our clusters which seems to be giving things a slight performance increase. I haven't noticed any spiking memory issues yet, but like any solution implemented into our production environment, I monitor the changes for at least 24-48 hours and roll back changes or further adjust them if they prove to be less than useful.

One day, one day... I'll switch to making use of nginx, but for now we're content with good ole Apache.

Cheers!

-Peter

*** I enjoy helping people out, which is why I do so professionally as a career... sharing is caring! ***

Submitted by aitte on Sat, 02/23/2013 - 12:46 Comment #15

Hello Peter!

In that case, Apache needs to be tweaked to spawn PHP via FastCGI, and to only spawn 1 parent PHP process per user (which in turn spawns the individual worker children that handle the requests). That is how you ensure that each user gets a single shared cache, since caches are set up on a per-parent basis.

Otherwise, each user will have multiple PHP parent processes, meaning multiple independent caches which will balloon up until your server implodes, not only wasting gigabytes of RAM but also ensuring that the cache is under-utilized since bits and pieces are cached here and there in separate processes, instead of having it all in one location.

You can read about how to fix this behavior in Apache over at brandonturner.net/blog/2009/07/fastcgi_with_php_opcode_cache/

The reason I suggested Nginx is that it makes this kind of setup a heck of a lot easier, and of course offers that beautifully low memory footprint and ultra high performance that allows our servers to handle 5x more connections per second than Apache ever did, while barely ever breaking above 100MB of RAM and 30% CPU, hehe. (Apache under the same load used to crash after using god knows how much RAM.)

And yes, I have seen you around the forums, and one of the most impressive things I ever saw was when you out of the blue offered to help a user troubleshoot his system via a remote control session, without asking for anything in return. I also like to believe in "paying it forward." I do good things because it's a heck of a lot more fun than being a grumpy little bastard, and I ask for nothing more than that people keep sharing and enjoying the ideas. ;-)

Submitted by tpnsolutions on Fri, 02/22/2013 - 10:24 Comment #16

Hi,

I'll take a peek at the link in a bit, thanks!

The primary reason we haven't switch to nginx is because I've been using Apache for over a decade, and it's what I know best. While I'd agree nginx has set a new standard in the industry for speed, performance, and certainly has not gone unnoticed, until I have enough spare time to really test it out, I'll stick with Apache though I've been itching to move to nginx for quite some time.

*** I appreciate your compliments regarding my activity in the community, and certainly do enjoy helping out whenever I can. Today I offer assistance in the forums, FREE one-on-one training, and in the near future GROUP training... Email me if you want to learn more :-) ***

-Peter

Submitted by aitte on Sat, 02/23/2013 - 13:47 Comment #17

Jamie: I forgot yet another reason to use the setup I proposed earlier.

With the current Virtualmin system, I can log in as user "hacker" and access the files of user "victim".

This is me, as user "hacker" (the abbreviated version):

  $ ls /etc/nginx/sites-available
  -rw-r--r-- victim.com
  $ cat /etc/nginx/sites-available/victim.com
  fastcgi_pass localhost:9001;
  $ telnet localhost 9001
  <?php
     echo file_get_contents('/home/victim/mysecrets.txt');
  ?>
  This is my secret file, only I can read it, I hope...
  (you were wrong; anyone can connect to your PHP listener port and issue PHP commands under your privileges)

Unix sockets housed under the user's home folder solve that. Nobody but root or the user themselves can read the socket file housed in their home folder.

I am glad that I'm able to program and can repair and improve Virtualmin to get a fast and secure setup, no matter what happens or fails to happen with the official version. Shame about the rest of the users though, if this isn't fixed...

The proposed code solves security issues, memory/CPU bloat issues, and enables efficient APC caching for a 3-10x site speedup, and it's yours if you want it. Either way I'm running it on all systems now with incredible results. All the information you need is in this thread and in the linked pastebin code.

Submitted by JamieCameron on Sat, 02/23/2013 - 20:09 Comment #18

Wow, that is a massive hole - I didn't realize that the fcgi protocol allowed arbitrary PHP code to be run like that. I will look into switching the Nginx module to using a socket file instead.

Submitted by aitte on Sun, 02/24/2013 - 09:47 Comment #19

Indeed.

Check my pastebin code above for a perfect implementation of socket files.

And as for the rest: Remember to read the sections above on why PHP-FPM won't save you, at all, and should be avoided.

Submitted by JamieCameron on Mon, 02/25/2013 - 22:03 Comment #20

So I did some testing, and with the default Virtualmin Nginx setup I wasn't able to just feed an arbitrary PHP script to the php-cgi process like you did in comment 17. In fact, based on reading the FastCGI protocol at http://www.fastcgi.com/devkit/doc/fcgi-spec.html , I can't see how it even allows arbitrary code to be injected like that.

EDIT : Regardless, you are still correct about the hole. The fastCGI protocol allows the caller to specify an arbitrary script to run with the server's permissions, which could be a script in the attacker's directory.

Submitted by JamieCameron on Mon, 02/25/2013 - 23:38 Comment #21

Further update - the next release of the Nginx plugin will use socket files instead of TCP connections to talk to PHP in new domains.

Submitted by tpnsolutions on Mon, 02/25/2013 - 23:40 Comment #22

Hi,

As you suggested, APC started to make use of a large amount of resources. So for the moment, I've simply disabled the feature while I investigate more options.

-Peter

Submitted by aitte on Tue, 02/26/2013 - 11:27 Comment #23

Jamie: I said that the example was abbreviated (I also didn't want anyone reading this to be able to exploit it just by reading my description). A real attack would first have to write to /tmp/exploit.php (or any other world-writable/readable location which the hacker and victim can both access), then connect to the socket, issuing the FastCGI parameters to make the victim execute that file. Good thing you fixed it. If I was the one writing the fix I'd also do the responsible thing and write a small bootstrap script installed in the next update that converts every existing site over to the socket method and then deletes itself.

By the way, it's cute how you're not even commenting on the APC and PHP-FPM issues (nor my perfect, working solution) - but just know this whatever you do: Never, ever implement PHP-FPM. It's a giant security hole and a waste of time and doesn't even solve the coveted APC issue for you. So you're just going to have to start telling users that you're not going to be implementing APC support at all. ;) Maybe even point daring users to this thread, which has working APC solutions for both Apache and Nginx.

Peter: Yeah, either use Nginx or the Apache tutorial I linked to and all those APC memory issues will go away. The key thing is just to ensure that each user gets 1 and only 1 PHP FastCGI parent-process.

Submitted by JamieCameron on Tue, 02/26/2013 - 12:06 Comment #24

I'm not commenting on the APC or PHP-FPM issues because I haven't yet decided which (if any) of those solutions to implement :-)

I'm still debating whether to update the PHP port for existing domains - even though the old behavior is insecure, doing a mass update runs the risk of breaking something. In the past when we applied a security fix like that automatically, users got annoyed :-(

Submitted by aitte on Wed, 02/27/2013 - 11:11 Comment #25

Apologies; it looked like you were doing the "na na na na can't hear you" thing, hehe.

Feel free to ask me if you have any questions about the method I've outlined in this thread.

About an automatic security update; well, I actually do that thing in my automatic conversion script.

In your case it would just have to loop through every Nginx server{} block, find the fastcgi_pass line, see if it still uses "fastcgi_pass\s+localhost:\d+" format, and if so: 1. Stop the PHP service for that domain. 2. Create the socket-file under $HOME/etc/php5/php-fcgi.sock and set its owner/permissions. 3. Edit /etc/init.d/php-fcgi-domain-com to bind to that socket file instead. 4. Start the PHP service again. 5. Edit the fastcgi_pass line to point to that new socket file.

Then just issue "service nginx reload" to re-read the new configurations.

This couldn't possibly break anything. I should know since I do almost the exact same thing over at pastebin.com/9X3iVWxz

Also: If you want to try that program to see how it works, all you'd have to do is install APC as described in the thread, then create a dummy parent-domain as discussed (named after the admin username, with only the "Webmin login" and "MySQL database" features enabled), then create one or more Nginx domains as sub-servers, then run php5fcgi-create [user] for that user, and voila, working APC system along with a massive cleanout in the home-folder structure. The process for APC support in Apache would be very similar but a bit more complex and would have to be adapted using the method linked in the post I gave to Peter.

And trust me, I am well aware that this conflicts with the inefficient way in which Virtualmin used to do things, but sometimes change for the better can't be done any other way. All it would need is a php.ini conflict resolution system that detects whether a script installer's option already exists in php.ini and if so asks the user if they want to keep the old or new value or write their own custom value for the conflicting option. All scripts can be made to share a single php.ini without any drawbacks and it's just a matter of configuring options in a way that accomodates both scripts. Either way it's just a suggestion, because it's running perfectly on my systems no matter what happens. :-) I am no longer afraid of ultra-heavy PHP frameworks consisting of thousans of .php files, such as Expression Engine, because all code is pre-compiled with APC and runs quickly no matter what.

Submitted by aitte on Wed, 02/27/2013 - 11:26 Comment #26

I am going on a two week business trip and thought I'd let you know since I've been keeping you company here for a while now. :-P

There's not much else to do. All of the systems I administer are working perfectly - I've reported every issue I came across, from big to small to major, and we've seen a lot of changes, and I just don't think there's much more for me to help with around here.

So I probably won't be coming back after the trip. The only remaining things are this optional APC implementation (which would greatly benefit Virtualmin users) and the more efficient procmail script (I bumped that issue so that you can see it again), and all of the required information are in both threads. I am already running both on my systems with fantastic results.

Well, that's all. It's been interesting. Take care, man :)

Submitted by eddieb on Tue, 05/21/2013 - 19:37 Comment #27

just putting in my 0 cents so i can subscribe to this post. interesting stuff... assuming it works as described (not to doubt aitte's capacity at all), it would be worth it for me switching to nginx if the whole process becomes built into virtualmin one day.

one thing is for sure: i'm staying away from php-fpm. guess i might have to wait for apache 2.4 to use APC.

Submitted by aitte on Tue, 05/21/2013 - 20:48 Comment #28

Thanks for your cents, even if they were just 0 cents. ;)

You don't have to switch to Nginx. I spoke with Peter Knowles (tpnsolutions) in this thread and showed him how to do it for Apache. See post 15.

At its absolute simplest (but not the best performance), you can install and enable the PHP-APC extension, switch Apache to use FastCGI (via mod_fastcgi, not mod_fcgid), and set it to only spawn 1 FastCGI worker per user. Make sure to then set that FastCGI worker to spawn around 4 children so that it can handle more than 1 request at a time. By doing this, you end up with Apache speaking to a single per-user FastCGI process, which in turn contains the PHP runtime and APC cache, and that process in turn houses the worker children which all share that per-FastCGI cache.

However, if you change nothing other than what I just described, then you only get per-website caches, not per-user caches. Per-website caches are a step up from the default per-worker caches, but still woefully inefficient and bloated.

Here are the various scenarios that are available:

1) Using Virtualmin as-is today and enabling APC gives you per-worker caches, the worst possible kind. Every website on your server spawns new workers in response to load, and may hover around 2-3 FastCGI workers, so a user with 2 websites and the ~3 workers per site gets 2x3 = 6 workers, with each containing a 32mb cache, meaning that just those two websites take up 192 MB of RAM at all times. With 20 websites on the machine, you need 1.92 gigabytes of RAM just for the APC caches. Making matters worse, these workers are constantly being killed and re-spawned in response to load, meaning that the APC module is constantly unloaded from memory and all its cached contents wiped. So, enabling APC together with Virtualmin as-is just gives you a completely ineffective cache that almost never contains the cached data you want, and the massive amount of per-worker caches ensures that your RAM will quickly fill up and crash your server. (This happened to Peter, just as I had warned him. You cannot use APC with Virtualmin out-of-the-box.)
2) Alternatively, you can keep a kinda-Virtualmin setup, but modifying Apache as I just described, to make Apache use a single FastCGI process per-website which won't keep respawning itself and won't keep loading additional instances of the cache. This solution cuts the resource wastefulness down to a guaranteed single-cache-per-website. So a machine with 2 websites requires 64 MB of RAM, and 20 websites requires 640 MB of RAM. This is still very inefficient, as 20 websites can usually easily fit within a single cache (depends on how large the site codebases are, of course). However, at least your server's RAM usage is much more predictable than with the default Virtualmin setup, and you'll actually get some use out of the APC cache.
3) The last and best (from a performance standpoint) option is switching to Nginx and using my complete rewrite of the PHP-FastCGI daemon, my new user directory layout, my new Nginx configs, and everything else I've created, giving you the most optimal possible setup. Each user gets their own single cache, the size of which can be configured per-user. On a system with 20 websites, you just need 32 MB of RAM. On a system with 30 websites, you just need 32 MB of RAM. On a system with 100 websites, you just need 32 MB of RAM (that would be 9.6 gigabytes under Virtualmin's default setup), assuming that all those sites are running as 1 user. Basically, each user gets ONE cache per-user no matter how many sites that user hosts. The server admin can set individual cache sizes, so that one user can get 0 mb (no cache, for instance this could be the default for all users on a cheap plan), another can get 128 mb (if they have many, heavy sites), and so on. This setup ensures that the cache size is tailored for each user, and that the cache will do its job properly and with optimal performance (under my setup, the cache will never be periodically flushed out, as would be the case in scenario 1, nor will it inefficiently fragment the cache among several processes leading to RAM bloat and cache misses and cache de-sync, as would be the case in scenario 2).

Of course, regardless of whether you use Nginx or Apache, these required tweaks aren't built into Virtualmin, and I can sympathize with not wanting to deviate from what works out-of-the-box.

However, if your system doesn't have many websites, then you may still want to at least do method 2 (Apache + FastCGI), which is the least invasive while still giving you some of the APC benefits. I believe it could be doable with just a few Apache configuration changes, as long as Virtualmin has some built-in way of switching to mod_fastcgi and telling it to only spawn 1 worker.

Either way, I wish you luck with your servers! I wouldn't wait for official changes if I were you, because the proposed changes in this thread are extensive and may possibly never make it into Virtualmin, even though they are much better than the default system. They simply change too much of how Virtualmin used to work, and therefore would have implications for paying customers that expect things to be how they have always been.

I wouldn't expect any changes until the day that the need for ultra-performance becomes greater and Apache has finally died. This is the best method of doing things, and this ticket will always remain here for reference.

Submitted by eddieb on Tue, 05/21/2013 - 23:21 Comment #29

"as long as Virtualmin has some built-in way of switching to mod_fastcgi and telling it to only spawn 1 worker."

that's where I stand. and mod_fastcgi was the first thing that came to mind when I realized I needed APC. then I discovered PHP-FPM and even made a couple posts asking for it. but i am now convinced i want to stay away from it.

making the "switch to mod-fastcgi" a built-in feature of Virtualmin would not affect existing customers and it probably isnt that complicated. I'm gonna hope for that (wink wink jamie cameron)

BTW, if nginx read .htaccess files I would have already switched... (i wouldn't care if I needed to restart nginx just to have it pick up changes made to .htaccess)

thank you for your time, aitte

Submitted by aitte on Wed, 05/22/2013 - 08:33 Comment #30

Making Virtualmin support mod_fastcgi should be very easy, so I'll throw my +1 towards that suggestion too.

If supported, it'd give people a pretty okay APC setup without any tweaking required. To combat the still-present RAM bloat of per-site caches, you could simply make the default cache size 10 MB or so. Even a heavy Wordpress site with loads of modules normally never even reaches 10 MB.

So, mod_fastcgi + tuning the APC cache size downwards from the default 32 MB should give you a good system. Thanks to only having 1 cache per site, you are guaranteed to avoid cache de-sync and cache misses, since it'll all be right there in a single process, just as APC was intended.

As for Nginx and .htaccess, I know your pain, I am lucky to be intricately familiar with Nginx and am very good at porting .htaccess files over to Nginx, most often doing a better job than other people's attempts at porting around the net. But, it's definitely a hassle that so many webscripts rely on .htaccess and it limits the software that I can offer official support for, to the ones where I've ported the configurations over.

In the end it came down to speed vs convenience. Nginx is the only webserver that guarantees that it'll never exceed 100Mb of RAM use whether it handles 1000 connections/second or 30000 connections/second, whereas Apache explodes through the roof at about 5% of the way there. Nginx now powers a third of the largest websites on the internet, and even more if you count the sites that use Nginx as a proxy on top of Apache to speed up serving of static files. The lack of .htaccess support is the only major roadblock towards widespread acceptance, and unfortunately those files rely on so many Apache features and modules that they could never be officially supported by Nginx...

Submitted by eddieb on Wed, 05/22/2013 - 09:54 Comment #31

I'm interested in knowing if you have successfully ported WP w/W3TC modified htaccess & Magento standard (also w/ Varnish) htaccess to Nginx. BUT I don't want to start a tangent, so if you have something to say pls get in touch via http://j.mp/email2image.

THX!

Submitted by aitte on Wed, 05/22/2013 - 11:37 Comment #32

I don't run that combo myself, but looking at the config files I can see that it's really basic stuff. I might as well post the reply here, otherwise someone might stumble upon your post and want the same help.

Wordpress itself only needs like a single rewrite rule to convert pretty-URLs (like /my-blog/i-watched-tv-today/) to their internal handler (index.php?q=/my-blog/i-watched-tv-today/). That's done with either a single "-e (file exists?)" check which rewrites all missing files to index.php, or via a try_files directive which first looks for the literal file, and if not found sends the request to index.php. There are a few other small things to do as well, but all of them are mentioned at wiki.nginx.org/WordPress.

When it comes to W3 Total Cache, it adds a few special cached-content URLs that must be handled as well.

Disregarding the above site, you can get an all-in-one WP+W3TC configuration here, containing all you'll ever need to do: rtcamp.com/tutorials/wordpress-nginx-w3-total-cache/

Note that you would normally always want Nginx "location ~ .php$" blocks to do a "try_files $uri =404;" check to combat a URL forging exploit. But, the Wordpress configuration above doesn't need the "=404" check, as it instead uses "try_files $uri /index.php;" which means that if the file doesn't physically exist, it then gets passed on to index.php, thus killing the exploit possibility. Still, I added a =404 check at the end, to avoid issues in the remote possibility that an attacker manages to delete index.php somehow. Just thought I should clarify that.

The linked configuration looks pretty good to me apart from the author's regex's where he's used . (any character) instead of . (literal period .), and is using parenthetical captures "()" when he doesn't actually need capturing, which slows things down a tiny bit. He's also using debug-level error logs which is excessively spammy.

So, I've cleaned up the entire configuration and fixed the flaws. Here's a non-expiring paste of a WP+W3TC configuration for you:

pastebin.com/qbKv1ad4

I leave testing it up to you, but everything is semantically correct now and uses best-practices. Refer to the rtcamp blog above for more information.

Enjoy!