Recently I had a 502 bad gateway error on nginx caused by an php5-fpm error. Seems like I had a surge on web traffic and the php service couldn’t recover from it.
So I don’t know which option is better:
either to fiddle with the pm.max_children and related config options in /etc/php5/fpm/pool.d/www.conf
switch to listening to a port instead of a socket
I’ve had more luck listening to a port than to a socket when using php5-fpm under high load. Listening with a port can result in some 502 errors, but it recovers from them. When using sockets I’ve found that once it fails, it will seldom recover.
Here are some links to similar discussions on the internet :P:
Can you find the exact error in the logs to see what the actual problem was?
on /var/log/nginx/error.log I get:
2015/04/22 11:12:05 [error] 3486#0: *460674 upstream timed out (110: Connection timed out) while reading response header from upstream, client: xxx.xxx.xxx.xxx, server:
xxxxxx.net, request: “GET /mail/ HTTP/1.1”, upstream: “fastcgi://unix:/var/run/php5-fpm.sock”, host: “ xxxxx.net”
on /var/log/php5-fpm.log I get no error.
Just for the record, I still have this problem. My solution right now is:
Adding this to the cron:
0 2 * * * service php5-fpm restart
And modifying this variables for php5-fpm:
pm.max_children = 100
pm.start_servers = 20
pm.min_spare_servers = 5
pm.max_spare_servers = 30
pm.max_requests = 500
I have a server with 4GB in RAM.
Are you using Owncloud, I had a simular issue and disabling APC(u) fixed it for me as a work around until Owncloud 8.0.3
07:29PM - 03 Apr 15 UTC
10:37AM - 16 Sep 15 UTC
Just wondering if anyone else is having issues with Owncloud version 8 when using the Owncloud client, after an hour or...
I think no one is using OwnCloud. But I’ll double check with my users.
Following on this one. Seems like no one is using owncloud. My next aproach is enabling the php5-fpm slowlog to see if I can get a better log on the failure. Right now the logs I have don’t provide much information.
I also created a pull request to enable it by default on mailinabox, which I guess could help other users:
Well, now that I have the slowlog enabled I have a better clue as to what is happening. Seeing the logs I see two scripts that give me timeout errors:
[14-May-2015 12:31:47] [pool www] pid 30250
script_filename = /usr/local/lib/z-push/index.php
[0x00007f8177382318] sleep() /usr/local/lib/z-push/backend/carddav/carddav.php:315
[0x00007f8177382158] ChangesSink() /usr/local/lib/z-push/backend/combined/combined.php:474
[0x00007f8177381ec8] ChangesSink() /usr/local/lib/z-push/lib/core/synccollections.php:504
[0x00007f8177381bf8] CheckForChanges() /usr/local/lib/z-push/lib/request/ping.php:162
[0x00007f8177381a98] Handle() /usr/local/lib/z-push/lib/request/requestprocessor.php:131
[0x00007f8177381920] HandleRequest() /usr/local/lib/z-push/index.php:209
[17-May-2015 10:43:47] [pool www] pid 28530
script_filename = /usr/local/lib/owncloud//remote.php
[0x00007f2d7d480950] apc_store() /usr/local/lib/owncloud/lib/private/memcache/apc.php:21
[0x00007f2d7d480810] set() /usr/local/lib/owncloud/lib/autoloader.php:109
[0x00007ffcbdc2cfa0] load() unknown:0
[0x00007ffcbdc2d2f0] spl_autoload_call() unknown:0
[0x00007f2d7d4805c0] init() /usr/local/lib/owncloud/lib/base.php:522
[0x00007f2d7d4804b8] init() /usr/local/lib/owncloud/lib/base.php:1011
[0x00007f2d7d480378] +++ dump failed
I’ll keep investigating.
Very interesting indeed. I wonder how safe it is to update the PHP version when using mailinabox.
I am testing this because i have stability issues on production server too. Looks like there is some real problem with APCu below version 4.0.6. I made update to OwnCloud 8.0.3 on mailinabox testing machine and admin contains info:
APCu below version 4.0.6 is installed, for stability and performance reasons we recommend to update to a newer APCu version.
Can you test new version of PHP and see some improvements?
There is an issue for this on mail-in-a-box now:
So I guess next release will fix it. Or maybe just doing an update to owncloud with apt-get upgrade, once that fix is released.
Yes, i found it too. For now i disabled extension apcu.so and waiting for new mail-in-a-box version.
Its easy to have new owncloud now
https://github.com/mail-in-a-box/mailinabox/commit/13093f1732ac544a2f03b1e761eb72974b08ec26 but i want to see if PHP on server fails again with disabled apcu.
After update to MIAB v0.10 (OC 8.0.3) with enabled APCu I had one PHP crash yesterday. Disabling APCu again.
This is nginx error.log
2015/06/08 09:19:05 [error] 1147#0: *1275 FastCGI sent in stderr: "PHP message: PHP Warning: apc_store(): GC cache entry 'oce46d4995vd/AutoloaderOC_Log_Owncloud' was on gc-list for 3651 seconds in /usr/local/lib/owncloud/lib/private/memcache/apc.php on line 21" while reading response header from upstream, client: 192.168.0.1, server: mx.test.cz, request: "PROPFIND /cloud/remote.php/webdav/ HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "mx.test.cz"
2015/06/08 09:22:17 [error] 1147#0: *1275 FastCGI sent in stderr: "PHP message: PHP Warning: apc_store(): GC cache entry 'oce46d4995vd/AutoloaderOC\Files\Stream\OC' was on gc-list for 3648 seconds in /usr/local/lib/owncloud/lib/private/memcache/apc.php on line 21" while reading response header from upstream, client: 192.168.0.1, server: mx.test.cz, request: "PROPFIND /cloud/remote.php/webdav/ HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "mx.test.cz"
2015/06/08 09:30:26 [error] 1147#0: *1453 upstream timed out (110: Connection timed out) while reading response header from upstream, client: xx.xx.xx.xx, server: mx.test.cz, request: "POST /mail/?_task=mail&_action=refresh HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "mx.test.cz", referrer: "https://mx.test.cz/mail/?_task=mail&_action=show&_uid=111&_mbox=INBOX&_caps=pdf%3D0%2Cflash%3D0%2Ctif%3D0"