Our ownCloud has around 600 users. At certain time (noon) everyday, it crashes. I have been analyzing the /server-status information from apache status module and noticed that the request for status.php started to pile up with "W" for Sending Reply. The resources and bandwidth seem fine(htop, iotop). The MySQL database shows a lot of sleeping processes when the status.php requests start to pile up. I have increased MaxRequestWorkers and mysql max_connections many many times and it still keep hitting the max and I started to think that there's bug within the ownCloud application. Here are screenshots of the server: http://imgur.com/a/7KnjU Any guide on how to troubleshoot this issue is greatly appreciated. Please let me know if more information is needed.
Thank you so much!
Steps to reproduce 1. Do not know how to reproduce. It happens at certain time every day.
Expected behaviour ownCloud server is expected to handle all the requests with no problem.
Actual behaviour The GET /status.php requests piles up and crash apache2 (server reached MaxRequestWorkers).
Server configuration Operating system: Ubuntu 14.04 Web server: Apache2 2.4.7 Database: MySQL 5.7.17-0ubuntu0.16.04 (Separate Server) PHP version: 5.5.9-1ubuntu4.20 OpenSSL version: 1.0.1f ownCloud version (see ownCloud admin page): 9.1.3 Updated from an older ownCloud or fresh install: Updated from an older ownCloud. Special configuration (external storage, external authentication, reverse proxy, server-side-encryption): Redis
[mpm_prefork:notice] [pid 21035] AH00169: caught SIGTERM, shutting down
[mpm_prefork:notice] [pid 2275] AH00163: Apache/2.4.7 (Ubuntu) PHP/5.5.9-1ubuntu4.20 OpenSSL/1.0.1f configured -- resuming normal operations
[core:notice] [pid 2275] AH00094: Command line: '/usr/sbin/apache2'
[ssl:error] [pid 3380] [client 64.41.200.104:39208] AH02042: rejecting client initiated renegotiation
[:error] [pid 9882] [client 207.228.78.106:27263] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9959] [client 198.254.194.84:64341] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9936] [client 174.90.251.187:53598] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9628] [client 10.9.3.190:56101] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9996] [client 198.254.194.84:63397] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9905] [client 207.228.78.57:6749] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9427] [client 10.9.3.136:62628] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9977] [client 199.126.9.18:50411] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9956] [client 208.98.234.17:45985] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 9775] [client 198.254.194.84:58547] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[:error] [pid 10022] [client 184.151.222.118:7151] PHP Fatal error: Class 'OCA\\DAV\\Connector\\Sabre\\ExceptionLoggerPlugin' not found in /var/www/html/owncloud/remote.php on line 54
[ssl:error] [pid 9685] [client 64.41.200.102:58705] AH02042: rejecting client initiated renegotiation
[core:notice] [pid 2275] AH00051: child pid 13976 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[core:notice] [pid 2275] AH00051: child pid 14267 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[mpm_prefork:error] [pid 14386] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
Integrity status for oC9+
Login as admin user into your ownCloud and access
http://example.com/index.php/settings/integrity/failed
No errors have been found.
Yes I can confirm that the file ExceptionLoggerPlugin.php does exist in /var/www/html/owncloud/apps/dav/lib/Connector/Sabre. Do you think this could be the culprit?
The only temporary solution right now is to restart apache2 everyday when it is about to hit the max request.
tflidd, I will try update the ownCloud to 9.1.4 but I highly doubt it would fix the issue.
From the access log, it seems to be doing a lot of server checking: "PROPFIND /remote.php/webdav/ HTTP/1.1" 207 1317 "-" "Mozilla/5.0 (Windows) mirall/2.1.1"
The weird thing is that it only doing this at certain time (noon).
Just to get everyone an update, it was actually caused by our backup job (Veeam + VMware Snapshots). The backup freezes ownCloud VMs for ~8 seconds and caused high spiked on status.php requests during peak hours.
I guess my question now is, is there a way to fix this so that the requests dont get stuck for a long time, perhaps in php or apache2 configurations?
I have been trying to do this but looks like the requests still pilling up. Put ownCloud on maintenance mode does not really do anything.
I'm wondering if there's anything else I could do from the apache/php side. Making changes on the VM/Veeam might be harder considering the infrastructure.
What do you mean with "maintenance mode does not really do anything"? When you put oC in maintenance mode, the clients will pause till you put oC back into working mode. The web frontend will also say "system in maintenance mode" etc. ...
What I meant was, enabling the maintenance mode does not remove the current massive stalled requests. The stalled requests only go away after restarting the apache2 service.
I cannot put the maintenance mode on during the backup because the backup job start time and duration varies. There is no other cronjob in the system other than for ownCloud.
For now, I have created a cronjob script to restart apache2 service a couple of times after the backup jobs finish. This seems to be working good so far. This is a temporary solution until I get the infrastructure side figured out.