Hello Experts,
Server Product : ColdFusion
Version : 10,0,18,296093(PreRelease)
OS : Unix
OS Version : 2.6.32-573.12.1.el6.x86_64
Adobe Driver : 4.1 (Build 0001)
So, we are having our CF server crash every now and then with 503 error. There is no particular pattern as to why this happens.
There are two servers behind a load balancer..... We have a health check methodology that checks for a cfm page every 10 seconds. If it receives multiple 503's, users are switched over to another server.
After a CF restart, things come backup normally. Following is the log file at the crash point for one instance.....
Feb 17, 2016 14:17:52 PM Information [ajp-bio-8012-exec-12] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:17:52 PM Information [ajp-bio-8012-exec-6] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx.
Feb 17, 2016 14:17:53 PM Information [ajp-bio-8012-exec-13] - Starting HTTP request {URL=xxxxxxxxxxxxxxxxxxxxxxx, method='get'}
Feb 17, 2016 14:17:53 PM Information [ajp-bio-8012-exec-13] - HTTP request completed {Status Code=200 ,Time taken=48 ms}
Feb 17, 2016 14:17:55 PM Information [ajp-bio-8012-exec-5] - Starting HTTP request {URL=xxxxxxxxxxxxxxxxxxxxxxx, method='get'}
Feb 17, 2016 14:17:55 PM Information [ajp-bio-8012-exec-5] - HTTP request completed {Status Code=200 ,Time taken=64 ms}
Feb 17, 2016 14:18:10 PM Information [ajp-bio-8012-exec-14] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:18:11 PM Information [ajp-bio-8012-exec-10] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx.
Feb 17, 2016 14:18:11 PM Information [ajp-bio-8012-exec-12] - Starting HTTP request {URL=xxxxxxxxxxxxxxxxxxxxxxx4903204.json', method='get'}
Feb 17, 2016 14:18:11 PM Information [ajp-bio-8012-exec-12] - HTTP request completed {Status Code=200 ,Time taken=32 ms}
Feb 17, 2016 14:19:14 PM Information [ajp-bio-8012-exec-7] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:20:03 PM Error [ajp-bio-8012-exec-7] - POST parameters exceeds the maximum limit 100 specified in the server. You can modify the setting in Administrator Server Settings.
Feb 17, 2016 14:20:16 PM Information [ajp-bio-8012-exec-9] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:20:20 PM Information [ajp-bio-8012-exec-14] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:20:23 PM Information [ajp-bio-8012-exec-4] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:20:56 PM Information [ajp-bio-8012-exec-7] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:20:56 PM Information [ajp-bio-8012-exec-6] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx.
Feb 17, 2016 14:20:57 PM Information [ajp-bio-8012-exec-2] - Starting HTTP request {URL=xxxxxxxxxxxxxxxxxxxxxxx4927102.json', method='get'}
Feb 17, 2016 14:20:57 PM Information [ajp-bio-8012-exec-2] - HTTP request completed {Status Code=200 ,Time taken=38 ms}
Feb 17, 2016 14:21:05 PM Information [ajp-bio-8012-exec-12] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:21:05 PM Information [ajp-bio-8012-exec-1] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx.
Feb 17, 2016 14:21:05 PM Information [ajp-bio-8012-exec-4] - Starting HTTP request {URL=xxxxxxxxxxxxxxxxxxxxxxx4930797.json', method='get'}
Feb 17, 2016 14:21:06 PM Information [ajp-bio-8012-exec-4] - HTTP request completed {Status Code=200 ,Time taken=35 ms}
Feb 17, 2016 14:22:34 PM Information [ajp-bio-8012-exec-2] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:23:25 PM Information [ajp-bio-8012-exec-11] - Using Axis 1 for consuming the service xxxxxxxxxxxxxxxxxxxxxxx
Feb 17, 2016 14:27:46 PM Information [scheduler-2] - Alert: Unresponsive server state detected. 10 or more threads have been busy for over 180000 milliseconds
Feb 17, 2016 14:27:46 PM Information [scheduler-2] - Alert: unresponsiveserveralert: Email notification sent.
We see a few HTTP requests succeed and followed by an unresponsive server alert. This didn't make any sense to us what so ever
What we already tried doing ---------
- Updated Tomcat connector settings on Server.xml and worker.properties. We have different timeout settings on these two files for our custom environment.
- Disabled global client variables
- Updated connection pool settings
- Modified CFINVOKE calls to use CFHTTP
We are running out of options and are hitting a brick wall with no new ideas. I know we are missing a piece in the puzzle, but not sure where it is.... (Apache settings, Tomcat settings ???)
Any help on this issue would be greatly appreciated. Please let me know if you need any additional information.
Thank you in advance,
Sam.