Availability nightmares continue
We’re having a lot of outages on our blogs. Downtown Host tells me that huge numbers of MySQL processes are being spawned. I have trouble understanding why, as WP-SuperCache (Edit: Actually, just WP-Cache) is enabled, robots.txt has a crawl delay, and so on.
As of yesterday, we were getting 1 1/2 megabytes/hour of “MySQL database has gone away” errors. After Downtown Host declined to discuss that subject with us, Melissa Bradshaw implemented — at least for this blog — a workaround to change the MySQL wait_delay settings ourselves. Clever idea, and seemed to work for half a day — but now the problems have returned.
Downtown Host isn’t saying much more than “Look at these logs. Your blogs are experiencing a lot of queries and spawning dozens upon dozens of MySQL processes. The main offender is DBMS2.” I don’t know when we’ll get this sorted out. I fly to Europe tomorrow. I have a cough. I’m exhausted. I’m sorry.
Comments
4 Responses to “Availability nightmares continue”
Leave a Reply
[…] leave that for later and just get this posted as a start — assuming, of course, that blog outages […]
If I had to guess with 0 inside knowledge, it’s related to http://www.thetechherald.com/article.php/200937/4392/Worm-attacking-WordPress-is-an-example-of-why-patching-is-important
This availability mess has slowed us down from upgrading to the latest versions, which like everybody else we were rushing to do. But I’m pretty sure we haven’t been successfully attacked by that worm yet — and of course I want to be notified promptly in the distressing case that I’m wrong about that.
We experienced and issue with public access websites where the number of database threads (on windows) would go through the roof and the DB (Oracle 10.2 64bit) would die. After many sleepless nights and much stress it turned out that it had noting to do with excess traffic rather the increased network latency between the web server network and the database network meant increased wait times on queries (i.e. queries took longer to fully return) hence more concurrent load and bang. There is also a bug in that particular version of Oracle we were running meaning it couldn’t gracefully handle this scenario.
Thought I’d mention under the topic “things external to an app and traffic that make it seem like there is excess load on your database”
goodluck