Issue Details (XML | Word | Printable)

Key: MBH-269
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Normal Normal
Assignee: Robert Kaye
Reporter: Robert Kaye
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
MusicBrainz Hosting

Add nagios monitoring for memcached on wiley

Created: 03/Aug/12 06:04 PM   Updated: 19/Sep/12 09:23 PM   Resolved: 19/Sep/12 09:23 PM
Component/s: Nagios
Affects Version/s: None
Fix Version/s: None

Issue Links:
Relates
 


 Description  « Hide

We had a site outage because memcached on wiley crashed. This caused all of the web front ends to crash with the previously seen error message:

2011-11-28 17:31:50.495995500 [error] Caught exception in engine "Couldn't save expires:c1c611ace1febf8bc148e4a72d67ccbce510332b / 1322508709 in memcached storage"

(this message was taken from MBS-3590. nagios did not send any messages about memcached on wiley being down. Can you please verify that we are monitoring memcached and if not, please add monitoring? Also, I see the mediawiki instance of memcached is being managed by daemontools – can we move the second instance of memcached on wiley to daemontools as well? The second instance runs on port 11215 and is started by /etc/init.d



Sort Order: Ascending order - Click to sort in descending order
Dave Evans added a comment - 04/Aug/12 10:30 AM

What sort of monitoring would you like? If we make it daemontools-controlled then "is it up" is not a terribly useful thing to monitor, since (unless someone disables the service) it'll always be up.

See email re. move to daemontools.


Robert Kaye added a comment - 04/Aug/12 06:40 PM

Ah, yes good point. Just daemontools ought to be fine...

And, I could not find an email from you. Please re-send?


Dave Evans added a comment - 18/Aug/12 09:39 AM

As per email of 4th August,

Ready to, whenever someone decides that it's the right time to lose RE
sessions (which I believe is still the effect of restarting memcached).

To make it happen,

get root on wiley

invoke-rc.d memcached stop
cd /etc/service
ln -s .memcached-musicbrainz-server memcached-musicbrainz-server
sysv-rc-conf memcached off


Robert Kaye added a comment - 18/Aug/12 05:30 PM

Understood this is still on my plate. Still dealing with losing dear friend last week – still trying to recover from that...


Robert Kaye added a comment - 19/Sep/12 09:23 PM

Fixed this last week