Issue Details (XML | Word | Printable)

Key: MBH-256
Type: Task Task
Status: Open Open
Priority: Normal Normal
Assignee: Robert Kaye
Reporter: Dave Evans
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
MusicBrainz Hosting

webalizer is killing wiley

Created: 01/Jul/12 08:59 PM   Updated: 03/Jul/12 08:47 PM
Component/s: None
Affects Version/s: None
Fix Version/s: None


 Description  « Hide

Load average and memory usage very high on wiley recently, due to webalizer.

djce@wiley:/usr/local/webstats/zaphod-nginx$ ls -lh tmp/carl/
total 4.1G
rw-rr- 1 root root 4.1G 2012-07-01 07:13 musicbrainz-full-combined.log-20120701
djce@wiley:/usr/local/webstats/zaphod-nginx$

Typical daily log size.

Possible approaches:

  • suck it up
  • use something other than webalizer
  • find a way to make webalizer use less resources (feed the log in in chunks? adjust webalizer.conf ?)
  • process less data (only feed in, say, 10% of the log at random)
  • process the data elsewhere (not on wiley)


Sort Order: Ascending order - Click to sort in descending order
Dave Evans added a comment - 01/Jul/12 09:02 PM

Might want to disable webalizer processing of "musicbrainz-full" for now, until we resolve this.

To do so, edit zaphod-nginx-webalizer/wiley/bin/update-all in git.musicbrainz.org:sysadmin/stats (comment out musicbrainz-full); commit, push, pull onto wiley. Be careful when re-enabling though because by default it will try to catch up on all the logs it missed.


Robert Kaye added a comment - 03/Jul/12 08:47 PM

I have a couple of thoughts on this:

1. If we move backups away from wiley, does that buy us some time?
2. If so, then I think we should task plaintext with replicating the portions of webalizer we care about and do it using splunk. In theory, adding more queries and output types should be pretty quick once his project is finished. However, that wont his the live server until Mid Oct. Can we buy ourselves enough time?