rosso (FMA Admin)

REGISTERED:07/21/2014
COMMENTS POSTED:51
MIXES CREATED:6
AFFILIATIONS:
Curators: WFMU

Hi! I'm the FMA's current software developer. In addition to programming, I make experimental music (speakerdust.bandcamp.com) and experimental video (http://vimeo.com/100763590).

Featured Mix

FMFF #5 - Oct. 12, 2015
ui - "@" (02:09)
ui - "@" (02:09)
UPLOADED:10/13/2015
TRACKS:19
LISTENS:677
STARRED:0
DOWNLOADS:723
EMBED THIS MIX:

» VIEW BLOG rosso's Blog

rosso on 05/08/2017 at 03:02PM

May 8 Outage Postmortem

Our sincerest apologies for the FMA site outage this morning which lasted from roughly 7:00 EDT until 14:00 EDT.  In the interest of increased transparency about the FMA's operations, I've decided to write this brief entry describing what happened today.  We have a very small staff and I wasn't able to begin rectifying the outage until about 11:30 EDT.

What happened?

Certain types of requests made to the FMA's servers are logged directly in our database.  The size of these logs reached a point where the hard disks on our database servers were filled to their capacity.  When that happened, the database servers (a master and several read-only replicas) became completely unresponsive.  Since the site relies entirely on our database cluster, no pages could be rendered and no api requests could be completed--end users saw a giant error message!

What was the solution?

As soon as I was able to begin working on the problem, I put the maintenance page up and began downloading a snapshot of the logs which filled the database servers' hard disks.  This took much longer than anticipated.  Once I was able to retreive the data, I truncated the tables in question (truncated meaning deleting all data in the tables--a database table is similar to a spreadsheet).  After that, I waited for the read-only replicas of our master database to catch up.  It's not enough to restart the site with only the master database running--the site depends on the read-only replicas as well.  I waited almost an hour for the read-only replicas to catch up, but they didn't.  Due to the nature of our hosting provider, it was faster to delete the read-only replicas and create new ones.  That took another several minutes.  Once the replicas were rebuilt, I was able to restart our front-end servers and restore the site to normal operation.

How will we prevent this from happening again?

Logging directly to a database is definitely bad practice, but it was implemented on FMA many years ago by the original development team.  For now I will keep my eyes on database disk usage and will set alerts to let me know I need to do something before the disks fill up again!  Longer term, I will move all logging activity to a separate service, for example just flat log files.  Unfortunately, FMA is no stranger to outages, but whenever they happen, we try to restore service as quickly as we can and take steps to prevent similar outages from happening subsequently.

Is there anything I can do to help?

Yes!  FMA operates with a tiny staff (2 people) and extremely limited resources.  The best way to help is to Donate!  If you are a developer and have any technical suggestions, please write to me directly at [email protected] - We greatly value input from our users and the community.  We're dedicated to making the FMA the biggest and best resource for Creative Commons licensed, and other royalty-free music, anywhere on the Internet.

What is this song?

One of my all-time favorite FMA tracks, and an adequate description of how it feels to finally fix a major outage.

» 14 COMMENTS Share
rosso on 09/24/2016 at 04:20AM

Zoom H4n Pro -- WINNER!

Congrats to our winner, Abraham!

Thanks to everyone who entered and who has donated to our fundraiser so far!  Your support means a lot to us!

Thanks to Z'EV, Schemawound, Lee Rosevere, and Carlos Giffoni/Okkyung Lee for the music used in the video! All music is from the FMA and is linked below.

Special thanks to fellow FMA developer Erik Schoster for mixing the music and doing the drawing!


READ MORE
» 0 COMMENTS Share

rosso's Wall

Jaan Patterson
on 12/11/2015 at 07:05PM
d/eaRRosso, thank you so much for all your hard work on the FMA backend et al! Very best wishes!!!
rosso
on 09/21/2015 at 10:51PM
Hi @nrclark - I was hoping to record today's show for that purpose, but I ran into some problems with my setup. I'll work out the kinks by next week. Thanks for your interest!
nrclark
on 09/21/2015 at 10:15PM
Hey, just wondering if FMFF is available as a podcast?
rosso
on 01/20/2015 at 01:42AM
If wishes and buts were clusters of nuts, we'd all have a bowl of granola.