All the downtime! Also: MongoDB, R7, etc

No data has been generating for the last week for a few reasons:

  • All graph data has been moved from MySQL (500+ million rows) to MongoDB (the actual converting data over only took a couple hours though)
  • At the same time I physically moved to a different machine. This was primarily to get out of OVH’s lacklustre network in Montréal. MCStats is now located in New York.
  • Naturally, Real Life got even busier so what should’ve only taken a day had to be postponed.

Now that the site is back up, what is new?

  • Improved page caching. Most pages (and all API requests) are now cached in Redis. Previously I was using Memcached for caching specific pages but Memcached appeared to have a mind of its own (or it was the php driver) so I took the chance to switch to Redis at the same time which has worked as expected the entire time.
  • Graph generator progress bars have made their way back onto the site.
  • http://api.mcstats.org/api/1.0/ is now more simply http://api.mcstats.org/1.0/.
  • The backend now supports compression+JSON requests. The new R7 reporter will take advantage of this once released.
  • Graphs for Plugin Rank have been added (finally). This will be more interesting once the plugin index is finished but it’s there 🙂

MongoDB

MySQL was becoming a huge bother to store the graph data in and it was getting slow as I could only do so much to scale it. The choice to move it to MongoDB took a few months to finally decide to start testing it but I am glad I did.

So far MongoDB has been a lot better to me than MySQL. MongoDB reports that all graph data (as of this post) is 4.5GB (3.3 Data + 1.2 Indexes). This is spread across 11.8 million documents. It would not be fair to compare to what it was on MySQL (~60 GB / ~530 million rows) because in the process of converting to Mongo a lot of data was trimmed out. At the very least, there is a lot less wasted data from having so much repeated data because document storage makes a lot more sense and is stored more logically now.

What gets even better is that with compression=gzip (via ZFS) the database on-disk is only 835 MB!

For those curious, previously data was stored in MySQL like so:

Plugin, ColumnID, Epoch,

So each point on a graph had a row in the database (a Graph has many ColumnIDs). This meant each graph had to effectively pull out thousands of rows to generate the data. If it’s all in memory it’s very fast.

With MongoDB it is stored much more logically:

plugin, graph, epoch, { columnid1: , columnid2: , [..] }

This makes much more sense because all of the data is normally used even though a document can be large. This keeps the overall index size pretty small because the Plugin, Graph, Epoch are not repeated so often (which are indexed). For a month of fully generated data it only has to pull out 1,488 documents for a graph.

R7: Compression, JSON, maybe other things

The backend end for R7 is complete and I am hoping to release the plugin side (already done, just needs fine tuning) in the next few days. This adds compression for data being sent to MCStats and as well the data format it sends with is now JSON.

For most plugins the compression will reduce data sent by around 50%. Some plugins that send a lot of data will see even higher ratios.

At the moment MCStats uses 6-7 TB of traffic per month so I am hoping to reduce this by a couple TB (this realisation came after I moved from unmetered b/w to metered… :)) to allow continuous growth without any issues.

5 thoughts on “All the downtime! Also: MongoDB, R7, etc

  1. bertha

    how are we supposed to log into the admin area if we forgot our password? there isn’t an option to reset the password. :/

    Reply
    1. Hidendra Post author

      If you could email me at [email protected] with your username / attached plugins I can likely reset this for you. At some point I will have the option to attach an email to accounts so that it can at least be reset heh 🙂

      Reply
  2. KiwiLetsPlay

    There used to be about 166 server with my plugin since yesterday there are only 22
    Do I have to change something with the metrics or … ?

    Reply
    1. Filbert66

      Haven’t seen any new data since Sep 4. Trying to update because R7 removed CustomGraph but can’t test it. Any news when it will be back collecting data?

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *