Fishpool

To content | To menu | To search

Sunday 26 April 2009

MySQL 2009-2010 roadmap

The development model for MySQL Enterprise took a big step forward with the new community process Karen Padir announced in her Tuesday keynote. This is great for both the open source server as well as enterprise customers, because the closer the tie between the community and the development path, the better the quality and faster the progress towards new functionality. I'm not entirely sure everyone at Sun still completely understands why a working community process is a benefit for the enterprise customer base, but I'm happy steps are made in the right direction, and it seems to me that Karen Padir is going to be a good leader for the product.

A big improvement, for sure, and still there's more to improve here. To borrow the words of Baron Schwartz, MySQL currently "has" a community, while it would really be in everyone's benefit if instead MySQL would "be" a community. I would suggest that the goal should be not monthly "community" releases from Sun, but a completely out-in-the-open development process with the community members being on the driving seat regarding patch acceptance, quality management and releases, much like the Fedora process works. Sure, there's a role for corporate sponsorship and project management, but it's a distinct difference of responsibility. The Drizzle project is another good example of how this can work. An important point to realize here is that there is a difference between the community, an active partner in the process of making the software better, and the unpaid userbase. The latter is an acquisition and conversion vehicle for the former, but they're separate entities.

The announcement of the 5.4 server was at the same time an encouraging as well as confusing example of the changes. I would like to be enthuastic about it, but we've seen MySQL (if not Sun) announce pre-announce releases that didn't appear before, and it's a long way to the promised release time. I asked two questions from many, many MySQL staff members during the week: why is it that 5.4 was announced now, but is slated to be released GA only in December when it clearly demonstrates massive scalability improvements already, and why is it that the feature list for the final 5.4 release is much longer than what's already completed? I did not get a really coherent answer from anyone. Best I could decipher, there is somewhere a faceless "marketing" which decided that a) there should only be one release announced and b) 40% demonstrated improvement is not good enough when it's not the only improvement that can be made. I also learned that it's not unlikely that much of the work which has gone to 5.4.0-beta would be backported to the 5.1 branch and released in a 5.1 point release before the actual 5.4 release, because in fact they can be considered bugfixes.

I consider myself not an entirely unexperienced in the decision processes for release management, and know intimately the clarity hindsight provides to well-intentioned choices made with best available information. I know there are many areas to consider, and every decision made is a compromise. I still can't bring myself to completely understand what exactly led to this particular approach. Lets recap:

  • Improvements already made are announced and made available in beta test form, but beta does not contain everything planned for the release
  • Final release is intentionally delayed by 7 months adding significant project risk to it, despite having no previously committed release schedule
  • Former release version is planned to by improved by making significant performance-altering changes in a point release in order to offset the delay
  • Such a release adds risk to maintenance roadmap and steals away upgrade motivation from the upcoming version

How this plan serves either Sun, the community, the free userbase or the enterprise customers is a mystery to me. It would certainly seem far simpler and clearer to take an aggressive quality assurance and release testing position with the intent to push 5.4 out as a rock-solid replacement upgrade to 5.1 as soon as possible, and only then continue with further updates as a 5.5 release. This would definitely be welcomed by everyone but the class of enterprise customers who like to hear about future versions two years in advance - but keep in mind that such conservative enterprises are not MySQL's primary customer base anyway, and if MySQL is to make inroads there, rapidly improving the quality and performance of the product in the meantime would still be a sensible step.

There is the argument that if I want to get those performance features now, I can use Percona/XtraDB or MySQL 5.1 plus the InnoDB Plugin. While technically that route does work, and clearly is worth pursuing as a user, it does have its drawbacks in terms of requiring multiple sources and it's hard to see how it supports MySQL/Sun's commercial interests, the latter surely having been a consideration in the 5.4 release plans.

Thus far in the argument I have ignored one new component - Oracle. That's because to my understanding the process I've discussed did not consider the acquisition, which was unknown to most people before Monday. Clearly this changes a few points. It's not necessarily in the interests of Oracle for MySQL to continue making inroads to enterprise customers, though if someone's going to be cannibalizing Oracle's database sales, it might as well be Oracle. InnoDB Plugin will also be a product from the same company as MySQL Server in the near future - in fact, in a future likely to be fact before the final GA release of MySQL 5.4. What is the role of a delayed 5.4 release in this equation, then?

Recap of MySQL Conference 2009

This was an interesting week for sure. Of course, we all know it started with a bit of a shock news, but that's not nearly the most interesting bit about the conference. I'm posting a series of cleaned-up notes and opinions about what I saw there as I finish them. Will also try to link to further information where I've seen good notes. Please leave more links in the comments if you have any!

Thursday 23 April 2009

Three domains of data

My MySQL Conference presentation on Tuesday discussed my practical findings on how Infobright's technology works in developing a MySQL-based data warehouse. I also touched on a more high-level question of how to select a technology for a different kinds of data-related problem areas, and this article expands on that discussion.

Continue reading...

Wednesday 22 April 2009

Mining for insight - presentation materials

Completed my MySQL Conference presentation 45 minutes ago. Seemed to go over ok, got some followup questions. Trouble is, I got hit by amazing jetlag half an hour before the session, and almost fell asleep myself during the presentation. Fortunately, survived that anyway, and as far as I could see, was the only one having problems staying awake. Below is an embedded version of the slides, which should also appear on the conference proceedings site later. Now for a beer at the expo. Will blog with more description of the stuff later (update: see this follow-up article).

Read this doc on Scribd: Mining for insight

Tuesday 21 April 2009

Interesting start to MySQL Conf

So, waking up this morning to prepare for the first day of the MySQL Conference, the first news I pick up is that it's now Oracle Conference instead. Been speaking with a few people through the day and confusion as to how this is going to impact the company, development community, users or customers reigns. Personally, I'm more apprehensive than excited about it at this point - but frankly, I've spent more time thinking of how to apply MapReduce to large-scale ETL processes than about the acquisition today.

However, I'll keep digesting this for a while, and hopefully after a couple of days of discussing it with people here I can form a better opinion. I don't know Oracle that well - it never seemed like a very easily approachable company to me, and what connection I've had with the technology its felt a bit baroque and legacy, but I haven't even looked at it in a few years. It is interesting though - Oracle's acquisition of InnoDB a couple of years back was certainly one reason why there's today a number of development projects for other transactional storage engines for MySQL. Sun has been an active corporate sponsor to a number of such projects, and it doesn't seem very likely that Oracle would want to continue that. Dunno.

Wednesday 8 April 2009

Using the Infobright Community Edition for event log storage

Apart from the primary "here's how we ended up using Infobright for data warehousing and how is that working out" topic I'm going to discuss in my MySQL Conf presentation I'll touch on another application, the use of Infobright's open-source Community Edition server for collection and storage of event logs. This is a system we've implemented in the past couple of months to solve a number of data management problems that were gradually becoming problematic for our infrastructure.

We've traditionally stored structured event log data in databases for ease of management. Since Habbo uses MySQL for most everything else, putting the log tables in the same databases was pretty natural. However, there are significant problems to this approach:

  • MyISAM tables suffer from concurrency issues and crash-recovery is very slow due to table consistency check
  • InnoDB tables suffer from I/O bottlenecks and crash-recovery is very slow due to rollback segment processing
  • Both scale badly to hundreds of millions of rows (especially if indexing is required), and mixing them is not a recommended practice
  • Storage becomes an issue over time, especially as indexes can easily require many times as much disk than the data, and an event log is going to have a LOT of rows
  • Partitioning has only recently become available, and before that, managing "archive" tables needed manual effort
  • Perhaps worst of all (as it's very hard to measure), if any of this is happening on the primary DB servers, it's competing for buffer pool memory with transactional tables, thus slowing down everything due to cache misses

Over the years, we've tackled these issues in many ways. However, with our initial experience of scaling an Infobright installation for data warehousing needs, a pretty simple solution became apparent, and we rapidly implemented an asynchronous, buffered mechanism to stream data into an ICE database. We're early with this implementation, but it has turned out to be a satisfactory high-performance solution. Even better, it's a very simple thing to implement, even in a clustered service spanning many hosts, as long as log tables don't need to be guaranteed 100% complete or up-to-date to the last second. Here's a description of the simple solution; extending that to the complex solution providing those guarantees is left as an exercise to the reader.

Rather than running single INSERTs to a log table or writing lines to a text file log, each server buffers a small set of events, eg for the past second in a memory buffer. These are then sent over a message bus or lightweight RPC call to a log server, which writes them to a log file that is closed and switched to a new file after every megarow or every few minutes, whichever is smaller. A second process running on this log server wakes up periodically and loads each of these files (minus the last one, which is still being written to) into the database with LOAD DATA INFILE.

This has multiple general benefits:

  • Buffered messaging requires much less time on the "client" servers compared to managing database connections and executing thousands of small transactions
  • The asynchronous processing ensures database contention can not produce random delays to the normal service operation
  • Batch loading of text files is implemented by every DB server, so there's little in this implementation that is proprietary or dependent on any particular DB solution
Using the Infobright ICE as the backend database provides a number of additional specific benefits:
  • Excellent data load performance
  • No index management, yet capability to run queries on the data without first extracting it to another DB
  • No degradation of performance as deployment size grows, as would happen even to a MyISAM table should it have any indexes
  • Compressed storage, so less spinning iron required
  • Columnar datapack organization should not require table partitioning even over long periods
This works very well for structured events. For unstructured data, a different solution is required, which I will discuss at some later date.

Update: Mark Callaghan asked in the comments for some quantified details. We have not spent the time to produce repeatable benchmarks, so all I can offer on that front is anecdotal data - it's very conclusive for us, given it's addressing our real concerns, but less so for others. That said, ICE does not support inserts, only batch loads, so the solution had to be engineered to use that, which added some complexity, but brought orders of magnitude more performance. A simple benchmark run showed that the end-to-end performance for this exceeded 100,000 events per second when running all parts of the client-logserver-database chain on a single desktop machine.

Query performance depends on the queries made. Summary data is 2-3 orders of magnitude faster to access, the bigger the dataset, the bigger the performance benefit - but expecting that for single row accesses would disappoint badly. Storage compression varies wildly depending on the data in question -- we've seen up to 15:1 compression on some real-world data sets, but others (such as storing email addresses in a varchar column) actually expand on storage. This is why I think of this as a solution for structured, quantified event logs, not for general unstructured log file storage.

Thursday 2 April 2009

Amazon order sizes, ideal behaviour, and proof of market friction

I wrote last fall about the "sweet spot" in pricing and spending patterns for a microtransaction-based service and business model, where I posited that given flexible consumption, revenue could be maximized by ensuring the lowest possible minimum price point; one which is preferably closer to 1 cent than one 1 euro. Depending on the goods sold and amount of logistics overhead, the minimum profitable price may of course be much higher, and depending on the payment mechanisms available, the minimum price for which the consumers effort overhead exceeds the cost of the good may also be fairly significant. A chocolate bar may be sellable for 40 cents, while few durable goods can achieve a price point lower than a few euros. For virtual goods, the minimum pricing is mostly a question of efficient mechanism for transferring low amounts of money, because the minimum "size" of the good sold can be in theory reduced ad infinitum, and distribution costs are a non-issue.

Last week there was a Facebook Developer Garage day in San Francisco where a couple of interesting presentations were given. I wasn't there, but browsing through the material I found this slide about the distribution of order sizes among Amazon customers (slide 10 in the deck):

It's interesting to see the similarities on this chart to the behavior in virtual goods. In this data, the observed behavior follows the power law model in an ideal fashion at price points over $25, but the drop-off below that order size is remarkably fast. This is the result of primarily the goods sold and the logistical overheads implicit in that; it just doesn't make much sense for someone to order $5 worth of goods from Amazon given the shipping costs and delays incurred on top of the purchase.

For virtual goods, the drop-off point can be much lower, but still, a similar drop off does happen - again because below a certain price point and transactional overhead level, neither the consumer of the good nor its producer see value in the market. At prices above that, the transactional model does however exhibit the power law distribution. Again, by reducing the minimum marketable and profitable price point, there is a big potential customer base to be gained at the bottom end of the pricing scale. Most companies leave an amazing amount of revenue on the table by not addressing this issue.

I'm still thinking of OnLive. Why is that?

I pretty much blasted OnLive the other day as something that doesn't hold a candle to the distribution power that is the web. Still I keep wondering what's the draw of it. Positioned against the consoles business, it does have clear benefits, clear enough that Nintendo's Reggie Fils-Aime felt it necessary to try to dismiss it. That is, with the exception that it doesn't run console games, only PC games. Today though I read this post from Keith Boesky (RT @jussil), and sure, looking at it from the perspective of building it for acquisition, yeah, it makes perfect sense. I guess my weakness is I always try to understand things like this as standalone businesses, when they're probably not meant to be that. My bad.

Friday 27 March 2009

Why OnLive will not be the massive tectonic shift so many are currently predicting

Among the things announced this week in GDC were two developments in entirely different directions on a particular axis of games technology: first, the OnLive network of thin clients showing network-streamed video games rendered on a server cluster somewhere, and second, the Mozilla/Khronos Group initiative to develop an OpenGL-accelerated, JavaScript-programmed 3D canvas in a web browser. Both have one thing in common: make it possible to run 3D apps (games) on standard devices without prior installations. How they go about that goal is radically different. One of them will fail.

OnLive is not the first company to attempt their idea. It's a basic extension to the same theme that has been around since at least the inception of the X Window System and Sun NeWS in early 80s - graphical thin clients showing applications running somewhere in the network. Further, the idea was explored for 3D games in the late 90s by G-Cluster, which apparently is still around in Japan in some form or another. In my opinion, it's a misguided approach. Certainly there's value to server-side processing, even of graphics, but the final rendering just makes so much more sense to be done on the client even when all of the application logic is remote.

What kind of client? Well, anything that can run a high-performance VM for Java or JavaScript (ie, a modern browser), and has 3D acceleration functionalities built into the graphics pipeline. This includes basically every network-connected device from the cost of $200 upwards: all smart phones, all netbooks, all laptops, all games consoles, and so on. Some of those devices are still intentionally crippled by their manufacturers in terms of operating system support for the required features, and clearly the 3D Canvas development hasn't been finished yet. The hardware capabilities, however, are already deployed to hundreds of millions of consumers.

Ignoring that deployed base and trying to scale a server-side rendering solution to the same figures is just mad. And that's not even considering the framerate and responsiveness constraints that are inescapable simply because of the round-trip network latency of such a system: on a high-bandwidth wired network 10s of milliseconds (not everyone can be situated within a few kilometers of the server cluster), and on radio networks, 100s of ms. Developing high-framerate games under those circumstances is hard enough when you only need to deal within transmitting positional data and adjusting for lag and jitter in both ends - making the games playable when every action made by the player needs to go to the server and back before it shows up is practically speaking impossible.

(Update an hour later) I suppose I should acknowledge that clearly the OnLive approach does have certain benefits to it: no piracy, little hacking of the typical kinds, little opportunity to cheat, and no need for investing in PunkBuster-type technology in the game clients, since none of that is running locally. However, all that just simply will not matter when weighed against the enormous brunt of having to run all that rendering in the wrong end of the MMO network and ignoring the opportunities to disperse so much of the investment and energy requirements to the gamers.

Saturday 21 March 2009

What would you like to hear from me in MySQL Conf?

I'm going to be talking in MySQL Conf 2009 about our business intelligence and data warehousing solutions for Habbo. Since this blog is syndicated on Planet MySQL, and I presume many of the people going to the conference read it, here's a question: would you like to hear about the why's of our technology selection (eg, IT management level questions), the techniques we use for analysis (for the BI analyst or startup technologist), or about the nuts and bolts of the database implementation itself (for the DBAs in the crowd)? I'm going to be touching on all three aspects, perhaps more, but can and will focus on one of the areas in more detail.

- page 5 of 23 -