Fishpool

To content | To menu | To search

Tag - enterprise

Entries feed - Comments feed

Monday 15 September 2008

Infobright BI tools go open source

I've mentioned Infobright before as an interesting solution to getting more performance to BI analytics solutions. Today's news are interesting: Sun invests in the company, and the baseline product is open sourced. Too busy to write more about it today, but I'm certainly watching this one closely.

Saturday 10 May 2008

RIght move, MySQL

Again a week late, but hey, I only need to keep up with this stuff, not comment on it all the time. MySQL changed their minds and turns out the core server will continue to be open source, allowing customers to depend on being able to inspect it if required, extend on any bit as needed, and most importantly, get the benefits of a large community using and testing all features. Thanks for that. I just hope you're going to be consistent about this, for precisely the reason that as a MySQL Enterprise customer, I don't pay you to deliver bits that haven't received that community testing, but to rapidly fix problems if they exist despite that exposure.

It was interesting to hear Monty Widenius comment about it in this week's Open Tuesday event, and I also got to talk to him about attending a MySQL Users session in Helsinki next time I or someone else (anyone? anyone? Bueller?) manage to organize one. Would be nice to hear about the upcoming storage engines straight from the horse's mouth - Monty's Maria effort has certainly been less covered than the Falcon engine I have also commented on, and I can't say to know anything about it myself.

Tuesday 22 April 2008

MySQL Users Conference followup and MySQL's business model

Last week saw MySQL User Conference 2008 in Santa Clara, but I was not able to make time for it this year either. However, in the wake of Sun's acquisition of MySQL, it was very interesting to follow what was going on. A few things that caught my attention:

MySQL 5.1 is nearing General Availability and an interesting storage engine plugin ecosystem starts to emerge. It's this latter, but related event that I see as the first real sign of validation for MySQL's long-ago chosen path of pluggable storage systems instead of focused effort on making one good general-use engine.

Oracle/Innobase announced InnoDB Plugin for MySQL 5.1, which much-awaited features which promise a great deal of help for daily management headaches. More than that, InnoDB Plugin's release under GPL lifts quite a lot of the concern I'm sure many users like us have had about the future viability of InnoDB as MySQL storage engine.

A couple of data warehousing solutions are launched, also based on MySQL 5.1 -- Infobright is one I've already researched somewhat (looks very interesting, as soon as a few current limitations are lifted), Kickfire I know nothing about right now but would love to learn more of.

There's a huge amount of coverage graciously provided by Baron Schwartz that I have yet to fully browse through.

A few remarks by Mårten Mickos regarding MySQL's business model seem to have kicked up a bit of a sandstorm. I don't really understand why; I read these to just verify that the direction MySQL took last year is to continue this year as well. I don't see any major changes here regarding the licensing structure, software availability, or support models. Frankly, it seems like yet another case of Slashdot readers not reading, let alone understanding, what they're protesting against, and press following up on the noise.

I do understand the critique made against MySQL's chosen model, though. In fact, I went on record last September to say that I understand that critique. I still see the same issues here. I believe we represent a fairly common profile of a MySQL Enterprise customer in that what we want from it is not the bleeding-edge functionality but a stable, well-tested product that we can expect to get help for if something does go wrong. We don't see great value in having access to a version of software that isn't generally available to "less advanced" or more adventurous users for free in a community version. In fact, we see it as a negative that such functionality exists, because it hasn't received the community testing, feedback and improvements that makes great open source software as good as it is. While new functionality is interesting, and we're trying to spend time getting familiar with new stuff in order to use it in production later, it simply isn't prudent to put business-critical data in a system that hasn't received real-world testing by as large a community as possible (unless you have no other alternative, and then you takes your chances).

Yet it seems to me that this is essentially what Sun/MySQL continue to propose for the Enterprise customers by delivering "value add" functionality in a special version of the server or plugins to it, possibly in a closed-source form that further reduces transparency and introduces risk. Mårten, I'd prefer it to be otherwise. How can I help you change your mind about this?

Sunday 7 October 2007

MySQL and materialized views

I'm working on alternative strategies to make the use and maintenance of a multi-terabyte data warehouse implementation tolerably fast. For example, it's clear that a reporting query on a 275-million row table is not going to be fun by anyone's definition, but that for most purposes, it can be pre-processed to various aggregated tables of significantly smaller sizes.

However, what is not obvious is what would be the best strategy for creating those tables. I'm working with MySQL 5.0 and Business Objects' Data Integrator XI, so I have a couple of options.

I can just CREATE TABLE ... SELECT ... to see how things work out. This approach is simple to try, but essentially unmaintanable; no good.

I can define the process as a BODI data flow. This is good in many respects, as it creates a documented flow of how the aggregates are updated, is fairly easy to hook up to the workflows which pull in new data from source systems, and allows monitoring of the update processes. However, it's also quite work intensive to create all those objects with the "easy" GUIs in comparison to just writing a a few simple SQL statements. There are also some SQL constructs that are horribly complicated to express in BODI; in particular, COUNT(DISTINCT ..) is ugly.

Or I could create the whole process with views on the original fact table, with triggered updates of a materialized view table in the database. It would still be fairly nicely documentable, thanks to the straightforward structure of the views, and very maintanable, as the updates would be automatic. A deferred update mechanism with a trigger keeping track of which part of the materialized view needs update and a periodic refresh over a stored procedure would keep things nicely in sync. MySQL 5.0 even has all of the necessary functionality.

Except.. It's only there in theory. The performance of views and triggers is so horrible that any such implementation would totally destroy the usability of the system. MySQL's views only work as statement merge when there is a one-to-one relationship between base table and view rows, or in other words, the view can not contain SUM(), AVG(), COUNT() or any of the other mechanisms which would have been the whole point of the materialized view in question. It will fall back to a temp table implementation in these cases, and creating a GROUP BY temp table over 275 million rows without using the WHERE BY clause is pure madness.

In addition, defining any triggers, however simple, slow bulk loads to the base tables by an order of magnitude. I could of course still work around triggers by implementing the equivalent logging in each BODI workflow and create the materialized views and a custom stored proc to update each one, but having a view there in between was the only way to make this approach maintainable. Damn, there goes that strategy.

Friday 21 September 2007

MySQL Community vs Enterprise tension

I probably don't spend quite enough time following progress around MySQL considering how critical the product is to us. I'd like to consider it part of the infrastructure in a way I treat Red Hat Enterprise Linux, ie something I can trust to make good progress and follow up on a quarterly basis. Naturally we have people who watch both much more closely, but my time simply should, and pretty much is, spent doing something else.

However, it seems MySQL really demands a bit more attention right now. Today I went and read Jeremy Cole's opinion about MySQL Community (a failure), and I have to say I agree on many of the points. MySQL simply has not yet found a model that works as well as that of Red Hat's Fedora vs Enterprise Linux - that is, really giving the Community edition to the community to direct, and using the Enterprise edition as a platform for enterprises to depend on.

I feel the fundamental problem really is quite simple; as long as MySQL maintains the community edition (both binaries AND the source tree) themselves, and don't let the community integrate features to it on a timely basis, the model will not function, not even to their paying customers (us included). However, if they reverse this particular point from the current status-quo, all of the other benefits are inevitable.

The comparison to Fedora and RHEL is rather obvious, despite the distribution vs single product differences. Fedora is a great community Linux distribution with the latest-and-greatest features integrated to it on a very timely fashion. Not even Ubuntu can really compete with Fedora in terms of features. However, what Fedora gives up to reach this is a certain amount of polish and reliability. I will happily use Fedora as a personal platform, because of the latest features, but I would not pretend to run a stable system on top of it. For that, I'll rather choose something a bit more mature, that has proven itself in the community and received further QA ahead of commercial release. This is RHEL, and this is what the MySQL Enterprise should be. A version that, when it's released, I shouldn't have to hesitate to install on a new production server.

I also today learned about the Dorsal Source MySQL community release. Now this looks like something MySQL Community release probably should be like. I'll have to give it a test round and see what's up.

Update: Baron Schwartz describes a MySQL Enterprise that I would have far less trouble using than the existing one..