Fishpool

To content | To menu | To search

Tag - English

Entries feed - Comments feed

Sunday 25 February 2007

Returned

I came back from a two-week snowboarding trip to Zermatt -- my first to Switzerland, and what an excellent trip that was. Great powder, great freeriding, great weather. Will post some photos once I've sorted through them, but in the meantime, Sanna took some as well and posted them to her moblog.

On another note, I haven't mentioned Jim Starkey's comments to my previous post, but they're good reading to everyone interested in MySQL, with clarifications to some things I misunderstood in the documentation. I'm glad to hear that the "serial writes" don't in fact mean just one thread writing, as well as that he believes the engine will at a later stage allow multiple tablespaces per logical database.

Thursday 18 January 2007

First thoughts regarding the MySQL Falcon storage engine

One of my DBA colleagues mentioned that MySQL has released the first alpha version of the Falcon storage engine, which is advertised to most efficiently utilise modern hardware to provide a high-performance scalable replacement for InnoDB, which MySQL naturally tries to reduce dependency of. 

Unfortunately, just based on reading the Falcon documentation, I must draw the conclusion that without extensive further development, it won't be usable for very large installations such as the ones we run for Habbo for a number of reasons. I'm usually much more positive about MySQL, it after all being technology that has enabled Habbo to grow more than 100% every year I've been working on it, but this is a disappointment.

It supports just one tablespace per database, and each tablespace stores all data in a single file. While the concurrency problems of single-file access can be eliminated with careful application of modern kernel, filesystem and disk subsystem technology, single file databases still suffer from major administration issues.

Since a database can't be extended by additional tablespaces and data migrated by the storage engine, you'd better trust your capacity to indefinitely increase available storage space under one filesystem or downtime can't ever become a problem for you. Don't even think about deploying Falcon without a high-end NAS device that supports many times your current storage requirements, reliable logical volume management and an extendable file system. A database is also limited by the filesystem's maximum file size, so make sure that won't be a problem either. I wouldn't recommend ext3 for Falcon.

You'll also need to make backups either via SQL dumping the entire database (not really feasible for daily routine) or by backing up a single file, so either your filesystem, LVM system or storage device must support snapshot backups. Scalability may still become an issue, so be sure that the approach you choose doesn't degrade performance as file size grows.

Just one thread writing to disk may at first blush sound like excellent performance maximisation technique, but it forces you to make a choice between reliability (since it applied to log writes too, transactions are committed to disk in a serialized fashion - no concurrency) and scalability ("commits" to ram cache and background disk flushes certainly will perform well and scale nicely, but what if there's a power failure?). And this is not even the road to highest possible performance - the highest-end disk subsystems will become CPU limited if only one thread will be able to send I/O requests.

With one table space comes one cache/buffer pair, so developers are either forced to split their data model to multiple logical databases or suffer under one unpartitionable system where one bad table scan by one part of the application wipes the buffers from underneath the entire application. A truly modern storage system permits the DBA to assign certain tables or indices to their own caches and buffer spaces and retain a single logical model for software developers. MySQL has never had this ability, and apparently Falcon won't bring it, either.

A more traditional DBA might also cringe at the statement "it is impossible to predict or calculate the disk storage space required for a specific dataset." Many, many complaints could be made about the alpha-release's other restrictions, but I'll give MySQL the chance to keep their promise to address them in forthcoming versions.

I don't really understand which of its features qualify it as technology that utilises modern computers to the best possible effect. Perhaps they're referring to it automatically compressing data on disk? Sure, that may be useful, but it may just as well become a bottleneck when single-row updates require entire pages to be recompressed. Just that feature alone doesn't impress me. It's not more easily administrated, nor does it (on paper at least) address this kind of performance issues. At best, it's an upgrade to MyISAM, but shouldn't be mistaken for a solution to high-performance transactional database requirements.

More on it once I've had a chance to do some practical experimentation (might be a while).

Update: It seems Peter Zaitzev has benchmarked Falcon against MySQL's other storage engines, verifying my suspicion that it doesn't scale properly. Do note that neither MyISAM nor InnoDB show ideal scaling performance either.

Tuesday 2 January 2007

Three predictions for 2007

In the fine tradition of educated guessing, I thought I'd try to preview what's going to happen, or not, in the Internet world over the next 12 months. Here goes...

Continuing the ten-year tradition of "mobile internet overtakes xxx" punditry that fails to materialize, mobile video and social networking services will not become mainstream in 2007. The mobile 1" experience combined with severely limited input mechanisms just won't replace larger-format mediums. Mobile does have a niche in short-format updates and will gain limited popularity as a snapshot pictures upload vehicle.

Social web technology continues to find popularity in corporate environments, with Wikis replacing traditional document management systems in more and more companies. Consumer-oriented social networking hits peak growth amongst fierce competition in a niche-sites vs mass media battle.

TV companies are forced to accept the fact that Internet video services are able to provide a more relevant experience to their viewers, but few are able to actually deliver a satisfying service. YouTube continues to encroach on their field, despite outcries of copyright violations. By end of the year, DVR's will be marketed by their capability of showing Internet video content on televisions in addition to recording broadcast stuff.

Update: whoa, it seems I was really, really late with my last prediction. Let me qualify that a bit -- I'm not personally going to be satisfied by things like an Apple HTPC able to download stuff from iTunes, Xbox 360 movie downloads from Microsoft, or other such walled-garden solutions. The Sony Bravia thing might be a tiny step closer to what I had in mind, but ultimately I expect someone to provide a box which is not locked to single-vendor content. Something like the Democrazy Player on a TV would fit the bill better. If the Venice Project ends up on a non-PC living room device, that'd be It.

Monday 4 December 2006

Smart uses for RSS

I've thought for quite a while that RSS is one of the most important, and often overlooked technologies around for building interesting services. Overlooked because it has potential for much more than just the currently-widespread blog-type syndication of (new) content.

I think there's an amazing potential in delivering all kinds of information to users via RSS feeds personalized to their individual profile - whether based on recommendations, locale, activities, or like this new service FeedCycle, by serializing access to previously  published stuff in a sequence that makes most sense to the recipient. Think "my pals say Galactica is really cool, but I missed its first season - so I'd like to watch it two episodes a week to catch up". Or, one of those continuing novella-based stories many magazines publish.

Tuesday 28 November 2006

Value chain of content business in the Web 2.0

Since I sabotaged this blog a month ago by transfering it to a new platform that totally destroyed all the permalinks and with that my already rather pathetic Google PageRank, I figured this might be a good time to look at changing my editorial policy a bit. Until now, I've sort of avoided talking about work-related matters, which kind of limits the field since practically everything I might want to write about could be construed to be related. So, I figure - why not comment on things? I'm commenting on these things in a more official capacity in more closed circles anyway.

A couple of weeks ago I was trying to introduce a group of IT directors to Web 2.0, which is a fairly thankless task to do in 45 minutes, given that the buzz seems to have no clear boundaries at all, so basically anything whatsoever could be claimed to lay under that moniker. In fact, one definition of Web 2.0 seems to be "whatever the speaker thinks to be cool today". I hope someone in the audience got something out of that.

Anyway - one of the more relevant points in that presentation to myself was that the value of content in this world of syndication, aggregation, user-created content and mashups might not be in the actual content itself, but in the way services are able to provide relevant views into the ocean of it based on the reader's context and preferences. It seems that I'm not alone thinking this - Bear Sterns Media Research has just published an analysis claiming the value being precisely in the packaging of content. An illustration perhaps explains:

Sweet spot in the content value chain

The presentation is a useful read if you're interested in the subject, but repeats quite a few basic ideas for anyone who already is familiar with concepts such as the Long Tail etc. Essentially, the analyst (Spencer Wang) is making the prediction that the popularity of user-generated content (or other content coming from outside the traditional Big Media supplies) is going to increase, making it a much more significant fraction, if not the majority of total media consumption. This idea should not be too strange to anyone reading blogs in the first place. Furthermore, because this increases the supply sources dramatically, it's going to be increasingly valuable to consumers to find relevant content, which leads to the success of aggregators and packagers. Once again, much along the lines I'm thinking on as well, although not necessarily with the particular companies he chooses to name in that illustration above.

Saturday 28 October 2006

Blog moved

I had to make a choice between finding new hosting for the server running the old Fishpool.org, finding an equivalent service somewhere else, or moving to a new type of blog hosting. Since I wasn't all that ethusiastic about continuing to maintain the server anyway, I decided to make use of the blog service of GANDI, my domain registrar.

Unfortunately this switch means the permalinks to the old posts changed. Sorry about that.

Continue reading...

Wednesday 30 August 2006

Fedora and Nortel VPN?

I'm trying to make a Nortel VPN and my FC5 laptop talk to each other. NetworkManager + vpnc doesn't seem to have much luck -- I just get a cryptic error message. Novell has another implementation called turnpike which according to their docs might work, but I can't get it to build on Fedora. Nortel's own client, such as it is, doesn't look like it'll ever work - not that I really want to touch a closed-source IPsec implemention anyway. Anyone have better luck?

These notes might be the key to making it work, but it doesn't seem like it's been integrated...

Tuesday 1 June 2004

Fu**ing spammers

In addition to the 100+ spam emails a day, I am now receiving 15+ bounces a day for porn spam which is purpotedly coming from my address. I hate these bastards. The good news its a criminal offense that can land someone in jail for three years, the bad news is that I don't have the time or resources to track the assholes down.

page 3 of 3 -