To content | To menu | To search

Tag - measurement

Entries feed - Comments feed

Thursday 18 July 2013

The difference in being demanding and acting like a jerk

Sarah Sharp is a member of a very rare group. She's a Linux kernel hacker. Even among that group, she's unique - not because of her gender (though that probably is distinctive in many of the group's social gatherings), but because she's brave enough to demand civil behavior from the leaders of the community. I applaud her for that.

Now, I have immense respect for Linus Torvalds and his crew. I've been a direct beneficiary of Linux in both professional and personal contexts for soon to be two decades. The skills this group demonstrate are possibly only matched by the level of quality they demand from each other. However, unfortunately that bar is often demonstrated only on the technical level, while the tone of discussion, both in-person and on the mailing lists, can turn quite hostile at times. It's been documented many times, and I can bring no value to rehashing that.

However, I wanted share some experience from my own career as a developer, manager of developers, and someone who has been both described as demanding and who has needed to report to others under very demanding circumstances. I've made some of these mistakes myself, and hope to have learned from them.

Perhaps not so surprisingly, the same people in the community who defend hostile behavior also, almost by rule, misunderstand what it means to behave professionally. There's a huge difference between behaving as in a workplace where people are getting paid, and behaving professionally. The latter is about promoting behaviors which lead to results. If being a asshole was effective, I'd have no problem with it. But it's not.

To consistently deliver results, we need to be very demanding to ourselves and to others. Being an uncompromising bastard with regards to the results will always beat accepting inferior results, when we measure technical progress on the long run -- though sometimes experience tells us a compromise truly is "good enough".

However, that should never be confused with being a bastard in general. Much can (and should) be said about how being nice to others means we're all that much happier, but I have something else to offer. Quite simply: people don't like to be called idiots and having personal insults hurled at them. They don't respond well to those circumstances. Would you like it yourself? No? Don't expect anyone else to, either. It's not productive. It will not ensure delivering better results in the future.

Timely, frequent and demanding feedback is extremely valuable to results, to the development of an organization, and to the personal development of the individuals. But there are different types of communication, and not all of it is feedback. Demanding better results isn't feedback, it's setting and communicating objectives. Commenting on people's personalities, appearance, or otherwise, let alone demanding them to change themselves as a person isn't feedback nor reasonable. Feedback is about observing behavior and demanding changes in the behavior, because behavior leads to results. Every manager has to do it, never mind whether they're managing a salaried or voluntary team.

However, calling people names is bullying. Under all circumstances. While it can appear to produce results (such as, making someone withdraw from an interaction, thus "no longer exhibiting an undesirable behavior"), those results are temporary and come with costs that far outweigh the benefits. It drives away people who could have been valuable contributors. What's not productive isn't professional. Again - I'm not discussing here how to be a nicer person, but how to improve results. If hostility helped, I'd advocate for it, despite it not being nice.

The same argument can be said about using hostile language even when it's not directed at people, but to results. Some people are more sensitive than others, and if by not offending someone's sensibilities you get better overall results, it's worth changing that behavior. However, unlike "do not insult people", use of swearwords toward something else than people is a cultural issue. Some groups are fine with it, or indeed enjoy an occasional chance to hurl insults at inanimate objects or pieces of code. I'm fine with that. But nobody likes to be called ugly or stupid.

In the context of Linux kernel, does this matter? After all, it seems to have worked fine for 20 years, and has produced something the world relies on. Well, I ask you this: is it better to have people selected (or have the select themselves to) for Linux development by their technical skills and capability to work in organized fashion, or by those things PLUS an incredibly thick skin and capacity to take insults hurled at them without being intimidated? The team currently developing the system is of the latter kind. Would they be able to produce something even better, if that last requirement wasn't needed? Does that help the group become better? I would say hostility IS hurting the group.

Plus, it's setting a very visible, very bad precedent for all other open source teams, too. I've seen other projects wither and die because they've copied the "hostility is ok, it works for Linux" mentality, but losing out on the skills part of the equation. They didn't have to end that way, and it's a loss.

Monday 29 April 2013

Analytics infrastructure of tomorrow

If you happen to be interested in the technologies that enable advanced business analytics, like I am, the last year has been an interesting one. A lot is happening, on all levels of the tech stack from raw infrastructure to cloud platforms and to functional applications.

As Hadoop has really caught on and is now a building block for even conservative corporations, several of its weaknesses are also beginning to be tackled. From my point of view, the most severe has been the terrible processing latencies of the batch- and filesystem-oriented MapReduce approach, rather than solutions designed on top of streaming data. That's now being addressed by several projects. Storm provides a framework for dealing with incoming data, Impala makes querying stored data more processing-efficient, and finally, Parquet is coming together to make the storage itself more space- and I/O efficient. With these in place, Hadoop will move from its original strength in unstructured data processing to a compelling solution for dealing with massive amounts of mostly-structured events.

Those technologies are a bear to integrate and, in their normal mode, require investment in hardware. If you'd prefer to get a more flexible start to building a solution, Amazon Web Services has introduced a lot of interesting stuff, too. Not only have the prices for compute and storage dropped, they now offer I/O capacities comparable to dedicated, FusionIO-equipped database servers, very cost efficient long-term raw data storage (Glacier), and a compelling data warehouse/analytics database in the shape of Redshift. The latter is a very interesting addition to Amazon's already-existing database-as-a-service offerings (SimpleDB, DynamoDB and RDS), and, as far as I've noticed, gives it a unique capability other cloud infrastructure providers are today unable to match - although Google's BigQuery comes close.

The next piece in the puzzle must be analytical applications delivered as a service. It's clear that the modern analytics pipeline is powered by event data - whether it's web clickstreams (Google Analytics, Omniture, KISSMetrics or otherwise), mobile applications (such as Flurry, MixPanel, Kontagent) or internal business data, it's significantly simpler to produce a stream of user, business and service events from the operational stack than it is to try to retrofit business metrics on top of an operational database. The 90's style OLTP-to-OLAP Extract-Transform-Load approach must die!

However, the services I mentioned above, while excellent in their own niches, can not produce a 360-degree view across the entire business. If they deliver dashboards, customer insight is impossible. Even if they're able to report on customers, they don't integrate to support systems. They leave holes in the offering that businesses have to plug with ad-hoc tools. While it's understandable, as they're built on technologies that force nasty compromises, those holes are still unacceptable for a demanding digital business of today. And as the world increasingly turns more digital, what's demanding today is going to be run-of-the-mill tomorrow.

Fortunately, the infrastructure is now available. I'm excited to see the solutions that will arrive to make use of the new capabilities.

Thursday 1 March 2012

Increase engagement with social analytics

Last week I discussed segmentation as a method for identifying and differentiating customers for their specific service needs. Whether used for young cohort's introductory period service, high-value segments special treatment, or to identify the group on a transitionary path to high value and help accelerate that process, segmentation is a very versatile tool for business and product optimization. It can be approached with many techniques and I'll go on to more implementation details on those. But first, an introduction to the next topic after segmentation: social metrics.

While social behavior is not historically strongly featured by many products in either the gaming space or in the wider scope of freemium products, your customers and users are people, and thus they will have social interaction with others you can benefit from. If you can capture any of that activity in your product measurement, it can serve as a very valuable basis for in-depth analytics. Today, I will focus on those products and services in which their audience can interact among each other - that is, there is some sort of easily measured, directly connected community.

Any such product will probably have user segments such as:

  • new users who would benefit from seeing good examples of effective use of the product, guidance on the first steps, or some other introduction beyond what the product can do automatically or what your sales or support staff can scale to
  • enthusiasts who would like nothing better than to help the first group
  • direct revenue contributors who either have a lot of disposable income, or otherwise find your service so valuable to them that they'll be happy to buy a lot of premium features or content
  • people who, though they're not top customers themselves, find innovative ways to use premium features for extra value
  • people who are widely appreciated by the community for their contributions, "have good karma"
  • people whose influence within the community is on the whole negative due to disruptive behavior

and many, many others. Two of these groups are easy to identify simply based on their own history, I'm sure you'll recognize which two. The other four are determined largely by their interaction with the rest of the community and other users' reaction to their activities. How do you find them? This is a rapidly evolving field of analytics with constantly growing pool of theoretical approaches and practical tools, and can look daunting at first. The good news, there are many practical tools already, and while theoretical background helps, the first steps aren't too hard to make.

You'll need to develop some simple way to identify interaction. The traditional way to begin is to define a "buddy list" of some sort similar to Facebook friends network, Twitter following, or a simple email address book. However, I find a more "casual" approach of quantifying interactions works better for analytics. Enumerate comments, time in the same "space", exposure to the same content, common play time, or whatever works for your product. At the simplest level, this will be a list of "user A, user B, scale of interaction" stored somewhere in your logs or a metrics database. This is already a very good baseline. With the addition of time/calendar, you'll be able to measure the ebb and flow of social activities, but even that isn't strictly necessary.

Up to data set of about 100k users and half a million connections or so, you'll be able to do a lot of analysis just on your laptop. Grab such a data dump and a tool called Gephi and you're just minutes away from fun stuff like visualizing whether connections are uniformly defined or clustered into smaller, relatively separate groups (I bet you'll find the latter - social networks are practically always have this "small world" property). This alone, even though it isn't an ongoing, easily comparable metric, will be very informative for your product design and community interaction.

In terms of metrics and connected actions, here's a high-level overview of some of the more simple-to-implement things:

  • highly connected users are a great seed for new features or content, because they can spread messages fast and giving them early access will make them more engaged. While in theory you'd want to reach people "in between" clusters, the top connected people are an easy, surprisingly well functioning substitute.
  • those same people with a large number of connections are also critical hubs in the community, and you should protect them well, jumping in fast if they have problems. This is independent of their individual LTV, because they may well be the connection between high-value customers.
  • high clustering coefficient will indicate a robust network, so you should aim to build one and increase that metric. Try introducing less-connected (including new) people to existing clusters, not simply to random other users. A cluster, of course, is a set of people who all have connections to most others in the cluster (i.e., a high local clustering coefficient).
  • Once someone already has a reasonable number of semi-stable relationships (such as, 4-8 people they've interacted with more than once or twice), it's time to start introducing more variance, such as connecting them to someone who's distant in the existing graph. Most of these introductions are unlikely to stick, but the ones that do will improve the entire community a great deal.
  • if you can quantify the importance of the connections, e.g. by measuring the time or number of interactions, you can further identify the top influencers apart from the overall most connected people.
  • finally, when you combine these basic social graph metrics to the other user lifetime data I discussed previously, you'll get a whole new view into how to find crucial user segments and predict their future behavior. This merged analysis will give you measurable improvement far faster than burying yourself into advanced theories of social models, so take the low-hanging fruits first.

That's it for yet another introductory post. Time for feedback: what other analytics areas would you like to see high-level explanations about, or would you rather see this series dive into the implementation details on some particular area? Do let me know, either via comments here, or by a tweet.

Tuesday 21 February 2012

There's no such thing as an average free-to-player

A quick recap: in part 1 of this series, I outlined a couple of basic ways to define a customer acquisition funnel and explained how it falls short when measuring freemium products, in particularly free-to-play. In part 2, I continued to explain two alternative measurement models for free-to-play and focused on Lifetime Value (LTV) as a key metric. A core feature emerged: the spread of possible LTV through the audience is immense, ranging from 0 to, depending on product, hundreds, perhaps even thousands of euros. 

This finding isn't limited to just money spending, but is seen over all kinds of behavior, and is well documented for social participation as the 90-9-1 rule. From a measurement point of view, one of the most overlooked aspects is how it destroys averages as a representative tool. At the risk of stating the obvious, an example below.

When 90% of a freemium product players choose the free version and 10% choose to pay, the average LTV is obviously just 1/10th of the average spending of the paid customers. However, when there's not just a variety of price points, but in fact a scale of spending connected to consumption (or if we're valuing something else than spending, such as time), the top 1% is likely to spend 10x or more than the next 9%. Simple math will show you that the top 1% would be more valuable than the rest of the audience in total, as illustrated here:

The average? 0.19. Now, can you identify the group that is represented by "average spending of 0.19" in the above example? Of course you can't - there is no such group. Averages work fine when what you're measuring follows some approximation of a standard distribution (e.g., heights of people), but they break down with other kinds of quantities. Very crucially, they break down on behavioral and social metrics. Philip Ball's book Critical Mass goes to some length on the history of these measurements, if you're interested in that.

Instead of measuring an average, you should identify your critical threshold levels. Those might be the actions or value separating the 90% and 9% value players, and equally, 9% and 1%. Alternatively, you might already have a good idea of your per-user costs and how much a customer needs to spend to be profitable. Measure how many of your audience are above that level. Identify and name them, if you can. Certainly try to remember them over time to address them individually. This goes deeper than simply "managing whales", to use the casino term. Yes, the top 1% are valuable and important to special case, but it's equally if not more important to determine what are the right strategies for developing more paying customers from the 90% majority.

This is why it's important to measure everything. If you only measure payment, the 90% majority will be invisible to your metrics, and it's usually very hard to identify ahead of time which other measurements are important for identifying the activities that lead to spending. Instrumenting your systems to collect events on all kinds of activities on a per-user basis (rather than just system-level aggregates) enables a data mining approach to the problem. Collect the events, aggregate them across time for each player (computing additional metrics, when appropriate), and then identify which pre-purchase activities separate those players who convert to paying from those in the same cohort who do not. There are several strategies for this, from decision trees to Bayesian filtering to all kinds of dimensionality reduction algorithms. The tools are already pretty approachable, even in open source, whether as GUIs like Weka, in R, or with Big Data solutions like Apache Mahout, which works on top of Hadoop.

Essentially, this approach will surface a customer acquisition funnel akin to what I described earlier, but using the raw measurement data. It will probably reveal things about your product and its audience you would not have identified otherwise, and allow you to optimize the product for higher conversions. The next step in this direction is to replace the "is a customer" criteria above with the measured per-player LTV value. Now, instead of a funnel, you will reveal a number of correlations between types of engagement and purchase behavior, and will be able to further optimize for high LTV. Good results depend on having a rich profile of players across their lifetimes. A daily summary of all the various activities in a wide table with a column for each activity, and a row per player per day is a great source for this analysis.

Friday 10 February 2012

Developing metrics for free-to-play products

In my previous post, I outlined a few ways in which a "sales funnel" KPI model changes between different businesses, and argued that it really doesn't serve a free-to-play business well. Today, I'll summarize a few ways in which a free-to-play model can be measured effectively.

Free-to-play is a games industry term, but the model is a bit more general. In effect, this model is one where a free product or service exists not only as a trial step on the way to converting a paying customer, but can serve both the user as well as the business without a direct customer relationship, for example by increasing the scale of the service, making more content available. From a revenue standpoint, a free-to-play service is structured to sell small add-ons or premium services to the users on repeat basis - in the games space, typically in individual transactions ranging from a few cents to a couple of dollars in value.

As I wrote in the previous article, it's this repeated small transaction feature which makes conversion funnels of limited value to free-to-play models. Profitable business depends on building customer value over a longer lifetime (LTV), and thus retention and repeat purchase become more important attributes and measurements. Here is where things become interesting, and common methodologies diverge between platforms.

Facebook games have standardized on measuring number and growth of daily active users (DAU), engagement rate (measured as % of monthly users on average day, ie DAU/MAU), and the average daily revenue per user (ARPDAU). These are good metrics, primarily because they are very simple to define, measure and compare. However, they also have significant weaknesses. DAU/MAU is hard to interpret as it pushed up by high retention but down by high growth, yet both are desirable. Digital Chocolate's Trip Hawkins has written numerous posts about this, I recommend reading them. ARPDAU, on the other hand, hides a very subtle, but crucially important fact regarding the business - because there is no standard price point, LTV will range from zero to possibly very high values, and an average value will bear no reflection on either the median nor the mode value. This is, of course, the Long Tail like Pareto distribution in action. Why does this matter? Well, because without understanding the impact of the extreme ends of the LTV range to the total, investments will be impossible to target, implications of changes impossible to predict, as Soren Johnson describes in an anecdote about Battlefield Heroes ("Trust the Community?").

Another way of structuring the metrics is to look at measured cohort lifetimes, sizes and lifetime values. Typically, cohorts will be defined by their registration/install/join date. This practice is very instructive and permits in-depth analysis and conclusions on performance improvement: are the people who first joined our service or installed our product last week more or less likely to stay active and turn into paying users than the people four weeks ago? Did our changes to the product help? Assuming you trust that later cohorts will behave similarly to earlier ones, you can also use the earlier cohorts' overall and long term performance to predict future performance of currently new users. The weakness of this model relates to the rapidly increasing number of metrics, as every performance indicator is repeated for every cohort. Aggregation becomes crucial. Should you aggregate data older than a few months all-in-one? Does your business exhibit seasonality, so that you should compare this New Year cohort to the one last year, rather than to the one from December? In addition, we have not yet done anything here to address the fallacy of averages.

The average problem can be tackled to some degree by further increasing the number of cohorts over some other feature than the join date, such as the source by which they arrived, their geographic location, or some demographic data we may have of them. This will let us understand that French gamers will spend more money than those from Mexico, or that Facebook users are less likely to buy business services than those from LinkedIn. This information comes at a further cost in ballooning the number of metrics, and will ultimately require automating significant parts of the comparison analysis, sorting data to top-and-bottom percentiles, and highlighting changes in cohort behavior.

Up until now, all the metrics discussed have been simple aggregations of per-user data into predefined break-down segments. While I've introduced a few metrics which can take some practice to learn to read, the implementations of these measurements are relatively trivial - only the comparison automation and highlight summaries might require non-trivial development work. Engineering investments may already be fairly substantial, depending on the traffic numbers and the amount of collected data, but the work is fairly straightforward. In the next installation, I will discuss what happens when we start to break the aggregates down using something else than pre-determined cohort features.

Tuesday 7 February 2012

Metrics, funnels, KPIs - a comparative introduction

I know, I know - startup metrics have been preached about for years by luminaries like Mark Suster, Dave McClure, Eric Ries and many, many others. The field is full of great companies and tools like KISSmetrics, ChartBeat, Google Analytics, to name but three (and do a great disservice to many others). Companies like Facebook and Zynga collect and analyze oodles (that's a technical term) of data on their traffic, customers and products, and have built multi-billion dollar business on metrics. Surely everything is done already, and everyone not only knows that metrics matter, but also how to select the right metrics and implement a robust metrics process? There's nothing to see here, move along... or is there?

Metrics depend on your business as much as your business depends on them. No, more, in fact. It is possible (though hard) to build a decent, if not awesome business purely on intuition, but it is not possible to define metrics without understanding the business. Applying the wrong metrics is a disaster waiting to happen. In fact, in some ways this makes building a robust metrics platform more difficult than building the product it's supposed to measure. Metrics can't exist ahead of the product, but are needed from the beginning. Sure, with experience you will learn to pick plausible candidates for KPIs, and may even have tools ready for applying them to new products, but details change, and sometimes, with those details, the quality of the metrics changes dramatically. This is obviously true between industries like retail vs entertainment, but it's also true between companies working in the same industry.

This is a big part of why metrics aren't a solution to lack of direction. They can be a part of a solution in that well-chosen metrics will make progress or lack of it obvious, and may even provide clear, actionable targets for developing the business. Someone still needs to have an idea of what to do, and that insight feeds back into all parts: product, operations, measurement. I've never liked the phrase "metrics driven business" for this reason. Metrics don't do any driving. They're the instrumentation to tell you whether you're still on a road or what your speed is. You still have to decide whether that's the right road to be on, and whether you should be moving faster, or perhaps at times slower.

What to do, then? Well, understanding the differences helps. Lets start with a commonly applied metrics model, the sales funnel.

In a business-to-business, face to face sales driven business, a traditional funnel may begin with identifying potential customer opportunity, then measuring the number of contacts, leads, proposals, negotiations, orders, deliveries and invoices. A well managed business will focus on qualified customers and look for repeat transactions, as the cost per opportunity will likely be lower, and the revenue per order may be higher, leading to greater profitability. They will also look at the success rate between the steps of the funnel, trying to improve the probability of developing an opportunity into a first order.

This model is often adapted to retail: advertising, foot traffic, product presentation, purchase decision. For some businesses that's it - others will need to manage the delivery of the product, and may see further opportunities in service, cross-sales, or otherwise. Online retail businesses measure every step in much greater detail, simply because it is easier to do so. Large retail chains emulate that measurement with very sophisticated foot traffic measurement systems. But even in its simplest form, while the shape is similar, the steps of the funnel are very different.

Online businesses have developed a variety of business models, among which two large categories are very common: advertising and freemium.

The advertising-funded two sided market model is two different funnels: a visibility - traffic - engaged traffic - repeat traffic page view model, and a more traditional sales funnel for the advertisers, though even that one has been, through automation, turned to look more like an online retail model than what advertising industry is used to. This model is further enhanced by traffic segmentation and intent analysis, allowing targeted advertising and a real-time direct marketing product, the sales funnel of which is even less familiar to the sales funnel I described at the beginning.

Freemium isn't even one business model: a B2B service with a tiered product offering and a free time- or feature-limited trial product may ultimately use the traditional sales model, only so that the opportunity to prospect part of the model is fully automated. Often it's entirely automated to the point a customer never needs to (or perhaps even wants to, assuming a simple enough product) talk to a sales rep. Still, the basic structure holds: some of the prospective leads turn into customers, and carry the business forward. Free service, be it for trial only or for the starter-level segment, is a marketing cost and a leads-qualifier, enabling a smaller sales force.

On the other hand, the free-plus-microtransactions model, one which we pioneered with Habbo Hotel, and has since been used to great success by many, including Zynga, can certainly be described as a funnel, but to measure it with one requires significant violence to many details. The most important of these is that because individual transactions are typically of very low value, building a profitable business on top of a model which aims for, and measures one sale per customer is practically impossible. This class of business doesn't just benefit from repeat customers, it requires them. Hence, a free-plus (or, as it is called in the games industry, free-to-play) business model must replace counting a "new customer" metric or measuring individual transaction value with the measurement of customer lifetime value. Not just measuring it on average, but trying to predict it individually - both to try to develop 0- or low-value users (oh, how I hate the word) to higher value by giving them better value or experience, and by identifying the high value customers to serve and pamper them to the best of the company's ability (within reason and profitability, of course). And once you switch the way you value revenue, you really need to switch the way you measure things pre-revenue.

Funnels change. There are business models where funnels really can't provide the most instructive KPIs, even though they still may be conceptually helpful in describing the business. As this post is getting long, more on the details of KPIs of free-to-play in the next episode.

Wednesday 6 July 2011

Zynga's ARPU doubling? Not quite

Apparently today the pundits and analysts around have come up to reviewing Zynga's ARPU figures from their S-1 filing (Inside Social Games, Eric von Coelin). Something seemed fishy in these calculations, and since I'm home for a day, I had the opportunity to review the filing figures on a computer, rather than just a tablet. Yep, people, you're comparing apples to oranges. Zynga's monetization rate is improving, but it's nowhere as dramatic as you're making it look. Did you already forget, they defer revenue? You can't compare GAAP deferred revenue to non-deferred DAU/MAU figures! Use the bookings data instead.

This is what the S-1 filing states about the difference:

"Bookings is a non-GAAP financial measure that we define as the total amount of revenue from the sale of virtual goods in our online games and from advertising that would have been recognized in a period if we recognized all revenue immediately at the time of the sale. We record the sale of virtual goods as deferred revenue and then recognize that revenue over the estimated average life of the purchased virtual goods or as the virtual goods are consumed. Advertising revenue consisting of certain branded virtual goods and sponsorships is also deferred and recognized over the estimated average life of the branded virtual good, similar to online game revenue. Bookings is calculated as revenue recognized in a period plus the change in deferred revenue during the period. For additional discussion of the estimated average life of virtual goods, see the section titled “Management’s Discussion and Analysis of Financial Condition and Results of Operations—Revenue Recognition.”

Zynga is of the opinion that bookings more accurately represents their current sales activities, and I fully agree. After all, this is not subscription business we're talking of! If you're as hard-core geek about these things as I tend to be, the description of when a booking turns into revenue is discussed on pages 62-63 of the filing: 

"Durable virtual goods, such as tractors in FarmVille, represent virtual goods that are accessible to the player over an extended period of time. We recognize revenue from the sale of durable virtual goods ratably over the estimated average playing period of paying players for the applicable game, which represents our best estimate of the average life of our durable virtual goods"

That deferring means that during periods of rapid growth, ARPU monetization appears to decline, while on the other hand periods of flat or declining traffic would seem to improve ARPU, due to the recognition of earlier deferred revenue against current, not earlier userbase. 

With these covered, what are the actual sales figures? The average daily Bookings to DAU rate is somewhat higher than the Revenue to DAU rate, at $0.051 (B) in Q1 of this year vs $0.042 (R). Both seem to have plateau'd on that level since growing from a year-ago $0.030 (B) / $0.017 (R). Respectable, but not earth-shattering -- and the growth, while impressive, isn't quite "more than doubled".

Tuesday 11 May 2010

LOGIN presentation on Habbo's Flash transition and player-to-player market

Had my presentation as one of the first sessions of this year's LOGIN conference. Darius Kazemi liveblogged the speech at his blog, and the slides are here. Best viewed together.

Thursday 30 April 2009

The difference between conversion and retention

Picked up a piece of analysis today from my newsfeed regarding Twitter audience. Nielsen has posted information about Twitter's month-to-month retention (40%) and compared that to Facebook's and MySpace's. Pete Cashmore over at Mashable promptly misread the basic information and came to an entirely wrong conclusion about the stats, titling his post about it as "60% quit Twitter in the first month". A simple misunderstanding of basic audience analysis like this is the crucial difference between explosively growing traffic and a failure. That's a fail for you, Pete.

What's wrong? Well, retention is a separate matter from conversion. 40% conversion from a trial registration to being a continuing active user to the second month would not be a bad conversion rate. It's not stratospherically great, I've seen better, but I wouldn't be terribly unhappy about such a figure. However, Nielsen didn't say anything at all about first-to-second month conversion. This is what they DID say: "Twitter’s audience retention rate, or the percentage of a given month’s users who come back the following month, is currently about 40 percent."

That's pretty plain English when you take the time to read it. Month to month, regardless of visitor lifetime, not first to second month. On this metric, 40% retention is not good at all, and will definitely be a limiting factor to Twitter's traffic and audience size over time, just the Nielsen article points out (and shows the math for). For any given retention rate, there just is a certain maximum audience reach beyond which any new traffic can't overcome the leaving base, since new traffic is not an inexhaustible supply.

And since today is a busy day, that concludes the free startup advice. Take the time to understand the difference between these metrics, you'll thank yourself for it later.

Wednesday 19 November 2008

Looking for a ETL engineer for our BI team

So, I mentioned earlier that I was looking at Infobright's Brighthouse technology as a storage backend for heaps and heaps of traffic and user data from Habbo. Turns out it works fine (now that it's in V3 and supports more of the SQL semantics), and we took it into use. Been pretty happy with that, and I expect to talk more about the challenge and our solution at the next MySQL Conference in April 2009.

However, our DWH team needs extra help. If you're interested in solving business analytics problems by processing lots of data and the idea of working in a company that leads the virtual worlds industry excites you, let us know by sending us an application. Thanks for reading!

- page 1 of 2