Friday, April 17, 2009

Twitter's limited API

When I first looked at the Twitter API I was surprised to see how simple it is to use. The next thing I noticed is how little it can do. In particular, as a business intelligence guy I was struck by the lack of sophisticated query methods.

The API does not provide a way to do simple things like get a sorted list of the people you follow, which is probably ok. It's easy enough to do your own sort in your client. But what if you want to sort the people you follow by say the number of people THEY follow? To do that, you need to make an API call for each person you follow to find who he follows, and then do the sort in the client. This is pretty much hopeless for analysis purposes, especially considering the calls per hour limitations the API has and the fact that there are lots of Tweeters following 20,000+ people.

Twitter doesn't deliver some API features because it can't afford to. The "firehose" is the jargon for a real time feed of all tweets from everyone worldwide. Twitter has been promising this for some time but keeps delaying it. One reason may be that they are afraid there would be too many takers. I suspect the reason Google can offer so much storage to gmail users is that no one uses it -- like a bank hoping there won't be a run. Using an add on to store lots of data there is possible, but I think Google frowns on those shenanigans and will even block your account if you up load too fast... So as long as Twitter keeps growing at its current breakneck speed, there are some things the API won't offer because offering it could break their overstrained servers.

Another reason that Twitter might not offer some functions to their API is that they want to sell analyses as an added value service. Twitter still doesn't have a business model (except "Microsoft or Google") but analytics is an obvious option. In fact I think that providing analytics is the only real prospect that Twitter has.

In particular, Twitter has a mechanism for providing that favorite business intelligence feature -- the real time alert. For example, sighting a yellow headed blackbird in Connecticut is unusual, but the news needs to be immediate for an ornithologist to profit from it.

Twitter is also about people, not just about information. It provides relatively detailed information about who knows who. In fact I see this as the keep features of the service, so I am surprised that so little effort is invested in suppressing spam.

But whatever the specific application, the ability to analyze Twitter's database is too valuable to give away, so I suspect Twitter will not do too much to make their API better for analytics in the near future.

Monday, April 13, 2009

Combining Saas, Open Source and BI

The terms business intelligence (BI), software as a service (SaaS) and open source (sorry OS means operating system to me so no acronym) get juxtaposed a lot these days. Have a look at this as an example. The connection between SaaS (which kind of segues to cloud computing) and open source is twofold:
  1. An SaaS provider is a kind of license multiplier, so license fees are critical for him. That makes open source more attractive to him than to most others.
  2. Open source products are usually to technical for business users. But SaaS providers may shield their users from this complexity.
The upshot of all this is that open source may be very interesting to SaaS BI providers, even if it isn't very interesting to end users. There is plenty of chatter about this, but as usual a lot of the messages are aimed at the wrong recipient. It makes no sense to pitch open source SaaS to BI end users, because they don't care. But it makes a lot of sense to pitch open source BI to BI SaaS providers.

This whole thing is typical of the muddle surrounding BI, which is pretty technical but aimed at business users.

Thursday, April 09, 2009

Lyza, Gemini and QlikView

I've seen a recent rash of comparisons between Lyzasoft's Lyza and Microsoft' Gemini project. For example, here, here, here , here and here .

It's an interesting twist to the usual way you market a product. Lyzasoft seems to want to cash in on the buzz Microsoft is creating for its (competing) product. This in turn is connected to the buzz surrounding QlikTech.

Tuesday, April 07, 2009

SAS and business analytics

The discussion of SAS's new marketing campaign that differentiates between "business intelligence" and "business analytics" goes on. SAS has replied to some of the criticisms. To be honest, as I have already said I have a lot of doubts on the subject.

On the other hand, I honestly don't care very much. At BARC a big part of our mission is helping customers find the right product. To do that, we make a big effort to help companies separate important information from less important information. When we advise customers about which product they should select, we never discuss the vendor's marketing material. We discuss the user's needs and the feature set of products that seem likely to fit those needs.

Nigel Pendse wrote a piece called "What's in a name?" some years ago. It's dated in all its details, but still rings true. It certainly is not a criticism of the products the vendors had on offer, or any recommendation pro or con of any of the products the vendors offer. That would have been a disservice to the vendors and to potential customers.

I don't always agree with the way vendors present their products, but in the end it's their business. And even when I like the way they present their products I don't recommend that anyone base a purchase decision on any marketing statement. The question is how well the product fits the users' needs.

I also wonder if SAS is overreacting. The company can hardly expect analysts not to react to a marketing message announcing the death of business intelligence. It is obviously intended to be provacative. It would be unrealistic to expect all the reactions it provoked to be positive. I don't think the best analysts are necessarily the ones that praise the vendors the most. Like Abraham Lincoln said, knavery and flattery are blood relations.

Update: This post is partly based on something Peter Thomas twitteredI hadn't realized that he also had a new blog entry on the topic when I wrote it.

Saturday, April 04, 2009

Excel as a planning tool

In the course of my consultancy I often come across examples of companies doing their planning in Excel. For small-scale scenarios this is fine, but we sometimes see amazingly big and complex systems built on Excel. I do not consider this to be best practice.

I have twittered about this a time or two and I every time I get responses from people asking me what I mean. I think this one of the great things about social media. As a (self-annointed) expert on BI I tend to think that this is an obvious point. In fact Twitter has reminded me that it is not.

What I am talking about is large sets of Excel spreadsheets sometime containing a good deal of complexity which are sent around a company by email to collectplanning data. Such systems often contain tens of thousands of Excel formulas and are sometimes augmented by BASIC code as well. In some cases the results are fed back into some transactional system, but not always.

So here are a few important points on this issue:
  • I am not at all critical of the idea of Excel addins. Addins are third party products that enhance Excel. In fact far from being critical of this class of product, I like them quite a bit, and MIS, the company I once worked for and which now belongs to Infor was one of the many vendors of useful products of this type. Recently all the big BI vendors have piled into this market. An in-depth discussion of this type of software is also the topic of one of the most popular documents at OLAPReport (login required).

  • I also do not criticize companies and people who create this kind of system. In many cases it is the only way they have to deal with the complexity they are presented with in the limited time the planning cycle allows. The ingenuity I have seen put into some of these systems these systems is amazing

Nevertheless, I think these systems are very problematic, and any company using them should invest time and energy reviewing them and attempting to find a good way to replace them. The reason is that the are expensive to maintain, limited in functionality -- particularly in the are of analysis and simulation -- and inevitably suffer from data quality issues.