Thursday, June 30, 2011

Column oriented databases are not the same as in-memory databases

In recent years, thanks not least to aggressive marketing by QlikTech (or Qlik Technologies as the are now often called) Tableau and Tibco Spotfire, columnar databases and in-memory databases have become very fashionable. Microsoft's VertiPaq engine, which is behind the PowerPivot product, is a good example of a tool that came in on the wave of this trend.

One of the results of these is that there seems to be some confusion about what the terms "in-memory" and "column oriented" mean, and attributes of one are often attributed to the other.

Just to be perfectly clear: A columnar database is not necessarily in-memory, and an in-memory database is not necessarily columnar.

In-memory is a somewhat vague term, since, as Nigel Pendse likes to point out, all databases have to hold data in memory to process it -- the CPU cannot directly access the hard drive. However, I would say that unlike some other tools, IBM Cognos TM1 and QlikView are in-memory. These products load everything into memory before they do anything. If there is not enough memory to fit the entire data set, the load fails and that's that. The same applies to SAP HANA. But unlike QlikView and HANA, TM1 is a multi-dimensional database.

The loading behavior of an in-memory database is much different to the MOLAP engine in Analysis Services, which is fundamentally disk-based but has sophisticated paging abilities to keep as much as the data as possible in memory, or the column oriented Spotfire, which attempts to load everything but uses paging if there is not enough memory.

Columnar is a much clearer and simpler term. It simply means that the data is stored by column instead of by row. There are a large number of analytic databases with this architecture, such as Exadata, SAND, Greenplum, Aster, or Sybase IQ, just to name a few. Some, like Vertica and VertiPaq, even refer to their columnar architecture in their names. Some columnar databases are in-memory, but many are designed to deal with huge amounts of data, up to the petabyte range, and cannot possibly hold it all in memory.

By the way, what got me off on this rant is actually this blog about Endeca Latitude 2 which actually equates the two technologies, and a Linked-In discussion the author started (which is private, so I can't link it here) with the title "Is Data Modeling Dead?"

The idea in memory databases kill data modelling comes from the fact that columnar databases are often used to discover hierarchies, and a whole generation of so-called "agile" in-memory database tools use this method. But in-memory multi-dimensional databases are still around and still very useful for analyzing data on well defined structures such as financial data.

Tuesday, June 21, 2011

The end of Essbase Visual Explorer

I talked to Tableau and Oracle on the same day recently so I managed to get both sides of this story.

Essbase Visual Explorer is an OEM version of Tableau Desktop. In other words, it is the Tableau product rebranded as an Oracle Hyperion product.

The OEM agreement was originally made between Tableau and Hyperion. It made sense for Hyperion, because they did not want to invest in in-house development on any new front-ends, but they needed something to liven up their Essbase offering. It made sense for Tableau because they were a tiny unknown company at the time, and hooking up with Hyperion, then one of the largest BI companies, was a great way to push sales and raise visibility.

But Tableau has moved on since then. Hyperion’s unwillingness to market Essbase aggressively meant that Tableau could not depend on Hyperion forever, and Tableau now supports a wide variety of data sources. They said to me that they were “sunsetting” the relationship. My impression was that only a small proportion of their customers are Visual Explorer customers and they are ready to move on.

Oracle inherited the relationship from Hyperion, but its strategy has been quite different to (and in my opinion more sensible than) Hyperion’s. Reading between the lines of what Paul Rodwick said to me, my guess is that Oracle thinks that Tableau got more out of the deal than Hyperion did. Be that as it may, as a stand-alone tool Visual Explorer does not fit well into Oracle’s ambitious plans to integrate Essbase with its reporting and analysis suite, OBIEE. Visual Explorer is still on Oracle’s price list but the recently released BI Foundation Suite combining Essbase and OBIEE does not include Visual Explorer.

So Oracle will continue to support Visual Explorer, but both Oracle and Tableau have indicated to us that they have little interest in continuing the relationship, and I do not expect Oracle to continue actively positioning Essbase Visual Explorer in the coming years.

Thursday, June 16, 2011

SAP raises the stakes with EPM 10

SAP is now adding new features to its (relatively) new integrated planning tool, BusinessObjects Planning and Consolidation, often still called BPC. It needs to show customers that it is up to more than just integrating its portfolio and it also needs to face up to scrappy new competitors like Tagetik and De facto, planning tools with a very similar architecture.

The presentation I saw yesterday was nominally about EPM, but SAP concentrated on Planning and Consulting (BPC). It has many new features including a new workspace and Web-based data input. This is something new since the original OutlookSoft product only allowed planning in Excel. Neither Tagetik nor De Facto offer this. The product also allows reporting on non-BPC BW data and a slicker looking Excel interface with some nice new features. Unfortunately, they couldn’t get the template function working in Excel for the demo.

SAP says they will gradually move the whole thing to HANA but did not provide details. Having the database in memory is the best way to provide the performance planners demand. However, in my opinion HANA’s architecture is not very well suited for planning. Column database architecture is available in other business intelligence tools such as QlikView and Microsoft PowerPivot, but they are not suited for planning.

Planning involves adding new data to the database (as opposed to overwriting data) and the automatic data driven modelling features of this kind of database makes it impractical to offer a simple way to add data. You cannot "discover" new periods, scenarios or products in an ETL process. Multi-dimensional databases, with their predefined dimensions, are better for this kind of feature, because the idea of an “empty cell” for adding new plan data comes so naturally.

Monday, June 13, 2011

How Information Builders gets around the Flash / HTML 5 controversy

As I mentioned in a previous post, the iPad is still a hot topic for business intelligence vendors, and both Oracle and Information Builders have just come out with new iPad support. At its Summit 2011, Information Builders has been demoing the InfoAssist tool with the built-in ability to switch between rendering in Flash and HTML5 on the fly.

Information Builders is not exactly a newcomer to the world of using HTML to render rich content. In fact the Information Builders Active reports were originally rendered in HTML 4 and offered an amazingly rich  user experience completely off line. But what was amazing back in the day is becoming more commonplace, with HTML5 making it much easier to build rich interactive content. This development shows that users and developers no longer have to takes sides in the Flash vs HTML 5 argument.

To me, the moral of the story is that Microsoft, Abode and Apple may well be wasting their time fighting over the rich web development platform. It was always pretty artificial. And it isn't just large vendors like Information Builders who have this kind of multi-platform capability any more. HTML 5 is not just for MP3 bloggers any more, and as cross-platform development specialists such as Appcelerator  and the many Flash / HTML 5 converters (including Adobe Wallaby) gain traction in the market, it is becoming less and less important to worry about the development tool used to get the content delivered.

Monday, June 06, 2011

The trouble with mobile user interfaces

Mobile user interfaces are a major trend in business intelligence. We read a constant stream of announcements by vendors about their support for mobile BI, even if the actual success stories are less common. I commented on business intelligence and the iPad here.

Timo Elliot made some interesting remarks about how mobile BI could have an affect beyond mobile, and this is my reply:

It should be pretty obvious to most people by this time that mobile user interfaces are going to play a major role in the future of computing. That is because the mobile user experience can never be reproduced. The question is how big a role.

But however big the role, this doesn't necessarily mean that graphic user interface design will be completely changed by mobile devices. Mobile interfaces aren't just new, they are also subject to completely different constraints than desktop systems are.

  • Pointing with your finger can be seen as a much more natural user interface that using a mouse, but it is also less accurate.
  • Simple interfaces fit small screens well, but the give users fewer clues as to what they can do next.
And so on. The user interface guidelines that have been developed for desktop systems still make sense, and as a result my guess is that mobile devices will not change desktop interface designs all that much.

Wednesday, June 01, 2011

SQL Crescent: A new step for Microsoft SQL

Crescent is the code name the new business intelligence user interface Microsoft intends to release with the Denali release of SQL Server. You may have heard about it, because Microsoft has been releasing excited videos about it, featuring huge crowds of developers going wild over the technology. I have not seen much of the technology itself, but what I have seen seems to borrow heavily on the user interfaces of Tableau and the Google Data Explorer.

The technology is closely related to PowerPivot. It is based on a new hybrid semantic layer that either directly access a data source or uses PowerPivot's Vertipaq column-oriented storage engine. It is part of Microsoft's not yet fully realized plan to provide seamless conversion of Excel-based data marts to a server environment.

To me, the most intriguing thing about SQL Crescent is that Microsoft's SQL development team is dipping its toe into the area of business intelligence front ends. (I don't really include Reporting Services because it is simply a reporting tool, not a specialized BI tool, and it works better with relational data than with multidimensional data.) In the past real analysis has been the domain of the Office group. PerformancePoint was a bit of a halfway house between the groups, but in the end it was folded into SharePoint, where the dashboard component still resides. Currently PowerPivot only has an Excel (or SharePoint Excel Services) user interface. So this seems like a big step.