Thursday, January 04, 2018

Negative energy prices and artificial intelligence

Since renewable energy has started to become popular, an odd problem has appeared in wholesale energy markets: negative prices.

In other words, energy plants sometimes pay their customers to take energy off their hands. Usually older, less flexible plants that can't shut down without incurring costs are affected.
One solution to this problem is batteries. The idea is to store the energy when it is overabundant, and use it later when it is expensive. This is sometimes called "peak shaving".

Batteries are a great idea, but not the only solution. Another is to simply find an application that is energy hungry and can be run intermittently.

One possible application for soaking up excess energy is desalination. For example, a desert region near an ocean could build solar plants to desalinate water during the day only. The question is whether building a desalination plant that only runs 12 hours a day is worth the savings in energy. 
Another way to make use of energy that might go to waste is using it to power computers that perform analytics. The energy demand of data centers is growing quickly.


One source of energy needs is Bitcoin. Bitcoin mining consumes huge amounts of energy, so it is a great example of a use for negative energy prices. In fact there are already a lot of bitcoin miners in Western China, where solar and wind installations have outstripped grid upgrades. In these areas renewable energy is often curtailed because the grid can't keep up. So the energy is basically free to the miners.

Extremely cheap bitcoin mining arguably undermines the whole concept, but here is a more productive idea: Training artificial intelligence. For example, have a look at this link to gcp leela, a clone of Google Deepmind Alphago zero:
The entire source code is free, and it's not a lot of code. But that free code is just the learning model, and its based on well known principles. It's probably just as good as Deepmind Alphago Zero when trained, but they figure it would take them 1700 years to train -- unless of course they could harness other resources.



This is partly because they don't have access to Google's specialized TPU hardware. Whatever the reason, training it is going to burn through a lot of energy.
This would be a great application for negatively priced energy. Game playing is more a stunt than a commercial application, but when they are paying you to use the energy, why not? And as time passes, more useful AI apps will need training.
So it gets down to whether the business model of peak shaving with batteries makes more economic sense than banks of custom chips for training neural networks for AI in batches. The advantage of batteries is that you can sell the energy later for more, but it's not terribly efficient, and using it directly is a better idea. Cheap computer hardware and a growing demand for AI may fit this niche very well.
This puts a whole new twist on the idea that big tech companies are investing in renewables. These companies make extensive used of AI, which is trained in batch processes. 

Wednesday, January 03, 2018

Understanding Artificial Neural Networks

Artificial neural networks are computer programs that learn a subject matter of their own accord. So an artificial neural network is a method of machine learning. Most software is created by programmers painstakingly detailing exactly how the program is expected to behave. But in machine learning systems, the programmers create a learning algorithm and feed it sample data, allowing the software to learn to solve a specific problem by itself.

Artificial neural networks were inspired by animal brains. They are a network of interconnected nodes that represent neurons, and the thinking is spread throughout the network. 

But information doesn't fly around in all directions in the network. Instead it flows in one direction through multiple layers of nodes from an input layer to an output layer. Each layer gets inputs from the previous layer and then sends calculation results to the next layer. In an image classification system, the initial input would be the pixels of the image, and the final output would be the list of classes.



The processing in each layer is simple: Each node get numbers from multiple nodes in the previous layer, and adds them up. If the sum is big enough, it sends a signal to the nodes in the layer below it. Otherwise it does nothing. But there is a trick: The connections between the nodes are weighted. So if node A sends a 1 to nodes B and C, it might arrive at B as 0.5, and a C as 3, depending on the weights in the connections. 

The system learns by adjusting the weights of the connections between the nodes. To stay with visual classification, it gets a picture and guesses which class it belongs to, for example "cat" or "fire truck". If it guesses wrong, the weights are adjusted.This is repeated until the system can identify pictures.

To make all this work, the programmer has to design the network correctly. This is more an art than a science, and in many cases, copying someone else's design and tweaking it is the best bet.

In practice, neural network calculations boil down to lots and lots of matrix math operations as well at the threshold operation the neurons use to decide whether to fire. It's fairly easy to imagine all this as a bunch of interconnected nodes sending each other signals, but fairly painful to implement in code. 

The reason it is so hard is that there can be many layers that are hard to tell apart, making it easy to get confused about which is doing what. The programmer also has to keep in mind how to orient the matrices the right way to make the math work, and other technical details. 

It is possible to do all this from scratch in a programming language like Python, and recommended for beginner systems. But fortunately there is a better way to do advanced systems: In recent years a number of libraries such as Tensorflow have become available that greatly simplify the task. These libraries take a bit of fiddling to understand at first, and learning how to deal with them is key to learning how to create neural networks. But they are a huge improvement over hand coded systems. Not only do they greatly reduce programming effort, they also provide better performance.

Sunday, March 13, 2016

Alphago probably isn't learning from Lee Sedol

There has been quite a bit of discussion about whether Alphago can learn from the games it plays against Lee Sedol. I think not. At least, not directly. 
The heart of the program is the “policy network” a convolutional neural network (CNN) that was designed for image processing. CNNs return a probability that a given image belongs to each of a predefined set of classifications, like “cat”, “horse”, etc. CNNs work astonishingly well, but have the weakness that they can only be used with a fix size image to estimate a fixed set of classifications.
The policy network views go positions as 19×19 images and returns probabilities that human players would make one of 361 possible moves. This probability drives with the Monte Carlo tree search for good moves that has been used for some time in go computers.The policy network is trained on 30 million positions (or moves) initially. 
CNN (aka “deep learning”) behavior is pretty well understood. The number of samples required for learning depends on the complexity of the model. A model of this complexity probably requires tes of thousands of example positions before it changes much. 
The number of samples required to train any machine learning program depends on the complexity of the strategy, not on the number of possible positions. For example, Gomoku ("five in a row", also called goban) on a 19×19 board would take many fewer examples to train than go would, even though the number of possible positions is also very large.
Another point: Any machine learning algorithm will eventually hit a training limit, after which it won’t be able to improve itself by more training. After that, a new algorithm based on a new model of game play would be required to improve the play. It is interesting that the Alphago team seems to be actively seeking ideas in this area. Maybe that is because they are starting to  hit a limit, but maybe it's just because they are looking into the future.
So Alphago probably can’t improve its play measurably by playing any single player five times, no matter how strong. That would be “overfitting”. The team will be learning from the comments of the pro players and modifying the program to improve it instead.
Interesting tidbit: Alphago said the chances of a human playing move 37 in game 2 was 1 in 10,000. So the policy network doesn’t decide everything.

Alphago is a learning machine more than a go machine

The key part of Alphago is a convolutional neural network. These are usually used for recognizing cat pictures and other visual tasks, and progress in the last five years has been incredible.
Alphago went from the level of a novice pro last October to world champion level for this match. It did so by playing itself over and over again.
Chess programs are well understood because they are programmed by humans. Alphago is uses an algorithm to pick a winning move in a given go position. But the heart of the program is a learning program to find that algorithm, not the algorithm itself.
Go programs made steady progress for a decade with improved tree pruning methods, which reduce the total number of positions the program has to evaluate. The cleverest method is Monte Carlo pruning, which simply prunes at random. 

Saturday, April 27, 2013

Actian acquires Paraccel


Paraccel was founded in 2005 by former Netezza executives as a big data alternative to Netezza's DW appliance idea. Paraccel runs on a commodity platform. It got lots of VC money (I've heard estimates as high as $90m) but has comparatively few customers. 

Paraccel apparently hoped to generate a little revenue with hosting deals, which are always low priced. Or maybe they were just in it for the publicity. The technology is used for MicroStrategy Wisdom. I doubt MicroStrategy paid them very much, as is typical for deals like this. Recently they entered a deal with Amazon to license their technology for Redshift, which Amazon is reselling at very low rates. With the takeover, Amazon no longer owns a share of Paraccel.

Paraccel's struggles aren’t very surprising, since so many other vendors in the space except Teradata lost their independence in 2010/2011. Sybase, Netezza, Kickfire, Greenplum, Vertica and Aster Data were all acquired. Also HP killed Neoview in the same time frame. 

Actian is the newish name for Ingres, and controls the open source database by the same name. It also has several other databases including VectorWise., There seems to me to be a good deal of overlap between the VectorWise and Paraccel. Actian is not a high profile company but it will be interesting to see what strategy they adopt to squeeze cash out of this highly funded and presumably expensive company.

Wednesday, March 27, 2013

How headlines misrepresent data

Have a look at this TIME article. The headline is "Americans Are Eating Fewer Calories, So Why Are We Still Obese?".

It goes on to cite two studies. One study shows that old people eat less fast food then young adults, and that in general food consumption is declining. This comes from the aging population I guess. The other shows a small decline (about 7%) in the calorie consumption of children. 

Neither study showed a decrease in calorie consumption by adult Americans, as far as I can tell. The naive assumption would be that consuming fewer calories would result in less obesity, but no data on the topic is presented. The headline is a complete misdirection.

The moral of the story is you should limit your observations to what the data actually says.

Thursday, March 14, 2013

BI Trends for 2013

Analysts like to provide a list of predictions for the coming year. I've never been a big fan because I think that real tipping points are pretty rare in any industry. People tend to overestimate the short-term change and underestimate long-term change.

Here are some trend ideas that are floating around the internet. There are basically three kinds:

New data sources: This mostly means Big Data, whatever that means. (Personally, I like Amazon's definition, which is "too much data to handle on a single server". Other definitions strive to include appliances like Teradata and HANA.)

New applications: Sentiment analysis, predictive analytics, collaboration

New UIs and functions: Better dashboards and visualizations, more self service and agility, voice interfaces, mobile.

New Platforms: This is mostly in memory, cloud and SaaS.

My take on all this is simple:

New data sources: I think big data is still mostly for online and mobile providers. It's true that manufacturers and retailers are trying to figure out how to make better use of the huge amounts of data their business directly and indirectly generates. But this business is still heavily dependent on boutique providers that bring a lot of domain knowledge and deep understanding of statistics with into the deal. I do not think it will have much impact on existing BI business. It's something different.

New applications: The same remarks apply to sentiment analysis as to big data. Predictive analytics is a more interesting market, but to find large, non-specialist audiences, vendors need to prove that their "black box" predictions are as reliable as the expertise of business users without explaining the math behind them.

New user interfaces and functions continue to appear, but I believe that as long as most BI companies stick to solving the easy problems, like making software look cool in a demo, and ignore the harder problems, like user-friendly data governance, there will be no big changes here. Mobile has surprised me, but it still hasn't made a big difference in the BI business.

New platforms. It is good to remember that business users don't care what platform is used, and that the most successful projects are controlled by business users. A platform is only good in the sense it delivers things like speed and convenience. It doesn't add any value per se.

Friday, December 30, 2011

Everything is a computer

Christmas shopping this year really drove home to me how completely the electronics industry has made itself obsolete in recent years. The electronics stores are emptying out. All the gadgets that have been so popular in recent decades -- cameras, camcorders, VCRs, tape recorders, CD players, portable music boxes, dictation machines, game consoles, pagers, PCs, notebook computers, TVs, radios, pocket calculators, GPS navigation devices, synthesizers, mixing consoles and of course telephones (mobile and land based) have all disappeared into smart phones.

But smart phones aren't really phones at all, they are just palmtop computers that include an interface for cellular networks and a phone app. They are called smart phones for marketing reasons -- because the phone companies use them to lock consumers into overpriced network contracts. If photography were the killer app, they would be called smart cameras, which is just as appropriate. The only thing that seems to be keeping the entire electronics industry from being swallowed up by the black hole of Moore's Law is this kind of marketing wheeze and the (rapidly falling) price of screens.

I'm old enough to remember when Radio Shack was a national treasure. Now it's just a place to buy batteries while being harassed by a hard selling salesman from some telecoms oligopolist best known for hating its customers. And in this age of smart hearing aids, robotic assembly lines and fly-by-wire airplanes, its not just consumer electronics that is being computerized.

Wednesday, December 14, 2011

Mobile business intelligence expectations

There is a discussion of some remarks I made to Ann All at IT Business Edge here, together with some opnions by Howard Dresner.

The interview reflects my opinion pretty well. I see mobile BI as a way to find new types of customers for BI more than as a way to replace existing installations. Too bad I didn't mention salespeople on the road, I think they are an important potential market as well.

Another point is that I think most people who said they expected mobile BI to be in use within 12 months were being too optimistic. The BI Survey 11 will address this question.

Thursday, November 17, 2011

Chasing new trends

I'm talking to Tibco about Spotfire. It's an interesting product that I've reviewed before. It seems to me that they are moving more and more into operative BI, which fits the Tibco idea of fast data delivery that fits Tibco very well. They also seems to be putting more emphasis on ROLAP than they used to.

What is interesting is that they are also presenting a social media tool called Tibbr (wonder how they came up with that name!) and a cloud version. There's nothing wrong with this of course, but it seems to me that it doesn't fit their bus and/or ROLAP approach very well.

Their justification for the investment is that some analyst or other is predicting fast growth in this area. This reminds me of what an important ole analysts play in the market. Thanks to the analysts, a lot of BI vendors are jumping on the cloud bandwagon, even though the cloud sales channel is very different from what most BI vendors are accustomed to, and that the idea of mving data off site and then back onsite adds complexity.

Thursday, August 18, 2011

Enterprise BI and agility

I'm talking to QlikTech about their new version, coming soon. I'll be publishing the results in the BI Verdict. They are focusing more and more on these bigger accounts. I think this story fits pretty well in my informal series of posts on the subject of agile BI. By coincidence I have already discussed QlikView in a previous post.

QlikTech has a what they call a "land and expand" policy, which means getting a single department on the tool and expanding from there. Actually, all BI companies that can deliver departmental solutions have something similar. The reason for this is simple: The cost of sales for selling to a company that is already using the tool is much lower than for completely new customers. In fact, a lot of BI tools spread through companies from department to department this way.

Now QlikTech is concentrating more on enterprise accounts. So it's interesting to see that the company is moving away from the previous claim that the tool is a replacement for a data warehouse. I think that any attempt on their part to compete as an enterprise solution would just distract them from their end users.

A lot of BI companies go through a similar life cycle as they grow. Most start out as ways to create departmental solutions, which tend to be faster, more agile projects. As they get bigger management tends to concentrate on larger accounts, which means making sure they are acceptable to the IT department. But IT is more interested in keeping processes running than in agile development. As a result, the products tend to become more complex and less suitable to agile solutions.

This is a big issue for QlikView right now because they have grown so quickly in recent years. But currently the company seems to backing away from radical changes in the tool. But it applies to and BI tool that is growing.

Thursday, July 28, 2011

Why short projects are part of agile business intelligence

One of the key ideas in agility is the importance of delivering real, testable results without delay. In fact, the Agile Manifesto recommends delivering working software frequently, from a couple of weeks to a couple of months.

Delivery working software with two months may sound a bit extreme, but there is good evidence that short projects are more successful than long projects. In fact, our research in the BI Survey shows that the application should be rolled out to the users less than six months after the product has been selected. We have found the same result year after year in the ten year history of the Survey. Amazingly, project benefits start to fall as early as a month after the product is selected, and continue to fall thereafter. And of the many project parameters we study, none shows as clear an effect on project success as project length.

These results from the BI Survey provide clear empirical support for the idea of using agile methods in business intelligence projects. The results have remained also consistent since we started the Survey ten years ago, long before the idea of agile development or agile business intelligence became popular.

But why do short projects work so much better? Our research shows that the main problems that arise in longer projects are organizational, not technical. Needs change over the course of time, and end users lose interest in the project. Disagreements over the project goals arise. Lack of interest and disappointed end users are a major issue in business intelligence.

And needs certainly do change quickly in business intelligence. For example, another study we carried out shows that three quarters of European companies modify their planning processes one or more times a year. In an environment like this, a project that takes a year to implement is quite likely to be obsolete before it is finished. Even a six month wait can push potential users to look around for a more agile solution.

The problem this creates is that not all business intelligence projects can be carried out within a few months. This is especially true when major data management issues need to be addressed. The agile solution to this is to find way of splitting large projects into smaller ones. The usual argument against this approach is that it creates the risk of reducing efficiency in the long term. But the agile methodology is to measure success in terms of working software delivered in the short term, instead of attempting to meet nebulous long-term goals.

Sunday, July 17, 2011

Business Intelligence, the semantic Web and Alltop

My blog is including on the Alltop business intelligence page, and at the time of writing I display their badge on the blog.

According to the site 'The purpose of Alltop is to help you answer the question, “What’s happening?”' Alltop is a 1990's style Web registry maintained by human editors.

But Alltop's business intelligence page has several problems that make it less useful than it could be. The page has fallen victim to a semantic Web style gotcha. Like many other phrases, business intelligence means different things to different people. If you don't disambiguate somehow, a web registry based on a phrase may make no sense.

There are three distinct meanings of the phrase "business intelligence". The first is something about software for analyzing business data -- like my blog. The second is news about businesses, which is interesting but unrelated. These are some of those blogs:

MEED NEWS, EMARKETS.DE, B2B TRADE, DEALBOOK, ARUNDEL BUSINESS NEWS, FOREX TRADING INFO, THE FINANCE KID, ARBITRAGE MAGAZINE, SMALL BUSINESS SUPPORT

The third meaning is based on a completely different meaning of intelligence -- intelligence as in IQ, as opposed to intelligence as in information for analysis. In the sense business intelligence just means being smart about business, which could men just about anything.

So Alltop's business intelligence page contain sites that are not at all related to the #businessintelligence tag on Twitter. A lot of these seem to be advice about sales and entrepreneurship or general management consulting blogs. A few are just political blogs, or blogs about general market or marketing trends. They're fine in their way, I guess, just misplaced. Here's a list:

STATSPOTTING!, WSJ: THE NUMBERS GUY, MANAGE BY WALKING AROUND, BUILDING BUSINESS VALUE, CORPORATE MANAGEMENT STRATEGIES, KNOWLEDGE WORKS, HUSTLEKNOCKIN', THE THINK HERE BLOG, LEAD VIEWS, THE SOLOPRENEUR LIFE, SMEDIO, INTERCALL BLOG, GLOBAL INSTITUTE FOR INSPIRATION, SMALL BUSINESS SUPPORT, RED WING SOFTWARE BLOG, FRED67 B2B REFERRAL CLUB, RESULTS.COM BUSINESS GROWTH TIPS

I'm not criticizing any of these guys, just saying they seem to be improperly categorized.

Also Alltop is syndicating advertising material thinly disguised as blogs. Of course, maybe they're getting paid, what do I know? If not they should be. The following are BI vendors, which may or may not be problem:

RED WING SOFTWARE BLOG, LOGIXML, BIME - SAAS BUSINESS INTELLIGENCE (BI), BLOG.JINFONET.COM, MICROSTRATEGY 101, PANORAMA BUSINESS INTELLIGENCE

In addition, there are several aggregators -- Yahoo, and Beyenetwork twice. These guys can be seen as comptetitors, I guess.

In the end I think that the lack of careful stewardship reduces the usefulness of the site. The problem is that business intelligence is a vague term and needs a semantic Web to be useful. Manual editing in a web registry is a workaround, but it is not being used here to much effect.

Saturday, July 09, 2011

Discovering hierarchies in columnar databases

I recently blogged about columnar and Wayne Eckerson asked me for a clearer explanation of what I mean by columnar databases "discovering hierarchies".

For example consider the approach of two well known products, IBM Cognos TM1, which is multidimensional, and QlikView, which is columnar.

My definition of a data model is a structure that is informed by an administrator, or set down in the master data. To me this is different to a structure derived from analyzing the transactions. In the following simple example, let's say I have two sales teams, one for dental hygiene products and one for soap.

If I were designing a data model in TM1, then I could create a hierarchy, which is a set of parent child relationships between the departments and the products they sell. If the soap people cross-sold some toothpaste, it would have no effect on the hierarchy, because it is predetermined by my idea of how my company is supposed to work.

If I were to import the same data in QlikView I could create a report that showed me the relationship between the sales teams and the products without defining the model. Once the data is imported, QlikView recognizes the relationships automatically.

When the soap guys cross-sell toothpaste, QlikView discovrs the new relationship, but the hierarchies stay the same in TM1, because that's how I defined the model. To me this is the key difference. On the one hand the structures are coming directly from the actuals, and on the other hand they reflect my predefined perception (or "model") of what is going on.

So columnar databases typically discover the relationships automatically, and multidimensional databases allows you to define the relationships as you want them. Another way to look at this is that the transactional data drives the master data structure in a colunmar database, but those structures are wired into the multidimensional model.

So which approach is better? It depends on the application.

Thursday, July 07, 2011

Is the cloud really the future of business intelligence?

The idea of hosting software off-premises is not new. In fact, software as a service (SaaS) is just a rewording of an older term referring to companies providing the service, application service provider (ASP). ASPs were a very popular investment target during the dotcom era, but not much ever came of the business model. Like so many Internet-based startup ideas in this era, most of these companies disappeared.

In fact, the only major exception to this rule was Salesforce.com, which used to refer to itself as an ASP, before the SaaS term became popular. And indeed Salseforce.com is often taken as an example in discussions of the future of BI as a service, or cloud BI.

But why was the ASP wave a failure and why should cloud computing be different? Even the often quoted success of Salesforce.com is far removed from the company’s once loudly trumpeted goal of replacing all the ERP features of SAP and PeopleSoft. Furthermore, do the forces that could make SaaS more successful than ASP also apply to the BI market?

The main argument offered by vendors in favor of cloud BI is that it is less expensive than the more typical on-premises software. The upfront capital costs of hardware contribute to these savings. The other major factor is that the company is not required to operate separate servers for the BI application.

However, both of these savings result from the investment required to own and operate the hardware for BI software. Similar savings can also be realized with strictly in-house approaches such as server virtualization. Furthermore, the use of Web technology has already eliminated one of the key costs of BI initiatives, which is the cost of rolling out the front-ends to large numbers of users.

Another argument is that mobile devices are driving SaaS BI, but as HTML5 spreads, it is hard to see why mobile devices should require a cloud solution.

We a conducting a survey to find out more about BI in the cloud. If you take part you could win an iPad 2.

I'd also be interested to hear any comments on what I have talked about here.

Tuesday, July 05, 2011

Data modeling and agile BI

One of the advantage that some analytical tools such as QlikView Spotfire or Tableau claims to offer over the products they call "Traditional BI" is that they can be used without data modeling. According to this claim, data modeling is a major obstacle to agile business intelligence and not needed anyway.

Is it true that data modeling is dead? Has technology found a workaround?
The need for data modeling depends upon the application. Products that promise user friendly analysis without any data analysis are usually intended for a specific type of analysis that does not require any previously specified structure.

A good example of data does not require modeling that retailers gather about their customers. This data comes in big flat tables with many columns, and the whole point to the analysis is to find unexpected patterns in this unstructured data. In this case adding a model is adding assumptions that may actually hinder the analysis process.

However, some types of analyses only make sense with at least some modeling. Time intelligence is an example of a type of analysis that is supported by a data model. Also analyzing predefined internal structures such as cost accounts or complex sales channels is usually more convenient based on predefined structures. The alternative method of discovering the structures in the raw data may not be possible.

Planning is a common area of agile BI, and planning is rarely possible without predefined structures. It is no coincidence that the tools that promise analysis without data modeling do not offer planning features. Planning requires adding new data to an existing data set. In some cases, this includes adding new master data, for example when new products are being planned. Furthermore, there is often a good deal of custom business logic in a planning application that cannot be defined automatically. Most financial planning processes, and the analysis and simulation that goes along with them cannot be carried out on simple table.

In my view the new generation columnar databases are a welcome addition to agile BI. But I also think that their marketing is sometimes a little over the top when it comes to dismissing existing BI solution in this area.

Thursday, June 30, 2011

Column oriented databases are not the same as in-memory databases

In recent years, thanks not least to aggressive marketing by QlikTech (or Qlik Technologies as the are now often called) Tableau and Tibco Spotfire, columnar databases and in-memory databases have become very fashionable. Microsoft's VertiPaq engine, which is behind the PowerPivot product, is a good example of a tool that came in on the wave of this trend.

One of the results of these is that there seems to be some confusion about what the terms "in-memory" and "column oriented" mean, and attributes of one are often attributed to the other.

Just to be perfectly clear: A columnar database is not necessarily in-memory, and an in-memory database is not necessarily columnar.

In-memory is a somewhat vague term, since, as Nigel Pendse likes to point out, all databases have to hold data in memory to process it -- the CPU cannot directly access the hard drive. However, I would say that unlike some other tools, IBM Cognos TM1 and QlikView are in-memory. These products load everything into memory before they do anything. If there is not enough memory to fit the entire data set, the load fails and that's that. The same applies to SAP HANA. But unlike QlikView and HANA, TM1 is a multi-dimensional database.

The loading behavior of an in-memory database is much different to the MOLAP engine in Analysis Services, which is fundamentally disk-based but has sophisticated paging abilities to keep as much as the data as possible in memory, or the column oriented Spotfire, which attempts to load everything but uses paging if there is not enough memory.

Columnar is a much clearer and simpler term. It simply means that the data is stored by column instead of by row. There are a large number of analytic databases with this architecture, such as Exadata, SAND, Greenplum, Aster, or Sybase IQ, just to name a few. Some, like Vertica and VertiPaq, even refer to their columnar architecture in their names. Some columnar databases are in-memory, but many are designed to deal with huge amounts of data, up to the petabyte range, and cannot possibly hold it all in memory.

By the way, what got me off on this rant is actually this blog about Endeca Latitude 2 which actually equates the two technologies, and a Linked-In discussion the author started (which is private, so I can't link it here) with the title "Is Data Modeling Dead?"

The idea in memory databases kill data modelling comes from the fact that columnar databases are often used to discover hierarchies, and a whole generation of so-called "agile" in-memory database tools use this method. But in-memory multi-dimensional databases are still around and still very useful for analyzing data on well defined structures such as financial data.

Tuesday, June 21, 2011

The end of Essbase Visual Explorer

I talked to Tableau and Oracle on the same day recently so I managed to get both sides of this story.

Essbase Visual Explorer is an OEM version of Tableau Desktop. In other words, it is the Tableau product rebranded as an Oracle Hyperion product.

The OEM agreement was originally made between Tableau and Hyperion. It made sense for Hyperion, because they did not want to invest in in-house development on any new front-ends, but they needed something to liven up their Essbase offering. It made sense for Tableau because they were a tiny unknown company at the time, and hooking up with Hyperion, then one of the largest BI companies, was a great way to push sales and raise visibility.

But Tableau has moved on since then. Hyperion’s unwillingness to market Essbase aggressively meant that Tableau could not depend on Hyperion forever, and Tableau now supports a wide variety of data sources. They said to me that they were “sunsetting” the relationship. My impression was that only a small proportion of their customers are Visual Explorer customers and they are ready to move on.

Oracle inherited the relationship from Hyperion, but its strategy has been quite different to (and in my opinion more sensible than) Hyperion’s. Reading between the lines of what Paul Rodwick said to me, my guess is that Oracle thinks that Tableau got more out of the deal than Hyperion did. Be that as it may, as a stand-alone tool Visual Explorer does not fit well into Oracle’s ambitious plans to integrate Essbase with its reporting and analysis suite, OBIEE. Visual Explorer is still on Oracle’s price list but the recently released BI Foundation Suite combining Essbase and OBIEE does not include Visual Explorer.

So Oracle will continue to support Visual Explorer, but both Oracle and Tableau have indicated to us that they have little interest in continuing the relationship, and I do not expect Oracle to continue actively positioning Essbase Visual Explorer in the coming years.

Thursday, June 16, 2011

SAP raises the stakes with EPM 10

SAP is now adding new features to its (relatively) new integrated planning tool, BusinessObjects Planning and Consolidation, often still called BPC. It needs to show customers that it is up to more than just integrating its portfolio and it also needs to face up to scrappy new competitors like Tagetik and De facto, planning tools with a very similar architecture.

The presentation I saw yesterday was nominally about EPM, but SAP concentrated on Planning and Consulting (BPC). It has many new features including a new workspace and Web-based data input. This is something new since the original OutlookSoft product only allowed planning in Excel. Neither Tagetik nor De Facto offer this. The product also allows reporting on non-BPC BW data and a slicker looking Excel interface with some nice new features. Unfortunately, they couldn’t get the template function working in Excel for the demo.

SAP says they will gradually move the whole thing to HANA but did not provide details. Having the database in memory is the best way to provide the performance planners demand. However, in my opinion HANA’s architecture is not very well suited for planning. Column database architecture is available in other business intelligence tools such as QlikView and Microsoft PowerPivot, but they are not suited for planning.

Planning involves adding new data to the database (as opposed to overwriting data) and the automatic data driven modelling features of this kind of database makes it impractical to offer a simple way to add data. You cannot "discover" new periods, scenarios or products in an ETL process. Multi-dimensional databases, with their predefined dimensions, are better for this kind of feature, because the idea of an “empty cell” for adding new plan data comes so naturally.

Monday, June 13, 2011

How Information Builders gets around the Flash / HTML 5 controversy

As I mentioned in a previous post, the iPad is still a hot topic for business intelligence vendors, and both Oracle and Information Builders have just come out with new iPad support. At its Summit 2011, Information Builders has been demoing the InfoAssist tool with the built-in ability to switch between rendering in Flash and HTML5 on the fly.

Information Builders is not exactly a newcomer to the world of using HTML to render rich content. In fact the Information Builders Active reports were originally rendered in HTML 4 and offered an amazingly rich  user experience completely off line. But what was amazing back in the day is becoming more commonplace, with HTML5 making it much easier to build rich interactive content. This development shows that users and developers no longer have to takes sides in the Flash vs HTML 5 argument.

To me, the moral of the story is that Microsoft, Abode and Apple may well be wasting their time fighting over the rich web development platform. It was always pretty artificial. And it isn't just large vendors like Information Builders who have this kind of multi-platform capability any more. HTML 5 is not just for MP3 bloggers any more, and as cross-platform development specialists such as Appcelerator  and the many Flash / HTML 5 converters (including Adobe Wallaby) gain traction in the market, it is becoming less and less important to worry about the development tool used to get the content delivered.

Monday, June 06, 2011

The trouble with mobile user interfaces

Mobile user interfaces are a major trend in business intelligence. We read a constant stream of announcements by vendors about their support for mobile BI, even if the actual success stories are less common. I commented on business intelligence and the iPad here.

Timo Elliot made some interesting remarks about how mobile BI could have an affect beyond mobile, and this is my reply:

It should be pretty obvious to most people by this time that mobile user interfaces are going to play a major role in the future of computing. That is because the mobile user experience can never be reproduced. The question is how big a role.

But however big the role, this doesn't necessarily mean that graphic user interface design will be completely changed by mobile devices. Mobile interfaces aren't just new, they are also subject to completely different constraints than desktop systems are.

  • Pointing with your finger can be seen as a much more natural user interface that using a mouse, but it is also less accurate.
  • Simple interfaces fit small screens well, but the give users fewer clues as to what they can do next.
And so on. The user interface guidelines that have been developed for desktop systems still make sense, and as a result my guess is that mobile devices will not change desktop interface designs all that much.

Wednesday, June 01, 2011

SQL Crescent: A new step for Microsoft SQL

Crescent is the code name the new business intelligence user interface Microsoft intends to release with the Denali release of SQL Server. You may have heard about it, because Microsoft has been releasing excited videos about it, featuring huge crowds of developers going wild over the technology. I have not seen much of the technology itself, but what I have seen seems to borrow heavily on the user interfaces of Tableau and the Google Data Explorer.

The technology is closely related to PowerPivot. It is based on a new hybrid semantic layer that either directly access a data source or uses PowerPivot's Vertipaq column-oriented storage engine. It is part of Microsoft's not yet fully realized plan to provide seamless conversion of Excel-based data marts to a server environment.

To me, the most intriguing thing about SQL Crescent is that Microsoft's SQL development team is dipping its toe into the area of business intelligence front ends. (I don't really include Reporting Services because it is simply a reporting tool, not a specialized BI tool, and it works better with relational data than with multidimensional data.) In the past real analysis has been the domain of the Office group. PerformancePoint was a bit of a halfway house between the groups, but in the end it was folded into SharePoint, where the dashboard component still resides. Currently PowerPivot only has an Excel (or SharePoint Excel Services) user interface. So this seems like a big step.

Sunday, May 22, 2011

The BI Survey has started

The BI Survey 10 has started! The world’s oldest and largest survey of users of business intelligence software is now online. Last year’s survey garnered over three thousand responses and could build on a data set stretching back to 2001 to analyze the trends in the market for business intelligence tools.

The BI Survey has always had the goal of shedding light on the real world issues that matter to users. It avoids the latest hype and buzzwords and focuses on costs, goal achievements and business benefits. It helps potential buyers avoid typical pitfalls of buying and implementing business intelligence software. The Survey also provides vendors with no-nonsense information on what buyers really need, not what they say they want.

This year the Survey sharpens its focus on the project. Business intelligence has changed as it has become more accepted, and more, larger vendors have entered the market. The year we pinpoint new issues:

  • The size and scope of projects, and how these influence business benefits
  • Finding the right product for a given project
  • Who implements the project and how much it costs
  • Business participation in product selection and implementation


The survey compares the leading products on the market across a number of headline criteria such as performance, scalability and vendor support. Also it is based on actual data instead of just educated guesses, which makes it a great resource for any company evaluating business intelligence products.

If you want to take part in the Survey, click this link.

Tuesday, April 12, 2011

In-memory databases and mobile business intelligence

I was looking at some results of a benchmark test MicroStrategy did on its product. The benchmark was about supporting large numbers of users. MicroStrategy is pushing into the mobile business intelligence business and it has identified (correctly in my opinion) user scalability as one of the key challenges in this area.

Anyway, the benchmark tests to see how MicroStrategy would do with up to 100K users or so. Impressively, (if not surprisingly, considering they ran the tests themselves) the product did well. They ran the test both in Linux and Windows.

There were two things I found intriguing about the test: The hardware configuration of the application server, (144 GB of RAM), and the fact that they allowed the system four hours to load the data into cache, which is basically an in-memory cube.

In other words the Oracle database that was holding the data probably wasn't working very hard during the actual test. It worked during the four hours of loading, and was probably done for the day. Most of the heavy lifting was going on in MicroStrategy's in-memory database solution.

I think this is what we should expect more and more in the future: large scale database solutions of this type will make use of the falling RAM prices to squeeze better performance out of their solutions. Mobile applications with large numbers of users will need in-memory technology.

Sunday, March 13, 2011

Flash and business intelligence

I have heard quite a bit recently about the demise of Adobe Flash. In fact I recently had a discussion with some programmer who were decidedly of the opinion that the future belongs to HTML5.

Apple does not like Flash, and has managed to keep it off the iPhone and iPad. Steve Jobs also claims that Flash is the most common cause for crashes on Macs.

Microsoft is not as strongly opposed to Flash as Apple is, but Silverlight is certainly intended to compete with it, and Microsoft has also thrown in support for HTML5, as as Google.

But I certainly don't see any move by business intelligence companies to move away from Flash in their reporting tools. In fact IBM Cognos released Cognos 10 fairly recently with more support for Flash than in previous versions. I also notice that Information Builders has actually more towards Flash in its Active Reports, away from the original DHTML version. And MicroStrategy doesn't seem likely to move away from its popular Flash dashboards, even if its current interest in mobile business intelligence will means more support for Apple platform.

I think people tend to underestimate the role that momentum plays in the IT business. I can understand the arguments that the markets are moving away from Flash, but I wouldn't expect it to disappear overnight.

Sunday, February 27, 2011

Do companies really want transparency and collaboration?

Just finished our annual conference on planning tools. As usual, I talked to a lot of customers and vendors, and one theme came up several times: Some of the great features that vendors have on the roadmaps for their tools and that analyst list high in their discussions of trends for the new year may not be customer requirements at all.

I'm not talking about platform issues like mobile BI. This is about actual features of the tool. In particular, features that shed light on the thought processes that lead to the final planning numbers. Here are some specific remarks:

  • Fine grained comments, for example comments on individual numbers, are rarely used.
  • Detailed collaboration is often better done off line. Is this a tool issue or a cultural issue?
  • Comments from large user groups tend to be too low in quality to be of much use.
  • Middle management sometimes opposes processes that encourage transparency, even if they save time and effort.
  • In some cases middle managers feel that storing data locally until it has been verified is better for them. So (to get back to a platform issue) Web-based data collection is not as popular as you might expect, a trend I have mentioned before.
Of course these are not universal truths. But it is interesting to hear how often they are mentioned.

Monday, February 07, 2011

Business intelligence and the iPad


I have been observing the wave of BI vendors coming out with applications for the iPad with some interest. QlikTech and MicroStrategy, who are always quick to move with the latest market trends, were among the first. IBM Cognos waited for their new release but also have a client as well. And with SAP having just spent billions on Sybase for its mobile technology, it’s not surprising to see SAP has also jumped into the fray with a new iPad client for Crystal Reports and the BusinessObjects Explorer – though not, as far as I know, using Sybase technology.

But the point here is not to list all the iPad BI clients available – I am sure I have missed a few. What interest me more is the question of how these interfaces are designed. The key to making a mobile reporting interface work is to squeeze interface onto a very small screen. There are lots of tricks for doing this, most of them involving navigations tricks with simple gestures to make traversing the interface feel natural.

But the iPad really does not have that small a screen. In fact, with a 9.7 inch diagonal, it is about the same size as an Asus Eee screen. It does not weigh a lot less either. But despite Eee’s success, business intelligence companies were not falling all over themselves to create a new interface of the Eee.

Furthermore, in my opinion, the interface that Apple gave the iPad is not quite appropriate. The first time I saw it I just thought, “Oh cool, it’s like the biggest iPhone ever”, and so did many other people, I suspect. But frankly, the iPhone interface does not expand all that well. In short, the iPhone interface is a good interface because it is visually attractive and it handles the severe size constraint it is subject to quite well. To make this possible, the users’ choices are intentionally limited, but users do not mind since the device is so small. But the IPad does not really have this constraint, and so the restrictions are more of an inconvenience. Furthermore, the bigger screen gives the user way of interacting with the software, so looking simple becomes more and more cryptic.

So iPad interfaces tend to discard the typical desktop standards that have been accepted over the decades (for better or worse) and replace them with a concept that is finely tuned for a completely different format. But most users – especially users of business tools as opposed to toys – are more interested in the content of the application than in fancy interaction techniques. Business intelligence vendors may be better off just making sure their Web applications work well on Safari running on the IPad than spending time developing iPad specific applications.

Friday, November 26, 2010

Panopticon

I had another look at Panopticon today. I had seen it before, but they have a new version out. Panopticon is probably most noticeable at first glance because of its spectacular data displays. Their marketing material usually features very busy charts of two types – tree maps (heat maps with a drill down) and what Panopticon calls horizon graphs, which only they offer, as far as I know. Horizon graphs are too busy for most applications and only seem to me to be useful for trying to get an overview of a large quantity of data. It sees itself as a competitor of Tableau and Spotfire.

Similarly, Panopticon’s data layer is specialized in supporting large quantities of multivariate data, often in real time, or near real time. Typical applications include real time monitoring of telecom networks and financial trades. Product is notable for its ability to access disparate data sources including column-based data sources and enterprise buses. This makes a bit more like middleware than most BI. The company is particularly active in the financial services industry, where about 80% of its customers are found. Panopticon offers semantics in what it calls a Stream Cube, a multidimensional model with data access and caching functions.

Panopticon’s newest release includes an easier way for end users to design dashboard that are embedded in third party systems. This product, which the company refers to as the Rapid Development Kit(RDK), is a reminder that the product is often used in very technical applications, such as as an embedded dashboard in a custom solution. The RDK is intended to allow business users to deliver content to such technical environments.

Tuesday, November 09, 2010

Cognos 10

I had look at Cognos 10 today when IBM’s Andrew Popp and Andrew MacNeill dropped by in Würzburg on a whistle-stop tour of Europe. On the whole I was pretty positively impressed.

Cognos says it is moving back towards the business user. That was the main message I picked up from the meeting. I have been noticing this interesting trend with many vendors. About ten years ago there was a big swing away from the business user, and vendors started focusing more and more on the enterprise. The main reason for doing this was that they were chasing the bigger deals.

But vendors often forgot about the end user in their rush to establish themselves as enterprise standards. In fact, the BI Survey 9 found that buying choosing BI software for a project because it is the enterprise standard leads to the worst project results. It is refreshing to see vendors making a real effort to address this issue.

Cognos is focused on making the product easier to get started with. They have identified the issue of jumping between multiple studios as the key issue for end users. Removing the “studio hop” is a big part of the new product. Actually this never seemed to me to be such a big deal. However, I have often seen users confused about it in the field – at least during sales presentations. I wonder if it is also an issue for users with a day’s training.

Be that as it may, the new version of Cognos provides a new way of accessing the system. Now the user starts in a Workspace that provides easy self service features and can add to that by clicking a button automatically attached to objects that brings the user to the appropriate authoring tool. Cognos is also moving away from the recursive grid concept that is the basis of Report Studio. This is probably a good idea as well, because you need quite a bit of practice to get a complicated layout right.

Cognos also introduced disconnected analytics with Report Studio’ ability to create prepackaged reporting applications. These single file packets include a slice of data and one or more interactive formatted reports. They are called Active Reports and are a lot like Information Builder’s Active Reports. They have also developed a set of query optimizations called the Dynamic Query Mode.

Another interesting side remark that IBM made was that they were considering using the name Dashboard for the Workspace. In the end they decided against it, and I think it was a good decision, although it is probably easier to get attention using the totally overhyped term dashboard which seems to be used to mean just about anything.

Monday, October 04, 2010

Visual Mining

I had a look at Visual Mining's Performance Dashboards a few days ago. Visual Mining has a product call NetCharts which comes in different editions for different customer groups. The core version is a set of developer tools that allow Java developers to create HTML 5 or Flash charts. NetCharts Server provides additional features for delivering the charts to end users. The NetCharts Performance Dashboards add an additional layer of user interfaces and data access tools to allow users with relatively few technical skills to create their own dashboards.

Strictly speaking dashboards do not include tabular data. Well anyway, that was the original idea, even if most vendors do not stick to it. and most of the presentation I got of the Performance Dashboards actually revolved around the features of the table object that is provided. The tables offer of sorting and filtering as well as conditional formatting.

Each table is made up of a single hierarchy and one or more columns, either taken directly from the database or calculated on the fly as a variances, including simple time intelligence features such as previous year. The hierarchy has good presentation features - it shows children as indented elements, the parent can be above or below the children, level based formatting is available and so on. Furthermore the end user can add or remove levels and change their order on the fly.

The product also has drill-down feature in the tables and the charts, so users can see the transactions the underlie the aggregate data. Navigating in the finished dashboard is relatively simple, and more emphasis is put on creating the dashboards. The end user can be given the rights to modify the charts freely, even changing chart types.





Wednesday, September 15, 2010

Chris Webb on the BI Survey

Chris Webb made a few comments on the results for SSAS in The BI Survey 9. I think the most interesting point he brings up is the poverty of front-end tool use, particularly third party tools, for a low-priced database with an open interface. You would think third party vendors would make up a larger market share, but that doesn't seem to be the case.

After all one would expect third party vendors to flock to the tool, especially since it is already available at so many sites. (The latter claim is my speculative inference based on the fact that SSAS appears so seldom in competitive evaluations.) Of course there are plenty of third party vendors on SSAS, but they seem to make a relatively small dent in the market. We do see quite a few third party planning tools using Microsoft as a platform though.

I don't think that this is limited to SSAS however. Use of third party front-ends for other multidimensional databases such as TM1 and Essbase also seems relatively seldom.