Monday, October 23, 2006

Day 2: The Leading End-to-End Data Warehousing Platform, Part 2.

The first session this morning was about Oracle Data Mining, presented by Oracle's Product Manager for Data Mining Technologies, Bob Haberstroh. What a difference from yesterday! Plenty of interesting chat and fast paced labs demonstrating all the bells and whistles of the product. The course materials were of a high quality and the labs all worked first time. Phew - I was beginning to think that the X Treme Weekend wasn't going to deliver the goods.

I hadn't seen Oracle Data Mining before so I didn't know what to expect or even really what data mining meant.
I learned that it is what you use if you want to make complex predictions based on your data. One of the labs showed how you can leverage a small sample of known 'high rolling' customers to discover all the other potential big spenders.

Another demonstration was how data mining can detect outliers in large datasets, like fraudulent expense claims or unusual credit card spending. Oracle Data Mining manages to pull off these tasks this with an absolute minimum of user input, and get some impressive results (OK - so this was lab conditions, but even so...). These techniques are gaining ground in all kinds of areas, including the life sciences, where they are used to discover trends about hereditary diseases.

Data Mining can make use of a feature of Oracle Database called 'nested columns' which appears to be an alternative way of representing one-to-many relationships within a single set of rows without repeating all the parent data. This enables Data Mining to work with what could otherwise be quite unwieldy data sets. The example given was the courses being taken by individual students at a university where the choice of courses could be in the thousands but each student will only take a few.

In the afternoon, we had two sessions on Oracle OLAP with Marty Gubar. The client tool being used here was Oracle Analytic Workflow Manager with a dash of Oracle BI discoverer but he also showed us how to retrieve OLAP data using SQL to get the facts out of 'embedded total views'. Because OLAP uses array based storage and the data is pre-aggregated, this is some really fast stuff, allowing users to get at a vast range of data in next to no time.

AWM has got a cool query builder that looks a bit like the Rule Wizard in Outlook. Instead of starting with 'When mail arrives...', you select patterns like 'Start with children of Organization' and then click on hyperlinks and add lines to refine your query. It generates accurate, readable query conditions and looks like it would be very easy to use for non-technical business users.

0 Comments:

Post a Comment

<< Home