Monthly Archives: November 2005

OWB 10gR2 : Real Time Data Warehousing

There’s lots of talk about real time, right time, period batch, message based in the Data Warehousing and BI circles these days. I think this is driven by quite a few reasons. Need for fresh data, need for unified reporting interfaces for users, etc. Mostly, I think it comes down to a TCO for IT assets. As the EAI/EII/ETL tools start to converge along with increased SOA-ee-ness of databases and middleware products there becomes quite a bit of overlap between the different product sets. Managing “one product” that does this data integration, calcuation, and movement between systems costs less to maintain than “multiple products.” Truthfully, I see little strategic (ie, warehouse and marts) data that needs to be computed in real time. Those cases do exist, and OWB 10gR2 has some new features for those that do have some Real Time DW/BI needs.

There are two major flavors of mappings in support of Real Time Data Warehousing in OWB:

  • PERIODIC BATCH: This is basically a batch process that runs frequently (say every minute or so) that reads data from a QUEUE or STREAM. While the data is pushed into the DW (real time), the system only processes when run (batch). These are regular mappings that use a Stream or Queue as a source instead traditional Tables/Views/etc.
  • TRICKLE FEED: This is much closer to what most people think of when we refer to real time data warehouse. Trickle feeds involve processing each individual record as it arrives, instead of waiting for them to collect. These are a special kind of OWB mapping called Real Time Mappings that run continuously and process records as they arrive.

Truthfully I’ve only kicked the tires with both of these types of mappings limitedly. I tested some of the features back in OWB Beta2 and built a conceptual mockup of how it would work for a customer of mine. What I’m presenting is a conceptual partially working mock up built using an early beta release. In other words, do not use it as reference or consider it a blueprint for how you should proceed. If there is enough interest I might submit an article to OTN on the subject. Anyone like the idea? Better yet, if you’re not one of my customers please do consider contacting me! I’d love to help build a Real Time DW solution with OWB!

OWB now includes the ability to define, deploy, and setup Streams, Queues, Queue Tables, UserDefinedTypes, and propogations within the GUI. There’s a whole set of screens that you’ll see when the community preview hits the shelves. Unlike regular OWB deployments there are some additional requirements around streams administration locations, permissions, etc, but they are easily surmountable. Also, if you’re going to be doing real time DW you need to understand a bit about the underlying technology anyhow (not tons, but enough to know why you need to have Archive Logging turned on, etc).

Refer to the following PDF for some greater details on the conceptual, but here’s a not so good screenshot:

I’ve created a mockup of a BI solution that is fed by a CRM (Customer Data Hub perhaps) and a Subscription Management Application for this example. You can see that conceptually this involves both systems sending messages either from the APPLICATION LEVEL (JMS or some other messaging technology) or the DATABASE LEVEL (with DML Stream Captures running in Oracle). In other words, we have multiple places we can get different pieces of data and the application doesn’t necessarily have to be “REAL TIME ENABLED” to send real time data. Oracle can do that on it’s behalf using the Streams technology!

Overall what this looks like is we setup the various Streams, Capture Processes (DML), Queue Tables, and Types (based on our source tables) to support our real time system. Note that the screenshot does not include the Streams on the source system or the Capture Process definitions. This only includes the DW side Streams, Queues, Dimensions, etc.

I’ve built three real time mappings (TRICKLE FEED) which in concert receive messages to add Dimension records (SCD2) and insert new Cube records (transactions). Notice this is a greatly simplified example entirely ignoring what I consider a best practice of loading into a normalized warehouse, then updating marts based on the warehouse (a la CIF methodology). Also these are all assuming to changes (ie, record corrections) just straight clean data! We should all be so lucky!

One receives updates from the CRM application and performs SCD on the appropriate Dimension objects.

The others receives event messages from a transaction based system and inserts records into Cubes.

This isn’t quite as much detail as I would like to have gone into, and I’ll quickly repeat my warning… This is just some mockups and conceptual work so don’t expect it to be accurate come OWB 10gR2 production time! I have some more thoughts on how to use this with Partition Exchange Loading to get a days “Cubes” built realtime throughout the day, and then at the end of the day move them over to the full history but that’s a whole nother article.

This blog is part of the OWB Paris Early Review series which reviews and comments on several new Paris features.

Must have IE to evaluate SQL Server?

Microsoft is spending millions upon millions to launch and promote their new SQL Server 2005 release. I’m guessing they want every developer and nerdy IT type to check it out. They want to get into the VLDB and HA corporate data centers, and claim some of those vi using, I can write x86 assembly if I want to, firefox using, developers and DBAs.

The irony?

10% of the web surfing population won’t be able to evaluate it because the SQL Server 2005 homepage doesn’t load with Firefox.
http://www.microsoft.com/sql/default.mspx

Any other Firefox users able to load the page? Or is this another example of “Drink the MSFT koolaid or be gone with you!”?

OWB 10gR2 : Embedded OMBPlus

Not a long entry, but I did want to post some information on a useful little feature in the Paris release of OWB.

OMBPlus is the Swiss Army Knife for OWB developers. It allows you to do “OWB Stuff” without using the OWB GUI. If you’ve ever had to do many repeatable things in OWB you’ll be thankful for the hours that OMBPlus can save you! More details on OMBPlus can be found on OTN.

Currently you have to fire up OMBPlus seperate from the application and run it kind of like a text shell (interactive or fed a script). In the next release of OWB they’ve put this interactive shell environment as one of the panels in the design center. This is SOOOO very convenient when building OMB scripts.

Make a note though, some things won’t work in the context of the embedded OMBPlus window. Basically, things that require connecting to the OWB repository in SINGLE USER MODE you’ll still have to fire it up seperately (such as creating UDOs, etc).

This blog is part of the OWB Paris Early Review series which reviews and comments on several new Paris features.

Voted NUMBER ONE!

Working daily with people who are trying to measure and understand their world through the use of technology and BI methodologies I often hear lots of “things” that are important to determine.

  • What is this years top 5 products and what is their annual sales growth for the last 5 years?
  • Which company division has the most profitable customers, and which division has least profitable customers?
  • What time of day, in a registered website visitors home time zone are pages viewed on our website split by category?

In other words, there are some very specific things people want to know and brag about both within the company and externally to investors, analysts, and the media.

This predisposes me to question numbers I hear anywhere. What’s the qualification, what little keyword allows this company to say they are the top in their cateogory? Company XYZ is the Number 1 in Sales (in Asia Pacific small to midsized healthcare providers not owned by government and groups exceeding 1billion market cap for fiscal year 2003). We’ve all seen it…

One of my online music stations had a refreshingly simple claim to fame today that made me laugh out loud:

“Total Country. Rated #1 amoungst people who really like us!”

A refreshingly honest figure!

Ingres sails from Computer Associates

I’ve just started playing with Ingres recently (last 12 months). It’s a powerful DBMS that was released under an Open Source license last year. From what I gather about the history of the database it has been kind of a “hot potato” being passed from university to company to company to company, etc.

Feature for feature Ingres appears to be the most advanced Open Source database available. However, since it has been released under the CATOSL it has not resembled a community driven OSS project. There is still no public access to the source code repository, and as far as I know, there has not been source contributions from anyone outside of CA. The CATOSL is a “funny” OSI approved license that I think also hinders the uptake of Ingres.

However, all that could change, starting today.

A venture capital firm has purchased “Ingres” from CA and launched a company focusing entirely on the Open Source database. This company has an opportunity to capitalize on a starting point most OSS projects could only dream of (starting with a product that is deployed with mission critical applications at more than 5000 customer sites). That’s just where they start though… their future must include turning Ingres into a full scale Open Source project and community. This means public discussion forums, public source code control, welcome third party contributors, peer to peer information sharing, user based support, etc. I think Ingres (company and project) would also be VERY well served to trade the off color CATOSL license for a commercial friendly OSI approved license.

Welcome Ingres, Inc. to the marketplace! It’s an interesting one with Oracle, Microsoft, and IBM all providing “free” versions of their DB now and passionate communities in the MySQL and PostGres projects.

As a die hard Oracle consultant I need much more information to draw conclusions about Ingres… I’ve been in touch with CA and Ingres, Inc. I hope to provide more information and a more detailed evaluation as time permits. Stay tuned for more!

OWB 10gR2 "End to End" Metadata Management

Here are the slides for the presentation I gave at the UKOUG conference last week. There was some interest about this, and while it was rather late in the day I think that there might have been some light bulbs going off for those attending. There were a couple of questions about using Model Driven Architecture metadata (ie, UML to generate your application code) to integrate with the warehouse and OWB. Generated application code (Java) and Generated ETL code (OWB) are a good marraige so I’d I think there’s some interest.

Check out the slides to get the gist of the presentation, but basically what we’re talking about here is extending the OWB Metadata Repository (OMB) to include additional items. Above is pictured some example UDOs that I created that would be quite interesting. Integrating Business Objects, Crystal, Oracle Reports, Discoverer Metadata with OWB to get a true “end to end” picture of the data moving through your enterprise.

Mark Rittman posted a picture of me giving the presentation as well!

Comments, as always, are very welcome!

This blog is part of the OWB Paris Early Review series which reviews and comments on several new Paris features.

UKOUG : Days 1 and 2

I’m quite impressed by the UK user group that puts on quite a large conference in Birmingham, UK (aka Brum). While I believe there are many more regional groups in the US, I don’t believe there are any that are in the same “league” as the UKOUG. There must be several thousand participants and over 250 different sessions to attend… VERY, VERY impressive!

Some presentations of interest were those on the CBO, RFID, materialized view query rewrite, XMlQuery in Oracle, Oracle 10g OLAP and Discoverer, HTMLDB New Features. The session on XMLPublisher was full by the time I arrived and I’m bummed to have missed it.

One thing I’ve really enjoyed is the chance to meet some people that I’ve only had the chance to chat with, virtually, on email. Peter Scott, Jeff Moss, Jon Mead, and Julian Ford. I am well aware that all these people “know their stuff” because of their various blogs, articles, and emails. Now I know that they are genuinely nice people; it’s just as easy to chat about local beers as it is about the ins and outs of the Oracle database. Nice to catch up with Mark Rittman as well who put together a dinner of the bloggers that was well attended (and paid for by UKOUG, thanks!)

I was a bit taken aback by some of the Paris information being put forth at the conference… I’m uncertain if the beta information is not disseminated to Oracle employees properly or if it’s just a desire to keep upbeat about an overdue product. All the same, information announced publicly at Open World was not even covered (Paris released in CY 2006, officially). It’s clear that there is great customer interest in Paris and I think it’s a great leap forward for OWB. Paris is a great product, have no doubt! Just needs to get “finished up” and out the door!

I put on a presentation about new Metadata features of OWB Paris… I think some people found it useful, but it also might have been a bit of a firehose at 5pm at the end of conference day. I had this experience at another User Group before and I hereby resolve to refrain from submitting any more ‘highly focused’ presentations. Conference attendees I think would benefit more from something much more widely applicable… All the same, I’ll post the slides when I return for those that ARE interested and either did or did not attend the conference.

Blew off the social event to have a wonderful meal at the “bank” with my fiance. Excellent!

Oracle is free, btw. As if this hasn’t been blog covered, much to the dismay of Mr. Thomas Kyte. 🙂