Monthly Archives: October 2004

Oracle on VMWare Linux, Part 2

I’ve written about Oracle on VMWare before, but thought I would share an alternative perspective with readers. Howard Rogers has published a nice article on how to install VMWare, install White Box Linux, and suggests it works perfectly well. While I wouldn’t now suggest that production environments consider it suitable, I’m glad that someone has had a better experience with this product than I’ve had.
Note: There’s also an interesting discussion about his favorite font, for those looking for a digression.

What's that Java Application doing?

An engineer at one of my customer projects was having some difficulties with a third party program and Oracle. The application was referencing objects that didn’t exist, and throwing ORA- errors. There were difficulties in getting the attention of the Oracle DBA group to help troubleshoot the issue, so he found a clever way to find his information. He plopped in p6spy.

P6Spy is an open source framework to support applications that intercept and optionally modify database statements. The P6Spy distribution includes the following modules:

P6Log. P6Log intercepts and logs the database statements of any application that uses JDBC. This application is particularly useful for developers to monitor the SQL statements produced by EJB servers, enabling the developer to write code that achieves maximum efficiency on the server. P6Spy is designed to be installed in minutes and requires no code changes.
P6Outage. P6Outage detects long-running statements that may be indicative of a database outage proble and will log any statement that surpasses the configurable time boundary during its execution. P6Outage was designed to minimize any logging performance penalty by logging only long running statements.

He indicated it took only a few minutes to plop it in the application, and troubleshoot which object was missing. Helpful little utility if you’re in a bind. Thanks for the tip Russ!

PSOUG : 10/20/2004 : DBMS_RECTIFIER, Instead of Triggers, ORA_HASH

As a recent transplant to Seattle, I’m keen on getting connected with my local community of Linux, Open Source, BI, and Oracle brethren. I took in my first Puget Sound Oracle Users Group meeting last night. The group draws between 10-50 people per week, appears well organized, and based on my observations from my first meeting, provides substantial substance in a peer-to-peer setting.

Daniel Morgan, from the UW extension school, presented three mini-lectures on DBMS_RECTIFIER, Instead of Triggers, and ORA_HASH. I highly recommend you visit these pages on PSOUG, they have examples that demonstrate the following features quite nicely.

DBMS_RECTIFIER is a rather interesting package, especially for someone who spends a great deal of my Oracle time building CDC routines for Data Warehousing. At a basic level, it’s two procedures that provide a quick method to easily identify differences in data tables, and correct those changes. The rectify has an interesting behavior, and it’s not particularly well suited to CDC. However, the differences procedure quickly identifies differences (can be scoped to only a few columns as well) and places the different ROWIDs in a seperate table. In a pure Oracle environment, a BI professional could make use of this since ROWID provides a very quick way to access rows.

Instead of Triggers apparently have been availabe since Oracle 8. Some views are updateable, if they match a certain criteria. However, those that do not match this criteria, there is still a way to allow for updates/deletes/inserts on these views. Oracle allows you to make a special trigger, an Instead Of trigger that only applies to Views. They allow someone to specify exactly the DML to occur, and can include procedure and function calls, and DML on completely different tables. This could be used to :

Audit all inserts/updates through a defined view to a seperate audit table. I know this is old news, but this is a pretty slick way to accomplish this.

Provide an interface where certain DML statements become impossible. One can provide a view to a table that limits the DML so that certain columns are never updated.

Interface with an external system. One could create views in Oracle that provide an interface to a remote system. Data inserted/updated/deleted from these views could send these inserts/updates/delete to an external system via anything you can write in PL/SQL.

ORA_HASH. Quick. Flexible. Hash Function. Useful for tons of things. If you know how to use HASHing, you’ll find this convenient. It too, has been in Oracle for ages. 🙂

OWB Runtime : How to purge audit data easily

The OWB runtime generates runtime audit data for every process, mapping, and file loading that it invokes. If you have intricate flows and load your warehouse/mart several times a day (every 5 minutes perhaps?) you’ll see these pile up quickly. The easiest user method for purging this audit data is through the OWB Runtime Browser that allows you to select (checkbox).

Sometimes you don’t want to manually do this, but rather have a way to do this programatically. Perhaps because you want to keep a clean house (always purge greater than 30 days old). Perhaps you generate a lot of audit data in a development environment that’s a hassle to delete from OWB RT Browser. Perhaps you rarely visit the OWB RT and don’t want to fuss with keeping it up.

Whatever your reason, please feel free to use the following little script to systematically purge old audit data. It must be run as your OWB Runtime Owner. I’ve not run this by Oracle, or seen them document this API so use it at your own risk. It works, though.

begin
for purge_ids in (select distinct top_level_execution_audit_id id
from ALL_RT_AUDIT_EXECUTIONS
where updated_on

OWB10g Paris : Installation and Security

The OWB team has clearly heard the masses and responded to their difficulty in installing and getting “up and running” with OWB. I have to think back to those first few days of head scratching over the installation and “quick” start guides to remember how tough it was to just get to a point where you’re building tables and composing mappings. Thinking back it seems painless but as I’ve observed at customers and conferences alike there are great difficulties and a learning curve to installing OWB. Smart, intelligent DBAs and BI architects had to spend some amount of time to understand the concepts to get going.

Previously there were numerous steps (I believe the OWB team mentioned a total of 60 steps, but those included “double click on this” and the like). They’ve been simplified by the consolidation of the two repositories. OWB veterans are aware there were previously two primary repositories that one interacts with to get things done. There was the DESIGN repository which held the metadata, ETL definitions, table structures, etc Then there was the RUNTIME repository which invoked and audited the acutal flows, mappings, and deployments being executed. Beyond that, every schema that was going to execute any mappings, or have any data objects also required the installation of a TARGET runtime components so the runtime repository could execute and monitor ETL in that schema. In a stage/warehouse/multiple mart setting, this sometimes meant stepping through the Repository Assistant many times (24 times at one of my customers for a set of development/stage/production environments).

OWB10g Paris consolidates the two primary repositories (DESIGN and RUNTIME) into one unified repository. This greatly eases the burden of installation, which can be done from a simple “New User” wizard on the client. During my testing of the product, we were working with a single database so I was not able to test a multiple database installation. The OWB product team indicates that the original topology is still available for customers wishing to use only portions of certain repositories (ie, use one design, with development/testing/production runtimes).

There is also an augmented OWB user security, permissions, and role based system that supports another great feature of OWB installation. Target schemas (which often make up the bulk of installation time in new environments) allow for new target schemas to be added as a simple afterthought with on the last screen of an add User wizard. OWB objects now have permissions and can be assigned throttled permissions (full ctl, edit, read) by both user and role.

It was noted by OWB product team that the security provided in OWB is enforced for access to the metadata, not the actual deployed objects. If one restricts the “Customer” table in OWB to only those with “INTERNAL_EMPLOYEES” this means that only those users in INTERNAL_EMPLOYEES are allowed to utilized, examine, and otherwise edit the CUSTOMER OWB table. When that table is generated and deployed, it is then subject to the normal Oracle permissions and OWB does not have a method for managing that.

My fellow testers echoed the sentiment that if OWB is really positioning itself to be an inclusive, central location for BI work in an enterprise it will also need to figure out how to capture security and access rules. OWB team product team acknowledged need… We’ll see if it ends up in a major revision or two out.

This blog is part of the OWB Paris Early Review series which reviews and comments on several new Paris features.

owa_pattern.amatch gotcha

I’ve been using regular expressions for years. A few handy cut, sorts, egreps, and uniqs work wonders on datafiles for quick easy troubleshooting. I spent an hour or so tracking down what I think is a funny implementation of regular expression matching. Thought I would quickly post it, and hope google picks it up so that perhaps I can save some poor soul the couple hours I just lost to it.

There is limited information about owa_pattern available; one of the better resources I’ve found is this page on stormloader.com. One of the bits of advice on that page is to:

Or use the second function in owa_pattern, amatch. This function has an IN parameter that dictates where you want to start to match. For our purpose, we don’t care so set it to 0. What we want from it is the type of the return value, integer.

Examine the next two commands and their results, and perhaps you can glean what the issue is with setting the offset to 0
select DATE_UTC, owa_pattern.amatch(date_utc, 0, ‘2004’) from xxxx yields

[12/Jul/2004:00:07:27	0
[12/Jul/2004:00:07:37	0

select DATE_UTC, owa_pattern.amatch(date_utc, 9, ‘2004’) from xxxx yields

[12/Jul/2004:00:07:27	13
[12/Jul/2004:00:07:37	13

The offset, or where to start, doesn’t necessarily mean that that is where the pattern matching will begin to scan for the pattern. It most literally means that the pattern must start at that offset. I’m not a huge expert on regex, but I wonder why a function that is supposed to tell YOU where the regular expression exists in the string requires that it start on an offset specified. Anyone have a perspective or a different experience (perhaps I’m looking at this improperly)?

We've come a long way

I’ve had some moments over the past few days that have all served to solidify that the world has come a very long way in the last 10 years. I recall 1994 where I was about the only person I ran with socially that had an email address, and the idea that I had a website (hosted by a research project for Sun) with my resume was very fringe and nerdy.

How all that has changed… As I write this, I am sitting at a corner seat at Tully’s coffee shop in Seattle, WA with only a power cord connected to my laptop(WiFi is brilliant). I am connected remotely and securely to a customer location working on building a web traffic and customer analysis Data Mart. I have just ordered a 6.0/768 Mbps line for my new home office that will be provisioned in one week. I have ordered Voice service over that dedicated data line for just $30/mo more. The progress is astounding… I am listening to a radio station from the UK clearer than local FM. Put it all together and I, along with others, expect the next few decades of human to truly transform society and life on this planet. These are great times we live in…

OWB10g Paris : UI Improvements

Of course software vendor product teams products shots always show their UI configured in the most efficient way possible leading you to below that the development tool their pitching will be the greatest thing for your efficiency since Venti quadruple non-fat lattes. Many times, the actual product usually falls short and requires a significant curve to “grock” how the product designers thought that YOU would be going about your job. The OWB product team has experience with doing a significant rewrite of their GUI (9.0.3 to 9.2 was significantly different). If those OWB verterans thought that was significant, prepare to be shocked.

The OWB10gR2 GUI will delight those that have not used OWB before as it has embraced some UI paradigms of other IDEs. Those familiar with JDeveloper, JBuilder, NetBeans, etc. will have a much easier time learning OWB. The OWB product team has accomplished this by making two significant UI changes:

Smaller Fonts are pervasive through the product. : I dread having to return the OWB 10gR1 with it’s monstrous fonts that make displaying even the most simple of mappings difficult. This smaller fonts on dialogs, wizards, editors, and navigation are a great enhancement to the product. In the short list of features I would have submitted for the 10gR2 version of the product this would not have been on it. I am very thankful that I was not the one composing the shortlist; the smaller fonts make the entire product much more usable on the whole.
Dockable Panels throughout : The editors and OWB navigator have created a much improved display to support the real work that occurs in OWB. They have created a set of screens that allow the focus to be on the primary work at hand (process flow, mapping, table definition, etc) but allow for the peripheral, oft utilized parts of the definition available in a fast refreshing set of panels surrounding the main event. One panel display the properties for the objects highlighted thus eliminating MANY MANY redudant clicks and scrolls, and highlights, and drills to the attribute. Another panel displays a palette of objects that can be added to the process or mapping. The panels are dockable (make them float), and collapsable (running out of space quickly expand and collapse them).

While most of the benefit of their UI efforts are captured in the two major changes there are other beneficial UI changes. Overall the OWB feels like it is quicker, and more attuned to the work at hand. Birds Eye Views of your edited object, automatic layout buttons all add a certain refinement and really lead me to believe that OWB is really maturing into a robust comprehensive product. Of course, it must be for us BI consultants, ETL developers, architects, DBAs to spend entire days with our noses plugged to the screen.

This blog is part of the OWB Paris Early Review series which reviews and comments on several new Paris features.

Open source ETL

I was visiting with someone the other day who mentioned they might eventually be moving to an area that, because of a variety of factors, is predominantly inexpensive and free software solutions. This person is very knowledgable about databases, ETL, etc. and would probably like to continue their work in this regard. I’ve seen some mention of this before. While poking around the other day, I ran across an open source package for ETL. While it’s simple, and somewhat limited it could serve as a starting point or a very inexpensive alternative for a small MySQL based mart.

Six Apart gets nod from Investors

The makers of Typepad and Moveable Type have long been regarded as the professionals blogger tool. There are copious amount of good, free, open source blogging tools available which is fine for a great many purposes. However, the quality and support from a nominal license fee ensures that those wishing to do a professional looking and operating blog have a source. Ran across someone who thinks highly of the company, as well as the technology on a blog on ventureblog.com.

Of course, there’s more to Six Apart than powerful software. As elegant as I think both Movable Type and TypePad are, to quote myself again, “it’s the people, stupid!” The Six Apart team is phenomenal and getting better every week. I don’t invest in products or ideas or business models.

Note: The blogger has invested in the company… 🙂

Goodman on BI

Thoughts on Open Source, Analytics