Snapshots on Snapshots

Copyright 1999-2007 David Russell

San Diego based Gateway hired Anderson Consulting to write their web application: the catalog behind their internet ordering system.  Anderson had close to 200 people on the premises when I came onboard, doing everything necessary to complete the project on time.  There were two issues against the database: data was not propagating, and data connections on the web servers were hanging.

 

In order for the data to propagate, the application required data to be staged in five separate Oracle databases.  Data had to be editable at three of those five databases with the data getting better the closer it got to memory, and correctable at the last minute closest to where it was next needed.  At midnight the entire catalog was loaded into RAM for the quickest web access.  It was no longer Oracle at that point.

 

Nobody knew I was coming.  The lady who had hired me into Gateway had gone, and the two positions above me were open.  The next person in the chain was too far up the ladder to even talk to me.  The two existing DBAs had no experience with snapshots, and they were busy with day-to-day operations.  I was turned over to an Anderson consultant and reported directly to him.  The timing was perfect.  They needed data and I knew how to give it to them.

 

The project was four months away from completion and they had nothing more than an idea as to how this would work.  In order to develop the application they were using a full database at each node.  There was no refresh and no propagation during development… at least until I got there.

 

… And success was hinged on this tiered approach.

 

Snapshots on snapshots worked perfectly.  At least they worked once you got past the Oracle bugs with snapshots (materialized views), context indices, and database connections.

 

Context indices (now a component of InterMedia) were expected to assist in local searches.  These bugs were side-stepped by dropping the indices prior to doing a snapshot refresh, and rebuilding them after the refresh was finished.

 

The data connections were doctored a bit with scripts to kill dead connections on one server while the other two assumed the balanced load.  This was repeated on each server periodically to clean out the threads.  Some variation of this problem has repeated itself several times - and I have yet to see the real solution.

 

Oracle is a very reasonable company to work with in a lot of ways.  By the time of the second TAR (Trouble Assistance Request) the words “not a supported configuration” had been used two too many times.  They agreed that it was a supported configuration after we specifically addressed a group of their technical best with details of where we were using logs and where the databases were read-only, etc.

 

Oracle's Silver support plan used a pool of people, and not only was there no way for whoever answered the phone to know your system; but within Oracle there was no way to pass that information along gracefully.  Gold support used dedicated agents, and they kept a log on configurations.  Amid all our other problems, we now had to negotiate a procedural change within Oracle Corporation or maintenance would forever be a nightmare.  All problems were resolved.  Oracle Silver support inherited the form that Gold support had used.

 

When the application was finished it passed from Anderson Consulting in San Diego to Lockheed Martin in Colorado Springs where it would be maintained.  Lockheed sent an employee from North Dakota to San Diego to be taught how to build the setup from scratch, maintain, and re-build.  The Lockheed employee stayed with me for 4 ½ days, left on Friday, and arrived in Colorado on Monday where he maintained the system for the next six or seven months – until he quit and used me as a reference for his next job.

 

This was a fun project.  It took some designing and debugged to make it work; but it worked well and the project was completed on time.

Last Revised: April 2007