My photo
Auckland, New Zealand
Smurf sized geeky person with a penchant for IT, gaming, music and books. Half of industrial duo 'the craze jones'. Loves data, learning new things, teaching new things and being enthusiastic.

Tuesday, 24 August 2010

Adventures on the dark side

Last week I trundled along to a full day event covering various aspects of Oracle DB schemas, indexing and database optimisation.   

Why did I do this?  From a personal point of view I like to understand SQL from all the different viewpoints.   I spent 2 years as a Sybase developer, I am currently studying for Microsoft SQL Server 2008 exams and have recently been focusing on performance tuning and maintaining indexes in MS SQL.  I was curious what sort of differences, if any, there would be between the Oracle and MS optimisation practices. Quite a lot it seems.

I was expecting to find more similarities than differences, so was mildly surprised by what I heard.  

For the intial stages, Oracle and MS devs are in agreement.  

1. Before you start coding, implement a strategy for handling errors that all the developers/DBAs will adhere to.  Try..Catches that developers add willy nilly are not a strategy.  You need to ensure that errors are captured, reported, handled correctly, and that an audit trail is available.  

2. Spend more time planning your schema and getting data entry optimized to speed up data retrieval. You only insert data once, but you retrieve it multiple times.  It's pretty pointless creating your database and tables and then trying to fix the performance afterwards.  Put more effort into the planning and design to ensure it works. As Einstein said ""If I had one hour to save the World, I would spend fifty-five minutes defining the problem.".  Then again, how many of us work for companies who seem to have no time available for defining and implementing a solution, but all the time in the World for fixing problems and performance tuning after the fact?  There's a lot of very poor planning going on out there.

3. Indexing. This is where ideas started to diverge a little.  The unique index is pretty similar in both DBs, however, when we get into Reverse Keys things start to diverge.   

With MS SQL, one of the main performance maintenance tasks for indices is checking the fragmentation on a regular basis and rebuilding/reorganzing them to get nice tidy leaf blocks.  Oracle Reverse Keys purposefully spread data out across multiple leaf blocks, though they call it dispersement, not fragmentation.  With Oracle indexing, this technique speeds things up as it reduces the chances of multiple people hitting the same leaf block at the same time during a scan.  Reverse keys prevent most range scanning as the data is not stored in a scannable order, so a full table scan occurs each time.   

I asked how sizing works with this type of index and whether the indices were sized dynamically, and they are, which brought me to my next question, doesn't that defeat the object of this type of index?   If you're dynamically sizing, won't it start with 1 leaf block and add the rows randomly to that leaf block to fill it up before creating a new one to enlarge the index?  Won't this strategy only start to work once you have a long standing index where you have started deleting rows on earlier leaf blocks to create space for the random leaf block writing?  Perhaps an Oracle dev/DBA can answer this for me as it's not making much sense at the moment.  

Bitmap indices were the next topic covered. Bitmap indices appear to be the Oracle tool for creating cubes/pivot table type data. These sound great, though I can see many potential issues for them if not used correctly, or misused by people not understanding what they should be used for.  Bitmap indices should NOT be used on tables that are used for transactional processing.  You should only use bitmap indices on tables that process in bulk, preferably using array inserts (i.e. insert/delete thousands of rows at a time) and tables with no contention issues. A reporting server would be a great use for these type of indices; read only most of the time, with scheduled bulk update processing to bring the data up to date.  

4. Brain overloading by now - a full day of something  new but similar at the same time is quite mentally tiring it seems.  Anyhoo... clustering   B*Tree clustering, yup, SELECT * from blah... and just like MS SQL and Sybase, if you have a dev with no brain it can be royally screwed up.  Main thing is to make sure you don't use uneccessary joins/overkill, try to simplify as much as possible to ensure good performance.  This means you need to have a good understanding of your primary and foreign keys.  So, like I said, same in Oracle as it is in MS SQL.  

Hash clusters - I need to find out more about these as they sound quite interesting.  Would be good to know if there is a MS SQL equivalent.  Hash clustering knocks out 75% of IO usage by getting rid of range scanning and going direct to the required table. 

5. Planning, planning and planning - check out this link to see why it's important:  http://www.engadget.com/2009/12/17/led-traffic-lights-dont-melt-snow-do-cause-accidents/ :) 
Anyway, that's it from my foray into Oracle World yesterday.  It was very interesting and I'm considering taking some courses to delve further into things.