SQL Server VS Oracle: Index Monitoring

Jealousy is not an emotion I generally feel for the SQL Server technology stack. I covet their Twitter camaraderie, but I ‘know’ that Oracle can counter just about anything thrown at it.

People that Helped Me with this Post

I was reminded that even old dogs can be taught new tricks while monitoring the #SQLSaturday conversation going on in Chicago this weekend. The subject of monitoring INDEXES came up, and it was casually mentioned that this was pretty straightforward using a new Dynamic Management View (DMV) in SQL 2005.

[sql]
SELECT indexes.[name],
dm_db_index_usage_stats.last_system_lookup,
dm_db_index_usage_stats.user_seeks,
dm_db_index_usage_stats.user_scans,
dm_db_index_usage_stats.user_lookups,
dm_db_index_usage_stats.user_updates
FROM AdventureWorks.sys.indexes indexes
INNER JOIN
AdventureWorks.sys.dm_db_index_usage_stats dm_db_index_usage_stats
ON (indexes.[object_id] = dm_db_index_usage_stats.[object_id])
AND (indexes.index_id = dm_db_index_usage_stats.index_id);
[/sql]

index_usage1 — This is about as sexy as metadata can get!

If you’re wondering why the indexes are showing up multiple times (so did I), @BrentO and @Mike_Fal remind me that index names do not have to be unique, and that partitioned indexes will show up under same name for each partition in this DMV.

The data comes courtesy of SYS.DM_DB_INDEX_USAGE_STATS (docs) – You can find the column descriptions defined there, although they are self-explanatory.

So, assuming you are on at least version 2005 of SQL Server, you have a lot of data at your fingertips the next time a developer asks you if an INDEX is all that important, and is it OK if we drop it from the next build of our application.

What does Oracle give us?
We get a binary, YES or NO. We ask Oracle to start monitoring our INDEX(ES), then later after our workload has ran, we can see if it was touched or not.

V$OBJECT_USAGE (docs)

And there’s always a big BUT
Richard Foote (Twitter) blogs on how this data can be very misleading as the optimizer can use the index for building plans yet not have the index marked as used. Even worse, updating STATS on the index automatically causes it to be flagged. (Blog)

Surely we can do better than that?

Trading tweets with @fuadar, @leight0nn, @arupnanda has confirmed my suspicions: there is no magic bean in Oracle.

There is data you can mine. For example, if an index is being used frequently by your application, there’s a good chance the plan has been cached in the System Global Area (SGA). We could query the indexes from V$SQL_PLAN.

[sql]
select object_name, count(sql_id)
from v$sql_plan
where object_type like ‘INDEX%’
group by object_name
order by 2 desc;
[/sql]

Better than nothing, but I’d hate to make a major indexing change based on what’s in SGA – an extremely volatile space. If the database has been bounced recently, then our data is even more ‘suspect.’ We could also mine the Automatic Workload Repository (AWR), assuming you have paid for the Diagnostic Pack, have licensed Enterprise Edition, and are running at least Oracle 10g. A lot of ifs, and even more holes here as it only contains approximately 10% of the workload observed on a minute by minute basis.

So what’s a data-greedy DBA to do? @arupnanda recommends building a repository of the SGA data. Periodically write it out to a table so you don’t lose anything. Not a bad idea, but you do have to do that yourself. Sure would be nice if Oracle borrowed a page from the SQL Server team. Just saying…

Hey, what about v$segment_statistics?
Well, that’s a new one for me. @leight0nn says that I should check that out. Well, the data looks awesome. According to Oracle, this is available in version 9i and higher, and tracks usage of each segment without enabling any sort of monitoring. The data goes back to when the instance is started.

I found a blog by Rohan Raj Kalra from 2008 that talks about using this data for just such an exercise! (Blog)

[sql]
SELECT object_name, sum(value), statistic_name
FROM v$segment_statistics
WHERE object_type LIKE ‘INDEX%’ AND owner = ‘QUEST_OPTI’
GROUP BY OBJECT_NAME, statistic_name
ORDER BY OBJECT_NAME, 1 DESC;
[/sql]

But Wait!
@syd_oracle reminds me that this data would include any of the IO work that would happen for an index to be maintained. In other words, the reads and writes that happen when a row is inserted into a table and subsequently gets added to the index would be accounted for in this view. Furthermore, other Oracle experts are telling me that the data tracked there is ONLY for DDL work on the indexes. So I’m back to where I started – no reliable method for determining INDEX usage.

The Best Way to Determine Who Will Miss Something, is to Simply Take it Away
In the software industry, we try to have a good idea of what our target users are doing with our products. That way we can deliver value. And, if we ever need to remove a feature, we can estimate the impact and weigh that against the cost of maintaining a feature that ‘no one uses.’ However, every time I have ever seen a feature deprecated, users come out in droves immediately thereafter claiming that was their ‘#1 Favorite Widget!’

DBAs and developers face a similar quandary. While indexing every column seems like a good idea at the time (this is mostly a joke, mostly), the cost of maintaining those indexes is extremely high. You start to ask, if I remove this index, will anyone notice? Oracle isn’t giving up this information very easily. So, let’s just drop it!

Wait!

11g gives us the ‘Invisible Index’ (docs). The benefit here is that once invisible, the optimizer won’t see it for building plans, but, it will continue to be maintained as the source data is changed.

This way, if we decide that it would be a huge mistake to drop the index, we can simply make it visible again. It can takes hours (if not days) to re-create indexes. Thanks @fuadar for the tip!

Have I mentioned how nice it is to have Oracle and SQL Server experts available at 11:30PM on a Sunday for free advice and help?