We’re Gonna Need a Bigger TEMP Tablespace

thatjeffsmith Database Stuff 17 Comments

Tell Others About This Story:

It’s one of the most iconic scenes in American film – Roy Schneider’s character is chumming the water and sees the shark they are hunting. The massive size of the fish causes him to comment, “We’re gonna need a bigger boat!” It’s shark week, and I was struggling with inspiration for blog posts, so apologies if this metaphor is stretched beyond your imagination.

When Oracle (and SQL Server) need to do some expensive math operations, primarily SORTS, it needs a lot of disk space. When you want to join or order data, the database engine gets to do all the low-level yucky math stuff that you ignored in college (or at least I did). Instead of going to the OS for disk space to use this, the database has its own TEMP tablespace. In Oracle you can even have multiple tablespaces set aside for TEMP operations. But my questions are, ‘How much is enough?’ and ‘How much is too much?’

Wait, why is this bad?

I had never really considered having too much of anything as a problem before, but an offhand tweet last week by @erinstellato gave me pause. My current database has a 5+GB TEMP tablespace. The image it runs on only has 30GB of space and runs pretty full most of the time. So maybe my boat is really way too big? Or, maybe I can get greedy and steal back some of the space for my OS.

Instead of researching and finding out the best way to determine TEMP size, I’m just going to re-size this to 2GB and see what happens. I’ll wait and watch for

ORA-01652: unable to extend temp segment by string in tablespace string

I’ve additionally enabled auto-extend with 512MB increments.

How Do I Properly Size My TEMP Tablespace?

Is there a magic rule or ratio based on the size of your database or the number of transactions or users running amok in your system? Back in the day when I was managing customer databases, we had a script to build the database — which included TEMP — and then going forward we would just add space as necessary. Is this really the best way? I’m guessing not. So let’s jump to the innerwebs to see what the experts have to say!

It's official - You Suck at Google

  • Creation and Maintenance of TEMP (Oracle Docs) – good stuff, but nothing about sizing guidelines

And that’s all I could really find. Multiple Google’s on ‘sizing recommendations’ and ‘how to size’ really didn’t give me much. So I’m waiting for real production DBAs and other people smarter than me to share their thoughts here.

I’m also happy to learn more about sizing TEMPDB too if you’re a SQL Server type!

I’ll be happy to report back on the ‘magic number’ I end up with on my own personal Oracle 11gR2 database. Hopefully I come up with that number based on trial and error or science – and not on guessing. But I’ll be happy with a number that doesn’t starve the OS or the database. I think a lot of folks really only care about the end result and not how they got there. Is that bad when it comes to managing TEMP?

I found a few good links for stories and advice around being sure to tune your queries and having proper stats on your objects so you don’t slam TEMP. But how do you know when you’ve reached the limit of database tuning? Or

Is Your Server So Big You Can Afford to Throw Tons of Space at TEMP and Not Worry About It?

8/3/2011 Update
Enkitec consultant and Oracle expert Kellyn Pot’Vin did more than share a link, she authored a topic on the subject! Here it is in all its glory!

Warp Speed with Temp Tablespace Groups

Tell Others About This Story:

Comments 17

  1. Ironic, but had issue today where waits on temp read/write across SSD was a performance hit, (sequential reads were first seen as hit, then upon investigation, the IO hit wait was from the temp tablespace groups!) The performance difference between two processes, same CTAS, same explain plan, one only using 6MB of TEMP, the second 13GB of TEMP was over 32 fold.
    My goal is to figure out a way to redesign the code or find a way to guarantee more PGA towards the process. Fix the temp usage, fix the performance… πŸ™‚

  2. You got my mind chewing on this.
    Another aspect to consider is how many sessions/transactions are currently using temp space. The contents are private so it is inherently ‘selfish’ (That is MY space). Multiple sessions using temp space may be a sign of a problem (which is an extension of OLTP apps shouldn’t need temp).

  3. I have many adhoc users and due to the demands on the business, a query can utilize huge amounts of temp from poorly written SQL and designed objects. A couple of the analysts/developers thought it was a competition, with the worst being 3TB of temp tablespace used for one PROCESS!! Understanding how temp is used vs. the size of the data set is incredibly important and how and when to use “step through tables” during an ETL process can save 100’s GB in TEMP. This can not only save wasted disk space but also enhance performance, often taking processes that sorted for hours down to mere minutes. Sorting and hashing to temp should be seen no different than swapping to disk when you are out of memory. This is an expensive process as you are no longer in fast memory, but slow disk. Faster disk, such as SSD can render some of this obsolete in our current configurations, but we still need to investigate the costs. All explain plans for DSS systems especially, should show the temp usage and IO cost, not just the CPU cost, cardinality and object information.

  4. Another thought.
    TEMP can be a trade-off.

    If an occasional report requires sorted data, that sorting can be done at the time, requiring temp, or perhaps through an index. The latter incurs the performance penalty on writes, the former on reads.

    And the other obvious trade off is doing a sort in memory (sucking up PGA and potentially SGA depending on how your memory limits are set) against a spill to disk. The latter is lower for that process/query but might be friendlier to the rest of the application.

    Different users can be set up with different temp tablespaces (and hence limits) to isolate them according to what is expected of them

  5. #!/usr/bin/ksh
    #############################################################################
    # chk_tmp_undo.ksh
    # Kellyn Pot'Vin
    # Usage: ./chk_tmp_undo.ksh   
    # Good Base for Thresholds is 50% of Temp/Undo tablespace size. That will give you the warning when something large is running
    # Threshold is in MB- not GB!!
    #############################################################################
    
    #----------------------------------------------------------------------------
    # Verify that the ORACLE_SID has been specified on the UNIX command-line...
    #----------------------------------------------------------------------------
    if (( $# != 3 ))
    then
    echo "usage: $0 SID tmp_pct undo_pct"
    exit 1
    fi
    #
    #----------------------------------------------------------------------------
    # Set up Oracle environment...
    #----------------------------------------------------------------------------
    
    export ORACLE_SID=$1
    export TMP_THRSHLD=$2
    export UNDO_THRSHLD=$3
    echo "Oracle SID: "${ORACLE_SID}
    export ORACLE_BASE=/u01/app/oracle
    export ORACLE_HOME=`grep -i ^"$ORACLE_SID" /etc/oratab | awk -F: '{print $2}'`
    export PATH=$PATH:${ORACLE_HOME}/bin
    
    
    export EXEC_DIR=/common/admin
    export SQL_DIR=/common/sql
    export LOG_DIR=/common/logs
    export ML_LIST=""
    
    export UNDO_FL=${LOG_DIR}/${ORACLE_SID}_undo.log
    export TMP_FL=${LOG_DIR}/${ORACLE_SID}_tmp.log
    export OU_FL=${LOG_DIR}/${ORACLE_SID}_u.out
    export OT_FL=${LOG_DIR}/${ORACLE_SID}_t.out
    
    echo "UNDO_Threshold: $UNDO_THRSHLD"
    sqlplus -s < < EOF | read GET_UNDO
    / as sysdba
    set feedback off
    set head off
    set pagesize 0
    select sum(round((vu.undoblks*32768)/(1024*1024))) "MB of UNDO"
    from v$undostat vu
    where ((vu.undoblks*32768)/(1024*1024)) > 0;
    EOF
    export $GET_UNDO
    
    sqlplus -s < < EOF > $OU_FL
    / as sysdba
    set head on
    set feedback on
    set serveroutput on
    select distinct vt.sql_text, vs.sid,vs. osuser, vs.sql_id, sum(round((vu.undoblks*32768)/(1024*1024)))MB
    from v$sqltext vt, v$undostat vu, v$session vs
    where vu.maxqueryid in vs.sql_id
    and vs.sql_id=vt.sql_id
    and vt.piece=0
    and ((vu.undoblks*32768)/(1024*1024)) > 0
    group by vt.sql_text,vs.sid,vs. osuser,vs.sql_id, vu.undoblks
    order by MB desc;
    
    select sum(round((vu.undoblks*32768)/(1024*1024))) "MB of UNDO"
    from v$undostat vu
    where ((vu.undoblks*32768)/(1024*1024)) > 0;
    
    EOF
    
    if [[ $GET_UNDO -gt $UNDO_THRSHLD ]]
    then
    echo|mail -s "${ORACLE_SID}: $UNDO_THRSHLD MB Threshold of Undo Usage Surpassed" $ML_LIST < $OU_FL
    fi
    
    echo "TEMP_Threshold: $TMP_THRSHLD"
    sqlplus -s << EOF | read GET_TMP
    / as sysdba
    set head off
    set pagesize 0
    select sum(trunc(swa.tempseg_size/1024/1024))"TEMP TOTAL MB"
    from v$sql_workarea_active swa;
    EOF
    export $GET_TMP
    
    sqlplus -s << EOF > $OT_FL
    / as sysdba
    set head on
    set serveroutput on
    col sid format 9999990 heading 'SID'
    col osuser format 999,999,999,999 heading 'OSUSER'
    col tmpseg_size format 999999990 heading 'TEMP TOTAL MB'
    col sql_id format 999,999,999,999,999,999 heading 'Active SQL_ID'
    select swa.sid, vs.process, vs.osuser, vs.machine,vst.sql_text, vs.sql_id "Session SQL_ID", 
    swa.sql_id "Active SQL_ID", trunc(swa.tempseg_size/1024/1024)"TEMP TOTAL MB"
    from v$sql_workarea_active swa, v$session vs, v$sqltext vst
    where swa.sid=vs.sid
    and vs.sql_id=vst.sql_id
    and piece=0
    and swa.tempseg_size is not null
    order by "TEMP TOTAL MB" desc;
    
    select sum(trunc(swa.tempseg_size/1024/1024))"TEMP TOTAL MB"
    from v$sql_workarea_active swa;
    EOF
    
    if [[ $GET_TMP -gt $TMP_THRSHLD ]]
    then
    echo|mail -s "${ORACLE_SID}: $TMP_THRSHLD MB Threshold of TEMP Usage Surpassed" $ML_LIST < $OT_FL
    fi
    
    rm -f $OU_FL
    rm -f $OT_FL
    exit 0
    
    1. JeffS Post
      Author
    2. JeffS Post
      Author
  6. Now having a tempDB quite large in SQL Server is a harsh issue, but I don’t consider this a feature that can be compared apples to apples with Oracle in the way “temp” space is utilized. Nothing shows poor performance like a huge TEMP tablespace in Oracle, where in SQL Server, it will show itself in a large transaction log. Tune the queries and in Oracle, TEMP tablespace usage shrinks as PGA is able to aborb the sorting and hash work that would have “spilled over” to the TEMP tablespace and in SQL Server, you can shrink the tran log and it will actually stay a reasonable size!
    In Oracle, if you want to know how to size it or even better yet, what needs to be tuned in an environment BEFORE it goes production, monitor the memory work area and see what is spilling over from PGA to temp, along with the majority of temp usage. Checking the PGA cache hit ratio will also tell you if you are utilizing TEMP too often for spill over and need to resize it to stop the TEMP work.

  7. You should distinguish between OLTP and DSS systems (or at least queries):
    In a perfect OLTP system, no TEMP should be needed at all: If any statement does so much sort which can not be handled in memory, the statement might be worth tuning (or worse: the design is inappropriate). In a real world you should be able to know these few statements and estimate how many of these might run in parallel.
    In a DSS system you will need TEMP for sure. But not only TEMP, also CPU, IO (also for TEMP) and probably other resources for the statements. Again you should know at least those statements, which need most of them. As it makes no sense to run more statements than resources are available, the first saturated resource limits the total required resources – also TEMP.

    1. JeffS Post
      Author

      How many perfect OLTP systems have you encountered? Plus you’re assuming there’s plenty of memory…I’m thinking there’s a lot of AD HOC/reporting happening inside the OLTP database. That’s a much bigger ‘problem’ I suppose.

      As a DBA how do you track/ID queries using TEMP? Is that a report you run on a regular basis, or just something you look for when you get a performance ticket?

    2. While the OLTP application might not need TEMP tablespace, some DBA operations may. The obvious example is building a new index.

      So a rough and ready solution is, your initial data load (eg some migrated data from an old system) will probably keep failing until you have enough TEMP space. Then you leave it there and your OLTP shouldn’t have a problem.

  8. For new customers, I try to understand how many users they will have, how busy the system will be, and size from there *to start*. It’s a best guess…and can anywhere from 2GB to 20GB, it just depends. Then, I tell them how to monitor it and explain that the size may need to increase at some point as they bring on more users or add more functionality.

    Yes, when you bounce the instance it will come up with tempdb sized at 8 MB UNTIL you increase the size. Once you do that, it will recreate tempdb at that size when you restart.

    1. JeffS Post
      Author

      Your best guess is really based on years of understanding your users and your app. So it’s not really a guess anymore πŸ™‚

      Thanks for the clarification on the restart. That’s actually a very nice feature!

  9. There is no magic number for tempdb size in SQL Server πŸ™‚ just as there’s no magic size for the transaction log. Here’s a couple MS articles on tempdb:

    http://msdn.microsoft.com/en-us/library/ms345368.aspx
    http://blogs.msdn.com/b/sqlserverstorageengine/archive/2009/01/04/managing-tempdb-in-sql-server-tempdb-configuration.aspx

    The general recommendation is to run the system under normal production load (if you can recreate this in advance of go-live in a TEST/DEV environment, that’s great…but it really doesn’t happen too often) and see how large tempdb grows. Then, size it accordingly in Production. In SQL Server, when you restart the service, tempdb goes back to it’s default size (8 MB). For a Production instance, this is usually way too small. In addition, you can create multiple tempdb file to enhance concurrency, which is an entirely separate discussion πŸ™‚

    1. JeffS Post
      Author

      I don’t see – in general – folks running Test/Dev with the same amount of data or users. So while that’s great advice, I’m betting it won’t do much for 80-90% of the users out there. When you deploy a new customer site, how do you size their TEMPDB? Have you come up with a Goldilocks, not too cold, not too hot number? Or do you run in test dev onsite and then tune the production instance as appropriate?

      Are you saying after a server bounces it will always come up with a 8MB tempDB in SQL Server? Or is there a way to keep it at a higher low-water-mark?

Leave a Reply

Your email address will not be published. Required fields are marked *