So while hanging out with a bunch of co-workers last week, someone mentioned they had this trick. The challenge is knowing how many rows you have in your result set before you actually read them all. You see, Oracle has no way of telling us how many records are in our data set before we actually fetch the entire data set.
Yes, Oracle gives us %ROWCOUNT, but that only tells us ‘…how many rows affected so far.’ So when you start to fetch the records back, you know how many you have processed. And, you know there are more left.
Wouldn’t it be nice if you could get the size of the record set and have that available as you process the records?
Well, maybe you do.
Of course, nothing is free. Especially something as ‘cool’ as this.
Welcome to the world of analytics. Tom Kyte, of AskTom fame, has said this of analytics –
Analytics rock
Analytics rollbeen thinking about writing a book just about analytics…
Analytic functions are usually the answer to some challenging set based questions and puzzles. I’ll know when the answer involves analytics, and jump over to Google for a quick example. How they work is a bit beyond the scope of a quick blog post, but I’ll try a bit.
“Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they return multiple rows for each group. The group of rows is called a window and is defined by the analytic_clause. For each row, a sliding window of rows is defined. The window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a physical number of rows or a logical interval such as time.
Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed. Therefore, analytic functions can appear only in the select list or ORDER BY clause.
Analytic functions are commonly used to compute cumulative, moving, centered, and reporting aggregates.”
You should really, really KNOW about analytic functions.
I feel the same way about regular expressions. While it’s not expected that you master them, you need to know when they are called for and how to figure out how to use them as needed. It’s what separates the intermediate hackers from the professionals.
So here’s the ‘trick code’ and an example:
SELECT MainSQL.* FROM (
SELECT t.*,
ROW_NUMBER() OVER ( ORDER BY 1 ) Row_Number,
COUNT(1) OVER ( ORDER BY 1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) Row_Count
FROM (
select * from beer -- YOUR QUERY HERE
) t
ORDER BY 1
) MainSQL
WHERE Row_Number >= 1
AND Row_Number < ( 1 + 50 )
ORDER BY Row_Number;
So everything HAS to happen in the SELECT. What we have is a window being defined for the entire dataset. Or that’s how I read the ‘RANGED BETEWEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING’ bit. So how does this look when we run it?
So the ‘cool’ part is seeing the ‘Row Number’ and the ‘Row Count’ as the dataset is retrieved.
9 Comments
how do you do: count(*) over() total_rows, in ORDS?
Thanks Jeff, that template sql for row_number() solved my pagination issue π
Just one note to your co-workers statement that this will be fast for almost all real web applications π :
My company has a webshop with a couple hundred thousand products. We have a search facility based on Oracle Text.
Suppose I as a customer happen to start a query on a common word so the “where contains()” clause will find 132788 products.
If I place such a query as “mainsql” with an analytic construct to find the total count, Oracle Text will have to access all those thousands of products…
As a customer in such a case I would not like to wait even 3-4 seconds (web users have no patience) to get “Now showing 1-50 of 132788 products”.
I would prefer to blindingly fast get “Now showing 1-50 of many products – we suggest you add an extra word to narrow your search.”
Anyway – that’s opinion π
Oracle Text doesn’t count π
Sure it counts:
http://docs.oracle.com/cd/E11882_01/text.112/e24436/cqrypkg.htm#i996910
π
So use THAT instead, right?
Kim, I think you win for the longest comment reply ever on my blog! Thanks so much for sharing your insight and pointing out Tom’s take on pagination.
Your note regarding the engine having to access all the rows in order to count them is something I hope no one glosses over when they read these ‘tricks.’ It’s not really a trick then, is it?
My co-worker who originally shared the analytic approach with me for this row count puzzle supplied a simple, yet very important caveat: ‘For almost all queries for real web applications, this is so fast and does not make a time difference in terms of with or without the windowing.’ Yeah, don’t build this into your data warehouse reports without doing a bit of testing!
Huh? Where did my indentation go in my post above? I am sorry about that – I am pretty sure I used code tags π
Hi, Jeff
Analytics sure rock’n’roll π
You could combine your trick with Tom Kyte’s (and others) way to “paginate” result sets:
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:127412348064
Something like this:
SQL> variable from_row number
SQL> variable to_row number
SQL> begin
2 :from_row := 1;
3 :to_row := 3;
4 end;
5 /
PL/SQL procedure successfully completed.
SQL> select s2.*
2 from (
3 select s1.*
4 , rownum rnum
5 from (
6 select main.*
7 , count(*) over () row_count
8 from (
9 select empno
10 , ename
11 from scott.emp
12 ) main
13 order by main.ename
14 ) s1
15 where rownum = :from_row
18 ;
EMPNO ENAME ROW_COUNT RNUM
---------- ---------- ---------- ----------
7876 ADAMS 14 1
7499 ALLEN 14 2
7698 BLAKE 14 3
SQL> begin
2 :from_row := 4;
3 :to_row := 6;
4 end;
5 /
PL/SQL procedure successfully completed.
SQL>
SQL> select s2.*
2 from (
3 select s1.*
4 , rownum rnum
5 from (
6 select main.*
7 , count(*) over () row_count
8 from (
9 select empno
10 , ename
11 from scott.emp
12 ) main
13 order by main.ename
14 ) s1
15 where rownum = :from_row
18 ;
EMPNO ENAME ROW_COUNT RNUM
---------- ---------- ---------- ----------
7782 CLARK 14 4
7902 FORD 14 5
7900 JAMES 14 6
SQL> explain plan for
2 select s2.*
3 from (
4 select s1.*
5 , rownum rnum
6 from (
7 select main.*
8 , count(*) over () row_count
9 from (
10 select empno
11 , ename
12 from scott.emp
13 ) main
14 order by main.ename
15 ) s1
16 where rownum = :from_row
19 ;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 4140508541
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 644 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 14 | 644 | 3 (0)| 00:00:01 |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | VIEW | | 14 | 462 | 3 (0)| 00:00:01 |
| 4 | WINDOW SORT | | 14 | 140 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL| EMP | 14 | 140 | 3 (0)| 00:00:01 |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("S2"."RNUM">=TO_NUMBER(:FROM_ROW))
2 - filter(ROWNUM<=TO_NUMBER(:TO_ROW))
A couple points here:
“COUNT(*) OVER ()” is a short cut to avoid writing range between unbounded preceding and unbounded following. It also makes the order by unnecessary and makes it clear you simply wish a total count without any regard to a particular “window” of data.
Using ROWNUM for the filtering rather than ROW_NUMBER gives Oracle the possibility of doing COUNT STOPKEY in the access plan. (I believe maybe the optimizer is sufficiently intelligent in recent versions that it may do the same thing with ROW_NUMBER – I am not sure π
But beware…
Even though the ROWNUM (or ROW_NUMBER) pagination/filtering makes it possible not to FETCH all the data in order to know the total count of rows, the Oracle SQL engine still has to ACCESS all the rows in order to count them. In many cases it will still be expensive, but for example if your query has columns with expensive function calls then it is possible that the counting of the rows is relatively cheap compared with what you save because only the rows up to :TO_ROW will actually call the functions.
So in general I personally would advise using this magic trick with caution. Don’t use it all over your application simply because your web developer says “it’s cool to be able to write page 1 of 2705432 pages” π