Monday, September 10, 2007

Oracle - Counting the rows of a table

Most people use the oracle count function in the following way to count the number of rows in a table:

select count(*) from my_table;


While this does technically work, on larger tables it can be grossly inefficient to perform this query. It should also be noted that using the count() function on a single column can also be inaccurate because Oracle aggregate functions (avg, mean, max, min, count, etc.) will ignore null values. For example, the query:

select count(name) from my_table;

will return only the number of rows that have a value for name. Which may be what you want. If, however, you are looking to count the number of rows in a table you will need to supply a column that is specified as not null or a primary key. If your table doesn't have either of these constraints you can use:

select count(rowid) from my_table;

The rowid keyword is an internally used unique identifier that Oracle uses for each row.


8 comments:

ddf said...

Can you prove your assertion? In 10.2.0.3 using rowid in place of * may not be more efficient:

SQL> select count(*)
2 from test;

COUNT(*)
----------
11007


Execution Plan
----------------------------------------------------------
Plan hash value: 3467505462

-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 (8)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| TEST | 11007 | 13 (8)| 00:00:01 |
-------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement


Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
91 consistent gets
42 physical reads
0 redo size
210 bytes sent via SQL*Net to client
245 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL>
SQL> select count(rowid)
2 from test;

COUNT(ROWID)
------------
11007


Execution Plan
----------------------------------------------------------
Plan hash value: 3467505462

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 13 (8)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
| 2 | TABLE ACCESS FULL| TEST | 11007 | 128K| 13 (8)| 00:00:01 |
---------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement


Statistics
----------------------------------------------------------
27 recursive calls
0 db block gets
93 consistent gets
0 physical reads
0 redo size
231 bytes sent via SQL*Net to client
245 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL>

I'll let you prove otherwise.


David Fitzjarrell

ddf said...

Can you prove your contention? In 10.2.0.3 using rowid in place of * may not be more efficient:

SQL> select count(*)
2 from test;

COUNT(*)
----------
11007


Execution Plan
----------------------------------------------------------
Plan hash value: 3467505462

-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 (8)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| TEST | 11007 | 13 (8)| 00:00:01 |
-------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement


Statistics
----------------------------------------------------------
5 recursive calls
0 db block gets
91 consistent gets
42 physical reads
0 redo size
210 bytes sent via SQL*Net to client
245 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL>
SQL> select count(rowid)
2 from test;

COUNT(ROWID)
------------
11007


Execution Plan
----------------------------------------------------------
Plan hash value: 3467505462

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 13 (8)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
| 2 | TABLE ACCESS FULL| TEST | 11007 | 128K| 13 (8)| 00:00:01 |
---------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement


Statistics
----------------------------------------------------------
27 recursive calls
0 db block gets
93 consistent gets
0 physical reads
0 redo size
231 bytes sent via SQL*Net to client
245 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL>

Notice only 5 recursive calls for 'select count(*)' versus 27 recursive calls for 'select count(rowid)'. And both return in roughly the same time.


David Fitzjarrell

Shakespeare said...

This is utter nonsense. Count(*) will give the correct result too, and even for view, which has NO rowid at all.

Richard Drouillard said...

Thanks for the leaving the comments Dave,

I was taking my information from the Oracle Press Book: Oracle 10g: SQL

on page 91 it states:

'You should avoid using the asterisk (*) with the COUNT() function as it may take longer for COUNT() to return the result. Instead you should use a column in the table or the ROWID....'


I did some research and I didn't realize that this was such a hot button topic among people on the net. However, in my own defense of the comment you are replying to, I merely said that this is the Oracle recommended method, not necessarily the correct one. I'm sure you can speak to the fact that the two don't always meet.

Richard Drouillard said...

Hey Shakespeare,
Thanks for taking the time to write.

I wouldn't say it's nonsense. I never claimed that count(*) wouldn't work, only that it could be inefficient (see the other comment I just posted to Dave).

I did claim that the result is inaccurate if you use count() with a column that has null rows when attempting to get the size of a table, but that's a fish of a different colour isn't it.

Guillaume said...

The difference is IO: a count(*) might have to do a lot more IOs to read all data for each rows; with a rowid or something fixed like count(1) the database will only have to read the blocks containing the beginning of the rows.
Does it make a difference today ? count(*) is so usual that database vendors might have created some optimizations for this specific cases ...

Anonymous said...

I found this site using [url=http://google.com]google.com[/url] And i want to thank you for your work. You have done really very good site. Great work, great site! Thank you!

Sorry for offtopic

Anonymous said...

[url=http://platinconne.freehostia.com/map.html]free movies download[/url] parampampam!