Issue Details (XML | Word | Printable)

Key: CORE-214
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Dmitry Yemanov
Reporter: skopalik
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Count ( DISTINCT ... ) is too slow

Created: 17/Jan/03 12:00 AM   Updated: 02/Jun/15 03:08 PM
Component/s: Engine
Affects Version/s: None
Fix Version/s: 3.0 Beta 2, 2.5.5

SF_ID: 669770
QA Status: Done successfully
Test Details:
Multiple measures show that ratio in 2.5.5 and 3.0 is about 1.05 - 1.10 (this is how query "select count(*) from ( select distinct id from ... )" is SLOWER than "select count(distinct id) from ..."). In rare cases, when number of unique values is 100'000, this ratio was ~1.5.
Threshold was selected to be 1.8.


 Description  « Hide
SFID: 669770#
Submitted By: skopalik

select distinct is about 10x to 100x faster then
select count(distinct ).
Tested version: FB 1.0.2.908 and FB 1.5.0.1761

test case:
create table a ( a int);

create procedure aset( cnt int) as
declare variable i int;
begin
  i=0;
  while(i<cnt)do begin
   insert into a VALUES (:cnt/1000)
    i=i+1;
  end
end

execute procedure aset (10000)

PC: Notebokk Celeron 500/ 128 MB RAM
There are result captured from wisql:

select distinct * from a

PLAN SORT ((A NATURAL))

          A
===========

         10

Current memory = 0
Delta memory = 0
Max memory = 0
Elapsed time= 0.13 sec
Buffers = 75
Reads = 0
Writes 0
Fetches = 20072


select count (distinct a) from a

PLAN (A NATURAL)

      COUNT
===========

          1

Current memory = 0
Delta memory = 0
Max memory = 0
Elapsed time= 10.86 sec
Buffers = 75
Reads = 0
Writes 0
Fetches = 20071

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Alice F. Bird added a comment - 14/Jun/06 09:33 AM
Date: 2005-10-14 18:27
Sender: fuerchau
Logged In: YES
user_id=1300926

The time of "count(distinct xxx)" depends on the number of
different values in the database.
So i have now tested with a table of 30000 records.

If each record consists only 1 value, the time is ca. 10
seconds.
If i have perhaps 20 different values, the time is ca. 6
seconds.
if i have 4000 different values the result is immediatly
avalable !

Firebird 1.5.2.4731 and tested with IBExpert.

Because my application must work with millions of records
and i need this function, i would by happy to get a solution.

Alice F. Bird added a comment - 14/Jun/06 09:33 AM
Date: 2005-10-14 18:10
Sender: fuerchau
Logged In: YES
user_id=1300926

The time of "count(distinct xxx)" depends on the number of
different values in the database.
So i have now tested with a table of 30000 records.

If each record consists only 1 value, the time is ca. 10
seconds.
If i have perhaps 20 different values, the time is ca. 6
seconds.
if i have 4000 different values the result is immediatly
avalable !

Firebird 1.5.2.4731 and tested with IBExpert.

Because my application must work with millions of records
and i need this function, i would by happy to get a solution.