Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinct aggregates return wrong (duplicated) data [CORE3302] #3669

Closed
firebird-automations opened this issue Jan 7, 2011 · 15 comments
Closed

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Paul Tessman (insignis)

Is related to QA451

We've noticed a discrepancy between our Firebird 2.5.0 and 2.1.2 servers. The following is the query in question:

select
list(distinct case extract(weekday from date_of_task) when 0 then 'Sun' when 1 then 'Mon' when 2 then 'Tue' when 3 then 'Wed' when 4 then 'Thu' when 5 then 'Fri' when 6 then 'Sat' end)
from tasks where date_of_task is not null;

"tasks" being any table containing a date field, and "date_of_task" being that date field.

On 2.1.2, this query returns:

Fri,Mon,Sat,Sun,Thu,Tue,Wed

as one would expect. On 2.5.0, however, this query returns:

Fri,Fri,Fri,Fri,Fri,Fri,Mon,Mon,Mon,Mon,Mon,Mon,Sat,Sat,Sat,Sat,Sat,Sat,Sun,Sun,Sun,Sun,Sun,Sun,Thu,Thu,Thu,Thu,Thu,Thu,Tue,Tue,Tue,Tue,Tue,Tue,Wed,Wed,Wed,Wed,Wed,Wed

Let me know if I can provide further information of help.

Commits: 46b8ec1 f225c66

====== Test Details ======

Note: LIST() does not guarantee that returned values will be sorted so we can only COUNT words in the resulting string and compare it with checked number.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Full test case, please.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I can reproduce it easily with any table containing the date field. It seems that the problem lies somewhere inside the SortAggregate handling, i.e. the internal sorting supporting the DISTINCT clause. Six subsequent SORT_get() calls return the same value and only then switch to another one.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I confirm that the problem is not with LIST but with any aggregate function. COUNT(DISTINCT) returns 42 instead of 7.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

I'm testing 2.5.0. This is my test which ran ok:

SQL> show table tasks;
DATE_OF_TASK DATE Nullable
SQL> select * from tasks;

DATE_OF_TASK

2011-01-07
2011-01-07
2011-01-07
2011-01-06
2011-01-06
2011-01-06
2011-01-08
2011-01-08
2011-01-08

SQL> select count(distinct date_of_task) from tasks;

   COUNT

============
3

SQL> select
CON> list(distinct case extract(weekday from date_of_task) when 0 then 'Sun' when 1 then 'Mon' when 2 then 'Tue' when
3 then 'Wed' when 4 then 'Thu' when 5 then 'Fri' when 6 then 'Sat' end)
CON> from tasks where date_of_task is not null;

         LIST

=================
0:1

LIST:
Fri,Sat,Thu

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I have found the reason of the failure, already testing.

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

I was testing with a rather big table (millions of rows). And this is a real regression, v2.1 has the correct code.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

summary: LIST() returns extra values => Distinct aggregates return wrong (duplicated) data

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.5.1 [ 10333 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

The problem was that MOVE_CLEAR() zapped only first part of the sort key, as asb_length was not rounded up. Prior versions used ROUNDUP_LONG for both SORT_init() and SORT_put(), but v2.5 missed that rounding where SORT_put() is called.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Version: 3.0 Initial [ 10301 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

Link: This issue is related to QA451 [ QA451 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: Done successfully

Test Details: Note: LIST() does not guarantee that returned values will be sorted so we can only COUNT words in the resulting string and compare it with checked number.

@firebird-automations
Copy link
Collaborator Author

Commented by: @pcisar

Test created.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment