Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subselect with distinct takes very long time [CORE3608] #3962

Closed
firebird-automations opened this issue Sep 24, 2011 · 4 comments
Closed

subselect with distinct takes very long time [CORE3608] #3962

firebird-automations opened this issue Sep 24, 2011 · 4 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Matthias Philipps (mphilipps)

I've found some strange behaviour with destinct in a subselect: For both statements the result of data ist the same, but execution time for the second one is worse, where I expected to be rather better.

select
GPST_CHANGEDATE, GPST_CHANGEUSER, GPST_CREATEDATE, GPST_CREATEUSER, GPST_DATE, GPST_FILENAME, GPST_ID, GPST_NAME, GPST_TRACK, GPST_USERGROUP_ID,
min(GPSW_TIME) AS MINTIME, max(GPSW_TIME) AS MAXTIME, count(*) AS WEGPUNKTE
from GPSTRACKS T
join GPSWaypoints W on W.GPSW_GPSTID = T.GPST_ID
where T.GPST_ID in
(
select ST.GPST_ID from GPSTRACKS ST
join GPSWaypoints SW on SW.GPSW_GPSTID = ST.GPST_ID
where SW.GPSW_TIME > '25.08.2011 09:30' and SW.GPSW_TIME < '25.08.2011 10:30'
)
group By
GPST_CHANGEDATE, GPST_CHANGEUSER, GPST_CREATEDATE, GPST_CREATEUSER, GPST_DATE, GPST_FILENAME, GPST_ID, GPST_NAME, GPST_TRACK, GPST_USERGROUP_ID

takes 0,043 sec to execute whereas

select
GPST_CHANGEDATE, GPST_CHANGEUSER, GPST_CREATEDATE, GPST_CREATEUSER, GPST_DATE, GPST_FILENAME, GPST_ID, GPST_NAME, GPST_TRACK, GPST_USERGROUP_ID,
min(GPSW_TIME) AS MINTIME, max(GPSW_TIME) AS MAXTIME, count(*) AS WEGPUNKTE
from GPSTRACKS T
join GPSWaypoints W on W.GPSW_GPSTID = T.GPST_ID
where T.GPST_ID in

(
select <b>distinct</b> ST.GPST_ID from GPSTRACKS ST
join GPSWaypoints SW on SW.GPSW_GPSTID = ST.GPST_ID
where SW.GPSW_TIME > '25.08.2011 09:30' and SW.GPSW_TIME < '25.08.2011 10:30'
)

group By
GPST_CHANGEDATE, GPST_CHANGEUSER, GPST_CREATEDATE, GPST_CREATEUSER, GPST_DATE, GPST_FILENAME, GPST_ID, GPST_NAME, GPST_TRACK, GPST_USERGROUP_ID

takes over 8 Sec. with the same data

greetings
matthias

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

This is a support issue, and should be discussed in the Firebird support mailing list at YahooGroups.

The problem is your poorly constructed SQL:

1 - Should be using EXIST() predicate not IN()

2 - There is no need for the reference to GPSTRACTS table in the sub-select, you can query the GPSWaypoints table directly;
select 1 from GPSWaypoints SW WHERE SW.GPSW_TIME BETWEEN '25.08.2011 09:30' and '25.08.2011 10:30' AND SW.GPSW_GPSTID = T.GPST_ID <---- reference to outer GPSTracks table

3 - DISTINCT is slow because the engine it must build a list of all possible values, meeting the criteria, and then sort/dedup the list

@firebird-automations
Copy link
Collaborator Author

Modified by: Sean Leyne (seanleyne)

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Won't Fix [ 2 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

It also does a lot of sense to include the query plans when reporting about performance problems.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant