Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New index structures to eliminate data page reads [CORE5909] #6167

Open
firebird-automations opened this issue Sep 8, 2018 · 8 comments
Open

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @livius2

Votes: 1

Will be good to see index only operations in situation like this:

table with index on SOURCE_ID and the query

SELECT
GS.SOURCE_ID, COUNT(*)
FROM
GAIA_SOURCE GS
GROUP BY GS.SOURCE_ID
HAVING COUNT(*)>1

should go throught index only without visiting table data pages

it is important when working with big databases
e.g. index size can be relatively small compared to table itself.
In my real database index have only 18GB but table have 720GB.
As you can see accessing only 18GB is not the problem - but accessing 720GB is.

same in other places join between tables (using indexed fields) without returning any other field values from some joined tables.
I suppose this is big change and require REAL TRANSACTIONAL INDEXES.
Current indexes benefit only for intensive updating application.

@firebird-automations
Copy link
Collaborator Author

Modified by: Sean Leyne (seanleyne)

description: Will be good to see index only operations in situation like this:

table with index on SOURCE_ID and the query

SELECT
GS.SOURCE_ID, COUNT(*)
FROM
GAIA_SOURCE GS
GROUP BY GS.SOURCE_ID
HAVING COUNT(*)>1

should go throught index only without visiting table data pages

it is important when working with big databases
e.g. index size can be relatively small compared to table itself.
In my real database index have only 18GB but table have 720GB.
As you can see accessing only 18GB is not the problem - but accessing 720GB is.

same in other places join between tables (using indexed fields) without returning any other field values from some joined tables.
I suppose this is big change and require REAL TRANSACTIONAL INDEXES.
Current indexes benefit only for intensive updating application.

=>

Will be good to see index only operations in situation like this:

table with index on SOURCE_ID and the query

SELECT
GS.SOURCE_ID, COUNT(*)
FROM
GAIA_SOURCE GS
GROUP BY GS.SOURCE_ID
HAVING COUNT(*)>1

should go throught index only without visiting table data pages

it is important when working with big databases
e.g. index size can be relatively small compared to table itself.
In my real database index have only 18GB but table have 720GB.
As you can see accessing only 18GB is not the problem - but accessing 720GB is.

same in other places join between tables (using indexed fields) without returning any other field values from some joined tables.
I suppose this is big change and require REAL TRANSACTIONAL INDEXES.
Current indexes benefit only for intensive updating application.

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

Karol,

I understand the goal you are trying to describe but the Summary of the issue is ... useless. far too vague to provide any meaning to third-parties.

Please suggest an alternate description (eg. New index structures to eliminate data page reads).

@firebird-automations
Copy link
Collaborator Author

Commented by: @livius2

@sean,

feel free to change title and description as my English is not so good

@firebird-automations
Copy link
Collaborator Author

Modified by: Sean Leyne (seanleyne)

summary: Index only operations => New index structures to eliminate data page reads

@firebird-automations
Copy link
Collaborator Author

Commented by: Ann Harrison (awharrison)

Yes this is a big job and will significantly increase index size and degrade performance in many cases. Each index entry will need to identify the transaction number of the transaction that created the entry and the transaction that invalidated it. There are optimizations like dropping the identy of the creatimg index when an entry is fully mature and visible to all transactions but they cause even more index updates.

On the other hand, this is not an "either or" choice and could be implemented as a new index type, dropping the optimization choice in the designers lap.

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

> will significantly increase index size and degrade performance in many cases
Anyone did benchmark to compare performance ? (may be there is/was special build with such feature like Karol asks?)

> need to identify the transaction number
When database is read-only (e.g. replication target or other DB for reports/OLAP) then we can return key from index immediately, without knowing its transaction creator. Isn't ?

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

PS. When user does any kind of updates with GTT that was created 'on commit delete' then it is also no need to extract info from data pages if it is found in index key.

@firebird-automations
Copy link
Collaborator Author

Commented by: @livius2

if you consider implementing new index structure take also into account:
1. Bi directorial navigation
2. Different navigations in compound index first field asc second desc (maybe not alowed in bi directorial indexes)
3. partial navigation e.g. compound index (Field1, Field2, Field3) and in the where clause you specify ony (Field1 and Field3) or only FIeld3 and it should use also index (this work in PostgresSQL)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant