Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reporting relation/table summary/total Index Page usage statistics [CORE3906] #4242

Open
firebird-automations opened this issue Aug 24, 2012 · 10 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Sean Leyne (seanleyne)

The current statistics do not include details regarding the page usage related to index structures linked to data rows of the relation/table.

Details concerning Index page usage are important when trying to determine the storage 'profile' of a database.

@firebird-automations
Copy link
Collaborator Author

Modified by: Sean Leyne (seanleyne)

summary: Add support for reporting Relation Index Page usage statistics => Add support for reporting relation/table Index Page usage statistics

@firebird-automations
Copy link
Collaborator Author

Commented by: @dyemanov

What do you mean here? Index pages are reported by GSTAT. Or did you mean pointer pages instead?

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

I am saying that GSTAT should report all details on all structures (even system structures), so that a complete picture is available.

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

Here is the current GSTAT output:

XXXX \(128\) 
    Primary pointer page: 304, Index root page: 305 
    Average record length: 97\.12, total records: 2177650 
    Average version length: 0\.00, total versions: 0, max versions: 0 
    Data pages: 36868, data page slots: 36868, average fill: 83% 
    Fill distribution: 
         0 \- 19% = 0 
        20 \- 39% = 0 
        40 \- 59% = 0 
        60 \- 79% = 604 
        80 \- 99% = 36264  

All I see is the number of "Data Pages". "Data pages" means pages used to store primary data.

Where are the number of index pages?

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

Although you get extract the index page usage details:

Index xxxxxxx \(1\) 
    Depth: 3, leaf buckets: 1347, nodes: 2177650 
    Average data length: 0\.01, total dup: 2175032, max dup: 14545 
    Fill distribution: 
         0 \- 19% = 1 
        20 \- 39% = 0 
        40 \- 59% = 0 
        60 \- 79% = 0 
        80 \- 99% = 1346 

Unlike the data page output there is no summary (as the Data pages have), ie:
Pages: 36868, page slots: 36868, average fill: 83%

I also think that a simple summary at the relation/table level would be appropriate, ie:
XXXX (128)
Primary pointer page: 304, Index root page: 305
Average record length: 97.12, total records: 2177650
Average version length: 0.00, total versions: 0, max versions: 0
-> Indexes: {Count of Indexes}, total index pages: {nnn}, total index slots: {nnnn}, average fill: nn%
Data pages: 36868, data page slots: 36868, average fill: 83%

To save from having to add up all the fill distributions for all of the relation/table indexes.

@firebird-automations
Copy link
Collaborator Author

Modified by: Sean Leyne (seanleyne)

summary: Add support for reporting relation/table Index Page usage statistics => Add support for reporting relation/table summary/total Index Page usage statistics

@firebird-automations
Copy link
Collaborator Author

Commented by: @ibaseru

Sean, I'm against this
-> Indexes: {Count of Indexes}, total index pages: {nnn}, total index slots: {nnnn}, average fill: nn%

because yes, you can extract index root pages for table at once, but total index pages you can determine only visiting that pages.
Thus gstat scan will be very slow, or, will need to keep lot of information in memory before wrigting it to output.

Really, all these numbers can be parsed and computed, giving additional info line size in megabytes, etc, by any other program that analyzes gstat output. I wrote my own (and I think you know it's name), not to dig into numbers, especially for the large databases.
So, please, leave gstat ability to scan one object at a time :-)
What is also valuable, is that current gstat can scan broken databases. If it will do "forward scanning", the output will be broken even at objects, that could be read successfuly.

Also "count of indices" is not necessary, because indices are numbered from 1 to N for table, and gstat output places this info between current and next table.

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

While the current output includes a index number for a table, the output in not in index number order:

Index XXXX \(41\)
    Depth: 2, leaf buckets: 91, nodes: 195124
    Average data length: 0\.00, total dup: 195123, max dup: 195123
    Fill distribution:
         0 \- 19% = 0
        20 \- 39% = 0
        40 \- 59% = 0
        60 \- 79% = 1
        80 \- 99% = 90

Index XXXX \(29\)
    Depth: 2, leaf buckets: 91, nodes: 195124
    Average data length: 0\.00, total dup: 195123, max dup: 195123
    Fill distribution:
         0 \- 19% = 0
        20 \- 39% = 0
        40 \- 59% = 0
        60 \- 79% = 1
        80 \- 99% = 90

Index XXXX  \(22\)
    Depth: 2, leaf buckets: 91, nodes: 195124
    Average data length: 0\.00, total dup: 195123, max dup: 195123
    Fill distribution:
         0 \- 19% = 0
        20 \- 39% = 0
        40 \- 59% = 0
        60 \- 79% = 1
        80 \- 99% = 90

So, it is not obvious how many indexes exist.

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

kdv,

It should not require a third-party tool in order for a user to gather details about a database, this is something which is engine should provide.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

assignee: Vlad Khorsun [ hvlad ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment