Issue Details (XML | Word | Printable)

Key: CORE-1686
Type: New Feature New Feature
Status: Open Open
Priority: Major Major
Assignee: Dmitry Yemanov
Reporter: Timo Partanen
Votes: 12
Watchers: 11
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Support value distribution histograms

Created: 12/Jan/08 04:58 AM   Updated: 22/Jan/19 05:37 PM
Component/s: Engine
Affects Version/s: None
Fix Version/s: None

Issue Links:
Duplicate
 
Relate
 


 Description  « Hide
To provide better performance for queries, Firebird should support value distribution histograms. A histogram is a statistical report that shows the frequency of values within steps or ranges of values that fall between a certain minimum and maximum. Generally, histograms improve the performance of queries because the optimizer can use them to estimate the selectivity of conditions better than from the selectivity of indexes.

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Dmitry Yemanov added a comment - 14/Jan/10 10:01 AM
Suggested by Bill Oliver:

When this is added I would like to see the histogram information available through the monitoring tables, too. Information that might be surfaced might include:

- Column's Most Common Value (MCV)
- The frequency of the most common value, call this Most Common Frequency (MCF)
- Histogram information

Philip Williams added a comment - 06/May/11 06:06 PM
Is NULL handled as a special case in the histogram?

Can other values be requested to be treated as special, for situations where you know a field is being abused with special values (e.g. the field has a very regular distribution, except for the values -1, 0, and 1, which are each individually special and very different from > 1.)

Kjell Rilbe added a comment - 14/Aug/12 08:02 AM
In ref to Philip Williams' comment: I vote strongly for treating null as a special case in the histogram, i.e. always keep a specific frequency/selectivity figure for null, if the indexed column allows null. For a compound index, I'm not quite sure how to handle cases where some columns are null and some are not, but i assume that the all-null case should be treated separately, i.e. like null for a single-column index.

Dmitry Yemanov added a comment - 14/Aug/12 08:06 AM
Yes, nulls are to be treated separately.

Atri Sharma added a comment - 01/Mar/16 06:00 PM
Hi,

I am a relational database developer, focused on many areas.

I would like to start hacking Firebird,with this issue, if possible.

Please advice.

Ann Harrison added a comment - 01/Mar/16 09:02 PM
Histograms are less useful in database like Firebird that optimize queries
at compile time rather than run time. Knowing the value distribution
in an index is of little use to the optimizer if it doesn't know what value will
be searched. Not that this feature shouldn't be implemented, just don't
expect it to improved performance dramatically in cases like this:

Select c.zipcode
       from customers c
                inner join orders o on o.customerId = c.customerId
                inner join orderItems oi on oi.orderId = o.orderId
                where oi.item = ? and c.state = ?