Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing big table causes sort error: not enough memory [CORE2525] #2935

Open
firebird-automations opened this issue Jun 25, 2009 · 17 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @ibaseru

Votes: 2

Creating index on a huge table (3.7 billion records). After few hours error happens:

Statement failed, SQLCODE = -904
sort error: not enough memory

At that time there are lot of free space in TEMP, and also 1.5gb RAM available.

Index on a table with less records was created successfuly, with temp file 43gb size.

@firebird-automations
Copy link
Collaborator Author

Modified by: @hvlad

assignee: Vlad Khorsun [ hvlad ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

How big do you estimate that the temp file would be for the index? (i.e. how many rows and what is the total size of the index fields)

What is a "lot of free space"?

What are the sort memory settings in firebird.conf?

What version of FB are you running? Classic/SuperServer?

@firebird-automations
Copy link
Collaborator Author

Commented by: @ibaseru

Vlad made special build of FB 2.1.2, so I could create this index. See comments below.

>How big do you estimate that the temp file would be for the index?
it was 183 gigabytes. The error on released FB 2.1.2 happens I think near 100-120 gigabytes of temp file size (at 5 hours from the start). Unfortunately, I wasn't able to watch on TEMP directory to cach the problematic file size, and also, when error happens, temp file is being deleted immediately.
Successful index creation took 12 hours.

>(i.e. how many rows and what is the total size of the index fields)
3.7 billion. 4 fields - integer, smallint, smallint, smallint. It's a primary key.

>What is a "lot of free space"?
375 gigabytes at first try, and ~500 gigabytes at second try. After that I understood that the cause of the error is not TEMP space or RAM.

>What are the sort memory settings in firebird.conf?
Default. Only database cache pages was changed to 16384 (db has 16k page size), and TEMP. Nothing else.
Vlad said that this problem can't be fixed by changing firebird.conf parameters.

>What version of FB are you running? Classic/SuperServer?
Superserver.

I think Vlad can make comment on this, because his source change allowed this index to be created, but it slowered index creation itself for about ~30%. So, this temporary "fix" can't be considered as a fix for that problem. Of course, if indexing such a huge tables is the problem at all :-)

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Hum... 32 bits version?

@firebird-automations
Copy link
Collaborator Author

Commented by: @ibaseru

yes, 32 bit Firebird. Windows XP SP3, 32bit too.

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

Sorting have 3 main stages :

a) accumulate data from the data source (SORT_put)
b) prepare results for fetch (SORT_sort)
c) fetch sorted data (SORT_get)

At the stage (a) data is sorted in some blocks (sort run's) and this run's is stored in temporary space. Initially sort run's have size of 1MB each. When 8 such run's is created they merged into one bigger run of size 8MB. Again, when 8 run's of 8MB is created they merged into bigger run of 64MB. Therefore after the stage (a) we have a set of sorted blocks (run's) with size 1, 8, and 64MB. Few key parameters of this algorithm is hardcoded constants :

const USHORT RUN_GROUP = 8; // number of run's to merge
const USHORT MAX_MERGE_LEVEL = 2; // max size of run which could be merged

At the stage (b) the merge tree is created for all run's created at stage (a). For every tree item we need a buffer to read part of the corresponding run from disk. By default this buffers first allocated from the existing free sort memory used in our temporary space (and limited by TempCacheLimit setting). Size of each buffer is 1MB. For the run's which have no buffers we allocate additional memory from the OS. If allocation fails, we divide buffer size by 2 and try again. Until buffer size >= record_size * 2. At this point we throw out of memory error

At the stage (c) sorted data is read from run's, merged on the fly and returned to the caller

Look at current issue. Temporary space after stage (a) was about 180GB. This is near 180GB/64MB = 2880 runs of level 2 (maximum allowed run level). I see as expected that additional memory allocation failed for such big amount of run's. Even if it was not fail i doubt fetching phase (c) will run quickly, as we'll have very random IO - imagine 2880 IO requests at the start of the phase (c) and during it.

So, all i made for KDV's private build was to raise MAX_MERGE_LEVEL constant up to 4, allowing runs of size 512MB and 4GB. Also i added some logging at phase (b) to know number of run's (it was 56) and how many additional memory was allocated (0). As you know it allows sorting to proceed with success.

But allowing additional merge level introduced additional work fro both IO and CPU. This is the reason why whole process is slower than before. Therefore we can't just raise MAX_MERGE_LEVEL constant. We need more adaptive dynamic algorithm to determine optimal number of run's and its sizes. This is a task for the next Firebird version.

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Vlad, your explanation of how sort works would be appreciated as a comment in sort.cpp (after the license header). It's just described as per function comments there.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

Vlad, what do you think about merging not 8, but 64 runs into next level run?

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

When i worked on SORT for v2.1 i tried to change RUN_GROUP to different values but results was not good\stable.
Also think about IO - the more run's we have to merge, the more randomness we have.

I think at SORT_sort stage (when we know amount of data to sort) we should compute RUN_GROUP and MAX_MERGE_LEVEL dynamically.

@firebird-automations
Copy link
Collaborator Author

Commented by: @AlexPeshkoff

I have nothing against dynamic calculation, but looks like we need to know this values before we know the amount of data to sort. We need to merge first group of level0 runs into level1 run right after fetching first N records from the stream. How can we know that stream size? (Imagine stream is generated by SP).

@firebird-automations
Copy link
Collaborator Author

Commented by: @hvlad

My point is that except the final merge current hard-coded values of RUN_GROUP and MAX_MERGE_LEVEL is OK in most cases.

The problem is when at final stage we have a big amount of run's to merge. It can require additional memory to buffer disk reads for every run (as in this ticket case) and also make IO very random and thus not effective. I'd said that maximum effective value of number of run's to merge depends on random IO capacity of device with temp files and amout of free memory.

Before the SORT_sort i see no good reason to change RUN_GROUP or MAX_MERGE_LEVEL values.

Also, i think, we can try to define different RUN_GROUP values for different MERGE_LEVEL's. It seems good if lower-level run's will be merged later (i.e. make RUN_GROUP bigger for low-level run's) as in this case most of them reside in memory and we have almost no IO penalty. Also it allow to eliminate not needed merge's for small and medium sorts.

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0 Alpha 1 [ 10331 ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0 Beta 1 [ 10332 ]

Fix Version: 3.0 Alpha 1 [ 10331 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0 Beta 2 [ 10586 ]

Fix Version: 3.0 Beta 1 [ 10332 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0 RC 1 [ 10584 ]

Fix Version: 3.0 Beta 2 [ 10586 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0.0 [ 10048 ]

Fix Version: 3.0 RC 1 [ 10584 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @dyemanov

Fix Version: 3.0 RC2 [ 10048 ] =>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants