Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database file grows when Blob variables used in Stored Procedure [CORE5114] #5398

Open
firebird-automations opened this issue Feb 23, 2016 · 10 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: Holger Klemt (klemmo)

Attachments:
db1.fbk

Votes: 3

Today, I demonstrated to some customer on our Firbeird softwaretraining held in munich some technologies they should use to improve their work and some they should avoid. I know that using blob as variables or params inside a stored procedure have some really big disadvantages, but i was not aware of some part of the result.

the facts: We used the ibexpert demo database and created 10000 records as testdata in table products. To demonstrate the bad way, we added this stored proc:
using blob params and manipulate them for every found record. This is slow and needs a lot of memory, basic reasons for this problems are known from my side. Procedure execution time was around 45 seconds even not a single record should be written to disk.

The really bad thing that right after executing this stored proc, the database filesize changed from about 25MB to more than 3GB. The SP does not a single update or insert .... You can also see from statistics that the engine wrote about 190000 pages.

create or alter procedure GETPRODUCTS
returns (
RES blob sub_type 1 segment size 80)
as
declare variable ACTOR blob sub_type 1 segment size 80;
begin
res='';
for
select product.actor
from product
into :actor
do
begin
res=res||'
'||actor;
end
suspend;
end

------------------------------------------------
Prepare : 16,00 ms
Execute : 44.906,00 ms
Avg fetch time: 44.906,00 ms

Memory
------------------------------------------------
Current: 96.756.864
Max : 106.806.792
Buffers: 5.000

Operations
------------------------------------------------
Read : 85
Writes : 189.558
Fetches: 799.256
Marks : 583.503

Enchanced Info:
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
| Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges |
| | Total | reads | reads | | | | | | |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
|PRODUCT | 0 | 0 | 10000 | 0 | 0 | 0 | 0 | 0 | 0 |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+

Workaround for users

This is a procedure i demonstrated as a workaround, which does basically the same, but collects the data using varchar variables as log as it fits and only copy them to the result when there is no more free space in varchar. The execution takes only 350 ms and does not affect the size of the database, since it does not a single page write (as expected).

Basically the first SP is definitively a valid way to write an SP, but there must be something very strange going on inside Firebird.

The test was made with a firebird 255 on windows 64 and also with firebird 3 rc1, both had exactly the same strange behaviour extending the size of the database to 3GB without any real acceptable reason.

create or alter procedure GETPRODUCTS
returns (
RES blob sub_type 1 segment size 80)
as
declare variable ACTOR varchar(200);
declare variable Z varchar(32000);
begin
res='';
z='';
for
select product.actor
from product
into :actor
do
begin
z=z||'
'||actor;
if (char_length(z)>31000) then
begin
res=res||z;
z='';
end
end
if (z<>'') then
begin
res=res||z;
z='';
end
suspend;
end

Query Time
------------------------------------------------
Prepare : 16,00 ms
Execute : 359,00 ms
Avg fetch time: 359,00 ms

Memory
------------------------------------------------
Current: 87.044.576
Max : 87.230.256
Buffers: 5.000

Operations
------------------------------------------------
Read : 79
Writes : 0
Fetches: 20.751
Marks : 365

Enchanced Info:
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
| Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges |
| | Total | reads | reads | | | | | | |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
|PRODUCT | 0 | 0 | 10000 | 0 | 0 | 0 | 0 | 0 | 0 |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+

@firebird-automations
Copy link
Collaborator Author

Commented by: Holger Klemt (klemmo)

attached the fb25 backup file i used for the test

@firebird-automations
Copy link
Collaborator Author

Modified by: Holger Klemt (klemmo)

Attachment: db1.fbk [ 12905 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: Sean Leyne (seanleyne)

Holger,

IIRC, blobs don't really have an in-memory only representation, they always have an on-disk footprint. Further, when a blob is revised, a new object is actually allocated, the existing object is never re-used. The result is that each intermediate blob which is created, currently, is written to disk. Which is why the database size explodes.

The 2nd SP does not change the database size, since no single return value ever got larger than VARCHAR(32000), so no intermediate Blob page was created. (the engine recognized that a VARCHAR converted to a blob does not need intermediate storage since it has a max size)

@firebird-automations
Copy link
Collaborator Author

Modified by: Sean Leyne (seanleyne)

description: today i demonstrated to some customer on our Firbeird softwaretraining held in munich some technologies they should use to improve their work and some they should avoid.
I know that using blob as variables or params inside a stored procedure have some really big disadvantages, but i was not aware of some part of the result.

the facts: We used the ibexpert demo database and created 10000 records as testdata in table products. To demonstrate the bad way, we added this stored proc:
using blob params and manipulate them for every found record. This is slow and needs a lot of memry, basic reasons for this problems are know from my side
Procedure execution time was around 45 seconds even not a single record should be written to disk.

The really bad thing that right after executing this stored proc, the database filesize changed from about 25MB to more than 3GB. The SP does not a single
update or insert .... You can also see from statistics that firebird wrote about 190000 pages.

create or alter procedure GETPRODUCTS
returns (
RES blob sub_type 1 segment size 80)
as
declare variable ACTOR blob sub_type 1 segment size 80;
begin
res='';
for
select product.actor
from product
into :actor
do
begin
res=res||'
'||actor;
end
suspend;
end

------------------------------------------------
Prepare : 16,00 ms
Execute : 44.906,00 ms
Avg fetch time: 44.906,00 ms

Memory
------------------------------------------------
Current: 96.756.864
Max : 106.806.792
Buffers: 5.000

Operations
------------------------------------------------
Read : 85
Writes : 189.558
Fetches: 799.256
Marks : 583.503

Enchanced Info:
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
| Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges |
| | Total | reads | reads | | | | | | |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
|PRODUCT | 0 | 0 | 10000 | 0 | 0 | 0 | 0 | 0 | 0 |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+

Workaround for users

this is a procedure i demonstrated as a workaround, which does basically the same, but collects the data
using varchar variables as log as it fits and only copy them to the result when there is no more free space in
varchar. The execution takes only 350 ms and does not affect the size of the database, since it does not a single
page write (as expected).

basically the first stored proc is definitly a valid way to write a sp, but there must be something very strange
going on inside Firebird.

The test was made with a firebird 255 on windows 64 and also with firebird 3 rc1, both had exactly the same
strange behaviour extending the size of the database to 3GB without any real acceptable reason.

create or alter procedure GETPRODUCTS
returns (
RES blob sub_type 1 segment size 80)
as
declare variable ACTOR varchar(200);
declare variable Z varchar(32000);
begin
res='';
z='';
for
select product.actor
from product
into :actor
do
begin
z=z||'
'||actor;
if (char_length(z)>31000) then
begin
res=res||z;
z='';
end
end
if (z<>'') then
begin
res=res||z;
z='';
end
suspend;
end

Query Time
------------------------------------------------
Prepare : 16,00 ms
Execute : 359,00 ms
Avg fetch time: 359,00 ms

Memory
------------------------------------------------
Current: 87.044.576
Max : 87.230.256
Buffers: 5.000

Operations
------------------------------------------------
Read : 79
Writes : 0
Fetches: 20.751
Marks : 365

Enchanced Info:
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
| Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges |
| | Total | reads | reads | | | | | | |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
|PRODUCT | 0 | 0 | 10000 | 0 | 0 | 0 | 0 | 0 | 0 |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+

=>

Today, I demonstrated to some customer on our Firbeird softwaretraining held in munich some technologies they should use to improve their work and some they should avoid. I know that using blob as variables or params inside a stored procedure have some really big disadvantages, but i was not aware of some part of the result.

the facts: We used the ibexpert demo database and created 10000 records as testdata in table products. To demonstrate the bad way, we added this stored proc:
using blob params and manipulate them for every found record. This is slow and needs a lot of memory, basic reasons for this problems are known from my side. Procedure execution time was around 45 seconds even not a single record should be written to disk.

The really bad thing that right after executing this stored proc, the database filesize changed from about 25MB to more than 3GB. The SP does not a single update or insert .... You can also see from statistics that the engine wrote about 190000 pages.

create or alter procedure GETPRODUCTS
returns (
RES blob sub_type 1 segment size 80)
as
declare variable ACTOR blob sub_type 1 segment size 80;
begin
res='';
for
select product.actor
from product
into :actor
do
begin
res=res||'
'||actor;
end
suspend;
end

------------------------------------------------
Prepare : 16,00 ms
Execute : 44.906,00 ms
Avg fetch time: 44.906,00 ms

Memory
------------------------------------------------
Current: 96.756.864
Max : 106.806.792
Buffers: 5.000

Operations
------------------------------------------------
Read : 85
Writes : 189.558
Fetches: 799.256
Marks : 583.503

Enchanced Info:
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
| Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges |
| | Total | reads | reads | | | | | | |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
|PRODUCT | 0 | 0 | 10000 | 0 | 0 | 0 | 0 | 0 | 0 |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+

Workaround for users

This is a procedure i demonstrated as a workaround, which does basically the same, but collects the data using varchar variables as log as it fits and only copy them to the result when there is no more free space in varchar. The execution takes only 350 ms and does not affect the size of the database, since it does not a single page write (as expected).

Basically the first SP is definitively a valid way to write an SP, but there must be something very strange going on inside Firebird.

The test was made with a firebird 255 on windows 64 and also with firebird 3 rc1, both had exactly the same strange behaviour extending the size of the database to 3GB without any real acceptable reason.

create or alter procedure GETPRODUCTS
returns (
RES blob sub_type 1 segment size 80)
as
declare variable ACTOR varchar(200);
declare variable Z varchar(32000);
begin
res='';
z='';
for
select product.actor
from product
into :actor
do
begin
z=z||'
'||actor;
if (char_length(z)>31000) then
begin
res=res||z;
z='';
end
end
if (z<>'') then
begin
res=res||z;
z='';
end
suspend;
end

Query Time
------------------------------------------------
Prepare : 16,00 ms
Execute : 359,00 ms
Avg fetch time: 359,00 ms

Memory
------------------------------------------------
Current: 87.044.576
Max : 87.230.256
Buffers: 5.000

Operations
------------------------------------------------
Read : 79
Writes : 0
Fetches: 20.751
Marks : 365

Enchanced Info:
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
| Table Name | Records | Indexed | Non-Indexed | Updates | Deletes | Inserts | Backouts | Purges | Expunges |
| | Total | reads | reads | | | | | | |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+
|PRODUCT | 0 | 0 | 10000 | 0 | 0 | 0 | 0 | 0 | 0 |
+-------------------------------+-----------+-----------+-------------+---------+---------+---------+----------+----------+----------+

summary: Strange Stored Proc Behaviour with Blob Blob Variables => Database file grows when Blob variables used in Stored Procedure

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

We need something like Oracle's DBMS_LOB (https://docs.oracle.com/cd/B28359_01/appdev.111/b28419/d_lob.htm) to manipulate blobs (read, write functions) by blob id.

Fortunately, it's possible to write these function with the new external routines API.

Blob garbage collection should also be implemented, but it will never be fast than a good use of the blob api.

@firebird-automations
Copy link
Collaborator Author

Commented by: @sim1984

Holger Klemt,

When using the LIST function DB file will not grow so fast.

create or alter procedure GETPRODUCTS
returns (
RES blob sub_type 1 segment size 80)
as
begin
select LIST(product.actor, ' ')
from product
into :RES;
suspend;
end

@firebird-automations
Copy link
Collaborator Author

Commented by: Holger Klemt (klemmo)

yes, i know, list would be one way for a workaround, but when you generate for example csv, json or xml stuff using a stored procedure, you need some more logic.
My workaround using varchars and put only intermediate results into the blob also works fine

i would perhaps vote for a kind of new memory memory blob variable which allow sub_type text operations and could be used to copy final results to a standard blob.
This could be perhaps a good solution for similar problem, when this new memory blob variable always using the current instance insted of creating new versions
with every change on disk. to avoid problems it could be defined in a varchar like syntax that simply allows for example also more than 32k of data, for example

declare variable mblob(100000) to have space for 100000 characters.

This is just an idea

@NilsHaase
Copy link

We have also found the phenomenon described by Mr. Klemt in the current version.

  • 3.0.10.33601
  • 4.0.4.3010-0 x64.

We create EDI data directly in the database and need a bit of logic for this. With a pure Varchar field, we would not be able to cope with the character limits. That's why we switched directly to the blobs.

The solution is now different. If something improves here in the future, we would be very pleased.

@hvlad
Copy link
Member

hvlad commented Dec 19, 2023

Consider to use new BLOB_APPEND function (since v4.0.2)
https://firebirdsql.org/file/documentation/release_notes/html/en/4_0/rlsnotes40.html#rnfb40-dml-new-blob-append

and\or new BLOB_UTIL package (since v5)
https://firebirdsql.org/file/documentation/release_notes/html/en/5_0/rlsnotes50.html#rnfb50-engine-blob-util

@NilsHaase
Copy link

ok thanks I will try it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants