Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for text BLOBs >= 32K as first argument for TRIM() [CORE3234] #3606

Closed
firebird-automations opened this issue Nov 13, 2010 · 10 comments

Comments

@firebird-automations
Copy link
Collaborator

Submitted by: @paulvink

In some situations, it would be practical if one could TRIM long blobs from another blob, e.g. to remove standard T&C sections, EULAs etc.

Currently, TRIM(sub from string) allows blobs for both arguments, but the first argument must be under 32 KB.

Commits: 4e509cb 62054b8 FirebirdSQL/fbt-repository@c2b6365

====== Test Details ======

For single-byte (ascii) charset this test was run also two blobs with length 50'000'000 (+3) bytes: result was OK.
Current settings check work of TRIM() on blobs with non-ascii characters with octet_length ~ 1'000'000 bytes.

@firebird-automations
Copy link
Collaborator Author

Modified by: @asfernandes

assignee: Adriano dos Santos Fernandes [ asfernandes ]

@firebird-automations
Copy link
Collaborator Author

Modified by: @asfernandes

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 3.0 Alpha 1 [ 10331 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

> the first argument must be under 32 KB.

It seems that currently max length of the FIRST argument is limited to 65535 bytes.

Test:

show version;

recreate table utf8text(b_trim_what blob, b_trim_from blob);
commit;

insert into utf8text(b_trim_what) values( rpad('',32760, uuid_to_char(gen_uuid()) ) );
update utf8text set b_trim_what = b_trim_what || rpad('',32760, uuid_to_char(gen_uuid()) );
update utf8text set b_trim_what = b_trim_what || '123456789012345';

update utf8text set b_trim_from = b_trim_what || 'END';
commit;

set list on;

select
octet_length( b_trim_what ) trim_what_oct_len
,octet_length( b_trim_from ) trim_from_oct_len
,octet_length( trim( b_trim_what from b_trim_from ) ) last_word_oct_len
from utf8text;

update utf8text set b_trim_what = b_trim_what || '!'; --------------------- this makes b_trim_what to have length = 65535 + 1
update utf8text set b_trim_from = b_trim_what || 'END';
commit;

select
octet_length( b_trim_what ) trim_what_oct_len
,octet_length( b_trim_from ) trim_from_oct_len
,octet_length( trim( b_trim_what from b_trim_from ) ) last_word_oct_len
from utf8text;

Output:

ISQL Version: WI-T3.0.0.31780 Firebird 3.0 Beta 2
Server version:
Firebird/Windows/Intel/i386 (access method), version "WI-T3.0.0.31780 Firebird 3.0 Beta 2"
Firebird/Windows/Intel/i386 (remote server), version "WI-T3.0.0.31780 Firebird 3.0 Beta 2/tcp (csprog)/P13"
Firebird/Windows/Intel/i386 (remote interface), version "WI-T3.0.0.31780 Firebird 3.0 Beta 2/tcp (csprog)/P13"
on disk structure version 12.0

TRIM_WHAT_OCT_LEN 65535
TRIM_FROM_OCT_LEN 65538
LAST_WORD_OCT_LEN 3

TRIM_WHAT_OCT_LEN 65536
TRIM_FROM_OCT_LEN 65539
LAST_WORD_OCT_LEN 65539 -- <<<<<<<<<<<<<<< ??? <<<<<<<<<

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

PS.

Sorry, I forgot to put here my question.
There is issue in 3.0 Release Notes, page 73:

===
TRIM() BLOB Arguments Lose 32 KB limit
Adriano dos Santos Fernandes
In prior versions, TRIM(substring from string) allowed BLOBs for both arguments, but the first argument had
to be smaller than 32 KB. Now both arguments can take BLOBs of up to 4 GB.

What about: "Now *BOTH* arguments can take BLOBs of up to 4 GB" - ?

@firebird-automations
Copy link
Collaborator Author

Commented by: @asfernandes

Thanks Pavel, please check again.

@firebird-automations
Copy link
Collaborator Author

Modified by: @asfernandes

status: Resolved [ 5 ] => Reopened [ 4 ]

Fix Version: 3.0 Beta 2 [ 10586 ]

resolution: Fixed [ 1 ] =>

Fix Version: 3.0 Alpha 1 [ 10331 ] =>

@firebird-automations
Copy link
Collaborator Author

Modified by: @asfernandes

status: Reopened [ 4 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

@firebird-automations
Copy link
Collaborator Author

Commented by: @pavel-zotov

At least for two blobs with length 50 Mb now it works fine. But elapsed time on pretty good server is singnificant.

Test:

t list on;
set stat on;
set term ^;
execute block returns(b1len int, b2len int, trlen int) as
declare b1 blob = '';
declare b2 blob;
declare s varchar(32760);
declare n int = 50000000;
begin
while( n > 0 ) do
begin
b1 = b1 || rpad( '', 32760, uuid_to_char(gen_uuid()) );
n = n - 32760;
end
b2 = b1 || 'end';
b1len=octet_length(b1);
b2len=octet_length(b2);
trlen=octet_length( trim(b1 from b2) );
suspend;
end
^
set term ;^

Output:

B1LEN 50024520
B2LEN 50024523
TRLEN 3

Current memory = 40783200
Delta memory = 1606072
Max memory = 247443288
Elapsed time= 483.348 sec
Buffers = 8192
Reads = 8463794
Writes = 9509927
Fetches = 55413115

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: Done successfully

Test Details: For single-byte (ascii) charset this test was run also two blobs with length 50'000'000 (+3) bytes: result was OK.
Current settings check work of TRIM() on blobs with non-ascii characters with octet_length ~ 1'000'000 bytes.

@firebird-automations
Copy link
Collaborator Author

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment