Issue Details (XML | Word | Printable)

Key: CORE-3262
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Adriano dos Santos Fernandes
Reporter: Paul Vinkenoog
Votes: 0
Watchers: 1

If you were logged in you would be able to see more operations.
Firebird Core

LIST() may overwrite last part of output with zero characters

Created: 28/Nov/10 02:32 PM   Updated: 23/Jan/16 11:42 AM
Component/s: Engine
Affects Version/s: 2.1.3
Fix Version/s: 2.1.4

QA Status: Done successfully

 Description  « Hide
LIST overwrites the last portion of the result with zero characters when the length gets above 4030:

create table vc (s varchar(8000));

insert into vc values (cast('A' as char(4000)) || 'B');
select char_length(s), position('A' in s), position('B' in s) from vc;
-- returns (4001, 1, 4001) - good
with q (l) as (select list(s) from vc)
  select char_length(l), position('A' in l), position('B' in l) from q;
-- returns (4001, 1, 4001) - good

update vc set s = (cast('A' as char(5000)) || 'B');
select char_length(s), position('A' in s), position('B' in s) from vc;
-- returns (5001, 1, 5001) - good
with q (l) as (select list(s) from vc)
  select char_length(l), position('A' in l), position('B' in l) from q
-- returns (5001, 1, 0) - wrong

To test if it's not the position function screwing up:

with q (l) as (select reverse(list(s)) from vc)
  select char_length(l), position('A' in l), position('B' in l) from q;
-- returns (4066, 4066, 0)

It's still B that's not found, even though it ought to be at position 1 now.
But as we see, the list result has been truncated before (or during) the reversal.
So we examine the cut point in the list result:

with q (l) as (select list(s) from vc)
  select ascii_val(substring(l from 4066 for 1)), ascii_val(substring(l from 4067 for 1)) from q
-- returns (32, 0)

Testing with different lengths of s reveals:
- Results are correct up to and including a length of 4030,
- In the range 4031-4066, querying position('A'/'B' in list(s)) leads to:
    Statement failed, SQLSTATE = -902
    Unable to complete network request to host "".
    -Error reading data from the connection.
  and you have to reconnect.
- For lengths >= 4067, no error is raised, but you get the wrong results as described above,
  due to null characters in the list() output, starting at position 4067.

If the table contains two records, the position of the (first) null character is 8131,
so char_length(reverse(list(s))) becomes 8130. 'A' and 'B' from the first record are now
both found in list(s).
With three records, the first \0 is inserted at 16263. With four records, at 20327.

With UTF8, the behaviour is slightly different: here, already char_length(list(s))
is reported as having the truncated value, and hence equal to char_length(reverse(list(s))).
octet_length(list(s)) has the non-truncated value though, and only becomes truncated
after reverse().

Three last remarks:

1) With CHAR or BLOB SUB_TYPE TEXT, the same errors happen
   (with BLOBs possibly at different points).

2) In PSQL, these errors do NOT occur. No zeroes are inserted and even the
   last character is found with position().

3) In 2.5, these errors don't occur at all (in as much as I tested them).

 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Adriano dos Santos Fernandes added a comment - 28/Nov/10 05:08 PM
Please test on the latest 2.1 snapshot. I can't reproduce the problem.

I don't remember what change may have fixed this.

Paul Vinkenoog added a comment - 28/Nov/10 11:52 PM
With (Win32 Superserver), everything works fine.

The original tests were made on (x86-64 Classic server on Fedora 11).

To make sure it wasn't the package or 32 vs. 64 bit, I uninstalled 2.1.4 from the above Windows box and installed a fresh 32-bits Superserver.
This gave the exact same errors as in the original issue description.

All the tests were run on freshly created Dialect 3 databases, without any "fancy" settings.

So, *something* must have changed (for the better) between builds 18185 and 18370.

I'll try a 2.1.0 tomorrow.

This kind of bug can go unnoticed for a long time, because as soon as you have multiple records, the position at which the \0 characters start increases rapidly.
And with a single record, where it starts at 4067 - well, you don't LIST single records often, let alone check their tail ends...

Paul Vinkenoog added a comment - 29/Nov/10 12:37 AM
Just tried 2.1.0 release. Issue was already present there.

Paul Vinkenoog added a comment - 08/Dec/10 01:05 PM
For the record: I suspected that the vast number of spaces in the strings might have something to do with it, but it doesn't.
After filling up all the spaces with '0123456789', the problem persists.

Adriano dos Santos Fernandes added a comment - 08/Dec/10 01:14 PM
What about mark this as fixed in 2.1.4?

I'd hate to have to checkout old version to see what commit fixed this. If someone has a clue about what migh be...

Dmitry Yemanov added a comment - 08/Dec/10 01:18 PM
And who will be the author of the fix in the release notes? ;-)

Paul Vinkenoog added a comment - 08/Dec/10 02:49 PM
> And who will be the author of the fix in the release notes? ;-)

Providence? ;-)

Ideally, we should know what caused this and what fixed it.
But since it seems fixed in 2.1.4 and 2.5 and you guys have enough work to do on bugs that *aren't* fixed, I don't mind if you close this one.

Just one suggestion: The threshold of roughly 4K around which things start to go wrong made me think of CORE-3228 (RIGHT),
which concerned a problem starting at 1024 *characters* in UTF8. Multiplying this with UTF's max. bpc value, you also arrive at 4K.

CORE-3228 was fixed for 2.1.4 and 2.5.1. The fact that it existed in 2.5.0 may plead against it having the same cause as this one,
but maybe it's still worth having a look at exactly *how* it was fixed between 2.1.3 and 2.1.4, and if this fix might also affect the
current bug.

Another thing that changed between 2.1.3 and 2.1.4 is that LIST()'s second parameter may now be any string expression.
Actually, this may be a better candidate to look at, because this improvement is also there in 2.5.0. And it concerns LIST, not
RIGHT. So maybe this code change may also have fixed the truncation bug. BTW, I don't know if the truncation bug has anything
to do with the separator - all my tests were done without explicit separator. And the bug also works if there's only one row and
the separator isn't used anyway. Or... could it be that in the bugged versions, space is set aside for the separator strings, which
somehow is substracted from the (working) space or maximum copying length for the strings themselves? (I'm just thinking
out loud here - this may be utter nonsense.)

Dmitry Yemanov added a comment - 08/Dec/10 08:43 PM
It's a side effect of the commit by 02-Sep-2009 (Backported fix for CORE-1658) and 24-Sep-2009 (Backport fixes from HEAD related to CORE-1658), in particular the changes regarding the location of the BLB_close() call inside EVL_group() and handling the BLB_closed flag in BLB_open(). Together they solve this issue.

Paul Vinkenoog added a comment - 08/Dec/10 11:13 PM
Thanks, it's good to know that!