New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DISTINCT and UNION no longer work for BLOB fields [CORE1345] #1764
Comments
Commented by: @dyemanov This is an intentional change. Blobs cannot be sorted by, including the GROUP BY and DISTINCT operations. Prior versions allowed that but produced unexpected results as in fact BLOB IDs were sorted. |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Commented by: Bruno Depero (bdepero) I tried to restore few databases from 2.0.2 to 2.1 but gbak finished before completing due to blob field that with 2.1.x cannot be sorted. Thank you in advance ! |
Commented by: @samofatov We have encountered the issue with with pretty much all commercial Firebird applications that we tried to port to 2.1, and the needed fixes for applications are sometimes non-trivial or even impossible, and very difficult to explain to the customer. The issue had to be ranked as critical and the release blocker for us internally. CORE1250 is another report on the same issue. The access frameworks routinely generate queries like "select distinct * from <sometable>" and "select ...many fields... from <sometable> The previous engine used to treat two non-null BLOBs as DISTINCT if they came from different tables or the different fields of the same table, which was correct behavior (meaning it was consistent and acceptable). Changing DISTINCT and UNION behavior with BLOBs breaks a big percentage of applications for no logical reason. We may forbid using BLOBs in ORDER BY clause, because it produces incorrect results, but breaking DISTINCT and UNION cannot be justified. |
Modified by: @samofatovpriority: Minor [ 4 ] => Major [ 3 ] summary: No support for BLOB sorting => DISTINCT and UNION no longer work for BLOB fields |
Commented by: @dyemanov create table blb1 (id int, col blob); insert into blb1 values (1, 'asd'); select distinct * from blb1; select * from blb1 Do you call it a correct result? |
Commented by: @samofatov And the result is: >isql -i a_test.sql
============ =================
This is not too bad result, IMO. Much better than returning a error and breaking existing programs. Not many programs rely on ORDER BY for blob fields, but using DISTINCT and UNION with blobs is the standard practice, you like it or not. Consider this example: create database 'aa.fdb' user 'sysdba' password 'masterkey'; create table products ( create table orders ( insert into products values(1, 'Red Database'); select distinct http://products.id, products.dsc How can you justify to the user that this query which has produced expected and consistent results before has to be fixed? Old style and analytical queries very commonly join a whole bunch of tables (leveling many-many relationships), filter it in a WHERE clause and than use DISTINCT to figure out the rows that they needed. The change breaks this perfectly legal approach. |
Commented by: @asfernandes As we discussed in fb-devel, others DBMS doesn't allow this query either. But, we can make it work based on primary key/unique, i.e., if any primary key/unique of the same table of a blob is present (as in your example), this record will be always distinct. |
Commented by: @AlexPeshkoff Nickolay, your sample with: I agree with Adriano's suggestion - let's use PK (or any other unique constraint if present), this will let most of old queries run. But reanimating old _bugs_ (like one mentioned in your sample) for backward-compatibility reasons is not a good way to go. |
Commented by: @samofatov Guys, are we making the best engine in the world or a patchy mess? 1. Absence/presence/active status of constraints or indexes should not change SELECT statement validity or results (remember DDL dependencies and other stuff?) 2. Compatibility with existing installed base is hugely important, and if we are going to inflict migration costs, that has to be justified with something other than calling old predictable and consistent behavior "bugs". Either change the functionality properly or don't touch it. Breaking applications for the sake of breakage is not the way forward. 3. If we want to attack the issue seriously so it compares BLOB values in generic scenario correctly we have to implement SORT callbacks or something similar to compare BLOB values and a little bunch of optimizations to avoid performance regressions with old queries. 4. Regarding the other servers not supporting the feature, please come on, don't be silly: mysql> create table a (b longtext); mysql> insert into a values('test'); mysql> insert into a values('test'); mysql> select distinct * from a; ======== postgres=# create table aa (b text); |
Commented by: @AlexPeshkoff If we can provide postgres-like behavior, I see no problems. But certainly this will affect performance - at least by reading of blob pages from disk. BTW, presence of active unique constraint my become a serious optimization here. |
Modified by: @pcisarWorkflow: jira [ 12510 ] => Firebird [ 14867 ] |
Commented by: Philip Williams (unordained) Blobs aren't always stored on separate pages, if room exists to fit them on the main record page, right? I, for one, often use blobs when I'm not sure a large varchar will handle outlier cases, despite knowing that 99% of the time, the value will be very small and even a small varchar would be sufficient. Memo fields, for example, need to be large for rare cases, but most of the time users won't take the time to even write complete sentences, let alone 4kb of text. So there *may* be a performance cost in looking at the blob value, but you shouldn't assume it'll be catastrophic. And for those few users who do want to run a distinct on blobs because they're trying to eliminate duplicate photos in their collections (!), I expect they know who they are and will be okay with exchanging speed for features. |
Commented by: @livius2 If I were you i will implement it as: This will in most of cases do not tought blob data at all even if no unique constraint is specified in the query. |
Submitted by: Rene Pijnacker (rpijnacker)
Votes: 1
I tried to run our application through BDE using firebird 2.1 beta
on opening of some screen I get error message No support for blob sorting.
This works on all other versions of firebird
The text was updated successfully, but these errors were encountered: