Issue Details (XML | Word | Printable)

Key: CORE-2812
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Dmitry Yemanov
Reporter: Dmitry Yemanov
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Firebird Core

Prohibit any improper mixture of explicit and implicit joins

Created: 15/Jan/10 09:34 AM   Updated: 25/Sep/15 12:43 PM
Component/s: Engine
Affects Version/s: None
Fix Version/s: 3.0 Alpha 1

Issue Links:
Relate

QA Status: Done successfully
Test Details:
See also (in russian): sql.ru/forum/actualutils.aspx?action=gotomsg&tid=804601&msg=9751146
Additional explanations about why following WORKS in FB 3.0:
select 'case-4' msg, a.id from t1 b, t1 a join t1 c on a.id=c.id where a.id=b.id;
-- see in the letter by dimitr, 02-apr-2015 19:34


 Description  « Hide
Consider this query:

    select *
    from TA, TB
        left join TC on TA.COL1 = TC.COL1
    where TA.COL2 = TB.COL2

You can see that the outer join condition references the table from the implicit join. Currently, such queries either throw a famous "no current record to fetch" error or return incorrect result set, depending on the execution plan. Perhaps one lucky day they could even deliver a correct answer, I'm not sure ;-)

So far it has been considered being an optimizer bug and we have a few related tickets in the tracker. However, I was recently wondering whether this query is correct at all. I've found that both MSSQL and PostgreSQL throw an "invalid column reference" error for this query. The SQL specification mentions that a "joined_table" (i.e. an explicit join) is a table derived from the join expression. But we know that a derived table is prepared in its own (independent) context and cannot reference fields from the outer contexts. For me this clearly means that implicitly joined tables cannot be visible from the explicitly joined ones.

So I'm proposing to prohibit such field references as being invalid. Below are examples that should still work:

    select *
    from TA
        join TB on TA.COL2 = TB.COL2
        left join TC on TA.COL1 = TC.COL1

    select *
    from TB, TA
        left join TC on TA.COL1 = TC.COL1
    where TA.COL2 = TB.COL2

    select *
    from TA
        left join TC on TA.COL1 = TC.COL1
        , TB
    where TA.COL2 = TB.COL2

I doubt this limitation could badly affect any existing applications, but theoretically this is possible and should be taken into account. I have to admit though that I wouldn't feel comfortable with a yet another legacy compatibility configuration switch :-)


 All   Comments   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Karol Bieniaszewski added a comment - 31/Jan/13 07:32 AM
a think about this but is this really wrong query?

look at adapted same query but you explicitly tell optimizer how to build streams by derived table

################### orginal query ######################

select *
    from TA, TB
        left join TC on TA.COL1 = TC.COL1
    where TA.COL2 = TB.COL2

PLAN MERGE (SORT (TA NATURAL), SORT (JOIN (TB NATURAL, TC NATURAL)))

this work wrong - resultset is wrong TA.col1=1 <> TC.col1=4

################### addapted query ######################
without indexes

select TX.TA_COL1, TX.TA_COL2, TX.TB_COL1, TX.TB_COL2, TC.COL1 AS TC_COL1, TC.COL2 AS TC_COL2
    from (SELECT TA.COL1 AS TA_COL1, TA.COL2 AS TA_COL2, TB.COL1 AS TB_COL1, TB.COL2 AS TB_COL2 FROM TA, TB) TX
        left join TC on TX.TA_COL1 = TC.COL1
    where TX.TA_COL2 = TX.TB_COL2

PLAN JOIN (JOIN (TX TA NATURAL, TX TB NATURAL), TC NATURAL)

this query work ok for me - resultset look ok

Karol Bieniaszewski added a comment - 31/Jan/13 07:39 AM
with indexes on every column
first original query throw error "..not positioned.."

addapted query work ok with plan
PLAN JOIN (JOIN (TX TA NATURAL, TX TB INDEX (IDX_TB2)), TC INDEX (IDX_TC1))

Dmitry Yemanov added a comment - 31/Jan/13 07:51 AM
As far as I understand, your "original" query is wrong but the "adapted" one is correct. This is from the SQL spec POV.

Karol Bieniaszewski added a comment - 31/Jan/13 04:00 PM
"orginal" is your query from description and you say that should be prohibited. my addapted query is "the same" only difference that i change TA, TB to derived tabe - may be optimizer should do this?

Dmitry Yemanov added a comment - 31/Jan/13 05:33 PM
First of all, SQL is declarative. And if some declaration is defined to be invalid, I don't see any point in "fixing" it at the optimizer level, which is an implementation detail. Second, I don't see these two queries as equivalent, they're TA JOIN (TB LEFT JOIN TC) vs (TA JOIN TB) LEFT JOIN TC.

Karol Bieniaszewski added a comment - 31/Jan/13 06:33 PM
I do not say that this should to be "fixed"
i only think about this in this way:

select *
    from (TA, TB) make derived table here if tables are referenced outside
        left join TC on TA.COL1 = TC.COL1
    where TA.COL2 = TB.COL2

but i do not see that any other db support this way
- i test this in:
-MSSQL
-Ingress
-PostgreSQL

ingress have very good info about failed execution "Relation 'ta' appearing in join_search condition is not involved in the join"
may be FB can have the same info :)


Dmitry Yemanov added a comment - 31/Jan/13 06:44 PM
I've analyzed Oracle/MSSQL/Postgres before changing our behaviour and they all "agree" with the new rules. Your mistake is that your treat A, B join C as (A, B) join C while in fact this is A, (B join C). That said, I completely agree that a better error reporting would be great. I just wish it would be easier to accomplish :-)

Karol Bieniaszewski added a comment - 31/Jan/13 09:46 PM
i say that (A join B) join C becaouse of derived table. as you can saw in plan returned for addapted query. if you go in that path query work ok. but i do not see simple possibility to make this choice in optimizer.

Pavel Cisar added a comment - 25/Sep/15 12:43 PM
Test created.