Consultor Eletrônico



Kbase 15029: SQL Optimization in Progress
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   10/05/1998
SQL Optimization in Progress


VERSION 8.0+ SQL JOIN OPTIMIZATION

The total algorithm for joining/optimization is now as follows:

First, try to re-order the FROM list if its a join. This is
done by going thru all the ANDed expressions in the SQL
where clause and picking ones with the form:
<from-column operator "external">. Something is external if
it refers to elements above or outside the current tables
in the join. E.G. customer.cust-num = 5. (The 5 is external).
If there are two or more such anded items, we pick the best
-- operator equals is best.
If there are two with equals, the one with unique index is best.
Then range operators .. <, >, IN etc.. Once the best anded
item is chosen, its table is moved to the top. At that point,
all references to that table in the remaining anditems are
marked "external" and we iterate the same logic
on the remaining tables.

Here is an example for the re-ordering of the from clause:

SELECT .. FROM item,customer,order,order-line WHERE
order.order-num = order-line.order-num AND
item.item-num = order-line.item-num AND
customer.cust-num = order.cust-num AND
customer.cust-num = 5.


this will become:

SELECT .. FROM customer,order, order-line, item ...

because customer.cust-num = 5 is fld=global for the
customer file, so customer becomes first.

then customer.cust-num = order.cust-num gets changed to
global = order.cust-num so
order comes second...

and so on.


NEXT:

Examine all ANDed expressions in the SQL where clause, and
associate each one with the highest position possible
among the tables in the join.

Example:

SELECT customer.cust-num, order.order-num
FROM customer,order
WHERE customer.cust-num = order.cust-num AND
customer.cust-num > 10.


The above gets translated to the QUERY:

OPEN QUERY _SQL-qry1
FOR EACH customer WHERE customer.cust-num > 10,
EACH order WHERE customer.cust-num =
order.cust-num.

Then the PROGRESS optimization takes over -- each
WHERE clause is inspected for index references and
the best index is chosen according to documented rules
(eg. equality of unique ix fields is best, then
equality of non-unique ix fields, then range
specifications, then sort order etc). The QUERY is
assumed to be distributed in PROGRESS, so if there is no
PASS-THRU, on execution, the first record from the first
level is retrieved (with the server having done
bracketing and selection using the first level's WHERE
expression). Then the join fields are extracted by the
client, and at that point the first record from the
second level is retrieved (with a potentially different
server having done bracketing and selection using the
second level's WHERE expression).
Only when all second level+ records have been
retrieved, the second record from the first level
is retrieved, and so on.


Progress Software Technical Support Note # 15029