Você está na página 1de 52

Session: A1

How to Influence the DB2 Query


Optimizer Using Optimization
Profiles

Tom Eliaz
teliaz@us.ibm.com
IBM
Monday, May 8, 2006 • 10:20 a.m. – 11:30 a.m.

Platform: DB2 for Linux, UNIX, Windows

How to Influence the DB2 Query Optimizer Using Optimization Guidelines


Abstract: The DB2 optimizer is one of the most sophisticated in the industry;
however, even after perfect performance tuning, it can occasionally "get it wrong."
This talk teaches you how to use optimization guidelines to directly influence the
optimizer's choice of access method, join method, and join order.
Objective1: How to influence the optimizer's choice of access method, join
method, and join order
Objective2: How to influence the query rewrite transformations applied by the
optimizer
Objective3: How to put optimization guidelines into effect without making
application changes
Objective4: How to determine which optimization guidelines were applied, and
why others were not
Objective5: Basic optimizer tuning tips

Note: This presentation includes information that will be presented in majority by a


demonstration during the IDUG talk. The majority of the slides are included for
your reference during and after the demonstration. They cover the demonstration
material in more detail.

1
How to Influence the DB2 Query Optimizer
Using Optimization Profiles
¾ How to influence the optimizer's choice of access method, join
method, and join order

¾ How to influence the query rewrite transformations applied by the


optimizer

¾ How to put optimization guidelines into effect without making


application changes

¾ How to determine which optimization guidelines were applied, and


why others were not

¾ Some basic optimizer tuning tips

Hi. Thank you for reading the presentation notes for my talk. The majority of this
talk will be presented as a demonstration, and the slides are available for your
reference during and after the talk. They cover the demonstration material in more
detail. I hope you enjoy the demonstration, the slides and the presentation notes.
These notes are informal. I recommend reading the optimization profile
documentation for more details about advanced syntax.

Have you ever wanted to tell DB2 exactly which index to use?
Optimization Profiles help you control the rewrites and optimizations for your
query.
I will describe the power of DB2’s optimization profile.
The different areas you can impact using profiles.
Targeting a profile for a specific query.
Targeting a profile to all queries executed while it is enabled.
Cost-based guidelines (access and join methods and plan topology)
Rewrite guidelines (subquery-to-join, inlist-to-join)
Global guidelines (optimization class, reopt, MQTs, query degree)

I will describe how to write, enable, manage, and debug optimization profiles.
Along the way, I will provide basic optimizer tuning tips which should help you
avoid the need for profiles in all but the most desperate circumstances.

2
Motivation
¾ The DB2 optimizer is responsible for choosing the best access
plan for your query

¾ It always does a fabulous job… usually

¾ Your quiver of performance tools today:


explain, runstats, optimization class, db and dbm configs, etc.

Goal of this talk:


Add another, very powerful, tool to your quiver

Optimization Profiles
Plus a little something extra, watch for the
3

The running analogy in the talk is the quiver and the arrow. You have an existing set
of performance tools to optimize your queries. These are arrows in your quiver. I
hope to add optimization profiles as another tool, to be used only as a last resort.

For those reading the notes, I can let you in on the secret. The diamond represents
XML. You’ll find out why soon!

3
The Big Picture – Optimization Profiles

You can influence the optimizations and plan for a query by


providing a profile of your desired optimizations and plan
elements to DB2.

Admissions:
I like to have fun
ILIA Founder and Member
(I Like Icons Association)

ILIA is the I Like Icons Association. As you can already see, I use icons everywhere
in my slides. I think they’re fun and I hope you get a kick out of them.

4
Optimization Profile Overview

¾ Used to directly influence the optimization of a DML


statement
¾ Disable transform of an IN list predicate to a join
¾ Use index ISUPPKEY to access SUPPLIERS in the subquery
¾ Alter join order of PARTS and PARTSUPP
¾ Composed using a simple XML specification

¾ Can be used to impact queries without touching the


application
¾ Optimization profiles are stored in the database
¾ A profile can contain guidelines for one or more statements
¾ Incoming queries dynamically mapped to stored profiles and guidelines

These are the main talking points of optimization profiles.


The most important thing to remember is that you can impact your applications without altering them.
The profiles you write are managed in the database, you can query and organize them.

A small rant: One of the main advances with SQL was the separation of the application from the
schema, declarative query language. We worked hard to keep profiles out of the application. Profiles
talk about your schema, your indexes, your data. Good application design means keeping these
profiles out of your queries.

5
Optimization Profile Overview

¾ Detailed diagnostics via explain


¾ Query always executes, invalid or inapplicable profile registers warnings
¾ Diagnostics using db2exfmt and…
¾ Diagnostics using a SQL function

¾ Should only be used after all other tuning options explored


¾ Analyze the query using db2exfmt or visual explain
¾ runstats, indexes, explain, optimization class, db and dbm configs, etc.
¾ Warning, circumvents usual cost based optimization

Profiles are powerful things, and should be used with caution. You want to make sure you understand
what’s wrong with your query before you twist the compiler’s arm.

6
Agenda
¾ Big Picture
¾ Agenda
¾ Optimization profiles: first look
¾ Types of optimization profiles
¾ Access (index…)
¾ Join
¾ Rewrites (subquery to join…)
¾ Global (opt class, reopt, MQTs)
¾ Writing optimization profiles
¾ Putting optimization profiles into effect
¾ Managing optimization profiles
¾ Diagnostics

Lots of fun stuff to cover, but hopefully you can get the high level view fairly
quickly.

7
Overview – How Can You Nudge The Optimizer
¾ Understand what’s wrong with your query
¾ Use today’s tools to tune DB2
¾ If those don’t work, use a profile as last resort

¾ Write your profile


¾ Create an XML file with a few optimization guidelines
¾ If needed, include information to tie a guideline to your query
¾ Insert your profile into the user managed profile table
¾ Enable your profile using a special register/bind option
¾ Use explain and other tools to make sure you’ve fixed the
problem
¾ Margarita!

Margarita: When things are so bad that you need a profile, then only a profile will
do. And once you’ve overcome your major problem with a tiny bit of text, you’ll
have enough time to relax at the pub. Enjoy your Margarita, it’s on the profiles
team!

8
An Example Query
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND
S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE IN (39,40,45,48) AND
P_TYPE = ’BRASS’ AND
S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY
AND S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME
Index isn’t used

Should be first table


9

This will be my demonstration query for most of the presentation.


We have some initial complaints:
Why isn’t an index used on SUPPLIERS.S_NATION, and why isn’t it accessed first
in the query block. If only I had that!

We can fix that right up, and much more, using profiles.

9
An Example Query
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE IN (39,40,45,48) AND P_TYPE = ’BRASS’ AND
S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

¾ Access the SUPPLIER table using the S_NATIONREGION


index, and do it first

<IXSCAN TABLE='S' INDEX='S_NATIONREGION‘ FIRST='TRUE‘ />

¾ Use an index oring to get start-stop keys on nation

<IXOR TABLE=‘S’ FIRST=‘TRUE’/>

10

Here I show pieces of optimization profile guidelines. Individual guidelines make


up a profile. More on that later.

For now, the first bit of XML tells us all these things:
-For the table Suppliers
-Use the index S_NATIONREGION
-Make Suppliers the first table accessed in the query block

Uh oh, looking at the explain of the query after that first index shows that we don’t
have start-stop keys to use on the index.

We can fix that if we do an index oring, and we still want it accessed first of course.
We’ll subdivide the IN predicate, and each leg of the index or will use start stop
keys. Wow! All with one line:

<IXOR TABLE=‘S’ FIRST=‘TRUE’/>

Also, here is our diamond back again, letting us know about XML. XML is great for profiles because
you’ll be able to store and query your profiles using DB2.

10
Create The Profile XML File
<?xml version="1.0" encoding="UTF-8"?>
<OPTPROFILE VERSION=‘9.0’>
Which query should
we change
<STMTPROFILE ID=‘TPCD Q1’>
<STMTKEY>SELECT S. _NAME, S.S_ADDRESS, S.S_PHONE,
S.S_COMMENT…</STMTKEY>
What should we do
to it?
<OPTGUIDELINES>
<IXSCAN TABLE='S' INDEX='S_NATIONREGION‘ FIRST='TRUE‘/>
</OPTGUIDELINES>
</STMTPROFILE>

</OPTPROFILE>

11

This is what a complete optimization profile looks like when it modifies a single
statement.

Most of this is template stuff we provide in the documentation.


You drop in your query text and your small guideline, and we do the work of
matching up your guideline to the exact query you wanted to impact.

There is great power stored in the extra XML, and I’ll describe more of it in
presentation. For now lets start off with the simple common uses.

11
Insert and Use the Profile
¾ IMPORT FROM profile_file OF DEL MODIFIED BY LOBSINFILE INSERT
INTO SYSTOOLS.OPT_PROFILE

¾ SET CURRENT OPTIMIZATION PROFILE TPCD.OPTPROF


¾ Or BIND option, CLI option, etc.

¾ Re-execute query and check plan with explain


SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE IN (39,40,45,48) AND P_TYPE = ’BRASS’ AND
S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

12

After you write a profile, you load it into a user-managed table in your database, and
enable the profile.
You can enable the profile for a particular bind, on the clp, in cli, sqlj, all of our
DB2 interfaces.

Once you’ve enabled the profile, re-execute the query and make sure everything
worked as you wanted. If you get a Warning 437 RC 13, that means something went
wrong. I will describe diagnostics near the end of this talk.

12
Recap: Why Are Profiles Great?
¾ Get the plan you know you need
¾ Use them without changing the application!

¾ In database
¾ Centrally managed, queriable, adapted as the database
changes
¾ Easy to enable and disable as needed

¾ Flexible
¾ Target specific queries or all queries running
¾ Change access and join, change rewrites

¾ High quality diagnostics, warnings, and you always get a plan

¾ Margarita!
13

Why are profiles great?


Everything, and then of course the margarita.

13
Anatomy of an Optimization Profile
<?xml version="1.0" encoding="UTF-8"?>
<OPTPROFILE VERSION=‘9.0’>
Global Guidelines
<OPTGUIDELINES><QRYOPT VALUE=‘3’/></OPTGUIDELINES>
Global Guides

<STMTPROFILE ID=‘TPCD Q1’>


Statement Specific Guidelines
<STMTKEY>SELECT S. _NAME, S.S_ADDRESS, S.S_PHONE,
S.S_COMMENT…</STMTKEY>
¾Statement Identification Statement
Key
¾Rewrite Guidelines
<OPTGUIDELINES>
¾Cost-based
<IXSCAN Guidelines
TABLE='S' INDEX='S_NATIONREGION‘ FIRST='TRUE‘/>
Statement Guides
</OPTGUIDELINES>
Statement Specific Guidelines
¾Statement Identification
</STMTPROFILE>
¾Rewrite Guidelines
</OPTPROFILE>
¾Cost-based Guidelines

14

This is an animated slide that shows that an optimization profile is composed of two
major sections.
First there is an optional Global Guidelines section, where you put guidelines that
impact all queries that are run while the profile is enabled.
Next, there are zero or more statement guidelines, each of which modify a single
statement.
DB2 matches a statement key you provide in the statement guideline with incoming
queries in much the same way that the package cache maps compiled queries to
incoming statements.

14
Types of Optimization Guidelines
¾ Cost-based Guidelines (zero or more per query)
¾ Access methods
¾ Use this index for this table, index-anding, access this table first, etc.
¾ Join methods
¾ Join topology

¾ Query Rewrite Guidelines (zero or more per query)


¾ Affect transformations applied to the original query to produce the
optimized statement
¾ Eg. Block in-list to join transformation
¾ Enable/disable query reoptimization, query opt level, degree

¾ Global Guidelines (at most one per profile)


¾ Used to set Global optimization parameters
¾ Enable/disable query reoptimization, query opt level, degree

15

Optimization profiles include the following types of guidelines.


Each is described in more detail in the presentation.
Those in the blue square appear only in the statement level guidelines.
Those with a star can appear in both the statement level and the Global guidelines.

15
First The Pieces, Then The Whole

16

In the next slides I will walk you through all the sections of the optimization
profiles. On each slide I will show just the small piece of XML that corresponds to
that topic. We will bring everything together near the end, but since each part of the
profile is optional, your profile doesn’t need to be much more than any one of these
fragments.

16
Cost-based Optimization Guidelines
Access Requests
¾ Applied for a specific query
¾ Specify how to access a table
¾ Correspond to DB2 access methods
¾ Table reference identified using either exposed names in the
original, or the optimized statement

¾ IXSCAN, LPREFETCH, IXAND, IXOR, TBSCAN, ACCESS

These should look familiar (explain)

17

Access guidelines are probably the most useful profiles. We have an access
guideline for each of our main access methods in DB2.

“Hey DB2, USE THIS INDEX”

17
Index Scan Guidelines
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
Yes, you can tell us which index to use
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
(in a crunch)!
P_SIZE = 39 AND P_TYPE = ’BRASS’ AND S.S_NATION IN ('MOROCCO', 'SPAIN')
AND PS.PS_SUPPLYCOST =
Take a break… relax… when the sky is falling
(SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
Profiles Come Calling!
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

¾ Access the PARTS table in the main subselect using the IPTKY index
<IXSCAN TABLE=’TPCD.PARTS’ INDEX=‘IPTKY’/>

¾ Access the PARTS table in the main subselect using an index


¾ Let the optimizer choose the index, but access it first
<IXSCAN TABLE=’TPCD.PARTS’ FIRST=‘TRUE’/>

¾ You can tell us as much or as little in your profile


¾ Unspecified parts of the plan are determined
18 by the optimizer as normal

And here’s the big party! It’s very easy to tell us to use a specific index, any index,
and even tell us to access a table first in the query block.

If you leave something out like an index name, DB2 will pick the best.
Your profile need not talk about the whole query. In fact we suggest you give very
few guidelines, as few as are required to nudge DB2 into the right direction.
Anything you do not talk about will be optimized as normal.

18
Index Anding Guidelines
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE = 39 AND P_TYPE = ’BRASS’ AND S.S_NATION IN ('MOROCCO', 'SPAIN')
AND PS.PS_SUPPLYCOST =
(SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

¾ Access the PARTSUPP table in the main subselect using an index


anding of PS_PARTKEY and PS_SUPPKEY
<IXAND TABLE='PS1'>
<INDEX IXNAME='PS_PARTKEY'/><INDEX IXNAME='PS_SUPPKEY'/>
</IXAND>

¾ Why do you use PS1 here, instead of PARTSUPP?


¾ Use the exposed name. More on that at the end!
19

We need to use the table name PS1 here because it is the name you’ve exposed for
that table. The last slides describe the advanced topic of creating table references in
complex situations such as views.

In general, use the exposed name if you’ve created one, else use the table name.

19
List Prefetch Guidelines
¾ Access the PARTS table in the main subselect using a list prefetch on
the IPTKY index

<LPRFETCH TABLE=’TPCD.PARTS’ INDEX=‘IPTKY’/></OPTGUIDELINES>

Generic Access Guideline


¾ Access the PARTS table in any way you want, but do it first

<ACCESS TABLE=’TPCD.PARTS’ FIRST=‘TRUE’/>

Feeling pretty good yet?


20

The generic access guideline is very useful when building join queries.

20
Cost-based Optimization Guidelines
Join Requests
¾ Specify a join method and join order
¾ Correspond to DB2 join methods
¾ Composed of other access or join requests

¾ NLJOIN, HSJOIN, MSJOIN, JOIN

21

Like access requests, we have a join guideline for each of our main join methods in
DB2.

21
Nested Loop Join Guidelines
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE = 39 AND P_TYPE = ’BRASS’ AND
S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

¾ Join the PARTS and PARTSUPP tables in the main subselect using a
nested loop join
¾ Access the tables using the best index

<NLJOIN><IXSCAN TABLE=’TPCD.PARTS’/><IXSCAN TABLE=‘PS’/></NLJOIN>

¾ Reverse the order


<NLJOIN><IXSCAN TABLE=‘PS’/><IXSCAN TABLE=’TPCD.PARTS’/></NLJOIN>

22

Reversing the join order is great. In this case we specified that index accesses be
used for the legs of the join, but we could have used an ACCESS guideline to give
DB2 full freedom.

22
Hash Join Guidelines
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE = 39 AND P_TYPE = ’BRASS’ AND
S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

¾ Join the SUPPLIERS and PARTSUPP tables in the subselect using a


hash join
¾ Use the best index for PARTSUPP, and the best access for
SUPPLIERS.

<HSJN><ACCESS TABLE=‘S1’/><IXSCAN TABLE=‘PS1’/></HSJOIN>

23

You may not always be able to get a hash join. For example, your query
optimization class may disallow has joins, you may not have identical types on the
sides of your predicate, etc. You can read more about hash join in the DB2
performance guides.

If request an impossible join, or in general if a guideline is not applied, DB2 will


return a warning 437 and an RC 13. It will also log detailed diagnostics about your
warning into a diagnostics table. You can use explain or SQL to read these
diagnostics. Details are found later in the presentation.

23
Merge Join Guidelines
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE = 39 AND P_TYPE = ’BRASS’ AND
S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

¾ Join the PARTS and PARTSUPP tables in the main subselect using a
nested loop join
¾ Join that result with the SUPPLIERS table using a merge join
¾ Choose the best index for all table accesses
LOOKIN’ GOOD!
<MSJOIN>
<NLJOIN><IXSCAN TABLE=‘TPCD.PARTS’/><IXSCAN TABLE=‘PS’/></NLJOIN>
<IXSCAN TABLE=‘S’/>
</MSJOIN>

24

Even complex graphs can be expressed very naturally in our XML syntax.
Joins contain other accesses and joins.

24
Types of Optimization Guidelines
¾ Cost-based Guidelines
¾ Access methods
¾ Use this index for this table, index-anding, access this table first, etc.
¾ Join methods
¾ Join topology

¾ Query Rewrite Guidelines


¾ Affect transformations applied to the original query to produce the
optimized statement
¾ E.g. Block In-list to join transformation
¾ Enable/disable query reoptimization, query opt level, degree

¾ Global Guidelines
¾ Used to set Global optimization parameters
¾ Enable/disable query reoptimization, query opt level, degree

25

Now we will look into the Query Rewrite Guidelines

25
Query Rewrite Guidelines
¾ Enable or disable rewrite rules applied to a specific query

¾ In-list to join (INLIST2JOIN)


¾ Transforms the constant list in an IN predicate to a table expression
¾ Can enable/disable for a statement, or target a specific predicate

¾ Subquery to join (SUBQ2JOIN)


¾ Applies to subqueries quantified by EXISTS, IN, =SOME, =ANY,
<>SOME, <>ANY

¾ Not-Exists to anti-join (NOTEX2AJ)


¾ Applies to subqueries quantified by NOT EXISTS

¾ Not-In subquery to anti-join (NOTIN2AJ)


¾ Applies to subquery predicates quantified by NOT IN

26

Query rewrite guidelines help you alter the rewrite rules that transform your query.
These can have a tremendous impact on your query, sometimes making other
profiles inapplicable.

26
In-List to Join
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE IN (39,40,45,48) AND P_TYPE = ’BRASS’ AND
S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

¾ Transform the list of constants in the predicate P_SIZE (39, 40, 45, 48)
to a table expression

<INLIST2JOIN TABLE=‘P’/>

¾ Ambiguous if multiple IN predicates on a target table, so add the


COLUMN quantifier

<INLIST2JOIN TABLE=‘P’ COLUMN=‘P_SIZE’/>


27

In-list to join leaves the in-list as a predicate or transforms it to a join with a table
expression. It can take a COLUMN attribute to target it to a specific predicate.

27
In-List to Join - Disable
SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT
FROM TPCD.PARTS, TPCD.SUPPLIERS S, TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P_SIZE IN (39,40,45,48) AND P_TYPE = ’BRASS’ AND
S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM TPCD.PARTSUPP PS1, TPCD.SUPPLIERS S1
WHERE TPCD.PARTS.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND
S1.S_NATION = S.S_NATION)
ORDER BY S.S_NAME

¾ Disable the INLIST2JOIN transformation on P_SIZE

<INLIST2JOIN TABLE=‘P’ COLUMN=‘P_SIZE’ OPTION=‘DISABLE’/>

¾ All the rewrite rules have an optional OPTION attribute


¾ OPTION=‘ENABLE’ is the default
¾ OPTION=‘DISABLE’ disables the rule, but may not work in all cases

28

Each of the rewrite guidelines can take an optional OPTION attribute, which allows
you to DISABLE that transformation. ENABLE is the default if no OPTION is
specified.

28
Subquery to Join

¾ Perform subquery to Join transformations in the query if possible


¾ Cannot be targeted to a specific subquery

<SUBQ2JOIN />

¾ Disable any subquery to join transformations in the query

<SUBQ2JOIN OPTION=‘DISABLE’/>

¾ Identical to subquery to join:


¾ Not-Exists to anti-join (NOTEX2AJ)
¾ Not-In subquery to anti-join (NOTIN2AJ)

29

Due to space limitations, I don’t give the details for Not-Exists to anti-join and Not-
In to anti-join. They are similar to subquery to join and can be found in our
documentation.

29
Rewrite and Global Guidelines
¾ When found at the statement level, applied only to the
matched statement
¾ Overrides the values at the Global level, special register,
bind option, etc. Global Guidelines

¾ REOPT Statement Specific Guidelines


¾ When to reoptimize a query with parameters
¾Statement Identification
¾ Query Optimization Level ¾Rewrite Guidelines
¾ 0,1,3,5,9 ¾Cost-based Guidelines
Statement Specific Guidelines
¾ Degree
¾Statement Identification
¾ Query parallelism
¾Rewrite Guidelines
¾Cost-based Guidelines
30

So far I’ve shown you guidelines that only apply to individual statements, in the
statement guidelines.
These next few can be placed in a statement specific guideline, OR in the Global
guidelines section to apply to all queries that are executed while the profile is active
(all queries in a package if you bind with a profile, for example). I describe
precedence rules later in the talk.

30
REOPT
¾ Overrides the REOPT bind

¾ REOPT NONE
¾ No reoptimization of the query is done, optimizer chooses default
values.
<REOPT VALUE=‘NONE’ />

¾ REOPT ONCE
¾ The execution plan is picked at the first OPEN for a query. Useful if
initial values are representative of following executions of the query.
<REOPT VALUE=‘ONCE’ />

¾ REOPT ALWAYS
¾ Reoptimize the query on every execution. Chosen plans are optimal,
at cost of compilation time and package cache activity.
<REOPT VALUE=‘ALWAYS’ />
31

REOPT is great for queries with parameter markers/host variables. Look into our
documentation to find our more.

31
Query Optimization
¾ Overrides the Query Optimization class (dft_queryopt)

¾ You can set the query optimization class PER QUERY using profiles

¾ QRYOPT 0
¾ Minimal query optimization
¾ QRYOPT 1
¾ QRYOPT 3
¾ QRYOPT 5
¾ Default. Significant optimization with limiting heuristics
¾ QRYOPT 7
¾ QRYOPT 9
¾ Maximal query optimization

<QRYOPT VALUE=’5’/>
32

0 - minimal query optimization.


1 - roughly comparable to DB2 Version 1.
2 - slight optimization.
3 - moderate query optimization.
5 - significant query optimization with heuristics to limit the effort expended on
selecting an access plan. This is the default.
7 - significant query optimization.
9 - maximal query optimization

32
Default Degree
¾ Overrides the Current Degree (dft_degree)

¾ ANY, -1 – Degree of query intra-partition parallelism determined by DB2


<DEGREE VALUE=’ANY’/>

¾ 0 – Parallelism determined by value of CURRENT DEGREE register


<DEGREE VALUE=’0’/>

¾ 1 – No intra-partition parallelism for this query


<DEGREE VALUE=’1’/>

¾ 2 – 32767 – Amount of intra-partition parallelism


<DEGREE VALUE=’10’/>

33

Please refer to the infocenter documentation to learn more about the degree
parameter.

33
Types of Optimization Guidelines
¾ Cost-based Guidelines
¾ Access methods
¾ Use this index for this table, index-anding, access this table first, etc.
¾ Join methods
¾ Join topology

¾ Query Rewrite Guidelines


¾ Affect transformations applied to the original query to produce the
optimized statement
¾ E.g. Block In-list to join transformation
¾ Enable/disable query reoptimization, query opt level, degree

¾ Global Guidelines
¾ Used to set Global optimization parameters
¾ Enable/disable query reoptimization, query opt level, degree

34

We will now look into global guidelines that impact all queries that are executed
while the profile is enabled.

34
Global Guidelines
¾ Up to now, all guidelines impacted a specific statement
¾ Global guidelines are applied to all queries issued while
profile is in effect Global Guidelines

¾ Query Optimization Level


Statement Specific Guidelines
¾ Degree
¾Statement Identification
¾ REOPT
¾Rewrite Guidelines
¾Cost-based Guidelines
¾ MQT Optimization Choices
Statement Specific Guidelines
¾ Include or exclude MQTs from query matching
¾Statement Identification
¾Rewrite Guidelines
¾Cost-based Guidelines
35

We have only a few global guidelines that cannot be applied to an individual


statement. The one I will discuss is MQT optimization choices. Another lets you
determine on which computational partition group your federated query executes.

35
MQT Optimization Choices Guideline
¾ Enable or disable the use of MQTs in your query
¾ Provide a list of MQTs considered for the query
¾ Useful if you experience compile time problems with MQTs
¾ Using the MQT is still a cost based decision, this adds/removes them
from the pool

¾ Disable all MQTs


<MQTOPT OPTION=‘DISABLE’/>

¾ Consider only a set of MQTs


<MQT NAME=‘TPCD.PARTMQT’/>
<MQT NAME=‘COLLEGE.STUDENTS’/>

36

MQT Optimization choices can conflict (you can disable and then also list some
MQTs). In these situations we try to take the most restrictive directive (disable) but
in general this is not supported and you should not write such guidelines.

36
Statement Key, Statement What?
<?xml version="1.0" encoding="UTF-8"?>
<OPTPROFILE VERSION=‘9.0’>
<OPTGUIDELINES><QRYOPT VALUE=‘3’/></OPTGUIDELINES>

<STMTPROFILE ID=‘TPCD Q1’>


<STMTKEY>SELECT S. _NAME, S.S_ADDRESS, S.S_PHONE,
S.S_COMMENT…</STMTKEY>

<OPTGUIDELINES>
<IXSCAN TABLE=’TPCD.PARTS’ INDEX=‘IPTKY’/>
<INLIST2JOIN TABLE=‘TPCD.PARTS’ COLUMN=‘P_SIZE’ OPTION=‘DISABLE’/>
</OPTGUIDELINES>
</STMTPROFILE>
</OPTPROFILE>

37

An optimization profile can contain statement profiles for multiple statements. We


use the key to map a statement level profile to a specific query.
Normally, only the statement is required, and you can grab it from your application
or from the Original Query section in the db2exfmt.

37
Statement Key
¾ Each statement level profile must identify a corresponding query
using a statement key

¾ Statement Text (required)


¾ White space normalization

¾ Default Schema (optional)


¾ Useful if single part tables are referenced in the query

¾ Function Path (optional)


¾ Useful if functions are referenced in the query

38

Analogy is to the plan cache, we make sure we get the right statement matched up.
If you don’t provide a value, then we do not use it as a criteria for matching. If you
provide a value for schema or function path, then it must match the incoming
statement’s schema or function path exactly.

38
Statement Key
<STMTPROFILE ID=‘TPCD SIMPLE QUERY’>
<STMTKEY SCHEMA=‘TPCD’
FUNCTIONPATH=‘SYSIBM,SYSFUN,SYSPROC,SYSIBMADM,TE’>
<![CDATA[SELECT C_NAME FROM CUSTOMERS]]></STMTKEY>

<OPTGUIDELINES>
<IXSCAN TABLE=’TPCD.PARTS’ INDEX=‘IPTKY’/>
<INLIST2JOIN TABLE=‘TPCD.PARTS’ COLUMN=‘P_SIZE’ OPTION=‘DISABLE’/>
</OPTGUIDELINES>
</STMTPROFILE>

¾ Only queries matching exactly all the statement key information


provided will match the statement profile

¾ Use CDATA around the statement text to avoid XML-type parsing


of comparisons ‘<‘ etc.

39

Here is an example of a statement key.

39
Create the Profile Tables
¾ Profile Table DDL
CREATE TABLE SYSTOOLS.OPT_PROFILE (
SCHEMA VARCHAR(128) NOT NULL,
NAME VARCHAR(128) NOT NULL,
PROFILE BLOB (2M) NOT NULL,
PRIMARY KEY ( SCHEMA, NAME ) );

¾ SCHEMA - Specifies the schema qualifier of the optimization


profile.
¾ NAME - Specifies the base-name of the optimization profile.
¾ PROFILE - The XML document defining the optimization profile.

40

DB2 documentation contains provides all DDL I use in the presentation. This is
table is not created automatically, as most users do not use profiles.

40
Insert Into the Profile Table
¾ Any way you want to load it is fine!
¾ I will show you how to use the IMPORT utility

¾ Create a file on the client


profile_file.load
“TPCD”,”OPTPROF”,”tpcd_profile.prof”

¾ Create the profile on the client named tpcd_profile.prof

¾ IMPORT FROM profile_file.load OF DEL MODIFIED BY


LOBSINFILE INSERT INTO SYSTOOLS.OPT_PROFILE

¾ Check documentation for more IMPORT options


41

There are so many ways of inserting a LOB. My favorite is with a table function, but
I chose to describe IMPORT here because it works from client to server.

41
Enabling DB2 for Profiles
¾ Enable the DB2_OPTPROFILE registry variable

¾ Profile related commands are not recognized without this set

¾ db2set DB2_OPTPROFILE=YES
¾ db2set DB2_OPTPROFILE=NO

¾ DB2 restart required

¾ Value listed via db2set -all

42

Often times this step is forgotten with profiles. DB2 will complain about the syntax
of your other profile commands unless you tell it that profiles are enabled.

42
Enabling a Specific Profile
¾ You may have many profiles in your systools.opt_profile table
¾ One profile enabled at a time

¾ BIND option: OPTPROFILE SCHEMA.NAME


¾ Sets the profile for all static SQL in the package
¾ Sets the default for all dynamic queries

¾ Special register
¾ SET CURRENT OPTIMIZATION PROFILE SCHEMA.NAME
¾ Sets the profile for dynamic queries following it
¾ “String”, Host Variable, or NULL (no profile in use, overrides bind
option)

¾ Also CLI (DB2_OPTPROF), SQLJ (OPTPROF), etc.


43

When you bind an application with a profile, it is the profile for all static queries.
Also, if you have dynamic queries in your application, it is the default for those.
You can specify a profile in the application using the registry variable, and that will
change the default for following dynamic queries.

You can disable profiles by setting the current profile to NULL.

All DB2 interfaces can either explicitly support profile options, such as in the cli.ini
file, or will pass through profile settings as needed.

43
How Can Things Go Wrong?
¾ Reference a nonexistent table or index
¾ Syntax error in your guideline

¾ Guidelines in the profile may become inapplicable


¾ Runstats has caused the query plan to change drastically
¾ Rewrites have caused the query plan to change drastically

¾ When DB2 cannot apply a guideline it returns a warning 437


reason code 13
¾ Messages written to a new diagnostic tables
EXPLAIN_DIAGNOSTIC

¾ DB2 will apply as much of your profile as possible

44

Sometimes things go wrong, or don’t go as planned. Perhaps you wrote a bad


guideline, or perhaps the query has changed since you last explained it.

44
W 437 RC 13 example

S_NAME S_ADDRESS S_PHONE S_COMMENT


------------------------- ------------------------- --------------- -----------------
SQL0437W Performance of this complex query may be sub-optimal. Reason code:
"13". SQLSTATE=01602

Supplier#00004 pQskYdiqymGCKMpdh2rV3KLe2 25-8267647199 azed


agedamwhofor

1 record(s) selected with 1 warning messages printed.

45

This is an example of a profile gone wrong on the DB2 CLP. Note the 437 warning
still allows results to be returned. Profiles will always be associated with reason
code 13. Please check the infocenter for other reason codes associated with
SQL0437.

45
Warning, Now What?
¾ Explain the query with the warning
¾ Human readable details in a new section!

¾ Use a table function provided in DB2


SELECT MSG FROM
TABLE( EXPLAIN_GET_MSGS('TPCDUSER', /*EXPLAIN_REQUESTER*/
TIMESTAMP( '2006-01-01-20.42.54.977268' ), /*EXPLAIN_TIME */
'SQLC2E03', /*PACKAGE NAME */
CAST( NULL AS VARCHAR(128) ), /*SOURCE_SCHEMA */
CAST( NULL AS VARCHAR(64) ), /*SOURCE_VERSION */
CHAR( 'P' ), /*EXPLAIN_LEVEL */
1, /*STMTNO */
CAST( NULL AS INTEGER ), /*SECTNO */
'en_US') ) WARNINGS /*LOCALE */
ORDER BY DIAGNOSTIC_ID

¾ Query EXPLAIN_DIAGNOSTIC tables


¾ Diagnostics available via SQL to your applications!
¾ The profile and statement guideline used inserted as arguments to
the RETURN operator

46

This table function is provided in our documentation.


It is great to be able to access human-readable diagnostic messages using SQL.
However, I normally use the db2exfmt explain tool.

46
Flushing?
¾ Once used, profiles are stored in an internal cache
¾ When updating or deleting a profile, the profile must be flushed
from the cache

¾ FLUSH OPTIMIZATION PROFILE CACHE SCHEMA.NAME


¾ FLUSH OPTIMIZATION PROFILE CACHE ALL

¾ Invalidates cached DML statements


¾ Packages using profiles must be explicitly rebound, not automatic

¾ Remember to flush? We’ll do it for you!


¾ DB2: Helping you avoid domestic disputes

47

Quick tip for note readers, don’t worry about flushing! I have a trigger that does it
all for you in the next slide. It’s also in our documentation.

47
Flushing the Easy Way
CREATE PROCEDURE SYSTOOLS.OPT_FLUSH_CACHE( IN SCHEMA VARCHAR(128),
IN NAME VARCHAR(128) )
LANGUAGE SQL
MODIFIES SQL DATA
BEGIN ATOMIC
-- FLUSH stmt (33) + quoted schema (130) + dot (1) + quoted name (130) = 294
DECLARE FSTMT VARCHAR(294) DEFAULT 'FLUSH OPTIMIZATION PROFILE CACHE '; --
-- Setup error handler to ignore error in case DB2_OPTPROFILE is not set
DECLARE CONTINUE HANDLER FOR SQLSTATE VALUE '42601' BEGIN END; --
IF NAME IS NOT NULL THEN
IF SCHEMA IS NOT NULL THEN
SET FSTMT = FSTMT || '"' || SCHEMA || '".'; --
END IF; --

SET FSTMT = FSTMT || '"' || NAME || '"'; --


EXECUTE IMMEDIATE FSTMT; --
END IF; --
END;
CREATE TRIGGER SYSTOOLS.OPT_PROFILE_UTRIG AFTER UPDATE ON SYSTOOLS.OPT_PROFILE
REFERENCING OLD AS O
FOR EACH ROW

48

Here it is! Using this trigger you can forget there is any such thing as flushing the
profile cache.

48
¾ Opt guidelines can refer to tables, views, alias’s, table expressions
using either
¾ their exposed names in the original statement (TABLE attribute)
¾ their corresponding unique correlation names in the optimized
statement (TABID attribute)

SELECT S.S_NAME, S.S_ADDRESS, S.S_PHONE, S.S_COMMENT


FROM SM_TPCD.PARTS, SM_TPCD.SUPPLIERS S, SM_TPCD.PARTSUPP PS
WHERE P_PARTKEY = PS.PS_PARTKEY AND S.S_SUPPKEY = PS.PS_SUPPKEY AND
P.P_SIZE = 39 AND P.P_TYPE = ’BRASS’ AND S.S_NATION IN ('MOROCCO', 'SPAIN') AND
PS.PS_SUPPLYCOST = (SELECT MIN(PS1.PS_SUPPLYCOST)
FROM SM_TPCD.PARTSUPP PS1, SM_TPCD.SUPPLIERS S1
WHERE P.P_PARTKEY = PS1.PS_PARTKEY AND
S1.S_SUPPKEY = PS1.PS_SUPPKEY AND S1.S_NATION = S.S_NATION)
<OPTGUIDELINES>
<HSJOIN> <TBSCAN TABLE='S1’/> <IXSCAN TABID=’Q2’ /> </HSJOIN>
<IXSCAN TABLE=‘SM_TPCD.PARTS’ />
</OPTGUIDELINES>

Optimized statement:

SELECT Q6.S_NAME AS "S_NAME", Q6.S_ADDRESS AS "S_ADDRESS", Q6.S_PHONE AS "S_PHONE",


Q6.S_COMMENT AS "S_COMMENT"
FROM (SELECT MIN(Q4.$C0)
FROM (SELECT Q2.PS_SUPPLYCOST
FROM SM_TPCD. SUPPLIERS AS Q1, SM_TPCD.PARTSUPP AS Q2
WHERE Q1.S_NATION ='MOROCCO'AND Q1.S_SUPPKEY = Q2.PS_SUPPKEY AND Q7.P_PARTKEY = Q2.PS_PARTKEY
) AS Q3
) AS Q4, SM_TPCD.PARTSUPP AS Q5, SM_TPCD. SUPPLIERS AS Q6, SM_TPCD.PARTS AS Q7
WHERE P_SIZE = 39 AND Q5.PS_SUPPLYCOST = Q4.$C0 AND Q6.S_NATION IN ('MOROCCO', 'SPAIN') AND
Q7.P_TYPE = 'BRASS' AND Q6.S_SUPPKEY = Q5.PS_SUPPKEY AND Q7.P_PARTKEY = Q5.PS_PARTKEY
49

One important detail of optimization profiles is the creation of table references to


link your guideline to a portion of your query. I will discuss these here through
examples, but please also look at the documentation to learn the full story.

49
References Through Views
¾ Can use a sequence of exposed names to qualify references to
tables in views
CREATE VIEW “Hamid".V1 as (SELECT * FROM EMPLOYEE WHERE SALARY > 50,000)
CREATE VIEW “Laura".V2 AS (SELECT * FROM "Rick".V1 WHERE DEPTNO IN (‘52’, ‘53’,’54’)

SELECT *
FROM “Laura".V2 A
WHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE)

<OPTGUIDELINES><IXSCAN TABLE='A/“Hamid".V1/EMPLOYEE'/></OPTGUIDELINES>

¾ Can refer to tables indirectly referenced by views if the reference


is not ambiguous
CREATE VIEW "Rick".V1 as (SELECT * FROM EMPLOYEE E WHERE SALARY > 50,000)

CREATE VIEW "Gustavo".V2 AS (SELECT * FROM "Rick".V1 WHERE DEPTNO IN (‘52’, ‘53’,’54’)

SELECT *
FROM "Gustavo".V2 A
WHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE)

<OPTGUIDELINES><IXSCAN TABLE='E‘/><IXSCAN TABLE=‘EMPLOYEE’ /></OPTGUIDELINES>

50

References through views.

50
Summary

¾ You can use Optimization Profiles to impact the DB2


Linux Unix and Windows optimizer’s rewrites and plans

¾ You don’t need to change your applications!


¾ Optimization Profiles can influence many queries or be
targeted at specific queries
¾ Optimization Profiles are safely managed in the database

¾ You have the power

51

I hope you enjoyed this presentation. For advanced topics I urge you to read our
great documentation.

Profiles are a simple way to impact your query plans. We’ve worked hard to let you
alter applications with modifying the application. Also, we’ve tried to support the
best application design by separating the profiles from the queries themselves, and
letting you manage and query the profiles as the extremely important objects they
are.

51
Session: A1
How to Influence the DB2 Query Optimizer Using Optimization Profiles

Tom Eliaz
IBM
teliaz@us.ibm.com

Look for them in the Viper release!

52

Once again I hope you enjoyed this presentation and I look forward to hearing your
feedback. Check them out in DB2 Viper for Linux, Unix, and Windows!

52

Você também pode gostar