Você está na página 1de 163

SYBASE TRAINING

Sybase Architecture

What is Sybase Server?


Scalabale High Performance Database
Client/Server Architecture Multithreaded Server User Connections implemented as threads RAM requirement per user = 50KB Scalability Multithreaded Operation not possible Parallel processing not possible

System Databases
master database model database sybsystemprocs tempdb sybsecurity sybsyntax

Master Database
User accounts (in syslogins) Remote user accounts (in sysremotelogins) Remote servers that this server can interact with (in sysservers) Ongoing processes (in sysprocesses) Configurable environment variables (in sysconfigures) System error messages (in sysmessages) Databases on SQL Server (in sysdatabases)

Master Database
Storage space to each database (sysusages) Tapes and disks mounted on the system (sysdevices) Active locks (in syslocks) Character sets (in syscharsets) and languages (in syslanguages) Users who hold server-wide roles (in sysloginroles)

Model Database
Provides a template for new databases Default is 2MB Databases cannot be smaller than model database Adding user-defined data types, rules, or defaults Adding users who should have access to all databases on SQL Server Granting default privileges, particularly for guest accounts

Tempdb database
Storage area for temporary tables and other temporary working storage needs (for example, intermediate results of group by and order by) Space shared among all users Default size is 2 MB

Tempdb contd
Default size is 2MB Restart of the server clears tempdb At Server restart model is copied on to tempdb Size can be altered by ALTER DATABASE

Sybsecurity database
Contains the audit system for SQL Server. Consists of :
sysaudits table, which contains the audit trail. All audit records are written into sysaudits sysauditoptions table, which contains rows describing the global audit options All other default system tables that are derived from model

Sybsyntax database
Contains syntax help for Transact-SQL commands System procedures SQL Server utilities eg. sp_syntax "select"

System Tables
Track Information
Server Wide Database Specific

Define Database Structure The master contains all system tables(31) User databases Contain a Subset of System Tables(17)

System tables in user databases


Sysusers Sysobjects Sysprocedures Sysindexes Sysconstraints sysreferences

Database Components
Database Components System Tables

Objects table ,view,default,rule, sysobjects stored procedure, and trigger Indexes Datatype Constraint s sysindexes systypes sysconstraints sysreferences

System Procedures
An easy way to query system tables System Procedure is a precompiled collection of SQL statements Are located in sybsystemprocs but can be executed from any database

System Procedures

sp_help [objname] sp_helpdb [dbname] sp_helpindex tabname sp_spaceused [objname]

Allocating Space

Allocating Storage
Device Database Allocation Unit Extent Page

Devices
Devices are hard disk files that store databases, transaction logs, and backups One device can hold many databases and one database can span multiple devices Only SA can create devices

Creating a Device
DISK INIT name=logical_name, physname = physical name, vdevno=virtual_dev_no, size = num_of_2K_blocks [, VSTART= virtual_address]

DISK INIT
DISK INIT NAME = hcl_dev1, PHYSNAME=C:\HCL\DATA\hcl_dat, VDEVNO=3, SIZE = 8192 DISK INIT NAME = hcl_dev2, PHYSNAME=C:\HCL\DATA\hcl_log, VDEVNO=4, SIZE = 1024

DISK INIT
Maps the specified physical disk operating system file to a database device name Lists the new device in master..sysdevices Prepares the device for database storage

vdevno
Used to map sysdevices, sysusages and sydatabases Must be less than device parameter in master Total no. of devices available is 255

Memory Allocation for devices


Memory allocated at Server startup 50KB/device Over configuring can be a waste of memory 20 devices will take up 1 MB RAM

Devices
Configured Value Sp_configure Number of devices To see values of vdevno already in use select distinct low/16777216 from sysdevices order by low

Info on devices
Sp_helpdevice devicename eg sp_helpdevice master Which will be the table used for storing devices?

Managing Devices
Setting up default device sp_diskdefault database_device, {default_on | default_off} Dropping a device sp_dropdevice logical_name

Managing Devices
sp_diskdefault master, defaultoff Sp_diskdefault def_1, defaulton Sp_dropdevice tapedump1

Default devices
Devices not to be used as default devices
Master Device used for sybsecurity Devices used for transaction logs

Dropping of devices
Device in use cannot be dropped Server has to be restarted after dropping a device Corresponding file has to be dropped at OS level

Creation of Database And setting options

Creating databases and logs on devices


create database database_name [on {default | database_device} [= size] [, database_device [= size]...] [log on database_device [ = size ] [, database_device [= size]]...]

Create Database
Verifies that the database name specified in the statement is unique. Makes sure that the database device names specified in the statement are available. Finds an unused identification number for the new database. Assigns space to the database on the specified database devices and updates sysusages to reflect these assignments. Inserts a row into sysdatabases. Makes a copy of the model database in the new database space, thereby creating the new database's system tables.

Create database
Create database newdb on alpha_disk = 10, beta_disk = 10, delta_disk =10, gamma_disk = 50

Create Database
create database newpubs on default = 4 Multiple default devices can be used Eg create newdb on default = 100 could use more than on device

Transaction logs
Every database has a write ahead log First the transaction is written to the log It is the system table syslogs Essential to have a log

Log on separate device


create database newdb on mydata = 8, newdata = 4 log on tranlog = 3

Estimating Log Size


Amount of update activity in the associated database Frequency of transaction log dumps Rule of thumb is 25% of the database size

Checking Log Size


Use database go dbcc checktable(syslogs) OR select count(*) from syslogs

Alter database and Drop database


alter database newpubs on pubsdata1 = 2, pubsdata2 = 3 log on tranlog Drop database newpubs

Getting info about database storage


To find names of devices on which database resides sp_helpdb database name To find space used by a database use sp_spaceused after using the database

Sp_dboption
Sets options for databases Displays a complete list of the database options when it is used without a parameter Changes a database option when used with parameters Options can be changed only for user databases

Database Options
Sp_helpdb in a database shows the options set for that database Only SA or dbo can change the options None of the master database options can be changed

Database Options
To use sp_dboption to change the pubs2 database to read only: use master sp_dboption pubs2, "read only", true Use pubs2 checkpoint

Creation of Database Objects

Sybase defned Datatypes


Exact numeric: decimal(p,s), numeric(p,s) App. Numeric : float(n) Character : char(n), varchar(n) Money : money, smallmoney Date and time: datetime, smalldatetime Binary : binary(n), varbinary(n) Text and Image : text, image

User Defined Datatypes


Subset of system defined datatype Can be used for creating datatypes that are frequently used Adding a datatype sp_addtype datatypename, phystype [(length) | (precision [, scale])] [,"identity |nulltype Example : sp_addtype tid, "char(6)", "not null"

User Defined Datatypes


Sp_help datatype gives information about that datatype eg. Sp_help tid gives information about the datatype tid Sp_droptype datatype drops the datatype Datatype in use cannot be dropped

Tables
Entity represented as a table 2 billion tables per database 250 columns per database Column names have to be unique in a table

Create table
create table titles (title_id tid, title varchar(80) not null, type char(12), pub_id char(4) null, price money null, advance money null, royalty int null, total_sales int null, pubdate datetime)

Indexes

Indexes
Enforce uniqueness Speed up joins Speeds data retreival Speeds ORDER BY and GROUP BY

Indexing
Columns to consider for indexing
Primary Key Columns frequently used in joins Columns frequently searched in ranges Columns retrieved in sorted order

Indexing (contd)
Columns that should not be indexed
Columns seldom referenced in query Columns that contain few unique values Columns defined with text, image, or bit datatypes When Update performance has a higher priority than SELECT performance

Creating An Index
create [unique] [clustered | nonclustered] index index_name on [[database.]owner.]table_name ( column_name [, column_name]...) [with {{fillfactor |

Types and Characteristics of Indexes


Types of Indexes
Clustered Nonclustered

Clustered Indexes
Physical order = Indexed order Leaf level = actual data pages of a table Only one clustered index per table Requires 1.21*table size space for creation Should be created on PK or column(s) searched for range of values

Nonclustered Indexes
Physical order is not the same as index order The leaf level contains pointers to the rows on the data pages Pointers add a level between index and data 249 nonclustered indexes per table can be created

Fill factor
Low fillfactor means free space on indexes Not maintained by Sybase Has to be maintained by dropping and recreating index Fillfactor of 0 means data and leaf pages are completely filled and nonleaf pages to 75%

Creating and Using Segments

Segments
Subsets of database devices Can be used in Create table and Create Index commands Every database can have upto 32 segments

System defined segments


System logsegment default

Creating Segments
Initialize the physical device with disk init Make the database device available to the database by using the on clause to create database or alter database sp_addsegment segname, dbname, devname

Example
This statement creates the segment seg_mydisk1 on the database device mydisk1: sp_addsegment seg_mydisk1, mydata, mydisk1

Creating Objects on Segments


create table table_name (col_name datatype ) [on segment_name] create [ clustered | nonclustered ] index index_name on table_name(col_name) [on segment_name]

Creating objects on segments


1. Start by using the master database. 2.Initialize the physical disks. 3.Allocate the new database devices to a database. 4.Use the database. 5.Create new segments that each point to one of the new devices. 6.Reduce the scope of the default and system segments so that they do not point to the new devices. 7.Create the objects, giving the new segment names.

Commands to create objects on segments


Use master Disk init to create devices Alter database to add devices Sp_addsegment to create segments Sp_dropsegment to drop default and system segments Create table/create index to create objects

Reducing the scope of log and data segment


Sp_dropsegment default, mydata,mydisk1 Sp_dropsegment system,mydata,mydisk1

Dropping a segment
sp_dropsegment segname, dbname drops segment from the specified database

Getting info on segments


sp_helpsegment info about all segments in the database sp_helpsegment "default sp_helpsegment seg1 sp_helpdb dbname all segments for that database Sp_help tablename segments used by table Sp_helpindex table segments used by indexes

Clustered Indexes
Table and Clustered Index on the same segment If you have placed a table on a segment, and you need to create a clustered index, be sure to use the on segment_name clause, or the table will migrate to the default segment.

Object Placement
Log on separate device Spread large , heavily tables across devices Tables and non-clustered indexes on separate devices Tempdb on separate device

Problems due to data storage


Single-user performance satisfactory, but response time increases as no. of processes increase Query performance degrades as system table activity increases Maintenance activities seem to take a long time Stored procedures seem to slow down as they create temporary tables Insert performance is poor on heavily used tables

How Indexes affect performance


Avoid table scans when accessing data Target specific data pages for point queries Avoid data pages completely when an index covers a query Use ordered data to avoid sorts

Index Requirements
Only one clustered index per table, since the data for a clustered index is ordered by index key You can create a maximum of 249 nonclustered indexes per table A key can be made up of as many as 31 columns. The maximum number of bytes per index key is 600

Choosing Indexes
What indexes are associated currently with a given table? What are the most important processes that make use of the table? What is the ratio of select operations to data modifications performed on the table? Has a clustered index been created for the table? Can the clustered index be replaced by a nonclustered index? Do any of the indexes cover one or more of the critical queries? Is a composite index required to enforce the uniqueness of a compound primary key? What indexes can be defined as unique? What are the major sorting requirements? Do some queries use descending ordering of result sets? Do the indexes support joins and referential integrity checks?

Logical Keys and Indexing Keys


Logical keys define the relationship between tables Logical Keys may not be used for indexing Create indexes on columns that support the joins, search arguments and ordering requirements in queries

Clustered Indexes
Clustered Indexes provide very good performance for range queries In high transaction environment do not create clustered index on a steadily increasing value such as IDENTITY column

Index Usage Criteria


An index will be used if either of the following is true
The query contains a column in a valid search argument The query contains a column that matches atleast the first column of the index

Index Covering
Mechanism for using the leaf level of a nonclustered index the way data page of clustered index would work Index covering occurs when all columns referenced in the query are contained in the index itself Leaf level of the index has all the required data

Index Covering
As the leaf index rows are much smaller than data rows, a nonclustered index that covers a query is faster then clustered index eg. An index on royalty and price will cover the following query Select royalty from titles where price between $10 and $20

Index Covering
Provides performance benefits for queries containing aggregates Typically first column of the index has to be used in where clause for the index to be used but aggregates dont have to satisfy this condition

Index Covering
Select avg(price) from titles can use index on price and scan all leaf pages Select count(*) from titles where price > $7.95 will use index on price to find the firdt leaf row where price > 7.95 and then just scan to the end of the index counting the number of leaf rows

Index Covering
Select count(*) from titles also is satisfied by nonclustered index as the number of rows in any index will be the same as the total no. of rows in the table Sybase Optimiser uses the nonclustered index with the smallest row size

Composite Indexes vs Multiple Indexes


At times better to have many narrow indexes than have large composite indexes More indexes give the optimiser more alternatives to look at in deriving the optimal plan If the first column is not used in the where clause , the index will not be used

Composite Indexes vs Multiple Indexes


Flip side of multiple indexes is the overhead to maintain many indexes All queries must be examined in the database and indexes should be designed accordingly

Stored Procedure Optimization

Stored Procedures Stored Procedures


SQL Query Stored Procedure Call

Parse Validate Names

Locate Procedure

Check Protection Check Protection Optimize Compile Execute Substitute Parameters

Stored Procedures
Main performance gain is the capability of Sybase server to save the optimised query plan generated by the first execution of the stored procedure in procedure cache and to reuse it for further execution

Execution of SP
FIRST Execution Locate SP on disk and load into cache Substitute parameter values Develop optimisation plan Compile optimisation plan Execute from cache

Execution of SP
Subsequent Executions Locate SP in cache Substitute parameter values Execute from cache

Stored Procedures
Advantaqge of cost based optimizer is that it has the capability of generating the optimal query plan for all plans based on search criteria For certain type of queries (eg range queries) optimiser may at times generate different plans

Stored Procedures
In situations where parameter values can be different for every execution, use CREATE PROCEDURE WITH RECOMPILE option In case a particular execution has to use a different plan use EXECUTE with RECOMPILE

Stored Procedures
If an index used by a stored procedure is dropped, Sybase detects it and recompiles the procedure Adding additional indexes or running UPDATE STATISTICS does not cause automatic recompilation

Stored Procedures
Statistics Updation has to be followed by sp_recompile <table name>to generate a new query plan Addition of an index has also to be followed by sp_recompile

Stored Procedures
Create proc get_order_data (@flag tinyint, @value smallint) as If @flag=1 Select * from orders where price=@value Else Select * from orders where qty=@value should be converted to ..

Stored Procedures
Create proc get_orders_by_price (@price smallint) as select * from orders where price =@value Create proc get_orders_by_qty (@qty smallint) as select * from orders where qty =@value

Stored Procedures
A separate procedure to call the appropriate procedure depending on the value of flag create proc get_order_data(@flag tinyint, @value smallint) as if @flag=1 exec get_orders_by_price else exec get_orders_by_qty

Triggers

274

Why Triggers
Cascading Actions are not available with DRI i.e. when Primary Key is Updated or Deleted, Corresponding Foreign Keys do not get automatically changed Maintaining duplicate data Keeping derived columns current

275

Special Tables for Triggers


Inserted and Deleted Are available only to triggers Have the same structure as the trigger table Can be joined to other tables in the database

281

INSERT Trigger

Trigger Table

Inserted

282

INSERT Trigger
eg. CREATE TRIGGER loan_ins ON loan for insert AS UPDATE copy SET ON_LOAN=y FROM COPY,inserted WHERE copy.isbn=inserted.isbn AND copy.cop_no=inserted.copy_no

284

DELETE Trigger
Trigger Table

Deleted

285

UPDATE Trigger
Updated Table
Table

Inserted

Deleted
287

UPDATE Trigger
CREATE TRIGGER mem_upd ON member FOR UPDATE AS IF UPDATE(MEMBER_NO) BEGIN RAISERROR (Trnxn cannot be processed. \ **** Member cannot be updated.,10,1) ROLLBACK TRANSACTION END

288

Transaction control in Triggers


Rollback transaction in a trigger rolls back the entire transaction Rollback to a savepoint name rolls back to the savepoint Rollback trigger rolls back the data modification that fired the trigger and any statements in the trigger that are part of the transaction
292

Trigger Considerations
Overhead is very low Inserted and Deleted tables are in memory Location of other tables referenced by the trigger determines the amount of time required INSERT,DELETE or UPDATE in the trigger is a part of the transaction Nested triggers are set to true by default Self recursion of triggers does not happen unless set
293

Cursors

Benefits of Cursors
Allow a program to take action on each row of a query result set rather than on the entire set of rows Provide the ability to delete or update a row in a table based on cursor position

205

Cursors
A cursor consists of the following parts Cursor result set : set of rows resulting from execution of the associated select statement Cursor position : a pointer to one row within the cursor result set

206

Cursor scope
Session Starts when a client logs into SQL Server and ends on log out SP Starts when SP begins execution and ends when SP completes execution Trigger Starts when trigger begins execution and ends when it completes execution
207

Cursors
Declare declare the cursor for select statement. Checks SQL syntax Open Executes the query and creates the result set . Positions the cursor before the first row of the result set Fetch fetches the row to which cursor points Close Closes the result set, but the compiled query plan remains in memory Deallocate drops the query plan from memory
208

Resource requirements of Cursors


Memory allocated at the time of declaration of cursor On open intent table lock is acquired When a row is fetched, page lock is acquired

Read Only Cursors


Read-only mode uses shared page locks. It is in effect if you specify for read only or if the cursor's select statement uses distinct, group by, union, or aggregate functions

Updatable Cursors
Update mode uses update page locks. It is in effect if:
You specify for update. The select statement does not include distinct, group by, union, a subquery, aggregate functions,

Index requirements
For read only cursors any index can be used Same query result will be obtained as will be obtained for a select statement

Index Requirements
Updatable cursors require unique indexes on the tables whose columns are being updated

Performance issues with cursors


Cursor performance issues are:
Locking at the page and table level Network resources Overhead of processing instructions

Use cursors only if necessary

Optimizing Tips for cursors


Optimize cursor selects using the cursor, not ad hoc queries. Use union or union all instead of or clauses or in lists. Declare the cursor's intent. Specify column names in the for update clause. Use the shared keyword for tables.

Query Optimisation

Goals and steps of Query Optimisation


Goals Minimise logical and physical page accesses Steps 1) Parse and normalise the query validating syntax and object references 2) Optimise the query and generate plan 3) Compile query plan 4) Execute the query and return result to user

Optimisation Step
Phase 1 Query Analysis
Find the SARG Find the ORs Find the joins

Phase 2 Index Selection


Choose Choose Choose Choose the best index for each SARG the best method for ORs best indexes for any join clause best index to use for each table

Phase 3 Join Order Selection

SARG
SARG - Enables the optimiser to limit the rows searched to satisfy a query SARG is matched with an index SARG is a where clause comparing a column to a constant Col operator const_expr Valid operators are =,>,<,>=,<= != or <> cannot be used to match a value against an index unless query is covered

Search arguments
Valid SARGs flag = 7 salary > 10000 city = Pune and state = MH Invalid SARGs lname=fname ytd/months > 1000 ytd/12 = 1000 flag !=0

Improvising SARG by Sybase


BETWEEN becomes >= and <= price between $10 and $20 becomes price > = $10 and price < = $20 LIKE becomes >= and < eg. au_lname like Sm% becomes: au_lname >= Sm and au_lname < Sn

OR clauses
Format of OR clause is SARG or SARG [or ] with all columns involved in the OR belonging to the same table Column IN (const 1 ,const 2 , also is treated as OR Examples of OR where au_lname = X OR au_fname=Y Where (type=ty and price > $25) OR pub_id = 1 Where au_lname in (A,B,C)

The OR strategy
OR clause handled either by table scan or by using OR strategy OR strategy used only when indexes occur on both columns OR strategy breaks query into two parts. Each query is executed and row ids are kept into a work table. Duplicates are removed and qualifying rows are retrieved from the work table OR strategy used only if total I/O is less than the I/Os required for table scan

Join Clause
A join clause is a where clause in the format Table1.Column Operator Table2.Column

Index Selection
All SARGs, OR clauses and join clauses are matched with available indexes and I/O costs are estimated Index I/O costs are compared with each other and against the cost of table scan to determine the least expensive access path If no useful index is found, a table scan must be performed

ORDER BY, GROUP BY and DISTINCT clauses


Optimiser determines whether work tables are needed by these clauses GROUP BY always needs work table DISTINCT - If a unique index exists on the table and all columns in the unique index are included in the result set Work table is not created ORDER BY - In case of a clustered index on the column used in BY, no work table is created

Potential Optimiser problems and solutions


Make sure statistics are upto date (Run update statistics) Check SARGs Check Stored Procedures for current parameters

Guidelines for creating SARGs


Avoid functions, arithmetic operations, and other expressions on the column side of search clauses. Avoid incompatible datatypes. Use the leading column of a composite index. The optimization of secondary keys provides less performance. Use all the search arguments you can to give the optimizer as much as possible to work with

Adding SARG to help optimiser


1. select au_lname, title from titles t, titleauthor ta, authors a where t.title_id = ta.title_id and a.au_id = ta.au_id and t.title_id = "T81002" 2.select au_lname, title from titles t, titleauthor ta, authors a where t.title_id = ta.title_id and a.au_id = ta.au_id and ta.title_id = "T81002" 3. select au_lname, title from titles t, titleauthor ta, authors a where t.title_id = ta.title_id and a.au_id = ta.au_id and t.title_id = "T81002 and ta.title_id = "T81002

Joins
The process of creating the result set for a join is to nest the tables, and to scan the inner tables repeatedly for each qualifying row in the outer table.

Choice of inner and outer tables


The outer table is usually the one that has: Smallest number of qualifying rows, and/or Largest numbers of reads required to locate rows. The inner table usually has: Largest number of qualifying rows, and/or Smallest number of reads required to locate rows. For example, when you join a large, unindexed table to a smaller table with indexes on the join key, the optimizer chooses: The large table as the outer table. It will only have to read this large table once. The indexed table as the inner table. Each time it needs to access the inner table, it will take only a few reads to find rows.

Combining max and min aggregates


When used separately min and max on indexed columns use special processing if there is no where clause Min aggregates retrieve the first value on the root page of the index, performing a single read to find the value Max aggregates follow the last entry on the last page at each index level until they reach the leaf level. For a clustered index, the number of reads required is the height of the index tree plus one read for the data page. For a nonclustered index, the number of reads required is the height of the index tree.

Update Operation
Direct Updates Deferred Updates Direct updates are faster and are performed whenever possible

Direct Updates
Sybase performs direct updates in a single pass, as follows: Locates the affected index and data rows Writes the log records for the changes to the transaction log Makes the changes to the data pages and any affected index pages

Deferred Updates
The steps involved in deferred updates are: Locate the affected data rows, writing the log records for deferred delete and insert of the data pages as rows are located Read the log records for the transaction. Perform the deletes on the data pages and delete any affected index rows At the end of the operation, re-read the log, and make all inserts on the data pages and insert any affected index rows

Guidelines to avoid deferred updates


Create at least one unique index on the table to encourage more direct updates If null values are not used , use not null in table definition Use char datatype instead of varchar wherever possible

T-SQL Perofrmance Tips

Greater Than Query


This query, with an index on int_col: select * from table where int_col > 3 uses the index to find the first value where int_col equals 3, and then scans forward to find the first value greater than 3. If there are many rows where int_col equals 3, the server has to scan many pages to find the first row where int_col is greater than 3. Efficient way to write this query is - : select * from table where int_col >= 4

Not Exists Test


In subqueries and if statements, exists and in perform faster than not exists and not in when the values in the where clause are not indexed. For exists and in, SQL Server can return TRUE as soon as a single row matches. For the negated expressions, it must examine all values to determine that there are not matches.

Not Exists Test


Not exists test if not exists (select * from table where...) begin /* Statement Group 1 */ end else begin /* Statement Group 2 */ end

contd ..

Not Exists Test


Can be rewritten as if exists (select * from table where...) begin /* Statement Group 2 */ end else begin /* Statement Group 1 */ end

Variable vs Parameter values in where clause


The optimizer knows the value of a parameter to a stored procedure at compile time, but it cannot predict the value of a declared variable. Providing the optimizer with the values of search arguments in the where clause of a query can help the optimizer make better choices. Often, the solution is to split up stored procedures: Set the values of variables in the first procedure. Call the second procedure and pass those variables as parameters to the second procedure

Variable vs Parameter values in where clause


For example, the optimizer cannot optimize the final select in the following procedure, because it cannot know the value of @x until execution time: create procedure p as declare @x int select @x = col from tab where ... select * from tab2 where indexed_col = @x

Variable vs Parameter values in where clause


The following example shows procedure split into two . create procedure base_proc as declare @x int select @x = col from tab where ... exec select_proc @x create procedure select_proc @x int as select * from tab2 where col2 = @x

Count vs Exists
Do not use the count aggregate in a subquery to do an existence check: select * from tab where 0 < (select count(*) from tab2 where ...) Instead, use exists (or in): select * from tab where exists (select * from tab2 where ...)

Count vs Exists
When you use count, SQL Server does not know that you are doing an existence check. It counts all matching values, When you use exists, SQL Server knows you are doing an existence check. When it finds the first matching value, it returns TRUE and stops looking. The same applies to using count instead of in or any.

Aggregates
SQL Server uses special optimizations for the max and min aggregates when there is an index on the aggregated column For min, it reads the first value on the root page of the index For max, it goes directly to the end of the index to find the last row

Aggregates
min and max optimizations are not applied if: The expression inside the max or min is anything but a column. Compare max(numeric_col*2) and max(numeric_col )*2, where numeric_col has a nonclustered index. The second uses max optimization; the first performs a scan of the nonclustered index. The column inside the max or min is not the first column of an index. For nonclustered indexes, it can perform a scan on the leaf level of the index; for clustered indexes, it must perform the table scan. There is another aggregate in the query There is a group by clause

Aggregates
Do not write two aggregates together eg. select max(price), min(price) from titles results in a full scan of titles, even if there is an index on price Rewriting the query as: select max(price) from titles select min(price) from titles uses index for both the queries

Joins and datatypes


When joining between two columns of different datatypes, one of the columns must be converted to the type of the other Column whose type is lower in the hierarchy is converted Index cannot be used for converted column

Joins and datatypes


select * from small_table, large_table where smalltable.float_column = large_table.int_column In this case, SQL Server converts the integer column to float, because int is lower in the hierarchy than float. It cannot use an index on large_table.int_column, although it can use an index on smalltable.float_column

Null vs not null character columns


Char null is really stored as varchar Joining char not null with char null involves a conversion Best to have same datatypes for frequently joined columns, including acceptance of nulls Can be implemented using user defined datatypes

Forcing the Conversion to the other side of the join

If a join between different datatypes is unavoidable, and it hurts performance, you can force the conversion to the other side of the join

Forcing the Conversion to the other side of the join


In the following query, varchar_column must be converted to char, so no index on varchar_column can be used, and huge_table must be scanned: select * from small_table, huge_table where small_table.char_col = huge_table.varchar_col

Forcing the conversion to the other side of the join


Performance would be improved if the index on huge_table could be used. Using the convert function on the varchar column of the small table allows the index on the large table to be used while the small table is table scanned: select * from small_table, huge_table where convert(varchar(50),small_table.char_col) = huge_table.varchar_col

Parameters and Datatypes


The query optimizer can use the values of parameters to stored procedures to help determine costs. If a parameter is not of the same type as the column in the where clause to which it is being compared, SQL Server has to convert the parameter. The optimizer cannot use the value of a converted parameter. Make sure that parameters are of the same type as the columns they are compared to.

Parameters and Datatypes


eg create proc p @x varchar(30) as select * from tab where char_column = @x may get a less optimal query plan than: create proc p @x char(30) as select * from tab where char_column = @x

Commands to see how optimiser is working


Set showplan on - is used to see the execution plan for a particular query Set noexec on generally used with set showplan on . It is a toggle and should be set to off after the plan has been studied Set statistics time on is used to see where time is spent Dbcc traceon(302) and dbcctraceon(310) is used to which indexes are shosen