Você está na página 1de 70

Questionnaire

Category - Database Design


1) What is denormalization and when would you go for it?
- As the name indicates, denormalization is the reverse process of normalization. It's the controlled
introduction of redundancy in to the database design. It helps improve the query performance as the number
of joins could be reduced.

2) How do you implement one-to-one, one-to-many and many-to-many relationships while designing
tables?
- One-to-One relationship can be implemented as a single table and rarely as two tables with primary and
foreign key relationships.
- One-to-Many relationships are implemented by splitting the data into two tables with primary key and
foreign key relationships.
- Many-to-Many relationships are implemented using a junction table with the keys from both the tables
forming the composite primary key of the junction table.

3) What's the difference between a primary key and a unique key?


- Both primary key and unique enforce uniqueness of the column on which they are defined. But by default
primary key creates a clustered index on the column, where are unique creates a nonclustered index by
default. Another major difference is that, primary key doesn't allow NULLs, but unique key allows one
NULL only.

4) What are user defined datatypes and when you should go for them?
- User defined datatypes let you extend the base SQL Server datatypes by providing a descriptive name,
and format to the database. Take for example, in your database, there is a column called Flight_Num which
appears in many tables. In all these tables it should be varchar(8). In this case you could create a user
defined datatype called Flight_num_type of varchar(8) and use it across all your tables.

5) What is bit datatype and what's the information that can be stored inside a bit column?
- Bit datatype is used to store boolean information like 1 or 0 (true or false). Untill SQL Server 6.5 bit
datatype could hold either a 1 or 0 and there was no support for NULL. But from SQL Server 7.0 onwards,
bit datatype can represent a third state, which is NULL.

6) Define candidate key, alternate key, composite key.


- A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes
the primary key of the table. If the table has more than one candidate key, one of them will become the
primary key, and the rest are called alternate keys.
- A key formed by combining at least two or more columns is called composite key.

7) What are defaults? Is there a column to which a default can't be bound?


- A default is a value that will be used by a column, if no value is supplied to that column while inserting
data. IDENTITY columns and timestamp columns can't have defaults bound to them. See CREATE
DEFUALT in books online.

Category - SQL Server Architecture


1) What is a transaction and what are ACID properties?
- A transaction is a logical unit of work in which, all the steps must be performed or none. ACID stands for
Atomicity, Consistency, Isolation, Durability. These are the properties of a transaction.

2) Explain different isolation levels


- An isolation level determines the degree of isolation of data between concurrent transactions. The default
SQL Server isolation level is Read Committed. Here are the other isolation levels (in the ascending order of
isolation): Read Uncommitted, Read Committed, Repeatable Read, Serializable.

3) CREATE INDEX myIndex ON myTable(myColumn) - What type of Index will get created after
executing the above statement?
- Non-clustered index. Important thing to note: By default a clustered index gets created on the primary
key, unless specified otherwise.

4) What's the maximum size of a row?


- 8060 bytes. Don't be surprised with questions like 'what is the maximum number of columns per table'.

5) Explain some cluster configurations


- Two of the clusterning configurations are Active/Active and Active/Passive.

6) What is lock escalation?


- Lock escalation is the process of converting a lot of low level locks (like row locks, page locks) into
higher level locks (like table locks). Every lock is a memory structure too many locks would mean, more
memory being occupied by locks. To prevent this from happening, SQL Server escalates the many fine-
grain locks to fewer coarse-grain locks. Lock escalation threshold was definable in SQL Server 6.5, but
from SQL Server 7.0 onwards it's dynamically managed by SQL Server.

7) What's the difference between DELETE TABLE and TRUNCATE TABLE commands?
- DELETE TABLE is a logged operation, so the deletion of each row gets logged in the transaction log,
which makes it slow. - TRUNCATE TABLE also deletes all the rows in a table, but it won't log the deletion
of each row, instead it logs the deallocation of the data pages of the table, which makes it faster. Of course,
TRUNCATE TABLE can be rolled back.

8) Explain the storage models of OLAP


- MOLAP, ROLAP and HOLAP

9) What are constraints? Explain different types of constraints.


- Constraints enable the RDBMS enforce the integrity of the database automatically, without needing you
to create triggers, rule or defaults.
- Types of constraints: NOT NULL, CHECK, UNIQUE, PRIMARY KEY, FOREIGN KEY

10) Whar is an index? What are the types of indexes? How many clustered indexes can be created on a
table? I create a separate index on each column of a table. what are the advantages and disadvantages of
this approach?

- Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve the data
quicker.

- Indexes are of two types. Clustered indexes and non-clustered indexes. When you craete a clustered index
on a table, all the rows in the table are stored in the order of the clustered index key. So, there can be only
one clustered index per table. Non-clustered indexes have their own storage separate from the table data
storage. Non-clustered indexes are stored as B-tree structures (so do clustered indexes), with the leaf level
nodes having the index key and it's row locater. The row located could be the RID or the Clustered index
key, depending up on the absence or presence of clustered index on the table.

- If you create an index on each column of a table, it improves the query performance, as the query
optimizer can choose from all the existing indexes to come up with an efficient execution plan. At the same
t ime, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time
data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need
disk space, the more indexes you have, more disk space is used.
Category - Database Administration
1) What is RAID and what are different types of RAID configurations?
- RAID stands for Redundant Array of Inexpensive Disks, used to provide fault tolerance to database
servers. There are six RAID levels 0 through 5 offering different levels of performance, fault tolerance.

2) What are the steps you will take to improve performance of a poor performing query?
- This is a very open ended question and there could be a lot of reasons behind the poor performance of a
query. But some general issues that you could talk about would be: No indexes, table scans, missing or out
of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without
SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization,
excess usage of cursors and temporary tables.

- Some of the tools/ways that help you troubleshooting performance problems are: SET SHOWPLAN_ALL
ON, SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000
Performance monitor, Graphical execution plan in Query Analyzer.

3) What are the steps you will take, if you are tasked with securing an SQL Server?
- Again this is another open ended question. Here are some things you could talk about: Preferring NT
authentication, using server, databse and application roles to control access to the data, securing the
physical database files using NTFS permissions, using an unguessable SA password, restricting physical
access to the SQL Server, renaming the Administrator account on the SQL Server computer, disabling the
Guest account, enabling auditing, using multiprotocol encryption, setting up SSL, setting up firewalls,
isolating SQL Server from the web server etc.

4) What is a deadlock and what is a live lock? How will you go about resolving deadlocks?
- Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a
lock on the other's piece. Each process would wait indefinitely for the other to release the lock, unless one
of the user processes is terminated. SQL Server detects deadlocks and terminates one user's process.

- A livelock is one, where a request for an exclusive lock is repeatedly denied because a series of
overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses
further shared locks. A livelock also occurs when read transactions monopolize a table or page, forcing a
write transaction to wait indefinitely.

5) What is blocking and how would you troubleshoot it?


- Blocking happens when one connection from an application holds a lock and a second connection
requires a conflicting lock type. This forces the second connection to wait, blocked on the first.

6)How to restart SQL Server in single user mode? How to start SQL Server in minimal configuration
mode?
- SQL Server can be started from command line, using the SQLSERVR.EXE. This EXE has some very
important parameters with which a DBA should be familiar with. -m is used for starting SQL Server in
single user mode and -f is used to start the SQL Server in minimal confuguration mode.

7) As a part of your job, what are the DBCC commands that you commonly use for database maintenance?
- DBCC CHECKDB, DBCC CHECKTABLE, DBCC CHECKCATALOG, DBCC CHECKALLOC,
DBCC SHOWCONTIG, DBCC SHRINKDATABASE, DBCC SHRINKFILE etc. But there are a whole
load of DBCC commands which are very useful for DBAs.

8) What are statistics, under what circumstances they go out of date, how do you update them?
- Statistics determine the selectivity of the indexes. If an indexed column has unique values then the
selectivity of that index is more, as opposed to an index with non-unique values. Query optimizer uses
these indexes in determining whether to choose an index or not while executing a query.
Some situations under which you should update statistics:
a) If there is significant change in the key values in the index
b) If a large amount of data in an indexed column has been added, changed, or removed (that is, if the
distribution of key values has changed), or the table has been truncated using the TRUNCATE TABLE
statement and then repopulated
c) Database is upgraded from a previous version

9) What are the different ways of moving data/databases between servers and databases in SQL Server?
- There are lots of options available, you have to choose your option depending upon your requirements.
Some of the options you have are: BACKUP/RESTORE, dettaching and attaching databases, replication,
DTS, BCP, logshipping, INSERT...SELECT, SELECT...INTO, creating INSERT scripts to generate data.

10) Explian different types of BACKUPs avaialabe in SQL Server? Given a particular scenario, how would
you go about choosing a backup plan?
- Types of backups you can create in SQL Sever 7.0+ are Full database backup, differential database
backup, transaction log backup, filegroup backup. Check out the BACKUP and RESTORE commands in
SQL Server books online.

11) What is database replicaion? What are the different types of replication you can set up in SQL Server?
- Replication is the process of copying/moving data between databases on the same or different servers.
SQL Server supports the following types of replication scenarios:

a) Snapshot replication
b) Transactional replication (with immediate updating subscribers, with queued updating subscribers)
Merge replication

12) How to determine the service pack currently installed on SQL Server?
- The global variable @@Version stores the build number of the sqlservr.exe, which is used to determine
the service pack installed.

Category - Database Programming


1) What are cursors? Explain different types of cursors. What are the disadvantages of cursors? How can
you avoid cursors?

Cursors allow row-by-row prcessing of the resultsets.

Types of cursors: Static, Dynamic, Forward-only, Keyset-driven. See books online for more information.

Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip,
where as a normal SELECT query makes only one rowundtrip, however large the resultset is. Cursors are
also costly because they require more resources and temporary storage (results in more IO operations).
Furthere, there are restrictions on the SELECT statements that can be used with some types of cursors.

Most of the times, set based operations can be used instead of cursors. Here is an example:

If you have to give a flat hike to your employees using the following criteria:

Salary between 30000 and 40000 -- 5000 hike


Salary between 40000 and 55000 -- 7000 hike
Salary between 55000 and 65000 -- 9000 hike
In this situation many developers tend to use a cursor, determine each employee's salary and update his
salary according to the above formula. But the same can be achieved by multiple update statements or can
be combined in a single UPDATE statement as shown below:

UPDATE tbl_emp SET salary =


CASE WHEN salary BETWEEN 30000 AND 40000 THEN salary + 5000
WHEN salary BETWEEN 40000 AND 55000 THEN salary + 7000
WHEN salary BETWEEN 55000 AND 65000 THEN salary + 10000
END

Another situation in which developers tend to use cursors: You need to call a stored procedure when a
column in a particular row meets certain condition. You don't have to use cursors for this. This can be
achieved using WHILE loop, as long as there is a unique key to identify each row. For examples of using
WHILE loop for row by row processing, check out the 'My code library' section of my site or search for
WHILE.

2) Write down the general syntax for a SELECT statements covering all the options.

Here's the basic syntax: (Also checkout SELECT in books online for advanced syntax).

SELECT select_list
[INTO new_table_]
FROM table_source
[WHERE search_condition]
[GROUP BY group_by_expression]
[HAVING search_condition]
[ORDER BY order_expression [ASC | DESC] ]

3) What is a join and explain different types of joins.

Joins are used in queries to explain how different tables are related. Joins also let you select data from a
table depending upon data from another table.

Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as
LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.

4) Can you have a nested transaction?

Yes, very much. Check out BEGIN TRAN, COMMIT, ROLLBACK, SAVE TRAN and
@@TRANCOUNT

5) What is an extended stored procedure? Can you instantiate a COM object by using T-SQL?

An extended stored procedure is a function within a DLL (written in a programming language like C, C++
using Open Data Services (ODS) API) that can be called from T-SQL, just the way we call normal stored
procedures using the EXEC statement. See books online to learn how to create extended stored procedures
and how to add them to SQL Server.

Yes, you can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using
sp_OACreate stored procedure.

6) What is the system function to get the current user's user id?

USER_ID(). Also check out other system functions like USER_NAME(), SYSTEM_USER,
SESSION_USER, CURRENT_USER, USER, SUSER_SID(), HOST_NAME().
7) What are triggers? How many triggers you can have on a table? How to invoke a trigger on demand?

Triggers are special kind of stored procedures that get executed automatically when an INSERT, UPDATE
or DELETE operation takes place on a table.

In SQL Server 6.5 you could define only 3 triggers per table, one for INSERT, one for UPDATE and one
for DELETE. From SQL Server 7.0 onwards, this restriction is gone, and you could create multiple triggers
per each action. But in 7.0 there's no way to control the order in which the triggers fire. In SQL Server 2000
you could specify which trigger fires first or fires last using sp_settriggerorder

Triggers can't be invoked on demand. They get triggered only when an associated action (INSERT,
UPDATE, DELETE) happens on the table on which they are defined.

Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the
referential integrity checks, but wherever possible, use constraints for this purpose, instead of triggers, as
constraints are much faster.

Till SQL Server 7.0, triggers fire only after the data modification operation happens. So in a way, they are
called post triggers. But in SQL Server 2000 you could create pre triggers also.

9) There is a trigger defined for INSERT operations on a table, in an OLTP system. The trigger is written to
instantiate a COM object and pass the newly insterted rows to it for some custom processing. What do you
think of this implementation? Can this be implemented better?

Instantiating COM objects is a time consuming process and since you are doing it from within a trigger, it
slows down the data insertion process. Same is the case with sending emails from triggers. This scenario
can be better implemented by logging all the necessary data into a separate table, and have a job which
periodically checks this table and does the needful.

10) What is a self join? Explain it with an example.

Self join is just like any other join, except that two instances of the same table will be joined in the query.
Here is an example: Employees table which contains rows for normal employees as well as managers. So,
to find out the managers of all the employees, you need a self join.

CREATE TABLE emp


(
empid int,
mgrid int,
empname char(10)
)

INSERT emp SELECT 1,2,'Vyas'


INSERT emp SELECT 2,3,'Mohan'
INSERT emp SELECT 3,NULL,'Shobha'
INSERT emp SELECT 4,2,'Shridhar'
INSERT emp SELECT 5,2,'Sourabh'

SELECT t1.empname [Employee], t2.empname [Manager]


FROM emp t1, emp t2
WHERE t1.mgrid = t2.empid

Here's an advanced query using a LEFT OUTER JOIN that even returns the employees without managers
(super bosses)

SELECT t1.empname [Employee], COALESCE(t2.empname, 'No manager') [Manager]


FROM emp t1
LEFT OUTER JOIN
emp t2
ON t1.mgrid = t2.empid
SQL Servers
• What is a major difference between SQL Server 6.5 and 7.0 platform wise?
SQL Server 6.5 runs only on Windows NT Server. SQL Server 7.0 runs on Windows NT Server,
workstation and Windows 95/98.

• Is SQL Server implemented as a service or an application?


It is implemented as a service on Windows NT server and workstation and as an application on
Windows 95/98.

• What is the difference in Login Security Modes between v6.5 and 7.0?
7.0 doesn't have Standard Mode, only Windows NT Integrated mode and Mixed mode that
consists of both Windows NT Integrated and SQL Server authentication modes.

• What is a traditional Network Library for SQL Servers?


Named Pipes

• What is a default TCP/IP socket assigned for SQL Server?


1433

• If you encounter this kind of an error message, what you need to look into to solve this problem?
"[Microsoft][ODBC SQL Server Driver][Named Pipes]Specified SQL Server not found."
1.Check if MS SQL Server service is running on the computer you are trying to log into
2.Check on Client Configuration utility. Client and Server have to in sync.

• What are the two options the DBA has to assign a password to sa?
a) to use SQL statement
Use master
Exec sp_password NULL,
b) to use Query Analyzer utility

• What is new philosophy for database devises for SQL Server 7.0?
There are no devises anymore in SQL Server 7.0. It is file system now.

• When you create a database how is it stored?


It is stored in two separate files: one file contains the data, system tables, other database objects,
the other file stores the transaction log.

• Let's assume you have data that resides on SQL Server 6.5. You have to move it SQL Server 7.0.
How are you going to do it?
You have to use transfer command.

DirectConnect
• Have you ever tested 3 tier applications?

• Do you know anything about DirectConnect software? Who is a vendor of the software?
Sybase.
• What platform does it run on?
UNIX.

• How did you use it? What kind of tools have you used to test connection?
SQL Server or Sybase client tools.

• How to set up a permission for 3 tier application?


Contact System Administrator.

• What UNIX command do you use to connect to UNIX server?


FTP Server Name

• Do you know how to configure DB2 side of the application?


Set up an application ID, create RACF group with tables attached to this group, attach the ID to
this RACF group.
• Differences between SET and SELECT!
Are standards important to you? If your answer is 'yes', then you should be using SET. This is
because, SET is the ANSI standard way of assigning values to variables, and SELECT is not.

Another fundamental difference between SET and SELECT is that, you can use SELECT to assign
values to more than one variable at a time. SET allows you to assign data to only one variable at a
time. Here's how:

/* Declaring variables */
DECLARE @Variable1 AS int, @Variable2 AS int

/* Initializing two variables at once */


SELECT @Variable1 = 1, @Variable2 = 2

/* The same can be done using SET, but two SET statements are needed */
SET @Variable1 = 1
SET @Variable2 = 2

What is the output of following Sql query ?


select datepart(yy,'2/2/50'),datepart(yy,'12/31/49'),datepart(yy,getdate())
Note : Current year is assumed as 2005
Select Answer:
1. 2050 2049 2005
2. 2050 1949 2005
3. 1950 1940 2005
4. 1950 1940 1995
5. 1950 2049 2005

What is the purpose of clustering: Allowing physical server to take task in case of failure of other
server
New terminology used in SQL server 7.0 for DUMP,LOAD: BACKUP,RESTORE
What is the Default ScriptTimeOut for Server Object? 20 Sec
How many machine config file possible in system? Only 1
How many webconfig file possible in one system? Depend on no of web application on the
System
How many Global.asax file is possible in one application? 1
What is the use of following Statement Response.Expires=120 The page will be removed form
cache before 120 Min
What should the developer use in order to have an Active Server Page (ASP) invokes a
stored procedure on a SQL Server database? ADO
Write the query to get the full details of the currently installed SQL server in the machine
execute master..xp_msver
Which choice is NOT an ADO collection? Records
Which is the default Scripting Language on the client side in ASP? JavaScript
What is the query to get the version of the SQL Server? select @@version
which version of sqlserver is not supported by System.Data.SqlClinet? Sql Server 6.5
What is the use of SQL-DMO? It is set of programmable object to programmatically administer
the database
What is SQL-DMO stands for? Distributed Management Object
What is the Use of BCP It is used to export and import large file in and out sql
Expand BCP ? Bulk Copy Program
1. How do you read transaction logs?

2. How do you reset or reseed the IDENTITY column?

3. How do you persist objects, permissions in tempdb?

4. How do you simulate a deadlock for testing purposes?

5. How do you rename an SQL Server computer?

6. How do you run jobs from T-SQL?

7. How do you restore single tables from backup in SQL Server 7.0/2000? In SQL Server 6.5?

8. Where to get the latest MDAC from?

9. I forgot/lost the sa password. What do I do?

10. I have only the .mdf file backup and no SQL Server database backups. Can I get my database back
into SQL Server?

11. How do you add a new column at a specific position (say at the beginning of the table or after the
second column) using ALTER TABLE command?

12. How do you change or alter a user defined data type?

13. How do you rename an SQL Server 2000 instance?

14. How do you capture/redirect detailed deadlock information into the error logs?

15. How do you remotely administer SQL Server?

16. What are the effects of switching SQL Server from ‘Mixed mode’ to ‘Windows only’
authentication mode? What are the steps required, to not break existing applications?

17. Is there a command to list all the tables and their associated filegroups?

18. How do you ship the stored procedures, user defined functions (UDFs), triggers, views of my
application, in an encrypted form to my clients/customers? How do you protect intellectual
property?

19. How do you archive data from my tables? Is there a built-in command or tool for this?

20. How do you troubleshoot ODBC timeout expired errors experienced by applications accessing
SQL Server databases?
21. How do you restart SQL Server service automatically at regular intervals?

22. What is the T-SQL equivalent of IIF (immediate if/ternary operator) function of other
programming languages?

23. How do you programmatically find out when the SQL Server service started?

24. How do you get rid of the time part from the date returned by GETDATE function?

25. How do you upload images or binary files into SQL Server tables?

26. How do you run an SQL script file that is located on the disk, using T-SQL?

27. How do you get the complete error message from T-SQL while error handling?

28. How do you get the first day of the week, last day of the week and last day of the month using T-
SQL date functions?

29. How do you pass a table name, column name etc. to the stored procedure so that I can dynamically
select from a table?

30. Error inside a stored procedure is not being raised to my front-end applications using ADO. But I
get the error when I run the procedure from Query Analyzer.

31. How do you suppress error messages in stored procedures/triggers etc. using T-SQL?

32. How do you save the output of a query/stored procedure to a text file?

33. How do you join tables from different databases?

34. How do you join tables from different servers?

35. How do you convert timestamp data to date data (datetime datatype)?

36. Can I invoke/instantiate COM objects from within stored procedures or triggers using T-SQL?

37. Oracle has a rownum to access rows of a table using row number or row id. Is there any equivalent
for that in SQL Server? Or How do you generate output with row number in SQL Server?

38. How do you specify a network library like TCP/IP using ADO connect string?

39. How do you generate scripts for repetitive tasks like truncating all the tables in a database,
changing owner of all the database objects, disabling constraints on all tables etc?

40. Is there a way to find out when a stored procedure was last updated?

41. How do you find out all the IDENTITY columns of all the tables in a given database?

42. How do you search the code of stored procedures?

43. How do you retrieve the generated GUID value of a newly inserted row? Is there an @@GUID,
just like @@IDENTITY?

Q. What is the purpose of database links in Oracle?

Database links are created to establish communication between different databases or different environments
such as development, test and production of the same database. The database links are usually designed to be
read-only to access other database information . They are also useful when you want to copy production data
into test environment for testing.

Q. What is Oracle's data dictionary used for?

Data dictionary in Oracle contains information about all database objects such as tables, triggers, stored
procedures, functions, indexes, constraints, views, users, roles, monitoring information, etc.

Q. Which data dictionary objects are used to retrieve the information about the following objects from
a given schema?
1) tables
2) views
3) triggers
4) procedures
5) constraints
6) all of the above mentioned objects

The objects used are:


a> user_tables or tabs
b> user_views
c> user_triggers
d> user_procedures
e> user_constraints
f> user_objects

Q. Assume a developer has access to XYZ schema in an Oracle database and this is the main schema
where every developer puts finalized version of their procedures, triggers, views, etc as part of
building a database application. How will you retrieve the following data dictionary objects under
XYZ schema?
a> tables
b> views
c> triggers
d> procedures
e> constraints
f> all of the above mentioned objects

Use the following objects to retrieve the objects:


a> all_tables
b> all_views
c> all_triggers
d> all_procedures
e> all_constraints
f> all_objects

Note: When the information is retrieved from all_* objects then do not forget to include owner as part of the
where clause. E.g. select * from all_tables where owner = ‘XYZ’.

Q. What is a synonym and why would you use it?

A synonym is an alias given to a table in Oracle. There are two main reasons for using sunonyms: 1)For
security reasons, the tables are created with specific names by DBA and public synonyms (aliases) are
assigned to them that differ from original names. If security is not an issue then synonyms and table names
can have the same name. 2)Synonyms eliminate the need to qualify the table with owner name when
forming SQL queries.
Q. What is a REF cursor?
A REF cursor falls under explicit cursor category (also refer to Explicit and Implicit cursor question).
However the explicit SQL statement is not associated with it at compile time. At run-time the SQL statement
is dynamically formed and then assigned to the REF cursor variable. Thereafter it follows the same
procedures of opening cursor, fetching from cursor and closing cursor. It can be associated with variety of
different SQL queries in the same PL/SQL program vs. design time declared explicit cursors with an
association to only one query.

Q. You want to view top 50 rows from Oracle table. How do I this?
Use ROWNUM, the pseudo column in where clause as follows:
Where rownum < 51
After complete execution of query and before displaying output of SQL query to the user oracle internally
assigns sequential numbers to each row in the output. These numbers are held in the hidden column or
pseudo column that is a ROWNUM column. Now it is so simple to apply the above logical condition, as you
would have done to any other column of the table.

Q. Assume you have a table CUSTOMER with customer_id as its primary key and its data type is
number. The table has 1000 existing customers and its primary key values range 1 to 1000. How do
you select employees using select query that range from 300 to 400 using ROWNUM?
The obvious answer is: rownum >= 300 and rownum <= 400
However this is not possible because Oracle does not allow you to select the range based on ROWNUM.
You can apply WHERE clause that selects employee’s range 1 to 400 with rownum < 401 or do not use
ROWNUM pseudo column but use customer_id >= 300 and customer_id <= 400 as a WHERE clause.

Q. Why would you use “CREATE OR REPLACE PROCEDURE procedure name” clause when
writing stored procedures? What is the prerequisite to use such a clause?
The above clause helps replace existing stored procedure intelligently instead of dropping and recreating one
with new changes. The Oracle replaces the changes only if the procedure is created using the above clause. It
also offloads DBA from doing changes to the procedure constantly on developer’s requests.
The prerequisite to do such replacement is permission to replace from DBA must be granted. This option is
only good for development environment.

Q. What are the different types of triggers in Oracle?


The followings are different types of trigger supported by Oracle:

Statement Level Triggers: They execute only once per transaction. They are used to enforce additional
security measures.

Row Level Triggers: They execute for every row in a table that is affected by insert, delete or update
operation. They are further classified by the time they occur. BEFORE and AFTER keywords describe the
timing of the event. E.g. before inserting a row and after inserting the row in the table.

Instead Of Triggers: These triggers are useful when you want to control changing information in the tables
beneath the view. These triggers dictate how insert, delete or update commands can be performed on the
tables involved in views.

Schema Triggers: These triggers execute for DDL operations within the schema. The DDL operations
include create table, drop table, and etc.
Database Level Triggers: They are fired during database events such as startup or shutdown of the database
and used by DBA to maintain logs or notify users of the events.

Q. How do you control unwanted updates to a table from happening when an update SQL query is
issued?

Create a trigger on the table that handles event BEFORE UPDATE. In this event code you have access to old
and new values of the row before update takes place. Apply logical condition that compares old and new
value(s) of the column(s). Based on the outcome of it determine whether new value(s) should be retained or
replaced by old value(s) or any appropriate value(s).

Q. How do you reference column values in BEFORE and AFTER insert and delete triggers?

The BEFORE and AFTER insert triggers can reference column values by new collection using keyword
“:new.column name”. The before and after delete triggers can reference column values by old collection
using keyword “:old. column name”.

Q. Can you change the inserted value in one of the columns in AFTER insert trigger code?

This is not possible as the column values supplied by the insert SQL query are already inserted into the table.
If you try to assign new value to the column in AFTER insert trigger code then oracle error would be raised.
To alter any values supplied by insert SQL query create BEFORE insert trigger.

Q. Explain use of SYSDATE and USER keywords.

SYSDATE is a pseudo column and refers to the current server system date. USER is a pseudo column and
refers to the current user logged onto the oracle session. These values come handy when you want to
monitor changes happening to the table.

Q. Assume you are running a PL/SQL program that involves a huge transaction. For some reason,
Oracle cannot handle it and gives a typical error ROLLBACK SEGMENT FULL. What would you do
to get rid of this error?

The rollback segment holds the information about active transaction until it is either committed or rolled
back. Generally this type of error occurs due to the size of transaction and/or if insufficient extents (data
blocks) are allocated for rollback segment and/or there are many active transactions at the same time using
the same rollback segment and/or rollback segment is not freed even if there are no active transactions. This
type of error must be reported to DBA and most of the times bouncing the server helps or extending rollback
segment temporarily with more space helps. PL/SQL programs that spawn a big transaction should run as
batch job during off hours to minimize issues with error.

Q. Which functions do you use to report error number and message in PL/SQL program?

Use SQLCODE function to report an oracle error number and SQLERRM function to display error message
related to error number. The programmers use these functions in Exception section of the program to pass an
error message to the calling procedure or client application.

Q. What is the difference between explicit cursor and implicit cursor?

When a single insert, delete or update statement is executed within PL/SQL program then oracle creates an
implicit cursor for the same, executes the statement, and closes the cursor. You can check the result of
execution using SQL%ROWCOUNT function.

Explicit cursors are created programmatically. The cursor type variable is declared and associated with SQL
query. The program then opens a cursor, fetches column information into variables or record type variable,
and closes cursor after all records are fetched. To check whether cursor is open or not use function
SQL%ISOPEN and to check whether there are any records to be fetched from the cursor use function
SQL%FOUND.

Q. What is decode function used for?

The decode function implements if-then-else logic within SQL query. It evaluates SQL expression and
compares its result with a series of values. When it matches one of the values then it outputs result
associated with the values. When there is no match it defaults to the specified value.

Example:
Select employee_name, decode(designation, ‘P’, ‘Programmer’, ‘PL’, ‘Project Leader’, ‘PM’, ‘Project
Manager’, ‘Clerk’)
From Employee

The above query prints employee names and their designation. If the designation column value does not
match P or PL or PM values then the designation of an employee is Clerk.

Q. What is CASE statement used for?

The CASE function is more flexible version of DECODE- more readable and simple to code. It implements
if-then-else logic within SQL query and PL/SQL program block. The decode function was heavily used prior
to Oracle 9i version. The downside of using DECODE function was more complexity and difficult
maintenance, especially in case of nested DECODE statements.

Example:
Select employee_name,
(Case designation
When ‘P’ then ‘Programmer’
When ‘PL’ then ‘Project Leader’
When ‘PM’ then ‘Project Manager’
Else ‘Clerk’
End) as designation_description
From Employee

The above query prints employee names and their designation. If the designation column value does not
match P or PL or PM values then the designation of an employee is 'Clerk'.

Q. What are the two common integrity constraints when you deal with database tables?

The primary key constraint allows table to have only unique rows identified by primary key values. The
foreign key constraint establishes parent-child relationship between two tables. This constraint restricts child
table from having orphan rows. In simple terms you need to insert a row in parent table first and then insert a
row in the child table with the same parent key. A classic example of this relationship are the Customer and
Orders tables. CustomerID which is the primary key in the Customer table is referenced in the Orders table -
so every row in the Orders table has a CustomerID that exists in the customers table.

Q. What is a package?

A package is like an object which has methods and properties. The methods of the package are PL/SQL
functions or procedures. The properties (public) of package are oracle supported data types such as cursor,
variables, records, etc. The methods and properties are logically grouped into the package.

The package has two parts: package specification that defines the interface to the consumers of the package
and package body that has the real implementation of procedures and functions.

Q. What is a function?

A function is a PL/SQL program can return a value to the calling program and it can be used in SQL queries.
It returns the value using return statement.

Q. Assume you have a table that is contains some rows and you want to add a new column to it. There
are a few tables that are dependent on the table to which the column needs to be added and foreign
key constraints are defined on them. How would you do this?

Create a copy of the table using “create table – select” query. Disable foreign key constraints. Delete all
rows from the original table and add column to the table. Enable foreign key constraints. Use “insert - select
from copy of original table” query to repopulate the table. Drop copy of the original table.

Q. What data types can you return from a function?

The function can return any valid PL/SQL datatype. This may include scalar data types such as integer,
number, date, varchar, etc. or, composite data types such as record, vararray and table or reference types
such as REF cursor or LOB types such as BLOB, BFILE and etc.

Q. How do you handle errors in Pl/SQL?

You must add EXCEPTION keyword to create exception handling section in pl/sql program to handle errors
as follows:

Declare
Variables
Begin
Pl/sql statements
Exception
When exception1 then
.. error handling code
When exception2 then
.. error handling code
When others then
.. all errors not handled by exceptions defined above
End;

To handle a system related exception add keyword “WHEN exception name” to handle it. To handle user-
defined exception define it in DECLARE section of the program and handle it in exception section as
“WHEN user-defined exception name”. The user-defined exceptions must be raised by the program when
certain conditions are met. To handle any exception in general you would use WHEN OTHERS THEN in
exception section. The exception does not return control to the line following the line which raised the
exception but proceeds to the END; statement.

Q. Explain the difference between these joins supported by Oracle: Natural Join, Inner Join, Left
Outer Join, Right Outer Join and Full Outer Join.

When two tables are joined using natural join then the join is based on same column names and matching
rows common between two tables are retrieved.

When two tables are joined using inner join then the join condition is based on specified column names and
matching rows common between two tables are retrieved.
In a left outer join the matching rows common between two tables and non-matching rows from table, which
is on the left side of the join condition, are retrieved.

In a right outer join the matching rows common between two tables and non-matching rows from table,
which is on the right side of the join condition, are retrieved.

The full outer join is a nothing but union of left outer join and right outer join queries. It means the matching
rows common between 2 tables and non-matching rows from table, which is on the left and right side of the
join condition, are retrieved.

Q. Explain the different types of optimizers Oracle uses to optimize SQL queries.

Oracle uses Cost Based Optimizer (CBO) and Rule Based Optimizer (RBO) to optimize SQL queries.

RBO evaluates different execution paths based on a series of syntactical rules and rates the best path to
execute the query. This optimizer is not very much used in new applications but is supporting applications
developed in earlier versions of Oracle.

The CBO also known as CHOOSE evaluates different execution paths based on cost and uses the path with
lowest relative cost to execute the query. The cost of execution depends on latest table, index, partition and
other statistics. These statistics are gathered using Analyze command (retained for backward compatibility)
or DBMS_STATS package methods.

Q. Why does a query in Oracle run faster when ROWID is used as a part of the where clause?

ROWID is the logical address of a row - it is not a physical column. It is composed of file number, data
block number and row number within data block. Therefore I/O time is minimized retrieving the row,
resulting in a faster query.

Q. A PL/SQL program executes an update statement and it wants to know the number of rows
affected in the database. How can it check this?

Use SQL%ROWCOUNT function to check the number of rows affected whether it is an update or insert or
delete statement.

Q. What type of exception will be raised in the following situations:

a> select..into statement returns more than one row.

b> select..into statement does not return any row.

c> insert statement inserts a duplicate record.

The errors returned are:


a> TOO_MANY_ROWS

b> NO_DATA_FOUND

c> DUP_VAL_ON_INDEX
Q. Which Oracle object would you use when you want to have incremental (sequential) numbers in a
particular column?

Use Sequence object for each column that needs incremental values. When the sequence is created, you can
specify its starting value, maximum value and recycle ability. The PL/SQL program using the sequence
should call NEXTVAL function of the sequence to obtain the next number in the sequence and insert it in the
column that needs it along with other column values when inserting a row in the table.
How to delete the rows which are duplicate (don’t delete both duplicate records).?
SET ROWCOUNT 1
DELETE yourtable
FROM yourtable a
WHERE (SELECT COUNT(*) FROM yourtable b WHERE b.name1 = a.name1 AND b.age1 = a.age1) > 1
WHILE @@rowcount > 0
DELETE yourtable
FROM yourtable a
WHERE (SELECT COUNT(*) FROM yourtable b WHERE b.name1 = a.name1 AND b.age1 = a.age1) >
1
SET ROWCOUNT 0
?

Find top salary among two tables


SELECT TOP 1 sal
FROM (SELECT MAX(sal) AS sal
FROM sal1
UNION
SELECT MAX(sal) AS sal
FROM sal2) a
ORDER BY sal DESC

How to find nth highest salary?


SELECT TOP 1 salary
FROM (SELECT DISTINCT TOP n salary
FROM employee
ORDER BY salary DESC) a
ORDER BY salary

How to know how many tables contains "Col1" as a column in a database?


SELECT COUNT(*) AS Counter
FROM syscolumns
WHERE (name = 'Col1')

SQL Server Performance Enhancement Tips

SQL Server 2000 has many features which when used wisely, will improve the performance of the queries
and stored procedures considerably.

Index Creation

Table indexing will boost the performance of the queries a lot. SQL Server can perform a table scan, or it
can use an index
scan. When performing a table scan, SQL Server starts at the beginning of the table, goes row by row in the
table, and
extracts the rows that meet the criteria of the query. When SQL Server uses an index, it finds the storage
location of the rows needed by the query and extracts only the needed rows.

Avoid creating indexes on small tables since it will take more time for SQL Server to perform an index scan
than to perform a
simple table scan.

If a table has too many transactions happening in it (INSERT/UPDATE/DELETE), keep the number of
indexes minimal. For each transaction, indexes created in the table are re-organized by SQL Sever, which
reduces performance.

Index Tuning Wizard, which is available in the SQL Server Enterprise Manager, is a good tool to create
optimized indexes. You can find it in Tools->Wizards->Management->Index Tuning Wizard

Avoid using “Select *”

Select only the columns that are necessary from the table. Select * always degrades performance of SQL
queries.

Avoid using UNION and try to use UNION All wherever possible

When UNION operation is performed, SQL Serer combines the resultsets of the participating queries and
then performs a “Select DISTINCT” on these to eliminate the duplicate rows.

Instead if UNION ALL is used “Select DISTINCT” is not performed which improves performance a lot. So
if you know that your
participating queries will not return duplicate rows, then use UNION ALL.

Usage of stored procedures instead of large complex queries

If stored procedures are used instead of very large queries the advantage is that the network traffic will
reduce
significantly since from the application, only the Stored procedure name and some of its parameters are
passed to the SQL
Server rather than passing the entire query.

Further when a stored procedure is executed, SQL Server creates an execution plan and it is kept in
memory. Any further
execution to this stored procedure will be very fast since the execution plan is readily available. If queries
are used
instead of stored procedures, each time an execution plan will be generated by SQL Serer which makes the
process slow.

Restrict number of rows by using Where clause

When ever the number of rows can be restricted, use Where clause to reduce the network traffic and
enhance
performance.
Instead of Tempoprary Tables use Table Variable

Table variables requires less locking and logging and so they are more efficient than Temp Tables (#
Tables).

Avoid using Temp Tables inside stored procedures

If temporary tables are used inside stored procedures, SQL Server may not reuse the execution plan each
time the stored procedure is called. So this will reduce performance.

• What Stored Procedure means

A Stored procedure is a database object that contains one or more SQL statements. In this article you will
get an idea on how to create and use stored procedures and also highlighted on how to use stored
procedure.

The first time a stored procedure is executed; each SQL statement it contains is compiled and executed to
create an execution plan. Then procedure is stored in compiled form with in the database. For each
subsequent execution, the SQL statements are executed without compilation, because they are precompiled.
This makes the execution of stored procedure faster than the execution of an equivalent SQL script.

To execute a stored procedure you can use EXEC statement.


CREATE PROC spGetAuthors
AS
SELECT * FROM AUTHORS

When you run this script in Pubs database you will get the following message in Query Analyzer.
The Command(s) completed successfully.

Now you are ready to call/execute this procedure from Query Analyzer.

EXEC spGetAuthors

This stored procedure creates a result set and returns to client.

You can call a stored procedure from within another stored procedure. You can even call a stored procedure
from within itself. This technique is called a recursive call in programming. One of the advantages of using
stored procedures is that application programmers and end users don’t need to know the structure of the
database or how to code SQL. Another advantage of it is they can restrict and control access to a database.
Now days every one is familiar with SQL Injection Attack I think stored are the way this can be prevented
from this malicious attack.

How to Create a Stored Procedure

When the CREATE PROCEDURE statement is executed, the syntax of the SQL statements within the
procedure is checked. If you have made a coding error the system responds with an appropriate message
and the procedure is not created.

The Syntax of the CREATE PROCEDURE statement


CREATE {PROC|PROCEDURE} Procedure_name
[Parameter_declaration]
[WITH {RECOMPILE|ENCRYPTION|RECOMPILE, ENCRYPTION}]
AS sql_statements

You can use CREATE PROCEDURE statement to create a stored procedure in the database. The name of
the stored procedure can be up to 128 characters and is typically prefixed with the letters sp.
If you look at the above options like AS, RECOMPILE, ENCRYPTION these are having some significance
meaning to it.
The AS clause contains the SQL statements to be executed by the stored procedure. Since a stored
procedure must consist of single batch.
Recompile is used when you want to compile the stored procedure every time when you call. This comes
into the picture when one doesn’t want to catch the execution plan of stored procedure in the database.
Encryption implies that you want to hide this code so that no one can see it. This is very important when
you want to distribute the code across the globe or when you want to sell this code to other vendors. But
make sure you have original copy it; because once you encrypted it no one can decrypt it.

Apart from the stored procedure that store in the database a permanent entity you can create stored
procedure as per you session. That means as long the as the session is alive then the stored procedure is
available in the memory means in the database.
Once the session ends the stored procedure is vanished this actually depends on what type of stored
procedure you have chosen to create it.

Stored procedure provide for two different types of parameters: input parameters and Output Parameters.
An input Parameter is passed to the stored procedure from the calling program. An output parameter is
returned to the calling program from the stored procedure. You can identify an output parameter with the
OUTPUT keyword. If this keyword is omitted the parameter is assumed to be an input parameter.
You can declare an input parameter so that it requires a value or so that its value is optional. The value of a
required parameter must be passed to the stored procedure from the calling program on an error occurs. The
value of an optional parameter doesn’t need to be passed from the calling program. You identify an optional
parameter by assigning a default value to it. Then if a value isn’t passed from the calling program, the
default value is used. You can also use output parameter as input parameters. That is you can pass a value
from the calling program to the stored procedure through an output parameter. However is not advisable to
pass parameters to Output parameters.

The syntax for declaring the parameters

@Parameter_name_1 data_type [= default] [OUTPUT]


[, @Parameter_name_2 data_type [= default] [OUTPUT]…

Parameter declarations
@FirstName varchar(50) -- Input parameter that accepts a string.
@LastName varchar(50) -- Output Parameter that returns a string.
Create Procedure statement that uses an input and an output parameter.

CREATE PROC spGetAuthors


@FirstName varchar(50),
@LastName varchar(50)
AS
SELECT @LastName= ln_Name
FROM AUTHORS
WHERE fn_name = @FirstName
Create procedure statement that uses an optional parameter.

CREATE PROC spGetAuthors


@LastName varchar(50),
@FirstName varchar(50) = ‘vijay’
AS
SELECT @LastName= ln_Name
FROM AUTHORS
WHERE fn_name = @FirstName

A stored procedure can declare up to 2100 parameters. If you declare two or more parameters, the
declarations must be separated by commas.

Calling stored procedure with Parameters

To pass parameter values to a stored procedure, you code the values in the EXEC statement after the
procedure name. You can pass the parameters either by position or by name.

Passing parameters by Name:

Write the following code in Query Analyzer


DECLARE @LN VARCHAR(100)
EXEC spGetAuthors @FirstName = ‘krishna’, @LastName = @LN OUTPUT

Passing parameters by Position:


DECLARE @LN VARCHAR(100)
EXEC spGetAuthors @LN OUTPUT, ‘krishna’

In fact you can use both notations to pass parameters to stored procedures when you are calling. To pass
parameters by position, list them in the same order as they appear in the CREATE PROCEDURE statement
and separate them with commas. When you use this technique, you can omit optional parameters only if
they are declared after any required parameters.

To use an output parameter in the calling program, you must declare a variable to store its value. Then you
use the name of the variable in the EXEC statement and you code the OUTPUT keyword after it to identify
it as an output parameter.

Handling error in stored procedure

In addition to passing output parameters back to the calling program, stored procedures also pass back a
return value. By default, this value is zero. If an error occurs during the execution of a stored procedure you
may want to pass a value back to the calling environment that indicates the error that occurred. To do that
you use the RETURN statement and the @@ERROR function.

The @@ERROR system function returns the error number that’s generated by the execution of the most
recent SQL statement. If the value is zero, it means that no error has occurred. The stored procedure listed
below uses this function to test whether a DELETE statement that deletes a row from authors table is
successful.
CREATE PROC spDeleteAuthors @FirstName varchar(50)
As
DECLARE @ErrorVar int
DELETE FROM AUTHORS WHERE fn_name = @FirstName
SET @ErrorVar = @ERROR
IF @ErrorVar <> 0
BEGIN
PRINT ‘An Unknown Error Occurred’
RETURN @ErrorVar
END

RETURN statement immediately exists the procedure and returns an optional integer value to the calling
environment. If you don’t specify the value in this statement the return value is zero.

How to delete or change a stored procedure

You use DROP PROC statement to delete one or more stored procedures from database. To redefine the
stored procedure you use ALTER PROC.

The syntax of the DROP PROC statement


DROP {PROC|PROCEDURE} Procedure_name [, …]

The syntax of the ALTER PROC statement


ALTER {PROC|PROCEDURE} Procedure_name
[Parameter_declaration]
[WITH {RECOMPILE|ENCRYPTION|RECOMPILE, ENCRYPTION}]
AS sql_statements

When you delete a procedure any security permission that are assigned to the procedure are also deleted. In
that case you will want to use the ALTER PROC statement to modify the procedure and preserve
permissions.

• Based on this you can divide it in 4 sections.


System stored procedures
Local stored procedures
Temporary stored procedures
Extended stored procedures

System Stored Procedures:

System stored procedures are stored in the Master database and are typically named with a sp_ prefix. They
can be used to perform variety of tasks to support SQL Server functions that support external application
calls for data in the system tables, general system procedures for database administration, and security
management functions.
For example, you can view the contents of the stored procedure by calling
sp_helptext [StoredProcedure_Name].

There are hundreds of system stored procedures included with SQL Server. For a complete list of system
stored procedures, refer to "System Stored Procedures" in SQL Server Books Online.

Local stored procedures

Local stored procedures are usually stored in a user database and are typically designed to complete tasks in
the database in which they reside. While coding these procedures don’t use sp_ prefix to you stored
procedure it will create a performance bottleneck. The reason is when you can any procedure that is
prefixed with sp_ it will first look at in the mater database then comes to the user local database.
Temporary stored procedures

A temporary stored procedure is all most equivalent to a local stored procedure, but it exists only as long as
SQL Server is running or until the connection that created it is not closed. The stored procedure is deleted
at connection termination or at server shutdown. This is because temporary stored procedures are stored in
the TempDB database. TempDB is re-created when the server is restarted.

There are three types of temporary stored procedures: local , global, and stored procedures created directly
in TempDB.
A local temporary stored procedure always begins with #, and a global temporary stored procedure always
begins with ##. The execution scope of a local temporary procedure is limited to the connection that created
it. All users who have connections to the database, however, can see the stored procedure in Query
Analyzer. There is no chance of name collision between other connections that are creating temporary
stored procedures. To ensure uniqueness, SQL Server appends the name of a local temporary stored
procedure with a series of underscore characters and a connection number unique to the connection.
Privileges cannot be granted to other users for the local temporary stored procedure. When the connection
that created the temporary stored procedure is closed, the procedure is deleted from TempDB.

Any connection to the database can execute a global temporary stored procedure. This type of procedure
must have a unique name, because all connections can execute the procedure and, like all temporary stored
procedures, it is created in TempDB. Permission to execute a global temporary stored procedure is
automatically granted to the public role and cannot be changed. A global temporary stored procedure is
almost as volatile as a local temporary stored procedure. This procedure type is removed when the
connection used to create the procedure is closed and any connections currently executing the procedure
have completed.

Temporary stored procedures created directly in TempDB are different than local and global temporary
stored procedures in the following ways:

You can configure permissions for them.


They exist even after the connection used to create them is terminated.
They aren't removed until SQL Server is shut down.
Because this procedure type is created directly in TempDB, it is important to fully qualify the database
objects referenced by Transact-SQL commands in the code. For example, you must reference the Authors
table, which is owned by dbo in the Pubs database, as pubs.dbo.authors.

--create a local temporary stored procedure.


CREATE PROCEDURE #tempAuthors
AS
SELECT * from [pubs].[dbo].[authors]

--create a global temporary stored procedure.


CREATE PROCEDURE ##tempAuthors
AS
SELECT * from [pubs].[dbo].[authors]

--create a temporary stored procedure that is local to tempdb.


CREATE PROCEDURE directtemp
AS
SELECT * from [pubs].[dbo].[authors]

Extended Stored Procedures

An extended stored procedure uses an external program, compiled as a 32-bit dynamic link library (DLL),
to expand the capabilities of a stored procedure. A number of system stored procedures are also classified
as extended stored procedures. For example, the xp_sendmail program, which sends a message and a query
result set attachment to the specified e-mail recipients, is both a system stored procedure and an extended
stored procedure. Most extended stored procedures use the xp_ prefix as a naming convention. However,
there are some extended stored procedures that use the sp_ prefix, and there are some system stored
procedures that are not extended and use the xp_ prefix. Therefore, you cannot depend on naming
conventions to identify system stored procedures and extended stored procedures.
Use the OBJECTPROPERTY function to determine whether a stored procedure is extended or not.
OBJECTPROPERTY returns a value of 1 for IsExtendedProc, indicating an extended stored procedure, or
returns a value of 0, indicating a stored procedure that is not extended.
USE Master
SELECT OBJECTPROPERTY(object_id('xp_sendmail'), 'IsExtendedProc')
1. To see current user name
Sql> show user;

2. Change SQL prompt name


SQL> set sqlprompt “Manimara > “
Manimara >
Manimara >

3. Switch to DOS prompt


SQL> host

4. How do I eliminate the duplicate rows ?


SQL> delete from table_name where rowid not in (select max(rowid) from table group by
duplicate_values_field_name);
or
SQL> delete duplicate_values_field_name dv from table_name ta where rowid <(select min(rowid) from
table_name tb where ta.dv=tb.dv);
Example.
Table Emp
Empno Ename
101 Scott
102 Jiyo
103 Millor
104 Jiyo
105 Smith
delete ename from emp a where rowid < ( select min(rowid) from emp b where a.ename = b.ename);
The output like,
Empno Ename
101 Scott
102 Millor
103 Jiyo
104 Smith

5. How do I display row number with records?


To achive this use rownum pseudocolumn with query, like SQL> SQL> select rownum, ename from emp;
Output:
1 Scott
2 Millor
3 Jiyo
4 Smith

6. Display the records between two range


select rownum, empno, ename from emp where rowid in
(select rowid from emp where rownum <=&upto
minus
select rowid from emp where rownum<&Start);
Enter value for upto: 10
Enter value for Start: 7

ROWNUM EMPNO ENAME


--------- --------- ----------
1 7782 CLARK
2 7788 SCOTT
3 7839 KING
4 7844 TURNER

7. I know the nvl function only allows the same data type(ie. number or char or date Nvl(comm,
0)), if commission is null then the text “Not Applicable” want to display, instead of blank space. How
do I write the query?

SQL> select nvl(to_char(comm.),'NA') from emp;

Output :

NVL(TO_CHAR(COMM),'NA')
-----------------------
NA
300
500
NA
1400
NA
NA

8. Oracle cursor : Implicit & Explicit cursors


Oracle uses work areas called private SQL areas to create SQL statements.
PL/SQL construct to identify each and every work are used, is called as Cursor.
For SQL queries returning a single row, PL/SQL declares all implicit cursors.
For queries that returning more than one row, the cursor needs to be explicitly declared.

9. Explicit Cursor attributes


There are four cursor attributes used in Oracle
cursor_name%Found, cursor_name%NOTFOUND, cursor_name%ROWCOUNT, cursor_name%ISOPEN

10. Implicit Cursor attributes


Same as explicit cursor but prefixed by the word SQL

SQL%Found, SQL%NOTFOUND, SQL%ROWCOUNT, SQL%ISOPEN

Tips : 1. Here SQL%ISOPEN is false, because oracle automatically closed the implicit cursor after
executing SQL statements.
: 2. All are Boolean attributes.

11. Find out nth highest salary from emp table


SELECT DISTINCT (a.sal) FROM EMP A WHERE &N = (SELECT COUNT (DISTINCT (b.sal)) FROM
EMP B WHERE a.sal<=b.sal);

Enter value for n: 2


SAL
---------
3700
12. To view installed Oracle version information
SQL> select banner from v$version;

13. Display the number value in Words


SQL> select sal, (to_char(to_date(sal,'j'), 'jsp'))
from emp;
the output like,

SAL (TO_CHAR(TO_DATE(SAL,'J'),'JSP'))
--------- -----------------------------------------------------
800 eight hundred
1600 one thousand six hundred
1250 one thousand two hundred fifty
If you want to add some text like,
Rs. Three Thousand only.
SQL> select sal "Salary ",
(' Rs. '|| (to_char(to_date(sal,'j'), 'Jsp'))|| ' only.'))
"Sal in Words" from emp
/
Salary Sal in Words
------- ------------------------------------------------------
800 Rs. Eight Hundred only.
1600 Rs. One Thousand Six Hundred only.
1250 Rs. One Thousand Two Hundred Fifty only.

14. Display Odd/ Even number of records


Odd number of records:
select * from emp where (rowid,1) in (select rowid, mod(rownum,2) from emp);
1
3
5
Even number of records:
select * from emp where (rowid,0) in (select rowid, mod(rownum,2) from emp)
2
4
6

15. Which date function returns number value?


months_between

16. Any three PL/SQL Exceptions?


Too_many_rows, No_Data_Found, Value_Error, Zero_Error, Others

17. What are PL/SQL Cursor Exceptions?


Cursor_Already_Open, Invalid_Cursor

18. Other way to replace query result null value with a text
SQL> Set NULL ‘N/A’
to reset SQL> Set NULL ‘’

19. What are the more common pseudo-columns?


SYSDATE, USER , UID, CURVAL, NEXTVAL, ROWID, ROWNUM

20. What is the output of SIGN function?


1 for positive value,
0 for Zero,
-1 for Negative value.
21. What is the maximum number of triggers, can apply to a single table?
12 triggers.

Q. I want to use a Primary key constraint to uniquely identify all employees, but I also want
to check that the values entered are in a particular range, and are all non-negative. How
can I set up my constraint to do this?

A. It’s a simple matter of setting up two constraints on the employee id column: the primary key
constraint to guarantee unique values, and a check constraint to set up the rule for a valid, non-
negative range; for example:

Create Table Employee ( Emp_Id Integer Not Null Constraint PkClustered Primary Key
Constraint ChkEmployeeId Check ( Emp_Id Between 1 And 1000) )

Or, if you want to create table-level constraints (where more than one column defines the primary
key):

Create Table Employee ( Emp_Id Int Not Null , Emp_Name VarChar(30) Not Null,
Constraint PkClustered Primary Key (Emp_Id),
Constraint ChkEmployeeId Check ( Emp_Id Between 0 And 1000) )

Q. What’s a quick way of outputting a script to delete all triggers in a database (I don’t
want to automatically delete, just review the scripts, and selectively delete based on the
trigger name)?

A. There are probably a number of ways, but this may be a job for a cursor. Try the following
Transact SQL script:

Declare @Name VarChar (50)


Declare Cursor_ScriptTriggerDrop Cursor
For Select name from sysObjects where type = 'Tr'

Open Cursor_ScriptTriggerDrop
Fetch Next From Cursor_ScriptTriggerDrop Into @Name
Print 'If Exists (Select name From sysObjects Where name = ' + Char(39) + @Name +
Char(39)+ ' and type = ' + Char(39) + 'Tr' + Char(39) + ')'
Print 'Drop Trigger ' + @Name
Print 'Go'
While @@Fetch_Status = 0
Fetch Next From Cursor_ScriptTriggerDrop Into @Name
Begin
Print 'If Exists (Select name From sysObjects Where name = ' + Char(39)
+ @Name + Char(39) + ' and type = ' + Char(39) + 'Tr' + Char(39) + ')'
Print 'Drop Trigger ' + @Name
Print 'Go'
End
Close Cursor_ScriptTriggerDrop
DeAllocate Cursor_ScriptTriggerDrop

Q. Is there a function I can use to format dates on the fly as they’re added to a resultset? I
need to support a large range of formatted outputs.

A. Take a deep breath and build it yourself. Here’s a useful function to give you sophisticated
formatting abilities:

Use Pubs
Go
If Object_Id ( 'fn_FormatDate' , 'Fn' ) Is Not Null Drop Function fn_FormatDate
Go
Create Function fn_FormatDate
(
@Date DateTime --Date value to be formatted
,@Format VarChar(40) --Format to apply
)
Returns VarChar(40)
As
Begin

-- Insert Day:
Set @Format = Replace (@Format , 'DDDD' , DateName(Weekday , @Date))
Set @Format = Replace (@Format , 'DDD' ,
Convert(Char(3),DateName(Weekday , @Date)))
Set @Format = Replace (@Format , 'DD' , Right(Convert(Char(6) , @Date,12) ,
2))
Set @Format = Replace (@Format , 'D1' , Convert(VarChar(2) , Convert(Integer
, Right(Convert(Char(6) , @Date , 12) , 2))))

-- Insert Month:
Set @Format = Replace (@Format , 'MMMM', DateName(Month , @Date))
Set @Format = Replace (@Format , 'MMM', Convert(Char(3) , DateName(Month
, @Date)))
Set @Format = Replace (@Format , 'MM',Right(Convert(Char(4) , @Date,12),2))
Set @Format = Replace (@Format , 'M1',Convert(VarChar(2) , Convert(Integer ,
Right(Convert(Char(4) , @Date,12),2))))

-- Insert the Year:


Set @Format = Replace (@Format,'YYYY' , Convert(Char(4) , @Date , 112))
Set @Format = Replace (@Format,'YY' , Convert(Char(2) , @Date , 12))

-- Return function value:


Return @Format
End
Go

-- Examples:

Set NoCount On

Select dbo.FormatDate(Ord_Date,'dddd, mmmm d1, yyyy') From Pubs..Sales


Where stor_id = 6380 AND ord_num = '6871'
Select dbo.FormatDate(Ord_Date,'mm/dd/yyyy') From Pubs..Sales Where stor_id
= 6380 AND ord_num = '6871'
Select dbo.FormatDate(Ord_Date,'mm-dd-yyyy') From Pubs..Sales Where stor_id
= 6380 AND ord_num = '6871'
Select dbo.FormatDate(Ord_Date,'yyyymmdd') From Pubs..Sales Where stor_id
= 6380 AND ord_num = '6871'
Select dbo.FormatDate(Ord_Date,'mmm-yyyy') From Pubs..Sales Where stor_id
= 6380 AND ord_num = '6871'

Set NoCount Off

Q. If SQL Server does on-the-fly caching and parameterisation of ordinary SQL statements
do I need to build libraries of stored procedures with input parameters that simply wrap
the SQL statement itself?

A. I’m sure this question has come up before, relating to SQL Server 7, but it remains a good
question. The answer’s ‘yes and no’ (though more yes than no). It’s true that both SQL Server 7
and 2000 do ad hoc caching/ parameterization of SQL statements, but there are a couple of
‘gotchas’ attached. First of all the following two statements will generate an executable plan in
SQL Server’s cache with a usage count of two (i.e. the plan is put in cache and reused by the
second statement):

Select Title From Pubs.Dbo.Titles Where Price = £20


Select Title From Pubs.Dbo.Titles Where Price = £7.99

However if the currency indicator is dropped, one plan is marked with an Integer parameter, the
other with a Numeric parameter (as each is auto-parameterized). Worse still, if the case of any
part of the column name(s), database, owner, or object name(s) changes, in a following SQL
statement, SQL Server fails to recognize the structural identity of the next statement and
generates another plan in cache.

To check this out, run both statements (with currency marker and identical case) along with the
following call:

Select CacheObjType , UseCounts , Sql From Master..sysCacheObjects

(If possible run Dbcc FreeProcCache before the three SQL statements, but beware: it drops all
objects in cache).

You should find that the executable plan has been found and reused (the UseCounts value will be
‘2’). Now, either drop the currency indicators or change the case in one of the ‘Select Title … ‘
statements and note that a fresh executable (and compiled) plan is generated for the second
statement.

Now create a stored procedure with Price as an input parameter, run it with or without currency
markers, changing the case of the call at will, and the procedure will be found and reused – check
the UseCounts column value.

Q. How can I get a quick list of all the options set for a particular session?

A. Here’s a stored procedure which will give you a list of all options set in a session. If you create
it in Master, with the ‘sp_’ prefix it’ll be available within any database:

Create Procedure sp_GetDBOptions


As
Set NoCount On
/* Create temporary table to hold values */
Create Table #Options
(OptId Integer Not Null , Options_Set VarChar(25) Not Null )

If @@Error <> 0
Begin
Raiserror('Failed to create temporary table #Options',16,1)
Return(@@Error)
End

/* Insert values into the Temporary table */


Insert Into #Options Values (0,'NO OPTIONS SET')
Insert Into #Options Values (1,'DISABLE_DEF_CNST_CHK')
Insert Into #Options Values (2,'IMPLICIT_TRANSACTIONS')
Insert Into #Options Values (4,'CURSOR_CLOSE_ON_COMMIT')
Insert Into #Options Values (8,'ANSI_WARNINGS')
Insert Into #Options Values (16,'ANSI_PADDING')
Insert Into #Options Values (32,'ANSI_NULLS')
Insert Into #Options Values (64,'ARITHABORT')
Insert Into #Options Values (128,'ARITHIGNORE')
Insert Into #Options Values (256,'QUOTED_IDENTIFIER')
Insert Into #Options Values (512,'NOCOUNT')
Insert Into #Options Values (1024,'ANSI_NULL_DFLT_ON')
Insert Into #Options Values (2048,'ANSI_NULL_DFLT_OFF')

If @@Options <> 0
Select Options_Set
From #Options
Where (OptId & @@Options) > 0
Else
Select Options_Set
From #Options
Where Optid = 0

Set NoCount Off

Q. Is it possible to get a list of all of the system tables and views that are in Master only?

A. Perfectly easy, I’ll even order the output by type and name:

Select type, name From Master..sysObjects Where Type In ('S', 'V')


AND name Not In (Select name From Model..sysObjects) order by type, name

Q. Is there an easy way to get a list of all databases a particular login can access?

A. Yes, there’s an undocumented procedure called sp_MsHasDbAccess which gives precisely the
information you want. Use the SetUser command to impersonate a user (SetUser loginname) and
run sp_MsHasDbAccess with no parameters. To revert to your default status, run SetUser with no
parameters.

Other recommended sources for Microsoft SQL FAQs:


http://support.microsoft.com/support/default.asp?PR=sql&FR=0&SD=MSDN&LN=EN-US (for
FAQ and Highlights for SQL)
http://support.microsoft.com/view/dev.asp?id=hl&pg=sql.asp for Microsoft technical advice)
http://support.microsoft.com/support/sql/70faq.asp (UK based)
http://www.mssqlserver.com/faq/ (an independent support company)
http://www.sqlteam.com/ (a good source for answers to SQL questions)
http://www.sql-server-performance.com/default.asp (for SQL performance issues)
http://www.sql-server.co.uk/frmMain.asp (UK SQL user group - need to do free registration)
http://www.advisor.com/wHome.nsf/w/MMB (a useful US based VB/SQL Magazine)
http://www.advisor.com/wHome.nsf/w/JMSS (a useful US based SQL Magazine)
http://secure.duke.com/nt/sqlmag/newsub.cfm?LSV=NunLjpaxugA-
hA9hauyikjkarnlkTKgEAQ&Program=9 (another useful SQL magazine)

SQL FAQ
Details of "Frequently Asked Questions" (FAQ) dealing with common SQL Server 7.0 problems.

SQL Server 2000 Frequently Asked Questions

Q. Does SQL Server still use Transact SQL?

A. Yes, Transact SQL is still the language engine for SQL Server. There are a number of
extensions to support the new features of SQL Server 7, but routines written for earlier versions
should still run as before.

SQL Server 7 does automate a number of administrative chores however, so you should check to
see if it’s still necessary to run all of your TSQL scripts.

Q. Does SQL Server support the ANSI standard?

A. Yes, SQL Server 7 is fully ANSI-compliant (ANSI SQL-92). There are, of course a number of
proprietary extensions which provide extra functionality – for example the multi-dimensional
RollUp and Cube extensions.

Q. Can I still run SQL Server from DOS and Windows 3.x?

A. Yes. Both can still act as client operating systems. Of course there are a number of features
that won’t be available (for example, the Enterprise Manager, which requires a Windows 9.x or
NT client operating system).

Remember that you can run SQL Server 7’s server components on a Windows 9.x machine, but
security is less strictly applied on Windows 9.x systems.

Q. What’s happened to the SQL Executive service?

A. SQL Agent now replaces the SQL Executive service which controlled scheduling operations
under earlier versions of SQL Server. SQL Agent is far more flexible, allowing Jobs (formerly
Tasks) to be run as a series of steps.

MSDB is remains the scheduling database storing all of SQL Agent’s scheduling information.

Q. ISQL seems to have been replaced by OSQL. What’s the difference, if any?

A. ISQL is still there, and it continues to use ODBC to connect to SQL Server. However SQL
Server 7 has been designed to use OLE DB as its data access interface. OLE DB is designed to
be faster and more flexible than ODBC.
While ODBC allows access to SQL data only, OLE DB can work with both structured and
unstructured data. OLE DB is a C-like interface, but developers using Visual Studio have access
to a code wrapper known as Active Data Objects.

Q. I’m presently using the RDO (Remote Data Object) library to work with SQL Server data.
Should I migrate to ADO or wait and see?

A. RDO is a tried and tested data access technology, that’s been tweaked and tuned over two
versions. If you’re developing a new system from scratch, consider using ADO, as its object
model is flatter and more streamlined than RDO, but if you’re supporting a large, complex suite of
applications using RDO to interface with SQL Server, adopt a more cautious, incremental
approach. Consider migrating portions of your application suite to ADO, perhaps encapsulating
particular functions in classes.

Q. How has the Distributed Management Framework changed in SQL Server 7?

A. The Distributed Management Framework uses SQL Server’s own object library to make
property and method calls to SQL Server components. SQL Server 6.5 used the SQLOLE object
library (SqlOle65.dll); SQL Server 7 uses a new version, the SQLDMO library (SqlDmo.Enu).

Certain collections are no longer supported (notably the Devices collection) and others have
changed their name. You can still choose to manipulate SQL Server using the new, snap-in
Enterprise Manager (which makes the SQLDMO calls for you) or make the calls yourself from an
ActiveX-compliant client.

Q. How can I test the Authentication mode of a particular SQL Server? I don’t want to pass
a user Id and password across the network if I don’t have to.

A. If you can call SQLDMO functions directly, you’ll find an Enumeration (a group of SQL Server
constants) called SQLDMO_LOGIN_TYPE. This Enumeration supports three login modes: (1)
SQLDMOLogin_Standard, (2) SQLDMOLogin_NTUser, and (3) SQLDMOLogin_NTGroup.

You need to create an instance of the enumeration, test the login type and pass the appropriate
values.

Q. I have a group of TSQL scripts that query SQL Server’s system tables. Can I continue to
use them with SQL Server 7?

A. Yes, but be aware this isn’t good practice. If Microsoft change the structure of system tables in
a later release your scripts may fail or provide inaccurate information. Consider instead using the
new information schema views (for example information_schema.tables). These are system table
independent, but still give access to SQL Server metadata.

Q. I’ve tried to load SQL Server 7, but the setup insists on loading Internet explorer 4.01
"or later". I’m perfectly happy with IE3; do I have to upgrade to IE4?

A. Unfortunately, yes. All operating systems require IE4.01 or later to be loaded before SQL
Server 7 can be installed.

Q. Can I speed up SQL Server sort operations by choosing a non-default Sort Order?
A. In short, yes. The default remains Dictionary Order, Case Insensitive. A binary sort will be
faster, but consider that you may not get result sets back in the order you expect. It’s up to check
the sequence in which character data is returned.

Note also that changing the sort order after installation requires that you rebuild all your
databases, and you can’t carry out backups and restores between SQL Servers with differing Sort
Orders.

Q. Do I need to set the same Sort Order for both Unicode and non-Unicode data?

A. It’s recommended that you adopt the same Sort Order for both Unicode and non-Unicode data
as you may experience problems migrating non-Unicode data to Unicode. In addition Unicode
and non-Unicode data may sort differently.

Q. I’m having problems sending Email notifications to other SQL Servers. What could be
the problem?

A. It could be a host of problems, but the most common is running under a LocalSystem account
rather that a Domain User account. Under the LocalSystem account you can’t communicate with
another server expecting a Trusted Connection.

Q. Can I still use the Net Start command to start SQL Server in single-user mode or with a
minimal configuration?

A. Yes. To start SQL Server in single-user mode the syntax is net start mssqlserver –m typed at
the Command Prompt. To start SQL Server with the minimum set of drivers loaded, type net start
mssqlserver –f at the Command Prompt.

Q. After registering a SQL Server 7 I noticed the system databases weren’t visible in the
Enterprise Manager. Have they been removed from the Enterprise Manager interface?

A. No, they’re still available to manage via the Enterprise Manager, but by default they’re not
displayed.

Q. Under SQL Server 6.5 I managed Permissions using Grant and Revoke statements. How
does the new Deny statement work?

A. A Revoke used to be the opposite of a Grant – removing a Permission to do something. Under


SQL Server 7 a permission is removed by a Deny statement and the Revoke statement has
become ‘neutral’. In other works you can Revoke a Grant or Revoke a Deny.

Other points to note:

SQL Server 7 places an icon in the System Tray on your TaskBar allowing a visual check of the
current status of either the Server, SQL Agent or the Distributed Transaction Co-ordinator.

SQL Server, SQL Agent or the Distributed Transaction Co-ordinator can all be started, paused,
and stopped using a right mouse-click on the System Tray icon.

The SQL Server Distributed Management Object (SQLDMO) library can be directly programmed
to carry out almost any SQL Server, SQL Agent, or DTC task by manipulating properties and
methods.
To give a visual display of the Execution Plan adopted by SQL Server, type the query into the
Microsoft SQL Server Query Analyser and select the toolbar icon 'Display SQL Execution Plan'.

To output statistics from particular parts of the Execution Plan generated by the SQL Server
Query Analyser, pause the pointer over the appropriate icon. For instance Index statistics,
including the individual and total subtree costs, can be displayed by selecting the icon captioned
with the index name.

The output from the SQL Server Query Analyser can be run into a grid, or output in raw format to
the results pane. To generate the output in a grid, compose the query and click the toolbar icon
'Execute Query into Grid'.

When creating a new SQL Server database you have the option to increase the database size by
fixed increments, or by a percentage of the current size of the database.

SQL Server automatically performs query parallelism with multi-processor computers. Operators
provide process management, data redistribution, and flow control for queries which would
benefit from query parallelism.

To determine who's currently logged on use the system procedure sp_who


SQL Server 7 automatically adjusts its memory requirements according to the current load on the
operating system.

SQL Server 7 removes the administrative chore of periodically updating statistics by a


combination of table, row and index sampling.

Connections to SQL Server should now be routed through the Microsoft Active Data Objects. SQL
Server 7 provides the ADO 2.0 library which supersedes ODBC. Connection is made by calling
an ADO Provider.

SQL Server 7 now provides full, cost-based locking: automatically de-escalating to a single-row
lock as well as escalating to a Table lock.

Server Roles allow extensive privileges in one area of SQL Server Administration to be assigned,
e.g. Process Administration, while implicitly denying all other Administration rights. The task of
System Administrator can now be compartmentalised into a number of discrete, mutually-
exclusive roles.

Single queries can now reference 32 tables, and the number of internal work tables used by a
query is no longer limited to 16 as in earlier versions.

SQL Server tables now support multiple triggers of the same type - e.g. several Insert triggers on
the same table. This allows greater flexibility of Business Rule processing.

The SQL Server Page size has increased to 8K removing the upper limit of 1960 bytes per data
row.

SQL Server Extents are now 64K in size (increasing by a factor of eight). Multiple objects can
now share the same Extent until they grow large enough to be allocated their own Extent.
'Uniform' extents are owned by a single object, all eight pages in the extent can only be used by
the owning object. 'Mixed' extents are shared by up to eight objects. A new table or index is
allocated pages from mixed extents. When the table or index grows to the point that it has eight
pages, it is switched to a uniform extent.

SQL Server Tables now support up to 1024 columns.


The SQL Server ODBC Driver now supports the ODBC 3.5 API. The driver has also been
enhanced to support the bulk copy functions originally introduced in DB-Library, and now
possesses the ability to get metadata for linked tables used in distributed queries.

The SqlServer Distributed Management Object (SQLDMO) library can now be called using either
of the two major scripting languages: VBScript or Jscript.

SQL Server 7 now supports Unicode data types: nchar, nvarchar, ntext. These DataTypes are
exactly the same as char, varchar, and text, except for the wider range of characters supported
and the increased storage space used. Data can be stored in multiple languages within one
database, getting rid of the problem of converting characters and installing multiple code pages.

The maximum length of the char, varchar, and varbinary data types has been increased to 8000
bytes, substantially greater than the previous limit of 255 bytes in SQL Server 6.0/6.5. This growth
in size allows Text and Image datatypes to be reserved for very large data values.

SQL Server 7 allows the SubString function to be used in the processing of both text and image
data values.

GUID's (128 bit, Globally unique Identifiers) are now automatically generated via the
UniqueIdentifier DataType.

Most of SQL Server's functionality is now supported on Windows 95/98. Exceptions are
processes like Symmetric Multiprocessing, Asynchronous I/O, and integrated security, supported
on NT platforms.

SQL Server's new Replication Wizard automates the task of setting up a distributed solution.
Replication is now a simple task, and is significantly easier to set up, administer, deploy, monitor,
and troubleshoot.

SQL Server 7 has introduced a new Replication model 'Merge Replication' allowing 'update
anywhere' capability. Use Merge Replication with care however, as it does not guarantee
transactional consistency, as does the traditional transactional replication.

SQL Server Tasks have now become multistep Jobs, allowing the Administrator to schedule the
job, manage job step flow, and store job success or failure information in one central location.

Indexes are substantially more efficient in SQL Server 7. In earlier versions of SQL Server,
nonclustered indexes used physical record identifiers (page number, row number) as row
locators, even if a Clustered index had been built. Now, if a table has a clustered index (and thus
a clustering key), the leaf nodes of all nonclustered indexes use the clustering key as the row
locator rather than the physical record identifier. Of course, if a table does not have a clustered
index, nonclustered indexes continue to use the physical record identifiers to point to the data
pages.

Setting up SQL Server to use its LocalSystem account restricts SQL Server to local processes
only. The LocalSystem account has no network access rights at all.

When setting up replication, it's sensible to set up the Publisher and all its Subscribers to share
the same account.

Ensure you develop the appropriate standard for SQL Server's character set. If you need to
change it later, you must rebuild the databases and reload the data. Server-to-server activities
may fail if the character set is not consistent across servers within your organisation.

The SQL Server Upgrade Wizard can be run either following SetUp or later, at your convenience.
If appropriate, you can choose to Autostart any or all of the three SQL Server processes: the
MSSQLServer itself, SQLAgent, or the Distributed Transaction Co-ordinator.

Rather than backing up data to a device such as disk or tape, SQL Server backs up data through
shared memory to a virtual device. The data can then be picked up from the virtual device and
backed up by a custom application.

Ensure you select the correct sort order when you first install SQL Server. If you need to change
sort orders after installation, you must rebuild your databases and reload your data.

The simplest and fastest sort order is 'Binary'. The collating sequence for this sort order is based
on the numeric value of the characters in the installed character set. Using the binary sort order,
upper-case Z sorts before lower-case a because the character Z precedes the character a in all
character sets.

Remember that SQL Server's options for dictionary sort orders (Case-sensitive, Case-Insensitive)
carry with them a trade-off in performance.

Since SQL Server 6.0 login passwords have been encrypted. To check this, look in Master's
syslogins Table What appears to be garbled text is actually a textual representation of the binary,
encrypted password.

When using Windows 95/98 clients, client application and server-to-server connections must be
over TCP/IP Sockets instead of Named Pipes. Named Pipes is not an available protocol in
Windows 95/98.

To help trap the cause of SQL Server errors, while the error dialog still showing, look at Sqlstp.log
in the \Windows or \WinNT directory. Check the last few events recorded in the log to see if any
problems occurred before the error message was generated.

When upgrading from an earlier version of SQL Server, The Upgrade Wizard estimates only the
disk space required. It cannot give an exact requirement.

When upgrading from SQL Server 6.x, remember to set TempDb to at least 25 MB in your SQL
Server 6.x installation.

Forget memory tuning! Unlike SQL Server 6.x, SQL Server 7.0 can dynamically size memory
based on available system resources.

SQL Server's Upgrade Wizard allows the selection of databases for upgrade to SQL Server 7
databases. If you run the SQL Server Upgrade Wizard again after databases have been
upgraded, previously updated databases will default to the excluded list. If you want to upgrade a
database again, move it to the included list.

All tasks scheduled by SQL Executive, for a SQL Server 6.x environment are transferred and
upgraded so that SQL Server 7.0 can schedule and run the tasks in SQL Server Agent.

SQL Server's Quoted_Identifier setting determines what meaning SQL Server gives to double
quotation marks. With 'Set Quoted_Identifier = Off, double quotation marks delimit a character
string, in the same way that single quotation marks do. With 'Set Quoted_Identifier = On, double
quotation marks delimit an identifier, such as a column name.

SQL Server 7 can be installed side-by-side with SQL Server 6.x on the same computer, however
only one version can be run at one time. When the SQL Server Upgrade Wizard is complete, SQL
Server 7 is marked as the active version of SQL Server. If you have enough disk space, it is a
good idea to leave SQL Server 6.x on your computer until you are sure the version upgrade to
SQL Server 7.0 was successful.
Failover Support for SQL Server 7 is designed to work in conjunction with Microsoft Cluster
Server (MSCS). Managing Failover Support provides the ability to install a virtual SQL Server that
is managed through the MSCS Cluster Administrator. MSCS monitors the health of the primary
(active) and secondary (idle) nodes, the SQL Server application, and shared disk resources.
Upon failure of the primary node, services will automatically 'fail over' to the secondary node, and
uncommitted transactions will be rolled back prior to reconnection of clients to the database.

You can launch any Windows NT-based application from SQL Server's Enterprise Manager.
External applications can be added and run from the Tools menu.

With SQL Server 7 exactly the same database engine can be used across platforms ranging from
laptop computers running Microsoft Windows 95/98 to very large, multiprocessor servers running
Microsoft Windows NT, Enterprise Edition.

SQL Server Performance Monitor allows the SQL Server Administrator to set up SQL Server-
specific counters in the NT Performance Monitor, allowing monitoring and graphing of the
performance of SQL Server with the same tool used to monitor Windows NT Servers.

In addition to a programmable library of SQLDMO SQL Server Distributed Management Objects),


SQL Server 7 also offers access to the Object Library for DTS (Data Transformation Services).

The Windows 95/98 operating systems do not support the server side of the trusted connection
API. When SQL Server is running on Windows 95 or 98, it does not support an Integrated
Security model.

When running an application on the same computer as SQL Server, you can use refer to the SQL
Server using the machine name or '(local)' or '.'.

SQL Server 7 uses a new algorithm for comparing fresh Transact-SQL statements with any
Transact-SQL statements which have created an existing execution plans. If SQL Server 7 finds
that a new Transact-SQL statement matches an existing execution plan, it reuses the plan. This
reduces the performance benefit of pre-compiling stored procedures into execution plans.

To check Server Role membership, use sp_helpsrvrole; to extract the specific permissions for
each role execute sp_srvrolepermission.

Every user in a SQL Server database belongs to the Public role. If you want everyone in a
database to be able to have a specific permission, assign the permission to the public role. If a
user has not been specifically granted permissions on an object, they use the permissions
assigned to the Public role.

At the top of every SQL Server 8K Page is a 96 byte header used to store system information
such as the type of page, the amount of free space on the page, and the object ID of the object
owning the page.

Rows still can't span pages in SQL Server. In SQL Server 7, the maximum amount of data
contained in a single row is 8060 bytes, excluding text, ntext, and image data, which are held in
separate pages.

SQL Server 7 will automatically shrink databases that have a large amount of free space. Only
those databases where the AutoShrink option has been set to true will be shrunk. The server
checks the space usage in each database periodically. If a database is found with a lot of empty
space and it has the AutoShrink option set to true, SQL Server will reduce the size of the files in
the database.

SQL Server uses a Global Allocation Map (GAM) to record what extents have been allocated.
Each GAM covers 64,000 extents, or nearly 4 GB of data. The GAM has one bit for each extent in
the interval covered. If the bit is 1, the extent is free; if the bit is 0, the extent is allocated.
SQL Server Index Statistics are stored as a long string of bits across multiple pages in the same
way that Image data is stored. The column sysindexes.statblob points to this distribution data.
You can use the DBCC SHOW_STATISTICS statement to get a report on the distribution
statistics for an index.

Because Non-Clustered indexes store Clustered index keys as their pointers to data rows, it is
important to keep Clustered index keys as small as possible. Do not choose large columns as the
keys to clustered indexes if a table also has Non-Clustered indexes.

In SQL Server 7, individual text, ntext, and image pages are not limited to holding data for only
one occurrence of a text, ntext, or image column. A text, ntext, or image page can hold data from
multiple rows; the page can even have a mix of text, ntext, and image data.

In SQL Server 7, Text and Image data pages are logically organized in a b-tree structure, while in
earlier versions of SQL Server they were linked together in a page chain.

Because SQL Server 7 can store small amounts of text, ntext, or image data in the same Page,
you can insert 20 rows that each have 200 bytes of data in a text column, with the data and all the
root structures fitting onto the same 8K page.

User Connections are cheaper in SQL Server 7. Under SQL Server 6.5 each connection 'cost'
44K of memory; each connection under SQL Server 7 costs only 24K of memory.

If you need to maintain existing processes Pause rather than Stop your SQL Server. Pausing
SQL Server prevents new users from logging in and gives you time to send a message to current
users asking them to complete their work and log out before you Stop the server. If you stop SQL
Server without Pausing it, all server processes are terminated immediately. Stopping SQL Server
prevents new connections and disconnects current users. Note that you can't pause SQL Server
if it was started by running sqlservr. Only SQL Server services started as NT services can be
paused.

If you need to start SQL Server in minimal configuration to correct configuration problems, stop
the SQLServerAgent service before connecting to SQL Server. Otherwise, the SQLServerAgent
service uses the connection and blocks your connection to SQL Server.

When specifying a trace flag with the /T option, make sure you use an uppercase "T" to pass the
trace flag number. A lowercase "t" is accepted by SQL Server, but this sets other internal trace
flags that are required only by SQL Server support engineers.

Be careful when stopping your SQL Server. If you stop SQL Server using Ctrl+C at the command
prompt, it does not perform a CHECKPOINT in every database. Therefore, the recovery time is
increased the next time the server is started.

To shutdown your SQL Server immediately, issue 'SHUTDOWN WITH NOWAIT'. This stops the
server immediately, but it requires more recovery time the next time the server is started because
no CHECKPOINT is issued against any databases.

If you prefer to avoid the command line, data can be transferred into a SQL Server table from a
data file using the 'Bulk Insert' statement. The Bulk Insert statement allows you to bulk copy data
into SQL Server using the bcp utility with a Transact-SQL statement: e.g. Bulk Insert
pubs..publishers FROM 'c:\publishers.txt' With (DataFileType = 'char').

When running the Bulk Copy Program (BCP), use the -n parameter where possible. Storing
information in native format is useful when information is to be copied from one computer running
SQL Server to another. Using native format saves time and space, preventing unnecessary
conversion of data types to and from character format. However, a data file in native format
cannot be read by any program other than BCP.
Native format BCP can now be used to bulk copy data from one computer running SQL Server to
another running with a different processor architecture. This was impossible with earlier versions
of SQL Server.

When running the Bulk Copy Program (BCP), the new -6 parameter, when used in conjunction
with either native format (-n) or character format (-c), uses SQL Server 6/6.5 data types. Use this
parameter when using data files generated by BCP in native or character format from SQL Server
6/6.5, or when generating data files to be loaded into SQL Server 6/6.5. Note that the -6
parameter is not applicable to the Bulk Insert statement.

With BCP, the SQL Server Char DataType is always stored in the data file as the full length of the
defined column. For example, a column defined as Char(10) always occupies 10 characters in
the data file regardless of the length of the data stored in the column; spaces are appended to the
data as padding. Note that any pre-existing space characters are indistinguishable from the
padding characters added by BCP.

When running the Bulk Copy Program (BCP), choose terminators with care to ensure that their
pattern does not appear in any of the data. For example, when using tab terminators with a field
that contains tabs as part of the data, bcp does not know which tab represents the end of the
field. The bcp utility always looks for the first possible character(s) that matches the terminator it
expects. Using a character sequence with characters that do not occur in the data avoids this
conflict.

When running the Bulk Copy Program (BCP), you may decide to drop the indexes on the table
prior to loading a large amount of. Conversely, if you are loading a small amount of data relative
to the amount of data already in the table, dropping the indexes may not be necessary because
the time taken to rebuild the indexes can be longer than performing the bulk copy operation.

If, for any reason, a BCP operation aborts before completion, the entire transaction is rolled back,
and no new rows are added to the destination table.

Following a BCP operation, it's necessary to identify any rows that violate constraints or triggers.
To do this run queries or stored procedures that test the constraint or trigger conditions, such as:
UPDATE pubs..authors SET au_fname = au_fname. Although this query does not change data to
a different value, it causes SQL Server to update each value in the au_fname column to itself.
This causes any constraints or triggers to fire, testing the validity of the inserted rows.

During BCP, users often try to load an ASCII file in native format. This leads to misinterpretation of
the hexadecimal values in the ASCII file and the generation of an "unexpected end of file" error
message. The correct method of loading the ASCII file is to represent all fields in the data file as a
character string (i.e character format BCP), and let SQL Server do the data conversion to internal
data types as rows are inserted into the table.

During BCP In, a hidden character in an ASCII data file can cause problems, generating an
"unexpected null found" error message. Many utilities and text editors display hidden characters
which can usually be found at the bottom of the data file. Finding and removing these characters
should resolve the problem.

With BCP, it's possible to specify the number of rows to load from the data file rather than loading
the entire data file. For example, to load only the first 150 rows from a 10,000 row data file,
specify the -L last_row parameter when loading the data.

After a bulk load using BCP, from a data file, into a table with an index, execute 'Update Statistics'
so that SQL Server can continue to optimise queries made against the table.
Deleting Duplicate Records
graz on 3/26/2001 in DELETEs
Seema writes "There is a Table with no key constraints. It has duplicate records. The
duplicate records have to be deleted (eg there are 3 similar records, only 2 have to be
deleted). I need a single SQL query for this." This is a pretty common question so I thought
I'd provide some options.
First, I'll need some duplicates to work with. I use this script to create a table called
dup_authors in the pubs database. It selects a subset of the columns and creates some
duplicate records. At the end it runs a SELECT statement to identify the duplicate records:
select au_lname, au_fname, city, state, count(*)
from dup_authors
group by au_lname, au_fname, city, state
having count(*) > 1
order by count(*) desc, au_lname, au_fname
The easiest way I know of to identify duplicates is to do a GROUP BY on all the columns in
the table. It can get a little cumbersome if you have a large table. My duplicates look
something like this:
au_lname        au_fname   city                 state             
­­­­­­­­­­­­­­­ ­­­­­­­­­­ ­­­­­­­­­­­­­­­­­­­­ ­­­­­ ­­­­­­­­­­­ 
Smith           Meander    Lawrence             KS    3
Bennet          Abraham    Berkeley             CA    2
Carson          Cheryl     Berkeley             CA    2
except there are thirteen additional duplicates identified.
Second, backup your database. Third, make sure you have a good backup of your
database.

Temp Table and Truncate


The simplest way to eliminate the duplicate records is to SELECT DISTINCT into a
temporary table, truncate the original table and SELECT the records back into the original
table. That query looks like this:
select distinct *
into #holding
from dup_authors

truncate table dup_authors

insert dup_authors
select *
from #holding

drop table #holding
If this is a large table, it can quickly fill up your tempdb. This also isn't very fast. It makes a
copy of your data and then makes another copy of your data. Also while this script is
running, your data is unavailable. It may not be the best solution but it certainly works.

Rename and Copy Back


The second option is to rename the original table to something else, and copy the unique
records into the original table. That looks like this:
sp_rename 'dup_authors', 'temp_dup_authors'

select distinct *
into dup_authors
Handling Errors in Stored Procedures
Garth on 2/5/2001 in Stored Procs
The following article introduces the basics of handling errors in stored procedures. If you
are not familiar with the difference between fatal and non-fatal errors, the system function
@@ERROR, or how to add a custom error with the system stored procedure
sp_addmessage, you should find it interesting.
The examples presented here are specific to stored procedures as they are the desired
method of interacting with a database. When an error is encountered within a stored
procedure, the best you can do (assuming it’s a non-fatal error) is halt the sequential
processing of the code and either branch to another code segment in the procedure or
return processing to the calling application. Notice that the previous sentence is specific to
non-fatal errors. There are two type of errors in SQL Server: fatal and non-fatal. Fatal
errors cause a procedure to abort processing and terminate the connection with the client
application. Non-fatal errors do not abort processing a procedure or affect the connection
with the client application. When a non-fatal error occurs within a procedure, processing
continues on the line of code that follows the one that caused the error.
The following example demonstrates how a fatal error affects a procedure.
USE tempdb
go
CREATE PROCEDURE ps_FatalError_SELECT
AS
SELECT * FROM NonExistentTable
PRINT 'Fatal Error'
go
EXEC ps_FatalError _SELECT
--Results--
Server:Msg 208,Level 16,State 1,Procedure ps_FatalError_SELECT,Line 3
Invalid object name 'NonExistentTable'.
The SELECT in the procedure references a table that does not exist, which produces a fatal
error. The procedure aborts processing immediately after the error and the PRINT
statement is not executed.
To demonstrate how a non-fatal error is processed, I need to create the following table.
USE tempdb
go
CREATE TABLE NonFatal
(
Column1 int IDENTITY,
Column2 int NOT NULL
)
This example uses a procedure to INSERT a row into NonFatal, but does not include a value
for Column2 (defined as NOT NULL).
USE tempdb
go
CREATE PROCEDURE ps_NonFatal_INSERT
@Column2 int =NULL
AS
INSERT NonFatal VALUES (@Column2)
PRINT 'NonFatal'
go
EXEC ps_NonFatal_INSERT
--Results--
Transact-SQL Query
SQL Server Performance Tuning Tips

This tip may sound obvious to most of you, but I have seen professional developers, in
two major SQL Server-based applications used worldwide, not follow it. And that is to
always include a WHERE clause in your SELECT statement to narrow the number
of rows returned. If you don't use a WHERE clause, then SQL Server will perform a table
scan of your table and return all of the rows. In some case you may want to return all
rows, and not using a WHERE clause is appropriate in this case. But if you don't need all
the rows returned, use a WHERE clause to limit the number of rows returned.

By returning data you don't need, you are causing SQL Server to perform I/O it doesn't
need to perform, wasting SQL Server resources. In addition, it increases network traffic,
which can also lead to reduced performance. And if the table is very large, a table scan
will lock the table during the time-consuming scan, preventing other users from
accessing it, hurting concurrency.

Another negative aspect of a table scan is that it will tend to flush out data pages from
the cache with useless data, which reduces SQL Server's ability to reuse useful data in
the cache, which increases disk I/O and hurts performance. [6.5, 7.0, 2000] Updated 4-
17-2003

*****

To help identify long running queries, use the SQL Server Profiler Create Trace
Wizard to run the "TSQL By Duration" trace. You can specify the length of the long
running queries you are trying to identify (such as over 1000 milliseconds), and then
have these recorded in a log for you to investigate later. [7.0]

*****

When using the UNION statement, keep in mind that, by default, it performs the
equivalent of a SELECT DISTINCT on the final result set. In other words, UNION takes the
results of two like recordsets, combines them, and then performs a SELECT DISTINCT in
order to eliminate any duplicate rows. This process occurs even if there are no duplicate
records in the final recordset. If you know that there are duplicate records, and this
presents a problem for your application, then by all means use the UNION statement to
eliminate the duplicate rows.

On the other hand, if you know that there will never be any duplicate rows, or if there
are, and this presents no problem to your application, then you should use the UNION
ALL statement instead of the UNION statement. The advantage of the UNION ALL is that
is does not perform the SELECT DISTINCT function, which saves a lot of unnecessary
SQL Server resources from being using. [6.5, 7.0, 2000] Updated 10-30-2003

*****

Sometimes you might want to merge two or more sets of data resulting from
two or more queries using UNION. For example:

SELECT column_name1, column_name2


FROM table_name1
WHERE column_name1 = some_value
UNION
SELECT column_name1, column_name2
FROM table_name1
WHERE column_name2 = some_value

This same query can be rewritten, like the following example, and when doing so,
performance will be boosted:

SELECT DISTINCT column_name1, column_name2


FROM table_name1
WHERE column_name1 = some_value OR column_name2 = some_value

And if you can assume that neither criteria will return duplicate rows, you can even
further boost the performance of this query by removing the DISTINCT clause. [6.5, 7.0,
2000] Added 6-5-2003

*****

Carefully evaluate whether your SELECT query needs the DISTINCT clause or
not. Some developers automatically add this clause to every one of their SELECT
statements, even when it is not necessary. This is a bad habit that should be stopped.

The DISTINCT clause should only be used in SELECT statements if you know that
duplicate returned rows are a possibility, and that having duplicate rows in the result set
would cause problems with your application.

The DISTINCT clause creates a lot of extra work for SQL Server, and reduces the
physical resources that other SQL statements have at their disposal. Because of this,
only use the DISTINCT clause if it is necessary. [6.5, 7.0, 2000] Updated 10-30-2003

*****

In your queries, don't return column data you don't need. For example, you should
not use SELECT * to return all the columns from a table if you don't need all the data
from each column. In addition, using SELECT * prevents the use of covered indexes,
further potentially hurting query performance. [6.5, 7.0, 2000] Updated 6-21-2004

*****

If your application allows users to run queries, but you are unable in your application to
easily prevent users from returning hundreds, even thousands of unnecessary rows of
data they don't need, consider using the TOP operator within the SELECT statement.
This way, you can limit how may rows are returned, even if the user doesn't enter any
criteria to help reduce the number or rows returned to the client. For example, the
statement:

SELECT TOP 100 fname, lname FROM customers


WHERE state = 'mo'

limits the results to the first 100 rows returned, even if 10,000 rows actually meet the
criteria of the WHERE clause. When the specified number of rows is reached, all
processing on the query stops, potentially saving SQL Server overhead, and boosting
performance.

The TOP operator works by allowing you to specify a specific number of rows to be
returned, like the example above, or by specifying a percentage value, like this:

SELECT TOP 10 PERCENT fname, lname FROM customers


WHERE state = 'mo'

In the above example, only 10 percent of the available rows would be returned.

In SQL Server 2005, a new argument has been added for the TOP statement. Books
Online specifies:

[
TOP (expression) [PERCENT]
[ WITH TIES ]
]

Example:

USE AdventureWorks
GO
SELECT TOP(10) PERCENT WITH TIES
EmployeeID, Title, DepartmentID, Gender, BaseRate
FROM HumanResources.Employee
ORDER BY BaseRate DESC

What the WITH TIES option does is to allow more than the specified number or percent
of rows to be returned if the the values of the last group of rows are identical. If you
don't use this option, then any number of tied rows will be arbitrarily dropped so that
the exact number of rows specified by the TOP statement are only returned.

In addition to the above new feature, SQL Server 2005 allows the TOP statement to be
used with DML statements, such as DELETE, INSERT and UPDATE. Also, the TOP
statement cannot be used in conjunction with UPDATE and DELETE statements on
partitioned views.

No changes were made to the SET ROWCOUNT statement in SQL Server 2005, and
usually the SET ROWCOUNT value overrides the SELECT statement TOP keyword if the
ROWCOUNT is the smaller value.

Keep in mind that using this option may prevent the user from getting the data they
need. For example, the data the are looking for may be in record 101, but they only get
to see the first 100 records. Because of this, use this option with discretion. [7.0, 2000]
Updated 4-7-2005

*****

You may have heard of a SET command called SET ROWCOUNT. Like the TOP operator,
it is designed to limit how many rows are returned from a SELECT statement. In effect,
the SET ROWCOUNT and the TOP operator perform the same function.

While is most cases, using either option works equally efficiently, there are some
instances (such as rows returned from an unsorted heap) where the TOP operator is
more efficient than using SET ROWCOUNT. Because of this, using the TOP operator is
preferable to using SET ROWCOUNT to limit the number of rows returned by a query.
[6.5, 7.0, 2000] Updated 10-30-2003

*****

In a WHERE clause, the various operators used directly affect how fast a
query is run. This is because some operators lend themselves to speed over other
operators. Of course, you may not have any choice of which operator you use in your
WHERE clauses, but sometimes you do.

Here are the key operators used in the WHERE clause, ordered by their performance.
Those operators at the top will produce results faster than those listed at the bottom.

• =

• >, >=, <, <=

• LIKE

• <>
This lesson here is to use = as much as possible, and <> as least as possible. [6.5, 7.0,
2000] Added 5-30-2003
*****
In a WHERE clause, the various operands used directly affect how fast a query
is run. This is because some operands lend themselves to speed over other operands.
Of course, you may not have any choice of which operand you use in your WHERE
clauses, but sometimes you do.
Here are the key operands used in the WHERE clause, ordered by their performance.
Those operands at the top will produce results faster than those listed at the bottom.
• A single literal used by itself on one side of an operator

• A single column name used by itself on one side of an operator, a single


parameter used by itself on one side of an operator

• A multi-operand expression on one side of an operator

• A single exact number on one side of an operator

• Other numeric number (other than exact), date and time

• Character data, NULLs


The simpler the operand, and using exact numbers, provides the best overall
performance. [6.5, 7.0, 2000] Added 5-30-2003
*****
If a WHERE clause includes multiple expressions, there is generally no
performance benefit gained by ordering the various expressions in any
particular order. This is because the SQL Server Query Optimizer does this for you,
saving you the effort. There are a few exceptions to this, which are discussed on this
web site[7.0, 2000] Added 5-30-2003
*****
Don't include code that doesn't do anything. This may sound obvious, but I have
seen this in some off-the-shelf SQL Server-based applications. For example, you may
see code like this:
SELECT column_name FROM table_name
WHERE 1 = 0
When this query is run, no rows will be returned. Obviously, this is a simple example
(and most of the cases where I have seen this done have been very long queries), a
query like this (or part of a larger query) like this doesn't perform anything useful, and
shouldn't be run. It is just wasting SQL Server resources. In addition, I have seen more
than one case where such dead code actually causes SQL Server to through errors,
preventing the code from even running. [6.5, 7.0, 2000] Added 5-30-2003
*****
By default, some developers, especially those who have not worked with SQL Server
before, routinely include code similar to this in their WHERE clauses when they make
string comparisons:
SELECT column_name FROM table_name
WHERE LOWER(column_name) = 'name'
In other words, these developers are making the assuming that the data in SQL Server
is case-sensitive, which it generally is not. If your SQL Server database is not
configured to be case sensitive, you don't need to use LOWER or UPPER to
force the case of text to be equal for a comparison to be performed. Just leave
these functions out of your code. This will speed up the performance of your query, as
any use of text functions in a WHERE clause hurts performance.
But what if your database has been configured to be case-sensitive? Should you then
use the LOWER and UPPER functions to ensure that comparisons are properly
compared? No. The above example is still poor coding. If you have to deal with ensuring
case is consistent for proper comparisons, use the technique described below, along
with appropriate indexes on the column in question:
SELECT column_name FROM table_name
WHERE column_name = 'NAME' or column_name = 'name'
This code will run much faster than the first example. [6.5, 7.0, 2000] Added 5-30-2003
*****
Try to avoid WHERE clauses that are non-sargable. The term "sargable" (which is in
effect a made-up word) comes from the pseudo-acronym "SARG", which stands for
"Search ARGument," which refers to a WHERE clause that compares a column to a
constant value. If a WHERE clause is sargable, this means that it can take advantage of
an index (assuming one is available) to speed completion of the query. If a WHERE
clause is non-sargable, this means that the WHERE clause (or at least part of it) cannot
take advantage of an index, instead performing a table/index scan, which may cause
the query's performance to suffer.
Non-sargable search arguments in the WHERE clause, such as "IS NULL", "<>", "!=",
"!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE", and "LIKE '%500'" generally
prevents (but not always) the query optimizer from using an index to perform a search.
In addition, expressions that include a function on a column, expressions that have the
same column on both sides of the operator, or comparisons against a column (not a
constant), are not sargable.
But not every WHERE clause that has a non-sargable expression in it is doomed to a
table/index scan. If the WHERE clause includes both sargable and non-sargable clauses,
then at least the sargable clauses can use an index (if one exists) to help access the
data quickly.
In many cases, if there is a covering index on the table, which includes all of the
columns in the SELECT, JOIN, and WHERE clauses in a query, then the covering index
can be used instead of a table/index scan to return a query's data, even if it has a non-
sargable WHERE clause. But keep in mind that covering indexes have their own
drawbacks, such as producing very wide indexes that increase disk I/O when they are
read.
In some cases, it may be possible to rewrite a non-sargable WHERE clause into one that
is sargable. For example, the clause:
WHERE SUBSTRING(firstname,1,1) = 'm'
can be rewritten like this:
WHERE firstname like 'm%'
Both of these WHERE clauses produce the same result, but the first one is non-sargable
(it uses a function) and will run slow, while the second one is sargable, and will run
much faster.
WHERE clauses that perform some function on a column are non-sargable. On the other
hand, if you can rewrite the WHERE clause so that the column and function are
separate, then the query can use an available index, greatly boosting performance. for
example:
Function Acts Directly on Column, and Index Cannot Be Used:
SELECT member_number, first_name, last_name
FROM members
WHERE DATEDIFF(yy,datofbirth,GETDATE()) > 21
Function Has Been Separated From Column, and an Index Can Be Used:
SELECT member_number, first_name, last_name
FROM members
WHERE dateofbirth < DATEADD(yy,-21,GETDATE())
Each of the above queries produces the same results, but the second query will use an
index because the function is not performed directly on the column, as it is in the first
example. The moral of this story is to try to rewrite WHERE clauses that have functions
so that the function does not act directly on the column.
WHERE clauses that use NOT are not sargable, but can often be rewritten to remove the
NOT from the WHERE clause, for example:
WHERE NOT column_name > 5
to
WERE column_name <= 5
Each of the above clauses produce the same results, but the second one is sargable.
If you don't know if a particular WHERE clause is sargable or non-sargable, check out
the query's execution plan in Query Analyzer. Doing this, you can very quickly see if the
query will be using index lookups or table/index scans to return your results.
With some careful analysis, and some clever thought, many non-sargable queries can
be written so that they are sargable. Your goal for best performance (assuming it is
possible) is to get the left side of a search condition to be a single column name, and
the right side an easy to look up value. [6.5, 7.0, 2000] Updated 6-2-2003
*****
If you run into a situation where a WHERE clause is not sargable because of
the use of a function on the right side of an equality sign (and there is no other way
to rewrite the WHERE clause), consider creating an index on a computed column
instead. This way, you avoid the non-sargable WHERE clause altogether, using the
results of the function in your WHERE clause instead.
Because of the additional overhead required for indexes on computed columns, you will
only want to do this if you need to run this same query over and over in your
application, thereby justifying the overhead of the indexed computed column. [2000]
Updated 6-21-2004
*****
If you currently have a query that uses NOT IN, which offers poor performance
because the SQL Server optimizer has to use a nested table scan to perform this
activity, instead try to use one of the following options instead, all of which offer better
performance:
• Use EXISTS or NOT EXISTS

• Use IN
• Perform a LEFT OUTER JOIN and check for a NULL condition
[6.5, 7.0, 2000] Updated 10-30-2003
*****
When you have a choice of using the IN or the EXISTS clause in your Transact-
SQL, you will generally want to use the EXISTS clause, as it is usually more efficient and
performs faster. [6.5, 7.0, 2000] Updated 10-30-2003
*****
If you find that SQL Server uses a TABLE SCAN instead of an INDEX SEEK when
you use an IN or OR clause as part of your WHERE clause, even when those
columns are covered by an index, consider using an index hint to force the Query
Optimizer to use the index.
For example:
SELECT * FROM tblTaskProcesses WHERE nextprocess = 1 AND processid IN
(8,32,45)
takes about 3 seconds, while:

SELECT * FROM tblTaskProcesses (INDEX = IX_ProcessID) WHERE nextprocess =


1 AND processid IN (8,32,45)
returns in under a second. [7.0, 2000] Updated 6-21-2004 Contributed by David Ames
*****
If you use LIKE in your WHERE clause, try to use one or more leading character in
the clause, if at all possible. For example, use:
LIKE 'm%'
not:
LIKE '%m'
If you use a leading character in your LIKE clause, then the Query Optimizer has the
ability to potentially use an index to perform the query, speeding performance and
reducing the load on SQL Server.
But if the leading character in a LIKE clause is a wildcard, the Query Optimizer will not
be able to use an index, and a table scan must be run, reducing performance and taking
more time.
The more leading characters you can use in the LIKE clause, the more likely the Query
Optimizer will find and use a suitable index. [6.5, 7.0, 2000] Updated 10-30-2003
*****
If your application needs to retrieve summary data often, but you don't want to
have the overhead of calculating it on the fly every time it is needed, consider using a
trigger that updates summary values after each transaction into a summary table.
While the trigger has some overhead, overall, it may be less that having to calculate the
data every time the summary data is needed. You may have to experiment to see which
methods is fastest for your environment. [6.5, 7.0, 2000] Updated 10-30-2003
*****
If your application needs to insert a large binary value into an image data
column, perform this task using a stored procedure, not using an INSERT statement
embedded in your application.
The reason for this is because the application must first convert the binary value into a
character string (which doubles its size, thus increasing network traffic and taking more
time) before it can be sent to the server. And when the server receives the character
string, it then has to convert it back to the binary format (taking even more time).
Using a stored procedure avoids all this because all the activity occurs on the SQL
Server, and little data is transmitted over the network. [6.5, 7.0, 2000] Updated 10-30-
2003
*****
When you have a choice of using the IN or the BETWEEN clauses in your
Transact-SQL, you will generally want to use the BETWEEN clause, as it is much more
efficient. For example:
SELECT customer_number, customer_name
FROM customer
WHERE customer_number in (1000, 1001, 1002, 1003, 1004)
is much less efficient than this:
SELECT customer_number, customer_name
FROM customer
WHERE customer_number BETWEEN 1000 and 1004
Assuming there is a useful index on customer_number, the Query Optimizer can locate
a range of numbers much faster (using BETWEEN) than it can find a series of numbers
using the IN clause (which is really just another form of the OR clause). [6.5, 7.0, 2000]
Updated 10-30-2003
*****
If possible, try to avoid using the SUBSTRING function in your WHERE clauses.
Depending on how it is constructed, using the SUBSTRING function can force a table
scan instead of allowing the optimizer to use an index (assuming there is one). If the
substring you are searching for does not include the first character of the column you
are searching for, then a table scan is performed.
If possible, you should avoid using the SUBSTRING function and use the LIKE condition
instead, for better performance.
Instead of doing this:
WHERE SUBSTRING(column_name,1,1) = 'b'
Try using this instead:
WHERE column_name LIKE 'b%'
If you decide to make this choice, keep in mind that you will want your LIKE condition to
be sargable, which means that you cannot place a wildcard in the first position. [6.5,
7.0, 2000] Updated 6-4-2003
*****
Where possible, avoid string concatenation in your Transact-SQL code, as it is not
a fast process, contributing to overall slower performance of your application. [6.5, 7.0,
2000] Updated 10-30-2003
*****
Generally, avoid using optimizer hints in your queries. This is because it is
generally very hard to outguess the Query Optimizer. Optimizer hints are special
keywords that you include with your query to force how the Query Optimizer runs. If you
decide to include a hint in a query, this forces the Query Optimizer to become static,
preventing the Query Optimizer from dynamically adapting to the current environment
for the given query. More often than not, this hurts, not helps performance.
If you think that a hint might be necessary to optimize your query, be sure you first do
all of the following first:
• Update the statistics on the relevant tables.

• If the problem query is inside a stored procedure, recompile it.

• Review the search arguments to see if they are sargable, and if not, try to rewrite
them so that they are sargable.
• Review the current indexes, and make changes if necessary.
If you have done all of the above, and the query is not running as you expect, then you
may want to consider using an appropriate optimizer hint.
If you haven't heeded my advice and have decided to use some hints, keep in mind that
as your data changes, and as the Query Optimizer changes (through service packs and
new releases of SQL Server), your hard-coded hints may no longer offer the benefits
they once did. So if you use hints, you need to periodically review them to see if they
are still performing as expected. [6.5, 7.0, 2000] Updated 6-21-2004
*****
If you have a WHERE clause that includes expressions connected by two or
more AND operators, SQL Server will evaluate them from left to right in the
order they are written. This assumes that no parenthesis have been used to change
the order of execution. Because of this, you may want to consider one of the following
when using AND:
• Locate the least likely true AND expression first. This way, if the AND expression
is false, the clause will end immediately, saving time.

• If both parts of an AND expression are equally likely being false, put the least
complex AND expression first. This way, if it is false, less work will have to be
done to evaluate the expression.
You may want to consider using Query Analyzer to look at the execution plans of your
queries to see which is best for your situation. [6.5, 7.0, 2000] Updated 6-21-2004
*****
If you want to boost the performance of a query that includes an AND operator
in the WHERE clause, consider the following:
• Of the search criterions in the WHERE clause, at least one of them should be
based on a highly selective column that has an index.

• If at least one of the search criterions in the WHERE clause is not highly
selective, consider adding indexes to all of the columns referenced in the WHERE
clause.

• If none of the column in the WHERE clause are selective enough to use an index
on their own, consider creating a covering index for this query.
[7.0, 2000] Updated 9-6-2004
*****
The Query Optimizer will perform a table scan or a clustered index scan on a
table if the WHERE clause in the query contains an OR operator and if any of the
referenced columns in the OR clause are not indexed (or does not have a useful index).
Because of this, if you use many queries with OR clauses, you will want to ensure that
each referenced column in the WHERE clause has a useful index. [7.0, 2000] Updated 9-
6-2004
*****
A query with one or more OR clauses can sometimes be rewritten as a series
of queries that are combined with a UNION ALL statement, in order to boost the
performance of the query. For example, let's take a look at the following query:
SELECT employeeID, firstname, lastname
FROM names
WHERE dept = 'prod' or city = 'Orlando' or division = 'food'

This query has three separate conditions in the WHERE clause. In order for this query to
use an index, then there must be an index on all three columns found in the WHERE
clause.

This same query can be written using UNION ALL instead of OR, like this example:

SELECT employeeID, firstname, lastname FROM names WHERE dept = 'prod'


UNION ALL
SELECT employeeID, firstname, lastname FROM names WHERE city = 'Orlando'
UNION ALL
SELECT employeeID, firstname, lastname FROM names WHERE division = 'food'

Each of these queries will produce the same results. If there is only an index on dept,
but not the other columns in the WHERE clause, then the first version will not use any
index and a table scan must be performed. But in the second version of the query will
use the index for part of the query, but not for all of the query.

Admittedly, this is a very simple example, but even so, it does demonstrate how
rewriting a query can affect whether or not an index is used or not. If this query was
much more complex, then the approach of using UNION ALL might be must more
efficient, as it allows you to tune each part of the index individually, something that
cannot be done if you use only ORs in your query.
Note that I am using UNION ALL instead of UNION. The reason for this is to prevent the
UNION statement from trying to sort the data and remove duplicates, which hurts
performance. Of course, if there is the possibility of duplicates, and you want to remove
them, then of course you can use just UNION.

If you have a query that uses ORs and it not making the best use of indexes, consider
rewriting it as a UNION ALL, and then testing performance. Only through testing can you
be sure that one version of your query will be faster than another. [7.0, 2000] Updated
9-6-2004
*****
Don't use ORDER BY in your SELECT statements unless you really need to, as it
adds a lot of extra overhead. For example, perhaps it may be more efficient to sort the
data at the client than at the server. In other cases, perhaps the client doesn't even
need sorted data to achieve its goal. The key here is to remember that you shouldn't
automatically sort data, unless you know it is necessary. [6.5, 7.0, 2000] Updated 9-6-
2004
*****
Whenever SQL Server has to perform a sorting operation, additional
resources have to be used to perform this task. Sorting often occurs when any of
the following Transact-SQL statements are executed:
• ORDER BY

• GROUP BY

• SELECT DISTINCT

• UNION
• CREATE INDEX (generally not as critical as happens much less often)
In many cases, these commands cannot be avoided. On the other hand, there are few
ways that sorting overhead can be reduced. These include:
• Keep the number of rows to be sorted to a minimum. Do this by only returning
those rows that absolutely need to be sorted.

• Keep the number of columns to be sorted to the minimum. In other words, don't
sort more columns that required.

• Keep the width (physical size) of the columns to be sorted to a minimum.

• Sort column with number datatypes instead of character datatypes.


When using any of the above Transact-SQL commands, try to keep the above
performance-boosting suggestions in mind. [6.5, 7.0, 2000] Added 6-5-2003
*****
If you have to sort by a particular column often, consider making that column a
clustered index. This is because the data is already presorted for you and SQL Server is
smart enough not to resort the data. [6.5, 7.0, 2000] Added 6-5-2003
*****
If your WHERE clause includes an IN operator along with a list of values to be
tested in the query, order the list of values so that the most frequently found values are
placed at the first of the list, and the less frequently found values are placed at the end
of the list. This can speed performance because the IN option returns true as soon as
any of the values in the list produce a match. The sooner the match is made, the faster
the query completes. [6.5, 7.0, 2000] Updated 4-6-2004
*****
If you need to use the SELECT INTO option, keep in mind that it can lock system tables,
preventing others users from accessing the data they need. If you do need to use
SELECT INTO, try to schedule it when your SQL Server is less busy, and try to keep the
amount of data inserted to a minimum. [6.5, 7.0, 2000] Updated 4-6-2004
*****
If your SELECT statement contains a HAVING clause, write your query so that the
WHERE clause does most of the work (removing undesired rows) instead of the HAVING
clause do the work of removing undesired rows. Using the WHERE clause appropriately
can eliminate unnecessary rows before they get to the GROUP BY and HAVING clause,
saving some unnecessary work, and boosting performance.
For example, in a SELECT statement with WHERE, GROUP BY, and HAVING clauses,
here's what happens. First, the WHERE clause is used to select the appropriate rows that
need to be grouped. Next, the GROUP BY clause divides the rows into sets of grouped
rows, and then aggregates their values. And last, the HAVING clause then eliminates
undesired aggregated groups. If the WHERE clause is used to eliminate as many of the
undesired rows as possible, this means the GROUP BY and the HAVING clauses will have
less work to do, boosting the overall performance of the query. [6.5, 7.0, 2000]
Updated 4-6-2004
*****
If your application performs many wildcard (LIKE %) text searches on CHAR or
VARCHAR columns, consider using SQL Server's full-text search option. The Search
Service can significantly speed up wildcard searches of text stored in a database. [7.0,
2000] Updated 4-6-2004
*****
The GROUP BY clause can be used with or without an aggregate function. But if you
want optimum performance, don't use the GROUP BY clause without an
aggregate function. This is because you can accomplish the same end result by using
the DISTINCT option instead, and it is faster.
For example, you could write your query two different ways:
USE Northwind
SELECT OrderID
FROM [Order Details]
WHERE UnitPrice > 10
GROUP BY OrderID
or
USE Northwind
SELECT DISTINCT OrderID
FROM [Order Details]
WHERE UnitPrice > 10
Both of the above queries produce the same results, but the second one will use less
resources and perform faster. [6.5, 7.0, 2000] Updated 11-15-2004
*****
The GROUP BY clause can be sped up if you follow these suggestion:
• Keep the number of rows returned by the query as small as possible.

• Keep the number of groupings as few as possible.

• Don't group redundant columns.

• If there is a JOIN in the same SELECT statement that has a GROUP BY, try to
rewrite the query to use a subquery instead of using a JOIN. If this is possible,
performance will be faster. If you have to use a JOIN, try to make the GROUP BY
column from the same table as the column or columns on which the set function
is used.

• Consider adding an ORDER BY clause to the SELECT statement that orders by the
same column as the GROUP BY. This may cause the GROUP BY to perform faster.
Test this to see if is true in your particular situation.
[7.0, 2000] Added 6-6-2003
*****
Sometimes perception is more important that reality. For example, which of the
following two queries is the fastest:
• A query that takes 30 seconds to run, and then displays all of the required
results.
• A query that takes 60 seconds to run, but displays the first screen full of records
in less than 1 second.
Most DBAs would choose the first option as it takes less server resources and performs
faster. But from many user's point-of-view, the second one may be more palatable. By
getting immediate feedback, the user gets the impression that the application is fast,
even though in the background, it is not.
If you run into situations where perception is more important than raw performance,
consider using the FAST query hint. The FAST query hint is used with the SELECT
statement using this form:
OPTION(FAST number_of_rows)
where number_of_rows is the number of rows that are to be displayed as fast as
possible.
When this hint is added to a SELECT statement, it tells the Query Optimizer to return the
specified number of rows as fast as possible, without regard to how long it will take to
perform the overall query. Before rolling out an application using this hint, I would
suggest you test it thoroughly to see that it performs as you expect. You may find out
that the query may take about the same amount of time whether the hint is used or not.
If this the case, then don't use the hint. [7.0, 2000] Updated 11-15-2004
*****
Instead of using temporary tables, consider using a derived table instead. A
derived table is the result of using a SELECT statement in the FROM clause of an
existing SELECT statement. By using derived tables instead of temporary tables, you
can reduce I/O and boost your application's performance. [7.0, 2000] Updated 11-15-
2004 More info on derived tables.
*****
It is fairly common request to write a Transact-SQL query to to compare a parent
table and a child table and find out if there are any parent records that don't have a
match in the child table. Generally, there are three ways this can be done:
Using a NOT EXISTS

SELECT a.hdr_key
FROM hdr_tbl a
WHERE NOT EXISTS (SELECT * FROM dtl_tbl b WHERE a.hdr_key = b.hdr_key)
Using a LEFT JOIN
SELECT a.hdr_key
FROM hdr_tbl a
LEFT JOIN dtl_tbl b ON a.hdr_key = b.hdr_key
WHERE b.hdr_key IS NULL
Using a NOT IN

SELECT hdr_key
FROM hdr_tbl
WHERE hdr_key NOT IN (SELECT hdr_key FROM dtl_tbl)
In each case, the above query will return identical results. But, which of these three
variations of the same query produces the best performance? Assuming everything else
is equal, the best performing version through the worst performing version will be from
top to bottom, as displayed above. In other words, the NOT EXISTS variation of this
query is generally the most efficient.
I say generally, because the indexes found on the tables, along with the number of rows
in each table, can influence the results. If you are not sure which variation to try
yourself, you can try them all and see which produces the best results in your particular
circumstances. [7.0, 2000] Updated 11-15-2004
*****
Be careful when using OR in your WHERE clause, it is fairly simple to
accidentally retrieve much more data than you need, which hurts
performance. For example, take a look at the query below:
SELECT companyid, plantid, formulaid
FROM batchrecords
WHERE companyid = '0001' and plantid = '0202' and formulaid = '39988773'
OR
companyid = '0001' and plantid = '0202'
As you can see from this query, the WHERE clause is redundant, as:
companyid = '0001' and plantid = '0202' and formulaid = '39988773'
is a subset of:
companyid = '0001' and plantid = '0202'
In other words, this query is redundant. Unfortunately, the SQL Server Query Optimizer
isn't smart enough to know this, and will do exactly what you tell it to. What will happen
is that SQL Server will have to retrieve all the data you have requested, then in effect do
a SELECT DISTINCT to remove redundant rows it unnecessarily finds.
In this case, if you drop this code from the query:
OR
companyid = '0001' and plantid = '0202'
then run the query, you will receive the same results, but with much faster
performance. [6.5, 7.0, 2000] Updated 11-15-2004
*****
If you need to verify the existence of a record in a table, don't use SELECT
COUNT(*) in your Transact-SQL code to identify it, which is very inefficient and wastes
server resources. Instead, use the Transact-SQL IF EXITS to determine if the record in
question exits, which is much more efficient. For example:
Here's how you might use COUNT(*):
IF (SELECT COUNT(*) FROM table_name WHERE column_name = 'xxx')
Here's a faster way, using IF EXISTS:
IF EXISTS (SELECT * FROM table_name WHERE column_name = 'xxx')
The reason IF EXISTS is faster than COUNT(*) is because the query can end immediately
when the text is proven true, while COUNT(*) must count go through every record,
whether there is only one, or thousands, before it can be found to be true. [7.0, 2000]
Updated 11-15-2004
*****
Let's say that you often need to INSERT the same value into a column. For
example, perhaps you have to perform 100,000 INSERTs a day into a particular table,
and that 90% of the time the data INSERTed into one of the columns of the table is the
same value.
If this the case, you can reduce network traffic (along with some SQL Server overhead)
by creating this particular column with a default value of the most common value. This
way, when you INSERT your data, and the data is the default value, you don't INSERT
any data into this column, instead allowing the default value to automatically be filled in
for you. But when the value needs to be different, you will of course INSERT that value
into the column. [6.5, 7.0, 2000] Updated 11-15-2004
*****
Performing UPDATES takes extra resources for SQL Server to perform. When
performing an UPDATE, try to do as many of the following recommendations as you can
in order to reduce the amount of resources required to perform an UPDATE. The more of
the following suggestions you can do, the faster the UPDATE will perform.
• If you are UPDATing a column of a row that has an unique index, try to only
update one row at a time.
• Try not to change the value of a column that is also the primary key.

• When updating VARCHAR columns, try to replace the contents with contents of
the same length.

• Try to minimize the UPDATing of tables that have UPDATE triggers.

• Try to avoid UPDATing columns that will be replicated to other databases.

• Try to avoid UPDATing heavily indexed columns.

• Try to avoid UPDATing a column that has a reference in the WHERE clause to the
column being updated.
Of course, you may have very little choice when UPDATing your data, but at least give
the above suggestions a thought. [6.5, 7.0, 2000] Added 7-2-2003
*****
If you have created a complex transaction that includes several parts, one part
of which has a higher probability of rolling back the transaction than the others, better
performance will be provided if you locate the most likely to fail part of the transaction
at the front of the greater transaction. This way, if this more-likely-to-fail transaction has
to roll back because of a failure, there has been no resources wasted on the other less-
likely-to-fail transactions. [6.5, 7.0, 2000] Added 7-2-2003

How do you determine the Nth row in a SQL Server database?

Answer

Consider the Pubs sample database that ships with SQL Server 2000. Our task is to
determine the third, but last date when one employee joined the company. Several
approaches are possible here. Let's first have a look at the different methods
available to us, before getting into a basic performance analysis.

1) Using TOP. This is probably the most intuitive one.

SELECT TOP 1
hire_date
FROM
employee
WHERE
hire_date
NOT IN(
SELECT TOP 2
hire_date
FROM
employee
ORDER BY
hire_date DESC)
ORDER BY
hire_date DESC

hire_date
------------------------------------------------------
1994-01-19 00:00:00.000

(1 row(s) affected)

Not much explanation needed here. The NOT IN rules out the TOP 2 hire dates. And
from the remaining row we take the TOP 1 hire date. This is a straightforward
approach.

2) Here we us the SQL Server feature to assign the value of a variable to the last row
processed.

DECLARE @dt DATETIME


SELECT TOP 3
@dt = hire_date
FROM
employee
ORDER BY
hire_date DESC
SELECT @dt hire_date

hire_date
------------------------------------------------------
1994-01-19 00:00:00.000

(1 row(s) affected)

The variable @dt is assigned to every row in the resultset. But since we force an
ORDER BY, the last row processed contains the date we are interested in and @dt is
assigned this value. We now only need to SELECT the variable to get the desired
result.

3) Use a temporary table. Below, we show a generic approach that uses a stored
procedure that accepts the desired row as an input parameter and returns the
corresponding hire date:

USE PUBS
GO
CREATE PROC dbo.GetNthLatestEntry (@NthLatest INT)
AS
SET NOCOUNT ON
BEGIN
CREATE TABLE #Entry
(
ID INT PRIMARY KEY NOT NULL IDENTITY(1,1)
, Entry DATETIME NOT NULL
)
INSERT INTO #Entry (Entry) SELECT hire_date FROM employee ORDER BY
hire_date DESC
SELECT
Entry hire_date
FROM
#Entry
WHERE
ID = @NthLatest
DROP TABLE #Entry
END
SET NOCOUNT OFF
GO
EXEC dbo.GetNthLatestEntry 3
DROP PROCEDURE dbo.GetNthLatestEntry

hire_date
------------------------------------------------------
1994-01-19 00:00:00.000

4) Until now we always used either this or that proprietary feature of SQL Server.
Either TOP or the IDENTITY property. Now we try to make this portable and use ANSI
SQL.

SELECT
e1.hire_date
FROM
employee AS e1
INNER JOIN
employee AS e2
ON
e1.hire_date <= e2.hire_date
GROUP BY
e1.hire_date
HAVING COUNT(DISTINCT e2.hire_date) = 3

hire_date
------------------------------------------------------
1994-01-19 00:00:00.000

(1 row(s) affected)

If you are interested in how this statement works, we suggest you have a look at the
books by Joe Celko.

So, we now have four different methods to get the same result. Which should we
choose? Well, the classical answer here: It depends! If your goal is to make your SQL
as portable as possible, you will surely choose the ANSI SQL method. If you, however,
do not bother about portability, you still have three different methods to choose from.
Let's now have a look at the output of SET STATISTICS IO ON. The results below
correspond to the four methods described above.

1. Table 'employee'. Scan count 4, logical reads 8, physical reads


0, read-ahead reads 0.

2. Table 'employee'. Scan count 1, logical reads 2, physical reads


0, read-ahead reads 0.

3. Table 'employee'. Scan count 1, logical reads 2, physical reads


0, read-ahead reads 0.

4. Table 'employee'. Scan count 44, logical reads 88, physical reads
0, read-ahead reads 0.
As you can see, one method clearly differs from all the others. This is the ANSI SQL
method. Portability has its price. The first method was the TOP method. It creates 4
times the IO of the other 2 methods. Though it is logical IO, it still is IO. So, the choice
now is between the temp table approach and the variable assignment approach. A
choice here might be dependent on how busy your whole system is. The use of temp
tables might cause issues in tempdb. So, for such simple questions, using the
variable assignment method seems to be a fairly reasonable choice. Running Profiler
to measure the duration here, is not very meaningful, since the employee table all in
all has just 43 rows. So every method is executed very fast. On larger table it is a
good practice to build up a test scenario to see how the different methods perform in
your specific environment.

Why can it take so long to drop a clustered index?

Answer

Generally speaking, indexes can speed up queries tremendously. This comes at the
cost, as changes to the underlying data have to be reflected in the indexes when the
index column(s) are modified.

Before we get into the reasons why dropping a clustered index can be time-
consuming, we need to take a short look at the different index structures in SQL
Server.

Every table can have one, and only one clustered index. A clustered index sorts the
data physically according to its index keys. And since there can only be one
physically sorted order on a table at a time, this sounds pretty obvious. If a table
does not have a clustered index, it is called a heap.

The second index structure are non-clustered indexes. You can create non-clustered
indexes on tables with clustered indexes, heaps, and indexed views.

The difference between both index structures is at the leaf level of the index. While
the leaf level of a clustered index actually is the table's data itself, you only find
pointers to the data at the leaf level of a non-clustered index. Now we need to
understand an important difference:
• When a table has a clustered index created, the pointers contain the clustered
index keys for that row.

• When a table does not have a clustered index, the pointers consist of the so-
called RowID, which is a combination of FileNumber:PageNumber:Slot.
When you understand this distinction, you can derive the answer to the original
question yourself. When you drop a clustered index, SQL Server will have to recreate
all non-clustered indexes on that table (assuming there are any). During this
recreation, the clustered index keys are replaced by the RowID. It should be obvious
that this is a time-consuming operation, especially on larger tables or tables with
many indexes.
One way around the long delay when dropping a clustered index (assuming there are
also non-clustered indexes) is to first drop the non-clustered indexes, and then drop
the clustered index. Likewise, you should first create the clustered index and then
the non-clustered indexes. Sticking to these recommendation, you won't waste time
and server resources that are spent better otherwise.

Is it possible to keep the database in memory?

Answer

In a very real sense, SQL Server automatically attempts to keep as much of the
database in memory as it can.

By default, when SQL Server is using memory dynamically, it queries the system
periodically to determine the amount of free physical memory available. If there is
more memory free, SQL Server recommits memory to the buffer cache, which SQL
Server uses to store data for ready access. SQL Server adds memory to the buffer
cache only when its workload requires more memory; a server at rest does not grow
its buffer cache.

SQL Server allocates much of its virtual memory to a buffer cache and uses the cache
to reduce physical I/O. Each instance of SQL Server automatically caches execution
plans in memory based upon available memory. Data is read from the database disk
files into the buffer cache. Multiple logical reads of the data can be satisfied without
requiring that the data be physically read again.

By maintaining a relatively large buffer cache in virtual memory, an instance of SQL


Server can significantly reduce the number of physical disk reads it requires.

Another method of providing performance improvement is using DBCC PINTABLE,


which is used to store tables in memory on a more or less permanent basis. It works
best for small tables that are frequently accessed. The pages for the small table are
read into memory one time, and then all future references to their data do not
require a disk read. SQL Server keeps a copy of the page available in the buffer cache
until the table is unpinned using the DBCC UNPINTABLE statement. This option should
be used sparingly as it can reduce the amount of overall buffer cache available for
SQL Server to use dynamically.

How can you use IIf in Transact-SQL?


Answer

This is a quite common question. It is usually asked by people arriving at SQL Server
with a background in Microsoft Access. They either want to use SQL Server as a
backend for their Access project, or they are otherwise upsizing from Access to SQL
Server. The answer, however is usually not much appreciated at first:

There is no IIf in SQL Server's Transact SQL language!

Like it or not, such queries have to be rewritten using the CASE expression. Let's look
at a simple example:
SELECT
Customers.CustomerID
, Customers.CompanyName
, Customers.Country
, IIf([Country]="Germany","0049 " & [Phone],[Phone]) AS Telefon
FROM
Customers
This is a valid query in Access, which evaluates within Access' Northwind sample
database whether a Customer is located in Germany or not. If this is the case (pun
intended!), it automatically adds the international telephone number for Germany in
front of the phone number. If you try to run this in SQL Server's Query Analyzer, you'll
get:
Server: Msg 170, Level 15, State 1, Line 5
Line 5: Incorrect syntax near '='.
That's it. The query stops with this error message. So, as was mentioned above, the
query has to be rewritten using the CASE expression. That might look something like
this:
SELECT
Customers.CustomerID
, Customers.CompanyName
, Customers.Country
, CASE
WHEN Country='Germany'
THEN '0049 ' + Phone
ELSE Phone
END AS Phone
FROM
Customers
This is a valid Transact-SQL query, which SQL Server can understand and execute.
CASE is one of the most powerful commands in the Transact-SQL language. In
contrast to IIf, where you only evaluate one logical expression at a time, this
limitation does not exist for CASE. Try, for example, to put this in one single IIf
expression:
SELECT
Customers.CustomerID
, Customers.CompanyName
, Customers.Country
, CASE Country
WHEN 'Germany'
THEN '0049 ' + Phone
WHEN 'Mexico'
THEN 'Fiesta ' + Phone
WHEN 'UK'
THEN 'Black Pudding (Yuk!) ' + Phone
ELSE Phone
END AS Phone
FROM
Customers
Don't spent too much time here on the sense of this query, but you will get the idea
of what is possible with CASE. And once you are familiar with using CASE, you'll
hardly miss IIf anymore.
What happens when my integer IDENTITY runs out of scope?

Answer

Before we actually look at the answer, let's recall some basics of the IDENTITY
property and SQL Server's numerical data types.

You can define the IDENTITY property on columns of the INT data type and on
DECIMAL with scale 0. This gives you a range of:

TINYINT 0 - 255

SMALLINT -32.768 - 32.767

INT -2.147.483.648 - 2.147.483.647

BIGINT -2^63 - 2^63-1

When you decide to use the DECIMAL datatype you have a potential range from
-10^38 to 10^38-1.

So, keeping this in mind, we're now ready to answer the original question here. What
happens when an INTEGER IDENTITY value is about to run out of scope?
CREATE TABLE id_overflow
(
col1 INT IDENTITY(2147483647,1)
)
GO
INSERT INTO id_overflow DEFAULT VALUES
INSERT INTO id_overflow DEFAULT VALUES
SELECT * FROM id_overflow
DROP TABLE id_overflow

(1 row(s) affected)

Server: Msg 8115, Level 16, State 1, Line 2


Arithmetic overflow error converting IDENTITY to data type int.
Arithmetic overflow occurred.
This script creates a simple table with just one column of type INT. We have also
created the IDENTITY property for this column. But instead of now adding more than
2 billion rows to the table, we rather set the seed value to the positive maximum
value for an INTEGER. The first row inserted is assigned that value. Nothing unusual
happens. The second insert, however, fails with the above error. Apparently SQL
Server does not start all over again or tries to fill the maybe existing gaps in the
sequence. Actually, SQL Server does nothing automatically here. You have to do this
by yourself. But what can you do in such a case?
Probably the easiest solution is to alter the data type of the column to BIGINT, or
maybe right on to DECIMAL(38,0) like so:
CREATE TABLE id_overflow
(
col1 INT IDENTITY(2147483647,1)
)
GO
INSERT INTO id_overflow DEFAULT VALUES
ALTER TABLE id_overflow
ALTER COLUMN col1 BIGINT
INSERT INTO id_overflow DEFAULT VALUES
SELECT * FROM id_overflow
DROP TABLE id_overflow

col1
--------------------
2147483647
2147483648

(2 row(s) affected)
If you know in advance that your table needs to keep that many rows, you can do:
CREATE TABLE bigint_t
(
col1 BIGINT IDENTITY(-9223372036854775808, 1)
)
GO
INSERT INTO bigint_t DEFAULT VALUES
SELECT * FROM bigint_t
DROP TABLE bigint_t

col1
--------------------
-9223372036854775808

(1 row(s) affected)
Or the DECIMAL(38,0) variation:
CREATE TABLE decimal_t
(
col1 DECIMAL(38,0) IDENTITY(-99999999999999999999999999999999999999, 1)
)
GO
INSERT INTO decimal_t DEFAULT VALUES
SELECT * FROM decimal_t
DROP TABLE decimal_t

col1
----------------------------------------
-99999999999999999999999999999999999999

(1 row(s) affected)
One might be distressed in one's aesthetical taste by those negative numbers, but
it's a fact, that one now shouldn't have to worry about running out of scope for quite
some time.

What is the difference between DELETE and TRUNCATE? Is one faster than the other?

Answer

DELETE logs the data for each row affected by the statement in the transaction log
and physically removes the record from the file. The recording of each affected row
can let your transaction log grow massively. However, when you run your databases
in full recovery mode, this is necessary for SQL Server to be able to recover the
database in case of a disaster to the most recent state. The fact, that each row is
logged explains also, why, DELETE statements can be slow.

TRUNCATE is faster than DELETE due to the way TRUNCATE "removes" the data.
Actually, TRUNCATE does not remove data, but rather deallocates whole data pages
and removes pointers to indexes. The data still exists until it is overwritten or the
database is shrunk. This action does not require great resources and is therefore very
fast. It is a common mistake to think that TRUNCATE is not logged. This is wrong. The
deallocation of the data pages is recorded in the log file. Therefore, BOL refers to
TRUNCATE operations as "minimally logged" operations. You can use TRUNCATE
within a transaction, and when this transaction is rolled-back, the data pages are
reallocated again and the database is again in its original, consistent state.

Some limitations do exist for TRUNCATE.

• You need to be db_owner, ddl_admin or owner of the table to be able to fire a


TRUNCATE statement.

• TRUNCATE will not work on tables, which are references by one or more
FOREIGN KEY constraints.
So if TRUNCATE is so much faster than DELETE why should one use DELETE at all?
Well, TRUNCATE is an all-or-nothing approach. You can not specify just to truncate
that rows that match a certain criteria. It's either all rows or none. You can, however,
use a workaround here. Suppose you want to delete more rows from a table than will
remain. In this case you can export the rows that you want to keep to a temporary
table, run the TRUNCATE statement, and finally reimport the remaining rows from the
temporary table. If your table contains a column with the IDENTITY property defined
on it, and you want to keep the original IDENTITY values, be sure to enabled
IDENTITY_INSERT on the table before you reimport from the temporary table.
Chances are good that this workaround is still faster than a DELETE operation. You
can also set the recovery mode to "Simple" before you start this workaround, and
then back to "Full" one it is done. However, keep in mind, that is this case, you might
only be able to recover to the last backup. Ask yourself, if this is good enough for
you!

My application is very INSERT heavy. What can I do to speed up the performance of


INSERTs?
Answer

Here are a variety of tips that can help speed up INSERTs.

1) Use RAID 10 or RAID 1, not RAID 5 for the physical disk array that stores your SQL
Server database. RAID 5 is slow on INSERTs because of the overhead of writing the
parity bits. Also, get faster drives, a faster controller, and consider turning on write
caching on the controller if it is not already turned on (although this has its
disadvantages, such as lost data if your hardware fails).

2) The fewer the indexes on the table, the faster INSERTs will be.

3) Try to avoid page splits. Ways to do this include having an appropriate fillfactor
and pad_index, rebuilding indexes often, and consider adding a clustered index on an
incrementing key for the table (this forces pages to be added one after another, and
page splits are not an issue).

4) Keep the columns widths as narrow as possible.

5) If data length in a column is consistent, use CHAR columns, or if data length varies
a lot, use VARCHAR columns.

6) Try to batch INSERTs rather than to INSERT one row at a time. But this can also
cause problems if the batch of INSERTs is too large.

None of these suggestions will radically speed up your INSERTs by themselves, but
put together, they all will contribute to overall faster INSERTs.

My SQL Server seems to take memory, but never releases it. Is this normal?

Answer

If you are running SQL Server 7.0 or SQL Server 2000, and have the memory setting
set to dynamically manage memory (the default setting), SQL Server will
automatically take as much RAM as it needs (assuming it is available) from the
available RAM. Assuming that the operating system or other application running on
the same physical server don't need more RAM, SQL Server will keep control of the
RAM, even if it really doesn't need it. The reason for this is because it is more
resource efficient for SQL Server o keep holding the RAM (even if it doesn't currently
need it) than to release and grab it over and over as memory needs change.

If your SQL Server is a dedicated SQL Server, it is very normal for SQL Server to take
memory, but to never release it.

If you have set SQL Server to use a minimum amount of memory (not a default
setting), once SQL Server grabs this amount, it will not give it up until it is restarted.
This can account for some instances of SQL Server not giving up memory.

If you have a non-dedicated SQL Server, and there are other applications running on
the same physical server, SQL Server will give up some of its memory if needed. But
this may not happen instantly. For example, if SQL Server needs a specific amount of
memory to complete a current task, it won't give up that memory until that task is
complete. In the meantime, your other application may cause your server to have
excessive paging, which can hurt performance. The best solution to the issue of SQL
Server having to fight for memory with other applications is to either add more RAM
to the server, or to move the other applications off the server to another server.
Is there any significant performance difference when joining tables across different
databases on the same server?

Answer

This is very easy to test yourself. For example, make a copy of pubs and call is pubs2.
Then create a query to JOIN two related tables from within pubs. Create one JOIN that
JOINs two tables in the same database, and create a second JOIN, but for one of the
JOINed tables, modify the query so that it points to the table in the other database.
Then run both queries and examine their query plans.

In virtually every case, the execution plans are identical, which tells you that
performance of the query, whether it is inside a single database, or between two
databases on the same serve, are more or less identical.

On the other hand, if the databases are on separate servers, performance will suffer
greatly due to network latency, etc.
Is there any performance difference between using SET or SELECT to assign values
in Transact-SQL?

Answer

This is virtually no difference in performance between using SET or SELECT to assign


values. In most cases, you will want to use the ANSI standard, which says you should
use SELECT. See the SQL Server Books Online for the exact syntax for both options.

Which is faster when using SQL Server 2000, temp tables or the new table datatype?

Answer

Generally speaking, if the data you are dealing with is not large, then the table
datatype will be faster than using a temp table. But if the amount of data is large,
then a temp table most likely will be faster. Which method is faster is dependent on
the amount of RAM in your server available to SQL Server, and this of course can vary
from server to server. The greater the RAM in a server, the greater number of records
that can be efficiently stored in a table datatype. You may have to test both methods
to determine which method is best for your situation.

Here are some reasons why the table datatype, when used with reasonable amounts
of data, are generally faster than using a temp table:

• Records are stored in memory, not in a temp table in the tempdb database,
so performance is much faster.
• Table variables act like local variables and have a well-defined scope.
Whenever the batch, function, or stored procedure that created the table
variable goes away, the table variable is automatically cleaned up.

• When a table variable is used inside a stored procedure instead of a temp


table, fewer recompilations occur, reducing server overhead.

• Table variables require less locking and logging resources when compared to
temporary tables, reducing server overhead and boosting concurrency.
If you haven't learned how to use table variables yet, you need to take the time to do
so as soon as you can. They can be powerful tools in the correct situations.

1) Do we need to create a primary key constraint on a table that already has a


clustered unique index? If so, why?

2) If we need a primary key constraint, should we delete the clustered unique


index we created, and then add the primary key constraint? My concern is
that we will end up with two clustered unique indexes on the same
column(s) that will add to database load times, etc.

3) Is there an advantage to using the primary key constraint to automatically


create the clustered unique index as opposed to just outright creating the
clustered unique index if they are on the same column.

Answer

Let's take a look at each of these three questions.


First Question: Technically speaking, a primary key is not required on a table for any
reason. A primary key serve two purposes. It acts to enforce entity integrity of the
table. What this means is that is ensure that there are no duplicate records in a table.
Duplicate records in a table can lead to all kinds of problems, and by adding a
primary key to a table, you eliminate this possibility. Next, a primary key can be used
along with a foreign key to ensure that referential integrity is maintained between
tables. Because of these two reasons, it is generally recommended that all tables
have a primary key.
By itself, a primary key does not have a direct affect on performance. But indirectly, it
does. This is because when you add a primary key to a table, SQL Server creates a
unique index (clustered by default) that is used to enforce entity integrity. But as you
have already discovered, you can create your own unique indexes on a table, which
has the same affect on performance. So, strictly speaking, a primary index does not
affect performance, but the index used by the primary key does.
Now to directly answer your question. You don't have to have primary keys on your
tables if you don't care about the benefits that arise from using them. If you like, you
can keep the current indexes you have, and assuming they are good choices for the
types of queries you will be running against the table, then performance will be
enhanced by having them. Replacing your current indexes, and replacing them with
primary keys will not help performance.
Second Question: I think I answered most of this question above. But I do want to
point out that you cannot have two clustered indexes on the same table. So if you did
want to add primary keys to the tables you currently have, you can, using one of two
techniques. First, you can choose to add a primary key using a non-clustered index,
or second, you can drop your current indexes, and then add a primary key using a
clustered index.
Third Question: Here's what I do. I assign primary keys to every table because this is
a best database design practice. But before I do, I evaluate whether or not the
primary key's index should be clustered or non-clustered, and then choose
accordingly. Since you can only have one clustered index, it should be chosen well.
See my clustered tips webpage for more details. Next, I then evaluate the table for
additional indexes that may be needed, and proceed accordingly.
It is not a good idea, from a performance perspective, to accept the default of a
clustered index on a primary key, as it may not be the best choice for the use of a
clustered index. In addition, it is not a good idea to "double up" on indexes. In other
words, don't put a Primary key non-clustered index on a column, and a clustered
index on the same column (this is possible, although never a good idea).
Always think about indexes and why they exist. Only add indexes where they are
needed, and nowhere else. Too many indexes can be as bad for performance as too
few indexes.

1) Delete from emp and delete * from emp will work in oracle but
Delete from emp only works in sql server

2) select * from employee where emp_name like '%C%' it display all the records having single ‘C’ in
emp_name Ex : ACF,CGT,TGC etc
select * from employee where emp_name like '%%C' it displays all the records having ‘c’ in last
ex: ABC,VBC etc

3) To Rename a table in sql server


--create table mytest(myname char(20))
-- sp_rename @objname = 'mytest', @newname = 'Newmytest'

4) Getting tables list from SQL Server


SELECT *
FROM sysobjects
WHERE type = 'U' order by name

5) I wrote one procedure in SQL SERVER(t-sql) I want to schedule that


procedure to run every day morning at 7:00AM Please let me how to
schedule
It is a simple Task, just go to "management" option in enterprise
mangaer
> and choose "Jobs" Option
> Right Click ,
> select New Job
> give the Name of the Job
> then select "Steps" Tab
> Click "New" Button
> select "Open" Button
> Choose the procedure you want to run
> give steps name
> Choose Ok button
> Select "Shedules" Tab
> Select "New shedule" Button
> Give Shedule Name
> Select "Change Button"
> Choose "Daily" Option Button
> Choose the Daily Frequency
> click 'OK' button

Você também pode gostar