Escolar Documentos
Profissional Documentos
Cultura Documentos
Abstract. Automatic index selection has received significant attention in the au-
tonomic computing field. Previous works have focused on providing tools and
algorithms to help the DBA in the choice of indices for a given static workload.
We present an approach for indexing management that works for workloads that
may dynamically change with no human intervention at all. We automatically
monitor statements submitted to the database with a built-in component, that
interacts directly with the DBMS, creating and dropping indices on-the-fly. This
paper presents a mechanism to integrate automatic index management com-
ponents in relational databases. We discuss the heuristics considered and the
observed experimental results for a PostgreSQL implementation.
1. Introduction
The database tuning task consists of fine manipulations aiming at obtaining better perfor-
mance of DBMS-based applications by means of an efficient use of the available compu-
tational resources. It is one of the main maintenance tasks performed by a database ad-
ministrator (DBA). Tuning considers hardware configuration, physical design and query
specifications, as well as commercial DBMSs parameters.
A good approach for the tuning process is to understand the functioning of
the entire system, but only carry through improvements in specific points each time
[Shasha and Bonnet 2003]. Some factors have made the tuning process more complex,
as in the case of parallel machines and systems. These bring new questions such as the
data allocation in multiple disks. Moreover, at each new edition of commercial DBMSs,
176
XXI Simpósio Brasileiro de Banco de Dados
additional operational parameters appear to be adjusted. So, the tuning activity becomes
even more important and, at the same time, more expensive, as highly specialized profes-
sionals are needed [Chaudhuri and Weikum 2000].
In our work we present an approach to completely automate the index self-tuning
process. This means no human intervention at all. We use a tuning component integrated
with the DBMS optimizer to choose good indices and create them when needed. The
component also detects and drops bad indices, if any are evaluated as so. Our main
goals are, on one hand, to enable the experienced DBA to focus on more complex and
unattended situations and, on the other hand, to offer a completely automatic solution
when no DBA or database experts are present.
It should be noted that some of the existing approaches for index self-tuning,
mostly in commercial DBMSs, are based only on index suggestion for specific, and static,
workloads. Indeed, it is up to the DBA to decide upon choosing the right workload and
parameters. Moreover, the DBA needs also to determine when to execute index creation
(or destruction) commands. Our mechanism enables a complete automation of this whole
process for dynamic workloads.
We have used PostgreSQL [Pos ] DBMS, an open source and full-fledged rela-
tional DBMS, in order to validate our ideas. We have coded a new version for PostgreSQL
that includes hypothetical indices, that help the DBA with what-if simulations. It also en-
ables our component to detect when new indices must be built or excluded. We observe
also that there are no automatic tuning advisors (or wizards) present in PostgreSQL.
We have tried many different heuristics to support the DBMS decision upon the
automatic creation of an index. We present here the Benefits Heuristic and give some
practical results with a transactional (TPC-C) workload.
The rest of the paper is organized as follows: in the next Section, we present and
discuss related works within a given classification on self-tuning database systems. A
self-tuning engine that enables autonomic index management is presented in Section 3.
Then, in Section 4, we present implementation and architectural issues, besides practical
results obtained. Finally, Section 5 lists our contributions, future and ongoing work.
177
XXI Simpósio Brasileiro de Banco de Dados
178
XXI Simpósio Brasileiro de Banco de Dados
[Agrawal et al. 2000], materialized views suggestion is also considered. The objective of
the index suggestion tools is to generate an index set for one determined input workload,
obtained by the DBA. The workload is broken into single statement inputs and candidate
indices for each statement are generated. As not all the possible indices evaluated exist
in the database, a separate module enables the consideration of hypothetical indices. The
candidate indices found for a statement are the hypothetical indices that would bring the
largest benefit to its execution in case they were materialized. Such candidate indices are
then arranged into configurations and costed by the query optimizer. A greedy algorithm
extends the number of statements and index configurations considered until a best index
configuration is determined to the workload as a whole.
The work of [Lohman et al. 2000, Zilio et al. 2004] suggests that the index se-
lection heuristic be tightly integrated with the optimizer. In [Zilio et al. 2004] the index
selection tool is augmented with materialized views suggestion. The optimizer itself is
extended with an index suggestion mode. In this mode, before the optimization of a given
query, hypothetical indices for all relevant column usages are generated. As the number
of possible multi-column indices for a query can be large, an heuristic, called “Smart
column Enumeration for Index Scans” (SAEFIS), is proposed to limit the enumeration.
Then, optimization proceeds normally. At the end, the indices picked by the optimizer are
recommended for the query. The single query recommendations serve as input to an index
selection heuristic that tries to find the best index configuration for a given workload. In
their strategy, the benefit assigned for each index is the entire benefit of the winning set
of indices found for the query. Then, the problem is viewed as the 0-1 knapsack problem
and the classic greedy heuristic is used to find a solution. In our work, we have used the
SAEFIS heuristic for the hypothetical index enumeration step (see Section 3).
In [Chaudhuri et al. 2004] the authors study the complexity of the index selection
problem and prove it to be both NP-hard and hard to approximate. Reviewing previous
approaches, a new heuristic is proposed for assigning benefits to individual indices using
a linear programming strategy. This strategy is more refined than the one presented at
[Lohman et al. 2000]. The problem of selecting indices for the workload is also viewed
as the 0-1 knapsack problem and the classic greedy heuristic is used to find a solution.
All of these previous proposals do not address the automation of the complete cy-
cle of workload detection, index selection and actual data structure creation or destruction.
It is important to stress that previous heuristics presented in the literature do not have an
on-line operation. Therefore, the presence of a DBA is mandatory during the index tuning
process, in order to characterize the system’s workload. Moreover, the final decision upon
index management (creation or dropping) also requires human intervention.
Our work here studies a possible solution to these problems through the use of
heuristics embedded in an autonomic system component. Our heuristic decides on-line
which indices should be managed to speed workload execution. This approach was also
adopted in [Sattler et al. 2004], though with a distinct cost model, specially with respect
to the cumulative benefit considered for index creation.
179
XXI Simpósio Brasileiro de Banco de Dados
the throughput, making the best possible use of the available computational resources
through the creation of an adequate index design. We do accommodate seasonal changes
in workload patterns as long as there is enough time between changes for our component
to recognize new workload characteristics and modify the index design.
Our architecture uses a tuning component based on software agents that monitors
the system and makes decisions autonomically. Software engineering techniques needed
to build such a component are further discussed in [Costa et al. 2005, Salles 2004]. The
self-tuning component interacts with DBMS components through a generic self-tuning
process with the following stages:
• Information Retrieval: the self-tuning component obtains measurements and in-
formation from DBMS components.
• Situation Evaluation: from the information obtained, the self-tuning component
updates data structures that will guide his tuning decisions.
• Possible Alterations Enumeration: the self-tuning component uses heuristics to
enumerate a set of alternative adjustments that can bring improved system perfor-
mance. During this enumeration, the component can simulate hypothetical sce-
narios using mechanisms previously built in the DBMS components.
• Alterations Accomplishment: the self-tuning component applies the chosen mod-
ifications to system components.
The self-tuning process presented here is based on a feedback control loop. In
the loop, tuning decisions are progressively refined by the evaluation of new measure-
ments extracted from system components. The use of a feedback loop is one of the main
characteristics of self-tuning database architectures proposed so far [Weikum et al. 2002].
180
XXI Simpósio Brasileiro de Banco de Dados
One must also remember that our autonomic component runs concurrently with
other DBMS tasks. This implies that the Configuration Selection heuristic must not be
computationally intensive. We also explicitly compute the accumulated benefit of actual
indices present in the database and drop indices when they become harmful to system
performance. Furthermore, due to our built-in mechanism, we may capture all DBMS
statements, including stored procedures, being able to make automatic fine tuning deci-
sions.
Benefits Heuristics
We need to first introduce some of the factors that are taken into account:
1. CA : the cost, generated by the optimizer, corresponding to the best query execu-
tion plan over the actual indices configuration for a given query.
2. CH : the cost, generated by the optimizer, for the best execution plan considering
both actual and hypothetical indices for a given query.
3. CN : the cost, also generated by the optimizer, considering a physical configuration
with no indices (neither actual, nor hypothetical) for a given query.
4. CU : the estimated index maintenance cost when an update operation happens.
Usually this cost is not determined by the optimizer. Therefore, we have estimated
it in our implementation according to the cost model of the DBMS being used.
5. BI : the benefit that index I brings to the statement being evaluated. This benefit
is determined by our heuristic, distinguishing actual and hypothetical indices.
6. AccBI : the accumulated benefit that index I brings to all statements already eval-
uated. Again, the heuristic will update the accumulated benefit considering if the
index is hypothetical or actual.
7. CCI : the estimated creation cost for index I. Usually optimizers do not calculate
this factor. Therefore, we have estimated it in our implementation.
When a new SQL operation is submitted to the DBMS, the optimizer generates the
best query plan considering the actual indices present in the database. On the occurrence
of this event, the index tuning component is notified that a new statement is available and
invokes the Candidate Generation heuristic. This heuristic interacts with the optimizer to
obtain the best query plan considering both actual and hypothetical indices. The hypo-
thetical indices selected for this plan are taken as the candidate indices for the statement.
The component also performs an additional interaction with the optimizer to obtain the
cost of the best query plan that uses no indices. After these steps, we will have the values
of the factors CA , CH and CN for the current statement. The Benefits Heuristic can then
be invoked. It treats queries and updates distinctly, as will be described in the following.
Query Evaluation
If the statement submitted is a query, we apply the procedure detailed in Figure 1. For
each candidate index, we calculate its benefit and update its accumulated benefit. The
benefit is obtained as the difference between the cost of the best query plan over the
actual indices configuration and the cost of the best query plan over the configuration that
181
XXI Simpósio Brasileiro de Banco de Dados
includes actual and candidate indices. As the costs are obtained for the best query plans,
the benefit calculated is always nonnegative.
If various candidate indices are used in the query plan, we will attribute the same
benefit to all of them. As in this case the benefit brought by each index is not independent
of the other indices used, we are incurring in double-counting. The benefit calculation
procedure follows the one proposed in [Lohman et al. 2000]. More sophisticated schemes
can be devised, such as the one in [Chaudhuri et al. 2004].
After calculating the benefit for an index, we will create it if and only if its accu-
mulated benefit surpasses its estimated creation cost. Our objective is to create indices
that are used enough times to compensate their creation cost.
For each actual index used to process the query, the heuristic calculates the benefit
as the difference between the cost of the best query plan over a configuration with no
indices and the cost of the best query plan over the actual indices configuration. Note that,
once again, the benefit will be nonnegative. As actual indices are already materialized,
we just update their benefits to reflect their usage.
One of the factors we use in our procedure, namely index creation cost, is not cal-
culated by typical query optimizers. In our approach, we have devised a formula for index
creation cost in accordance with the cost model of the DBMS we have used (PostgreSQL):
CCI = 2 P + cRl o g R
In the formula, P represents the number of pages in the underlying table, R is the
number of tuples and c is a factor that measures what percentage of the cost to obtain a
page from disk should be applied to obtain the cost to process a tuple in main memory.
The formula calculates the number of pages touched to scan the table’s data, sort it and
then construct the index in one pass.
Considering index creation costs explicitly differentiates our approach from others
that make index suggestions and leave the actual materialization decision to the DBA.
182
XXI Simpósio Brasileiro de Banco de Dados
Update Evaluation
After the procedure for query evaluation is applied, we compute index mainte-
nance costs and debit these costs from the accumulated benefit of all candidate and actual
indices affected by the update. To estimate index maintenance costs, we have introduced
the factor CU . In our implementation, we have supposed that the cost to update any of the
indices affected by the statement is identical. This could easily be extended to make the
factor CU a function of the index affected. To calculate CU , we have used a cost formula
that estimates the number of pages written by the update:
r
CU = 2 P + cr
R
Again, this formula was created in accordance with the PostgreSQL’s cost model.
The r factor represents the number of tuples updated by the command and R, P and c are
analogous to those discussed before for index creation costs. The first term of the formula
computes the cost necessary to update an amount of pages proportional to the fraction of
table records that were updated, multiplied by the table size (in pages). We therefore make
a simple, yet effective, assumption where updated records will be uniformly distributed
among the pages of the table. The first term counts twice since it is necessary to read and
write each page. The second term of the formula computes the cost to process each of the
records updated in main memory.
183
XXI Simpósio Brasileiro de Banco de Dados
4. Implementation
We have implemented our component within PostgreSQL 7.4, beta 3 [Pos ], running on
Red Hat Linux 9 kernel 2.4.20-30.9. We have used a 512MB RAM Pentium 4 2Ghz
server, with all codings in gcc and g++ 3.2.2. We have chosen PostgreSQL, not only
because it is an open source DBMS, but also because it is highly modularized and well
documented. Furthermore, it is a quite robust DBMS widely used that can actually reflect
the expected behavior with practical results.
We have followed an index selection approach based on optimizer estimates,
the same way as considered in other related works (e.g. [Finkelstein et al. 1988,
Lohman et al. 2000]). In these proposals, the database server is extended to allow the
simulation of hypothetical indices. We have coded similar server extensions for Post-
greSQL, but detailing these is beyond the scope of this paper. Our prototype, and the
server extensions, are freely available at [PUCDB ]. Detailed results of our autonomic
index management research work are also presented in [Salles 2004, Morelli 2006].
We consider for our experiments here the Database Test 2 (DBT-2) toolkit pro-
vided by the Open Source Development Labs (OSDL) [OSDL-DBT2 ]. Its origin is the
Transaction Processing Performance Council’s TPC-C benchmark [TPC-C ]. This DBT-2
toolkit simulates a workload that represents a wholesale parts supplier operating out of a
number of warehouses and their associated sales districts. The toolkit, as usual bench-
marks, comes with a set of suggested (human created) indices that improve the evaluation
of the given workload. Some indices are created due to primary key creation and others
come up aiming at increasing the query processing performance. Due to space limitations,
the reader may refer to [OSDL-DBT2 , Salles 2004] for further details.
In order to evaluate the contribution brought by our index tuning component, we
have established three relevant configurations for running our experiments:
1. A database with no indices at all and our component turned off, which we call here
I0C0 (both indices and component zero). If we do not consider small variations
due to transaction submissions frequency, we expect that this would lead to the
smallest throughput, once all queries would be executed through full scans on
related tables. We remove all indices, including those that refer to primary keys.
2. No indices at the database and component turned on (I0C1 - indices zero, compo-
nent one). We expect our component to find and materialize, automatically, those
indices that may be beneficial to run the workload in consideration.
184
XXI Simpósio Brasileiro de Banco de Dados
3. Component turned off and the database with the indices suggested by the DBT-2
toolkit (I1C0 - indices one, component zero). Here we consider that the throughput
would be the greatest one since we are using those indices that DBT-2 toolkit
implementers have chosen and the system does not run the tuning component.
There are, still, two variables that may affect the experiments’ throughput. First,
the number of warehouses used to create the database may be viewed as a scale factor
and, therefore, the greater the number of warehouses is, also greater will be tables’ sizes.
The system load will also be bigger since more terminals simulating users are created.
The second variable to be considered is how long the experiment takes. Specially
when the component is active, time may directly influence the observed throughput. In-
deed, there are 2 steps that the component goes through while analyzing and adapting
indices to the database. The first step is the learning one, when the best indices for submit-
ted commands are detected and materialized. The second step is the steady state, where
the component only verifies whether or not the existing indices are still adequate, no in-
dices are materialized anymore. The throughput in the steady state phase is considerably
greater than during learning periods, leading to a greater average throughput.
Throughput Results
With respect to our experiments, we have tried out a few configurations, basically chang-
ing warehouse quantities and the total execution time. In [Salles 2004] we give the com-
plete test results not shown here due to space limitations.
No indexes,
50,00
component off
(I0C0)
40,00
NOTPM
No indexes,
30,00 component on
(I0C1)
20,00 Indexes
created,
10,00 component off
(I1C0)
0,00
1 2 3 4
Number of Warehouses
Figure 3 shows the results obtained for a 90 minute time test. It is worth noting
that the throughput for the I0C0 configuration decreases when the number of warehouses
increase. This is due to full scan query processing, leading to worse performances when
the database size increase. When there are more terminals submitting queries, data con-
tention becomes a problem straightforward.
As expected, an intermediate throughput appears for the I0C1 configuration. The
tuning component eventually reaches the end of the learning step and the workload re-
mains active for a while with adequate indices. Therefore, the throughput gets closer to
the I1C0 configuration throughput as the total execution time increases.
185
XXI Simpósio Brasileiro de Banco de Dados
Besides the number of indices created by our component, it is also worth to evaluate
the quality of indices. We could observe that the component has chosen about the same
indices at each experiment, even when we vary the database size. There is a set of 10
indices repeatedly created, as given in Table 1.
Table 1 shows database tables for which indices were created along with the
columns indexed in each table. In the first line of the Table, we show an index for pri-
mary key enforcement on the customer database table. The index is on the warehouse
id (c w id), the sales district id (c d id), and the customer id (c id). We show the index
as it was created by the toolkit and as it was detected by the tuning component. Note
that the component has no knowledge about semantic constraints on the table, such as
primary and foreign keys. It bases its decisions on which indices are adequate to lower
the workload’s processing cost.
On the tenth line of the table, we present an index on the orders table that was
created for performance enhancement. The index suggested by the toolkit is on the ware-
house id (o w id), the sales district id (o d id), and the customer id (o c id). The compo-
nent suggests an index with the same columns plus the order id (o id).
Indices presented in the seventh and eleventh lines of the Table were created by
the component only when the database was populated with more than one warehouse; all
other indices were created by the component in all database configurations.
It should be noted that most indices suggested by the component and by the toolkit
are very similar, the only difference being the column order. The reason is that, for each
column group detected by the component, the column order is established according to
the corresponding positions at the base table. For those indices that are created from
the DBT-2 toolkit suggestion, columns are placed with respect to the expected selectivity
order. However, we must observe that this column ordering did not change the results
obtained for the workload in consideration.
Besides those differences, we can also note from Table 1 that there is an index
among those suggested by the toolkit that was not created by our component. Indeed, as
186
XXI Simpósio Brasileiro de Banco de Dados
our tuning component cares only about indices that improve the workload performance,
and the warehouse table is always small - in our experiments it contained at most 4 tu-
ples -, from the optimizer point of view there is no need to create an index for it. The
corresponding toolkit index is, as shown, a primary key constraint enforcement index.
A more interesting situation corresponds to component’s decision upon creating
an index for table new order on column no w id. This index allows us to scan the table
ordered by warehouse id. There exists a given workload query for which the optimizer
estimates the creation of such an index would be interesting when the table size increases.
Thus, even with a low query frequency with respect to the complete workload, the benefits
expected justify the index creation.
In order to compare the quality of indices, either suggested by the DBT-2 toolkit,
or suggested (and created) by our component, we have decided to use the optimizer to
estimate the execution cost of each workload command in both configurations. Next, we
have studied related costs considering expected frequencies, obtaining an weighted cost
for each command. The results are given in Figure 4, with the sum of all weighted costs
obtained for the whole workload and a 4-warehouse database.
Weighted Costs
140
120
Weighted Costs
100
80
60
40
20
0
Component Indexes Toolkit Indexes
Indexes
We have obtained a weighted cost for indices used by the component 11,64%
lower than the cost obtained for those indices suggested by the toolkit. The extra index
created by the component for table new order on column no w id has a fundamental role:
without this index, we would have obtained a weighted cost 4,79% bigger instead.
5. Conclusions
In this paper we have presented a self-tuning approach that allows autonomic index man-
agement for dynamic workloads. One main contribution is to have a built-in component
that captures statements issued by the DBMS, including stored procedures. As a matter
of fact, everything that goes through the optimizer is considered and, consequently, good
tuning decisions can be made. The heuristics embedded in the automatic tuning com-
ponent also distinguishes itself from previous approaches by evaluating on-line which
indices should be created and destroyed. The heuristic explicitly considers the balance of
query speedup and index maintenance costs.
We have conducted experiments with a transactional workload to check the vi-
ability of our approach. Our results indicate that the component is capable of altering
187
XXI Simpósio Brasileiro de Banco de Dados
the database’s index design dynamically, though some time is needed for it to recognize
the workload’s characteristics and effectively materialize indices. The component consis-
tently achieved an index design that favors transactional throughput. It should be noted
another distinction here from previous works is that we do not consider static workloads
for index tuning as most tuning wizards usually do.
We are currently studying the extension of this work to deal with more complex
workloads, such as those involving ad hoc queries and decision support applications, and
index updates [Morelli 2006]. We also plan to bring enhancements to the Benefits Heuris-
tic by looking at alternative ways to attribute benefits to individual candidate indices and
by including storage space limitation in the on-line decision procedure.
References
Aboulnaga, A. and Chaudhuri, S. (1999). Self-tuning histograms: building histograms
without looking at data. In Procs ACM SIGMOD Intl. Conf. on Management of Data,
pages 181–192.
Agrawal, R., Chaudhuri, S., Das, A., and Narasayya, V. (2003). Automating layout of
relational databases. In Procs IEEE Intl. Conf. on Data Engineering (ICDE), pages
607–618.
Agrawal, S., Chaudhuri, S., Kollar, L., Marathe, A., Narasayya, V., and Syamala, M.
(2004). Database tuning advisor for microsoft sql server 2005. In Procs Intl. Conf.
Very Large Databases (VLDB), pages 1110–1121.
Agrawal, S., Chaudhuri, S., and Narasayya, V. (2000). Automated selection of material-
ized views and indexes for sql databases. In Procs Intl. Conf. Very Large Databases
(VLDB), pages 496–505.
Brown, K. P., Mehta, M., Carey, M. J., and Livny, M. (1994). Towards automated per-
formance tuning for complex workloads. In Procs Intl. Conf. Very Large Databases
(VLDB), pages 72–84.
Chaudhuri, S., Datar, M., and Narasayya, V. (2004). Index selection for databases: A
hardness study and a principled heuristic solution. IEEE Transactions on Knowledge
and Data Engineering, 16(11):1313–1323.
Chaudhuri, S. and Narasayya, V. (1998a). Autoadmin “what-if” index analysis utility. In
Procs ACM SIGMOD Intl. Conf. on Management of Data, pages 367–377.
Chaudhuri, S. and Narasayya, V. (1998b). Microsoft index tuning wizard for sql server
7.0. In Procs ACM SIGMOD Intl. Conf. on Management of Data, pages 553–554.
Chaudhuri, S. and Weikum, G. (2000). Rethinking database system architecture: Towards
a self-tuning risc-style database system. In Procs Intl. Conf. Very Large Databases
(VLDB), pages 1–10.
Chen, C. M. and Roussopoulos, N. (1993). Adaptive database buffer allocation using
query feedback. In Procs 19th Intl. Conf. Very Large Databases (VLDB), pages 342–
353.
Chen, C. M. and Roussopoulos, N. (1994). Adaptive selectivity estimation using query
feedback. In Procs ACM SIGMOD Intl. Conf. on Management of Data, pages 161–172.
188
XXI Simpósio Brasileiro de Banco de Dados
Costa, R., Lifschitz, S., de Noronha, M., and Salles, M. (2005). Implementation of an
agent architecture for automated index tuning. In Procs IEEE Intl. Workshop on Self-
Managing Database Systems (SMDB).
Dias, K., Ramacher, M., Shaft, U., Venkataramani, V., and Wood, G. (2005). Automatic
performance diagnosis and tuning in oracle. In Procs Biennial Conf. on Innovative
Data Systems Research (CIDR), pages 84–94.
Faloutsos, C., Ng, R. T., and Sellis, T. K. (1995). Flexible and adaptable buffer manage-
ment techniques for database management systems. IEEE Transactions on Computers,
44(4):546–560.
Finkelstein, S., Schkolnick, M., and Tiberio, P. (1988). Physical database design for
relational databases. ACM Transactions on Database Systems, 13(1):91–128.
Frank, M., Omiecinski, E., and Navathe, S. (1992). Adaptive and automated index selec-
tion in rdbms. In Procs Intl. Conf. on Extending Database Technology (EDBT), pages
277–292.
Ganti, V., Lee, M., and Ramakrishnan, R. (2000). Icicles: Self-tuning samples for ap-
proximate query answering. In Procs Intl. Conf. Very Large Databases (VLDB), pages
176–187.
Johnson, T. and Shasha, D. (1994). 2q: A low overhead high performance buffer man-
agement replacement algorithm. In Procs Intl. Conf. Very Large Databases (VLDB),
pages 439–450.
Lee, M. L., Kitsuregawa, M., Ooi, B. C., Tan, K.-L., and Mondal, A. (2000). Towards
self-tuning data placement in parallel database systems. In Procs ACM SIGMOD Intl.
Conf. on Management of Data, pages 225–236.
Lifschitz, S. and Milanés, A. (2005). Design and implementation of a global self-tuning
architecture. In Procs Brazilian Symp. on Databases (SBBD), pages 70–84.
Lohman, G., Valentin, G., Zilio, D., Zuliani, M., and Skelley, A. (2000). Db2 advisor: An
optimizer smart enough to recommend its own indexes. In Procs IEEE Intl. Conf. on
Data Engineering (ICDE), pages 101–110.
Morelli, E. T. (2006). Updating database indices automatically (in portuguese). Master’s
thesis, Departamento de Informatica PUC-Rio.
O’Neil, E., O’Neil, P., and Weikum, G. (1993). The lru-k page replacement algorithm for
database disk buffering. In Procs ACM SIGMOD Intl. Conf. on Management of Data,
pages 297–306.
OSDL-DBT2. Open source development labs database test 2.
http://www.osdl.org/lab activities/kernel testing/
osdl database test suite/osdl dbt-2/.
Papadomanolakis, S. and Ailamaki, A. (2004). Autopart: Automating schema design for
large scientific databases using data partitioning. In Procs IEEE Intl. Conf. on Scientific
and Statistical Database Management (SSDBM), pages 383–392.
Pos. Postgresql. http://www.postgresql.org.
PUCDB. Puc-rio database self-tuning group. http://www.inf.puc-rio.br/ postgresql.
189
XXI Simpósio Brasileiro de Banco de Dados
190