Você está na página 1de 15

A data model in software engineering is an abstract model, that documents and organizes the business data for communication

between team members and is used as a plan for developing applications, specifically how data is stored and accessed.
Data. Generally, data represent a structured codification of single primary entities, as well as of transactions involving two or more primary entities. For example, for a retailer data refer to primary entities such as customers, points of sale and items, while sales receipts represent the commercial transactions.There are another Concept of data definition for example, Data are recorded (captured and stored) symbols and signal readings. Data is very important, because of this issues could be the main part of success for the company. Improve network security must be done to avoid a lost of data. Information. Information is the outcome of extraction and processing activities carried out on data, and it appears meaningful for those who receive it in a specific domain. For example, to the sales manager of a retail company, the proportion of sales receipts in the amount of over 100 per week, or the number of customers holding a loyalty card who have reduced by more than 50% the monthly amount spent in the last three months, represent meaningful pieces of information that can be extracted from raw stored data. Knowledge. Information is transformed into knowledge when it is used to make decisions and develop the corresponding actions. Therefore, we can think of knowledge as consisting of information put to work into a specific domain, enhanced by the experience and competence of decision makers in tackling and solving complex problems. For a retail company, a sales analysis may detect that a group of customers, living in an area where a competitor has recently opened a new point of sale, have reduced their usual amount of business. At same point purpose of database systems is to manage the knowledge so it's not lost. The knowledge extracted in this way will eventually lead to actions aimed at solving the problem detected, for example by introducing a new free home delivery service for the customers residing in that specific area. We wish to point out that knowledge can be extracted from data both in a passive way, through the analysis criteria suggested by the decision makers, or through the active application of math

Types of DBMS: Hierarchical Databases

There are four structural types of database management systems: hierarchical, network, relational, and object-oriented.

Hierarchical Databases (DBMS), commonly used on mainframe computers, have been around for a long time. It is one of the oldest methods of organizing and storing data, and it is still used by some organizations for making travel reservations. A hierarchical database is organized in pyramid

fashion, like the branches of a tree extending downwards. Related fields or records are grouped together so that there are higher-level records and lower-level records, just like the parents in a family tree sit above the subordinated children. Based on this analogy, the parent record at the top of the pyramid is called the root record. A child record always has only one parent record to which it is linked, just like in a normal family tree. In contrast, a parent record may have more than one child record linked to it. Hierarchical databases work by moving from the top down. A record search is conducted by starting at the top of the pyramid and working down through the tree from parent to child until the appropriate child record is found. Furthermore, each child can also be a parent with children underneath it. The advantage of hierarchical databases is that they can be accessed and updated rapidly because the tree-like structure and the relationships between records are defined in advance. However, this feature is a twoedged sword. The disadvantage of this type of database structure is that each child in the tree may have only one parent, and relationships or linkages between children are not permitted, even if they make sense from a logical standpoint. Hierarchical databases are so rigid in their design that adding a new field or record requires that the entire database be redefined.
Types of DBMS: Network Databases

Network databases are similar to hierarchical databases by also having a hierarchical structure. There are a few key differences, however. Instead of looking like an upside-down tree, a network database looks more like a

cobweb or interconnected network of records. In network databases, children are called membersand parents are called owners. The most important difference is that each child or member can have more than one parent (or owner). Like hierarchical databases, network databases are principally used on mainframe computers. Since more connections can be made between different types of data, network databases are considered more flexible. However, two limitations must be considered when using this kind of database. Similar to hierarchical databases, network databases must be defined in advance. There is also a limit to the number of connections that can be made between records.
Types of DBMS: Relational Databases

In relational databases, the relationship between data files is relational, not hierarchical. Hierarchical and network databases require the user to pass down through a hierarchy in order to access needed data. Relational databases connect data in different files by using common data elements or a key field. Data in relational databases is stored in different tables, each having a key field that uniquely identifies each row. Relational databases are more flexible than either the hierarchical or network database structures. In relational databases, tables or files filled with data are called relations, tuples designates a row or record, and columns are referred to as attributes or fields. Relational databases work on the principle that each table has a key field that uniquely identifies each row, and that these key fields can be used to

connect one table of data to another. Thus, one table might have a row consisting of a customer account number as the key field along with address and telephone number. The customer account number in this table could be linked to another table of data that also includes customer account number (a key field), but in this case, contains information about product returns, including an item number (another key field). This key field can be linked to another table that contains item numbers and other product information such as production location, color, quality control person, and other data. Therefore, using this database, customer information can be linked to specific product information. The relational database has become quite popular for two major reasons. First, relational databases can be used with little or no training. Second, database entries can be modified without redefining the entire structure. The downside of using a relational database is that searching for data can take more time than if other methods are used.
Types of DBMS: Object-oriented Databases (OODBMS)

Able to handle many new data types, including graphics, photographs, audio, and video, object-oriented databases represent a significant advance over their other database cousins. Hierarchical and network databases are all designed to handle structured data; that is, data that fits nicely into fields, rows, and columns. They are useful for handling small snippets of information such as names, addresses, zip codes, product numbers, and any kind of statistic or number you can think of. On the other hand, an object-oriented database can be used to store data from a variety of media sources, such as photographs and text, and produce work, as output, in a multimedia format. Object-oriented databases use small, reusable chunks of software called objects. The objects themselves are stored in the object-oriented database. Each object consists of two elements: 1) a piece of data (e.g., sound, video, text, or graphics), and 2) the instructions, or software programs called methods, for what to do with the data. Part two of this definition requires a little more explanation. The instructions contained within the object are used to do something with the data in the object. For example, test scores would be within the object as would the instructions for calculating average test score.

Object-oriented databases have two disadvantages. First, they are more costly to develop. Second, most organizations are reluctant to abandon or convert from those databases that they have already invested money in developing and implementing. However, the benefits to object-oriented databases are compelling. The ability to mix and match reusable objects provides incredible multimedia capability. Healthcare organizations, for example, can store, track, and recall CAT scans, X-rays, electrocardiograms and many other forms of crucial data.
Knowledge Check

Which of the following is a database management system (DBMS) that works on the principle that each table has a key field that uniquely identifies each row, and that these key fields can be used to connect one table of data to another. A. B. C. D. Hierarchical databases Network databases Relational databases Object-oriented databases

1st dbms by Charles bachman is The network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or lattice.

Ibm in late 1960s developed ims which is A hierarchical database model is a data model in which the data is organized into a tree-like structure. The structure allows representing information using parent/child relationships: each parent can have many children, but each child has only one parent (also known as a 1-to-many relationship). All attributes of a specific record are listed under an entity type.

Example of a hierarchical model

Referential Integrity Rules

A referential integrity rule is a rule defined on a key (a column or set of columns) in one table that guarantees that the values in that key match the values in a key in a related table (the referenced value). Referential integrity also includes the rules that dictate what types of data manipulation are allowed on referenced values and how these actions affect dependent values. The rules associated with referential integrity are:

Restrict: Disallows the update or deletion of referenced data. Set to Null: When referenced data is updated or deleted, all associated dependent data is set to NULL. Set to Default: When referenced data is updated or deleted, all associated dependent data is set to a default value. Cascade: When referenced data is updated, all associated dependent data is correspondingly updated. When a referenced row is deleted, all associated dependent rows are deleted. No Action: Disallows the update or deletion of referenced data. This differs from RESTRICT in that it is checked at the end of the statement, or at the end of the transaction if the constraint is deferred. (Oracle uses No Action as its default action.)

Data administration or data resource management is an organizational function working in the areas of information systems and computer science that plans, organizes, describes and controls data resources. Data resources are usually as stored in databases under a database management system or other software such as electronic spreadsheets. In many smaller organizations, data administration is performed occasionally, or is a small component of the database administrators work

Programming a Standalone Application A standalone application is an application that does not call database objects, such as stored procedures, when it executes. When you write the application, you must ensure that certain SQL statements appear at the beginning and end of the program to handle the transition from the host language to the embedded SQL statements

-> Data model is a collection of concepts for describing data, its relationships, and its constraintsProvides a clearer and more accurate description and representation of data

The purpose of the relational model is to provide a declarative method for specifying data and queries: users directly state what information the database contains and what information they want from it, and let the database management system software take care of describing data structures for storing the data and retrieval procedures for answering queries.

Data independence is the separation of the data that had been used for many different programs. So it is important to distinguish among
both the data and the programs that are being used for that data.

integrity constraints and checks the authority of users to access data. Transaction manager, which ensures that the database remains in a consistent (correct) state despite system failures, and that concurrent transaction executions proceed without conflicting. File manager, which manages the allocation of space on disk storage and the data structures used to represent information stored on disk. Buffer manager, which is responsible for fetching data from disk storage into main memory, and deciding what data to cache in main memory. The buffer manager is a critical part of the database system, since it enables the database to handle data sizes that are much larger than the size of main memory.The storage manager implements several data structures as part of the physical system implementation: Data files, which store the database itself. Data dictionary, which stores metadata about the structure of the database, in particular the schema of the database. Indices, which provide fast access to data items that hold particular values.

1. 2. 3.

4. Physical Independence: The logical scheme stays unchanged even though the storage space or type of some data is changed for reasons of optimisation or reorganisation. [The ability to change the physical schema without changing the logical schema is called as Physical Data Independence.] 5. Logical Independence: The external scheme may stay unchanged for most changes of the logical scheme. This is especially desirable as the application software does not need to be modified or newly translated. [The ability to change the logical schema without changing the external schema or application programs is called as Logical Data Independence.]

6.

[edit]

Introduction to Data Integrity

It is important that data adhere to a predefined set of rules, as determined by the database administrator or application developer. As an example of data integrity,

consider the tables employees and departments and the business rules for the information in each of the tables, as illustrated in Figure 21-1.

Figure 21-1 Examples of Data Integrity

Description of "Figure 21-1 Examples of Data Integrity"


Definition: A superkey is a combination of attributes that can be uniquely used to identify a database record. A table might have many superkeys. Candidate keys are a special subset of superkeys that do not have any extraneous information in them. Examples: Imagine a table with the fields <Name>, <Age>, <SSN> and <Phone Extension>. This table has many possible superkeys. Three of these are <SSN>, <Phone Extension, Name> and <SSN, Name>. Of those listed, only <SSN> is a candidate key, as the others contain information not necessary to uniquely identify records. DATABASE DESIGHN: 1)REQUIREMENT ANALYSIS 2)CONCEPTUAL DATABASE DESIGHN 3)LOGICAL DB DESIGN

4)schema refinement 5)physical db design

6)security design
Physical Database Design
Physical database design translates the logical data model into a set of SQL statements that define the database. For relational database systems, it is relatively easy to translate from a logical data model into a physical database. Rules for translation:

Entities become tables in the physical database. Attributes become columns in the physical database. Choose an appropriate data type for each of the columns. Unique identifiers become columns that are not allowed to have NULL values. These are referred to as primary keys in the physical database. Consider creating a unique index on the identifiers to enforce uniqueness. Relationships are modeled as foreign keys.

A relational database schema is the tables, columns and relationships that make up a relational database. A schema is often depicted visually in modeling software such as ERwin or drawing software such as Vis

The Purpose of a Schema

A relational database schema helps you to organize and understand the structure of a database. This is particularly useful when designing a new database, modifying an existing database to support more functionality, or building integration between databases.

Unlike ordinary tables (base tables) in a relational database, a view does not form part of the physical schema: it is a dynamic, virtual table computed or collated from data in the database. Just as functions (in programming) can provide abstraction, so database users can create abstraction by using views. In another parallel with functions, database users can manipulate nested views, thus one view can aggregate data from other views

Collation is information about how strings should be sorted and compared. It contains for example case sensetivity, e.g. whether a = A, special character considerations, e.g. whether a = , and character order, e.g. w
A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database. The trigger is mostly used for keeping the integrity of the information on the database.

Triggers are commonly used to:

audit changes (e.g. keep a log of the users and roles involved in changes) enhance changes (e.g. ensure that every change to a record is time-stamped by the server's clock)

enforce business rules (e.g. require that every invoice have at least one line item) Triggers in Oracle
In addition to triggers that fire when data is modified, Oracle 9i supports triggers that fire when schema objects (that is, tables) are modified and when user logon or logoff events occur. These trigger types are referred to as "Schema-level triggers". [edit]Schema-level

triggers

After Creation Before Alter

After Alter Before Drop After Drop Before Logoff After Logon

The four main types of triggers are:in ORACLE 9i

1. 2. 3. 4.

Row Level Trigger: This gets executed before or after any column value of a row changes Column Level Trigger: This gets executed before or after the specified column changes For Each Row Type: This trigger gets executed once for each row of the result set caused by insert/update/delete For Each Statement Type: This trigger gets executed only once for the entire result set, but fires each time the statement is executed.

There are four types of database triggers: 1. Table-level triggers can initiate activity before or after an INSERT, UPDATE, or DELETE event. 2. View-level triggers defines what can be done to the view. 3. Database-level triggers can be activated at startup and shutdown of a database. 4. Session-level triggers can be used to store specific information.

A Relational Database Overview

A database is a means of storing information in such a way that information can be retrieved from it. In simplest terms, a relational database is one that presents information in tables with rows and columns. A table is referred to as a relation in the sense that it is a collection of objects of the same type (rows). Data in a table can be related according to common keys or concepts, and the ability to retrieve related data from a table is the basis for the term relational database. A Database Management System (DBMS) handles the way data is stored, maintained, and retrieved. In the case of a relational database, a Relational Database Management System (RDBMS) performs these tasks. DBMS as used in this book is a general term that includes RDBMS.
This article needs additional citations for verification. Please help improve this article by adding reliable references. Unsourced material may be challenged and removed. (May 2010)

An application programming interface (API) is a particular set of rules ('code') and specifications that software programs can follow to communicate with each other. It serves as an interfacebetween different software programs and facilitates their interaction, similar to the way the user interface facilitates interaction between humans and computers.

An API can be created for applications, libraries, operating systems,

The method Class.forName(String) is used to load the JDBC driver class. The line below causes the JDBC driver from some jdbc vendor to be loaded into the application. (Some JVMs also require the class to be instantiated with Class.forName( "com.somejdbcvendor.TheirJdbcDriver" ); When a Driver class is loaded, it creates an instance of itself and registers it with the DriverManager. This can be done by including the needed code in the driver class's static block. e.g.DriverManager.registerDriver(Driver

.newInstance().)

driver)

Now when a connection is needed, one of the DriverManager.getConnection() methods is used to create a JDBC connection. Connection conn = DriverManager.getConnection( "jdbc:somejdbcvendor:other data needed by some jdbc vendor", "myLogin", "myPassword" ); try { /* you use the connection here */ } finally { //It's important to close the connection when you are done with it try { conn.close(); } catch (Throwable ignore) { /* Propagate the original exception instead of this one that you may want just logged */ } } A JDBC driver is a software component enabling a Java application to interact with a database.[1] JDBC drivers are analogous to ODBC drivers, ADO.NET data providers, and OLE DB providers.

To connect with individual databases, JDBC (the Java Database Connectivity API) requires drivers for each database. The JDBC driver gives out the connection to the database and implements theprotocol for transferring the query and result between client and database. A stored procedure is a subroutine available to applications accessing a relational database system. Stored procedures (sometimes called a proc, sproc, StoPro, StoredProc, or SP) are actually stored in the database data dictionary.

A stored procedure is a group of SQL statements that form a logical unit and perform a particular task, and they are used to encapsulate a set of operations or queries to execute on a database server. For example, operations on an employee database (hire, fire, promote, lookup) could be coded as stored procedures executed by application code. Stored procedures can be compiled and executed with different parameters and results, and they can have any combination of input, output, and input/output parameters.

Synopsis
SAVEPOINT savepoint_name

Description
SAVEPOINT

establishes a new savepoint within the current transaction.

A savepoint is a special mark inside a transaction that allows all commands that are executed after it was established to be rolled back, restoring the transaction state to what it was at the time of the savepoint.

Parameters
savepoint_name

The name to give to the new savepoint. Use ROLLBACK TO SAVEPOINT to rollback to a savepoint. Use RELEASE SAVEPOINT to destroy a savepoint, keeping the effects of commands executed after it was established. Savepoints can only be established when inside a transaction block. There can be multiple savepoints defined within a transaction.

Examples
To establish a savepoint and later undo the effects of all commands executed after it was established:
BEGIN; INSERT INTO table1 VALUES (1); SAVEPOINT my_savepoint; INSERT INTO table1 VALUES (2); ROLLBACK TO SAVEPOINT my_savepoint; INSERT INTO table1 VALUES (3); COMMIT;

The above transaction will insert the values 1 and 3, but not 2. To establish and later destroy a savepoint:
BEGIN; INSERT INTO table1 VALUES (3); SAVEPOINT my_savepoint; INSERT INTO table1 VALUES (4); RELEASE SAVEPOINT my_savepoint;

COMMIT;

The above transaction will insert both 3 and 4.

Update Log Record notes an update (change) to the database. It includes this extra information:

PageID: A reference to the Page ID of the modified page. Length and Offset: Length in bytes and offset of the page are usually included. Before and After Images: Includes the value of the bytes of page before and after the page change. Some databases may have logs which include one or both images.

Compensation Log Record notes the rollback of a particular change to the database. Each correspond with exactly one other Update Log Record (although the corresponding update log record is not typically stored in the Compensation Log Record). It includes this extra information:

undoNextLSN: This field contains the LSN of the next log record that is to be undone for transaction that wrote the last Update Log.

Commit Record notes a decision to commit a transaction.

Transaction Table: The table contains one entry for each active transaction. This includes Transaction ID and lastLSN, where lastLSN describes the LSN of the most recent log record for the transaction.

Dirty Page Table: The table contains one entry for each dirty page that hasn't been written to disk. The entry contains recLSN, where recLSN is the LSN of the first log record that caused the page to be dirty.

In database systems, isolation is a property that defines how/when the changes made by one operation become visible to other concurrent operations. Isolation is one of the ACID (Atomicity, Consistency, Isolation, Durability) properties. concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with each other

Multiversion concurrency control (abbreviated MCC or MVCC), in the database field of computer science, is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory.[1]

For instance, a database will implement updates not by deleting an old piece of data and overwriting it with a new one, but instead by marking the old data as obsolete and adding the newer "version." Thus there are multiple versions stored, but only one is the latest. This allows the database to avoid overhead of filling in holes in memory or disk structures but requires (generally) the system to periodically sweep through and delete the old, obsolete data objects. MVCC provides each user connected to the database with a "snapshot" of the database for that person to work with. Any changes made will not be seen by other users of the database until the transaction has been committed.

SERIALIZABLE
This is the highest isolation level. It specifies that all transactions occur in a completely isolated fashion, or, in other words, as if all transactions in the system had executed serially, one after the other. The DBMS may execute two or more transactions at the same time only if the illusion of serial execution can be maintained.

concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

With a lock-based concurrency control DBMS implementation, serializability requires read and write locks (acquired on selected data) to be released at the end of the transaction. Also range-locksmust be acquired when a SELECT query uses a ranged WHERE clause, especially to avoid the phantom reads phenomenon (see below).

When using non-lock based concurrency control, no locks are acquired; however, if the system detects a write collision among several concurrent transactions, only one of them is allowed to commit. See snapshot isolation for more details on this topic.

REPEATABLE READS
In this isolation level, a lock-based concurrency control DBMS implementation keeps read and write locks (acquired on selected data) until the end of the transaction. However, range-locks are not managed, so the phantom reads phenomenon can occur (see below). [edit]READ

COMMITTED

In this isolation level, a lock-based concurrency control DBMS implementation keeps write locks (acquired on selected data) until the end of the transaction, but read locks are released as soon as theSELECT operation is performed (so the non-repeatable reads phenomenon can occur in this isolation level, as discussed below). As in the previous level, range-locks are not managed. [edit]READ

UNCOMMITTED

This is the lowest isolation level. In this level, dirty reads are allowed (see below), so one transaction may see not-yet-committed changes made by other transactions.

Interleaving is a way to arrange data in a non-contiguous way to increase performance.

In error-correction coding, particularly within data transmission, disk storage, and computer memory.

a serial schedule, i.e., sequential with no overlap in time, because the actions of in all three transactions are sequential, and the transactions
are not interleaved in time.

, a schedule (also called history) of a system is an abstract model to describe execution of transactions running in the system. Often it is a list of operations (actions) ordered by time, performed by a set of transactions that are executed together in the system

Equivalent schedules: For any database state, the effect (on the set of objects in the database) of executing the first

schedule is identical to the effect of executing the second schedule.

Serializable schedule: A schedule that is equivalent to some serial execution of the transactions.If the dependency graph of a schedule
is acyclic, the schedule is called conflict serializable. Such a schedule is equivalent to a serial schedule

2 schedules same way.

are conflict equivalent if: they have the same sets of actions, and each pair of conflicting actions is ordered in the

Recoverable
Transactions commit only after all transactions whose changes they read, commit.

These schedules are recoverable. F is recoverable because T1 commits before T2, that makes the value read by T2 correct. Then T2 can commit itself. In F2, if T1 aborted, T2 has to abort because the value of A it read is incorrect. In both cases, the database is left in a consistent state. [edit]Unrecoverable If a transaction T1 aborts, and a transaction T2 commits, but T2 relied on T1, we have an unrecoverable schedule.

In this example, G is unrecoverable, because T2 read the value of A written by T1, and committed. T1 later aborted, therefore the value read by T2 is wrong, but since T2 committed, this schedule is unrecoverable.

A schedule serializable!.

is conflict serializable if it is conflict equivalent to a serial schedule. Note: Some serializable schedules are not conflict

Precedence graph use to capture all the conflicts between transactions

Você também pode gostar