Você está na página 1de 30

CENG 553_2 Presentation mer Burak KARAMER 1891316

A database that stores XML documents. An XML database is a system that allows data to be stored in XML format. These data can then be queried, exported and serialized into the desired format.

Term Relational Database Management System (RDBMS) was coined by E.F. Codd in early 1970s RDBMS is a table structure with tuples and attributes XML was developed in late 1990s with the advent of the Web XML is a rework on SGML to answer some of the problems the Web faced as it was growing XML is a tree structure with nodes and branches

Why we use XML documents?


Need for a common data format Transferring data between partners in the

business-to-business or web based applications It is suitable for structured, unstructured and semi-structured data (XML documents can be data-centric or document-centric)

Relational Databases are data-centric XML is document centric(can be data centric) This is a challenge in converting XML into Relational Database XML Databases is a solution

Data-centric documents are those containing structured data. Data appear in a regular order. In general, relational databases are efficient enough in storing data contained in data-centric XML documents
<Memo> <Meeting date="23/09/2005" time="10:30AM">Finance Committee</Meeting> <Purpose>Discuss 2006 Budget</Purpose> <Location>Room 923</Location> </Memo>

Document-centric XML documents are those characterized by irregular structure and mixed content
<Memo> Please can Finance Committee members come to <Location>Room 923</Location>on <MeetingDate>23/09/2005</MeetingDate> at <MeetingTime>10:30 AM</MeetingTime> to <Purpose>discuss the 2006 budget</Purpose> </Memo>

Unstructured data;

Structured data;

data can be of any type not necessarily following any format or sequence is not predictable examples include text, video, sound etc.
data is organized in semantic chunks (entities) similar entities are grouped together (relations or classes) entities in the same group have the same descriptions (attributes) descriptions for all entities in a group (schema) have the same defined format have a predefined length are all present and follow the same order

The semi-structured model is a model that in this model, there is no separation between the data and the schema, and the amount of structure used depends on the purpose. semi-structured data

organized in semantic entities similar entities are grouped together entities in same group may not have same attributes order of attributes not necessarily important not all attributes may be required size of same attributes in a group may differ type of same attributes in a group may differ

name: Peter Wood email: ptw@dcs.bbk.ac.uk, p.wood@bbk.ac.uk name:


first name: Mark last name: Levene

email: mark@dcs.bbk.ac.uk name: Alex Poulovassilis affiliation: Birkbeck

There are three different types of XML databases:


1. XML Enabled Database (XEDB) 2. Native XML Database (NXD) 3. Hybrid XML Databases (HXD)

Actualy a non-XML database There is a layer between XML document and database The layer tranlates the data between XML documents and tables (for RDMBS ex.) Need to support querying , updating and storing XML data. SQL/XML or XQuery can be used to retrieving and modifying database.

Maping is used to translate data between XML document and relational database.
1. table-based mapping:
Xml document have the same structure as a relational database The data is grouped into rows and rows are grouped into tables.

2. object-relational mapping:
XML document is viewed as a set of serialized objects Objects are mapped to tables, properties are mapped to columns, and inter-object relationships are mapped to primary key / foreign key relationships

Mosty used when the stored data is wellstructured can be stored in a relational database easily. The only need to a XML database is to handle translating the data between XML document and tables in database.

Defines a (logical) model for an XML document - as opposed to the data in that document - and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, PCDATA, and document order. Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage. Is not required to have any particular underlying physical storage model. For example, it can be built on a relational, hierarchical, or object-oriented database, or use a proprietary storage format such as indexed, compressed files.

Relational database structure: Airport ID 123 456 678 890 Airport Code DAL CHT RDU JFK Airline ID Arline Name Airplane Boing Airbus

12345
98765

American Airlines
Delta Airlines

Native Xml database structure:

Element
Flights Flight Airline Airplane Dest- Airport

Node Type
Root First node Second node Second node Second node

Attribute name AirlineID CityID

Native XML Databases (NXDs) are not meant to replace existing databases or XML but they intend to provide storage and manipulation of XML documents XML provides many characteristics of relational databases like storage in the form of XML document, schemas in the form of DTDs and XML Schema, query languages like XQuery and XPath and finally APIs like DOM and JDOM XML lacks many of the other characteristics of DBMS such as indexing, transactions, data integrity, triggers, normalization and updates

Storage Querying Updates Indexing Normalization APIs

NXDs store data in the form of XML document This is useful for semi-structured data as storing semi-structure data into a relational database is difficult Advantage: Retrieval is faster as there are no joins while retrieving the document Disadvantage: Difficult to retrieve a different view of the data Example:
Retrieving a particular flight instance is faster Retrieving a list of all airline companies whose flights are flying to RDU airport is difficult

Querying Most NXDs support XPath and XQuery for querying XPath is most commonly used query language for NXDs. But XPath lacks functionality like grouping, sorting, cross document joins, etc. XQuery has overcome these shortcomings Updates XUpdate is used for updating Native XML Databases Uses XPath to identify a set of nodes and then specifies whether to insert or delete these nodes, or insert new nodes before or after them.

Three types of indexes


Value indexes
They index text and attribute values Used to resolve queries such as, Find all elements or attributes whose value is American Airlines

Structural indexes
They index location of elements and attributes Used to resolve queries such as, Find all Airline elements Value and structural indexes are combined to resolve queries such as, Find all Airline elements whose value is American Airlines

Full-text indexes
They index individual tokens in text and attribute values Used to resolve queries such as, Find all documents that contain the words American Airlines Used with structural indexes for queries like, Find all documents that contain the words American Airways inside an Airline element

Similar to relational databases, normalization can be done on NXDs as well but. XML supports multi-valued properties . Thus NXDs are normalized even when they have multi-valued attributes 1NF of relational database meaningless in context of NXDs Thus normalization is a non-issue for many NXDs

Native XML Databases support APIs They are generally similar to ODBC-like interface with methods for connecting to the databases and retrieving results Results are returned in the form of XML string or DOM tree or XML Reader Two commonly used APIs are:

XML:DB API from XML:DB.org


The API is intended to be implementable in multiple languages though it does assume the implementation language is object oriented. Uses XPath as its query language and is being extended to support XQuery

JSR 225: XQuery API for Java (XQJ)


It is based on JDBC Uses XQuery as its query language Developed through Sun's Java Community Process (JCP)

The need to combine the features of both native and XML-enabled databases has led to the creation of a new category of databases call hybrid XML databases. Hybrid XML databases are usually relational database products extended with native XML support. Hybrid XML databases are ideal for applications which at one point stored data in relational form, but now need to move to the XML world; performing such a data transformation within a single DBMS greatly simplifies the task

Product Access 2007 MarkLogic eXist Berkeley DB XML Oracle XML DB

Developer Microsoft Mark Logic Corp. Wolfgang Meier Oracle Oracle

License Commercial Commercial Open Source Oper Source Commercial

Type Relational-XML Enabled Native XML Database Native XML Database Native XML Database XML support since version 9i (Hybrid)

IBM DB2 XML Extender

IBM

Commercial

Hybrid XML Database

Ref: XML Database Products Copyright 2000-2010 by Ronald Bourret Last updated on: June 20, 2010

EXAMPLE XML DOCUMENT


In this xml document books information in a book store is stored.

<bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price> </book> <book category="WEB"> <title lang="en">Learning XML <size>100</size> </title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>

/bookstore/book/title
<title lang="en">Everyday Italian</title> <title lang="en">Harry Potter</title> <title lang="en">XQuery Kick Start</title> <title lang="en">Learning XML <size>100</size> </title>

/bookstore/book[price>30]/title
<title lang="en">XQuery Kick Start</title> <title lang="en">Learning XML <size>100</size> </title>

for $x in /bookstore/book where $x/price>30 return $x/title

<ul> { for $x in /bookstore/book where $x/price<35 order by $x return <li>{data($x/title)}</li> } </ul>


<ul> <li>Everyday Italian</li> <li>Harry Potter</li> </ul>

<title lang="en">XQuery Kick Start</title> <title lang="en">Learning XML <size>100</size> </title>

This thee same result with the prev. one

The output is in the HTML format. We eliminated the title element, and show only the data inside the title element

When use XML databases ?

XML Databases - George Papamarkos, Lucas Zamboulis, Alexandra Poulovassilis School of Computer Science and Information Systems,Birkbeck College, University of London (http://www.dcs.bbk.ac.uk/~sven/adm08/xmlDBs.pdf) NATIVE XML DATABASES vs. RELATIONAL DATABASES IN DEALING WITH XML DOCUMENTS - Gordana PavlovicLazetic - Kragujevac J. Math. 30 (2007) 181-199 http://www.w3schools.com (Examples) Keyword Search over Hybrid XML-Relational Databases Liru Zhang Tadashi Ohmori and Mamoru Hoshi http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/toc.html

Você também pode gostar