Você está na página 1de 51

Emerging Research

Directions in DBs/ISs
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab

2
Mobile Databases
ƒ Recent advances in portable and wireless technology led to
mobile computing, a new dimension in data communication
and processing.
processing
ƒ Portable computing devices coupled with wireless
communications allow clients to access data from virtually
anywhere and at any time
time.
ƒ There are a number of hardware and software problems that
must be resolved before the capabilities of mobile computing
can be fully utilized
utilized.
ƒ Some of the software problems – which may involve data
management, transaction management, and database
recovery – have their origins in distributed database
systems.

3
Mobile Databases(2)

ƒ In mobile computing, the problems are more


difficult mainly:
difficult,
• The limited and intermittent connectivity afforded by
wireless communications.
• The limited life of the power supply(battery).
• The changing topology of the network.
• In addition, mobile computing introduces new
architectural possibilities and challenges.

4
Mobile Computing Architecture

5
Mobile Computing Architecture(2)

ƒ It is distributed architecture where a number of


computers generally referred to as Fixed Hosts
computers,
and Base Stations are interconnected through a
high-speed
g p wired network.
• Fixed hosts are general purpose computers configured to
manage mobile units.
• Base stations function as gateways to the fixed network
for the Mobile Units.

6
Data Management Issues
ƒ From a data management standpoint, mobile computing may
be considered a variation of distributed computing. Mobile
databases can be distributed under two possible scenarios:
• The entire database is distributed mainly among the wired
components, possibly with full or partial replication.
→A base station or fixed host manages its own database with a
DBMS-like functionality, with additional functionality for locating
mobile units and additional query and transaction management
features to meet the requirements of mobile environments.
• The database is distributed among wired and wireless
components.
→Data management responsibility is shared among base stations
or fixed hosts and mobile units.

7
Data Management Issues(2)
ƒ Data management issues as it is applied to mobile
databases:
• Data distribution and replication
• Transactions models
• Query processing
• Recovery and fault tolerance
• Mobile database design
• Location-based service
• Division of labor
• Security
ƒ M-Commerce

8
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab

9
Multimedia Databases

ƒ In the years ahead multimedia information systems


are expected to dominate our daily lives
lives.
• Our houses will be wired for bandwidth to handle
interactive multimedia applications.
• Our high-definition TV/computer workstations will have
access to a large number of databases, including digital
libraries image and video databases that will distribute
libraries,
vast amounts of multisource multimedia content.

10
Multimedia Databases (2)

ƒ DBMSs have been constantly adding to the types of


data they support.
support
ƒ Today many types of multimedia data are available
in current systems
systems.

11
Multimedia Databases(3)

ƒ Types of multimedia data are available in current


systems
• Text: May be formatted or unformatted. For ease of
parsing structured documents, standards like SGML and
variations such as HTML are being used.
• Graphics: Examples include drawings and illustrations
that are encoded using some descriptive standards (e
(e.g.
g
CGM, PICT, postscript).

12
Multimedia Databases(4)
ƒ Types of multimedia data are available in current
systems
y ((contd.))
• Images: Includes drawings, photographs, and so forth,
encoded in standard formats such as bitmap, JPEG, and
MPEG Compression is built into JPEG and MPEG
MPEG. MPEG.
→These images are not subdivided into components. Hence
querying them by content (e.g., find all images containing circles)
is nontrivial.
• Animations: Temporal sequences of image or graphic
data.

13
Multimedia Databases(5)

ƒ Types of multimedia data are available in current


systems (contd
(contd.))
• Video: A set of temporally sequenced photographic data
for presentation at specified rates– for example, 30
frames per second.
• Structured audio: A sequence of audio components
comprising note
note, tone
tone, duration
duration, and so forth
forth.

14
Multimedia Databases(6)

ƒ Types of multimedia data are available in current


systems (contd
(contd.))
• Audio: Sample data generated from aural recordings in a
string of bits in digitized form. Analog recordings are
typically converted into digital form before storage.

15
Multimedia Databases(7)

ƒ Types of multimedia data are available in current


systems (contd
(contd.))
• Composite or mixed multimedia data: A combination of
multimedia data types such as audio and video which
may be physically mixed to yield a new storage format or
logically mixed while retaining original types and formats.
Composite data also contains additional control
information describing how the information should be
rendered.

16
Data Management Issues

ƒ Multimedia applications dealing with thousands of


images documents,
images, documents audio and video segments
segments, and
free text data depend critically on
• Appropriate
pp p modeling g of the structure and content of data
• Designing appropriate database schemas for storing and
retrieving multimedia information.

17
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab
ƒ Revision

18
Geographic Information Systems

ƒ Geographic information systems(GIS) are used to


collect model
collect, model, and analyze information describing
physical properties of the geographical world.

19
Geographic Information Systems(2)
ƒ The scope of GIS broadly encompasses two types of data:
• Spatial data, originating from maps, digital images,
administrative and political boundaries, roads, transportation
networks, physical data, such as rivers, soil characteristics,
climatic regions
regions, land elevations
elevations, and
• Non-spatial data, such as socio-economic data (like census
counts), economic data, and sales or marketing information.
GIS is a rapidly developing domain that offers highly innovative
approaches to meet some challenging technical demands.

20
Geographic Information Systems(3)

21
Spatial data

22
GIS Applications

ƒ It is possible to divide GISs into three categories:


• Cartographic applications
• Digital terrain modeling applications
• Geographic objects applications

23
GIS Applications(2)

GIS Applications

Digital Terrain Modeling Geographic Objects


Cartographic Applications
Applications
I i ti
Irrigation Car navigation
Earth systems
science
Crop yield Geographic
analysis Civil engineering and market analysis
Land military evaluation
Evaluation Utility
Soil Surveys distribution and
Planning and
Facilities Air and water consumption
management pollution
ll ti studies
t di Consumer product
Landscape and services –
studies Flood Control economic analysis

Traffic pattern Water resource


analysis management

24
Data Management Requirements of GIS

ƒ The functional requirements of the GIS applications


above translate into the following database
requirements.

25
Data Management Requirements of GIS (2)

Data Modeling and Representation


ƒ GIS data can be broadly represented in two
formats:
• Vector data represents geometric objects such as points
points,
lines, and polygons.

26
Data Management Requirements of GIS (3)

ƒ Data Modeling and Representation (contd.):


• Raster data is characterized as an array of points
points, where
each point represents the value of an attribute for a real-
world location.
→Informally, raster images are n-dimensional array where each
entry is a unit of the image and represents an attribute. Two-
dimensional units are called pixels, while three-dimensional units
are called
ll d voxels.
l
→Three-dimensional elevation data is stored in a raster-based
digital elevation model (DEM) format.

27
Data Management Requirements of GIS (4)

Data Integration
ƒ GISs must integrate both vector and raster data from a
variety of sources.
• Sometimes edges and regions are inferred from a raster image
t form
to f a vector
t model,
d l or conversely,
l raster
t images
i suchh as
aerial photographs are used to update vector models.
• Several coordinate systems such as Universal Transverse
Mercator (UTM), latitude/longitude, and local cadastral systems
are used to identify locations.
• Data originating from different coordinate systems requires
appropriate transformations.

28
Specific GIS Data Operations

ƒ GIS applications are conducted through the use of


special operators such as the following:
• Interpolation
• p
Interpretation
• Proximity analysis
• Raster image processing
• Analysis of networks

29
Specific GIS Data Operations(2)
ƒ The functionality of a GIS database is also subject to other
considerations:
• Extensibility
• Data quality control
• Visualization
ƒ Such requirements clearly illustrate that standard RDBMSs
or ODBMSs do not meet the special needs of GIS.
• Therefore it is necessary to design systems that support the
vector and raster representations and the spatial functionality
as well as the required DBMS features
features.

30
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab
ƒ Revision

31
Bioinformatics
ƒ Bioinformatics: The study of genetics can be divided into
three branches:
• M
Mendelian
d li genetics
ti iis th
the study
t d off th
the ttransmission
i i off ttraits
it bbetween
t
generations
• Molecular genetics is the study of the chemical structure and function
of g
genes at the molecular level
• Population genetics is the study of how genetic information varies
across populations of organisms
ƒ Bioinformatics addresses information management of
genetic
ti information
i f ti with
ith special
i l emphasis
h i on DNA sequence
analysis
ƒ Interdisciplinary research field

32
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab
ƒ Revision

33
XML: Extensible Markup Language
ƒ Although HTML is widely used for formatting and structuring
Web documents, it is not suitable for specifying structured
data that is extracted from databases
databases.
ƒ A new language—namely XML (eXtended Markup
Language) has emerged as the standard for structuring and
exchanging data over the Web
Web.
• XML can be used to provide more information about the
structure and meaning of the data in the Web pages rather
than jjust specifying
p y g how the Web p
pages
g are formatted for
display on the screen.
ƒ The formatting aspects are specified separately—for
example, by using a formatting language such as XSL
(
(eXtended S
Stylesheet Language).)

34
XML (2)
ƒ Example1:

ƒ Example2:

35
XML (3)

ƒ The basic object is XML is the XML document.


ƒ There are two main structuring concepts that are
used to construct an XML document:
• Elements
• Attributes
ƒ Attributes in XML provide additional information that
describe elements.

36
XML(4)

ƒ As in HTML, elements are identified in a document by their


start tag and end tag.
• Th
The tag
t names are enclosedl d between
b t angled
l db
brackets
k t <<…>,
> and
d
end tags are further identified by a backslash </…>.
ƒ Complex elements are constructed from other elements
hierarchically whereas simple elements contain data
hierarchically,
values.
ƒ It is straightforward to see the correspondence between the
XML textual representation and the tree structure
structure.
• In the tree representation, internal nodes represent complex
elements, whereas leaf nodes represent simple elements.
• That is whyy the XML model is called a tree model or a
hierarchical model.

37
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab
ƒ Revision

38
Definitions of Data Mining

ƒ The discovery of new information in terms of


patterns or rules from vast amounts of data
data.
ƒ The process of finding interesting structure in data.
ƒ The process of employing one or more computer
learning techniques to automatically analyze and
extract
e t act knowledge
o edge from
o data
data.

39
Knowledge Discovery in Databases
(
(KDD)
)
ƒ Data mining is actually one step of a larger process
known as knowledge discovery in databases
(KDD).
(KDD)
ƒ The KDD process model comprises six phases
• Data selection
• Data cleansing
• Enrichment
• D t transformation
Data t f ti or encoding
di
• Data mining
• Reporting
p g and displaying
p y g discovered knowledge
g

40
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab
ƒ Revision

41
Data Warehousing

ƒ The data warehouse is a historical database


designed for decision support
support.
ƒ Data mining can be applied to the data in a
warehouse to help with certain types of decisions
decisions.
ƒ Proper construction of a data warehouse is
fundamental
u da e ta to tthe
e success
successful
u use oof data mining.
g
ƒ W. H Inmon characterized a data warehouse as:
• “A subject-oriented,
j , integrated,
g , nonvolatile,, time-
variant collection of data in support of management’s
decisions.”

42
Data Warehousing (2)
ƒ Purpose of Data Warehousing
• Traditional databases are not optimized
p for data access
only they have to balance the requirement of data access
with the need to ensure integrity of data.
• Most of the times the data warehouse users need only
readd access but,
b t needd th
the access tto be
b fast
f t over a large
l
volume of data.
• Most of the data required for data warehouse analysis
comes from
f multiple
lti l d
databases
t b and
d th
these analysis
l i are
recurrent and predictable to be able to design specific
software to meet the requirements.
• There is a great need for tools that provide decision
makers with information to make decisions quickly and
reliably based on historical data.
• The above functionality is achieved by Data Warehousing
and Online analytical processing (OLAP)
43
Data Warehousing (3)

ƒ Applications that data warehouse supports are:


• OLAP ((Online Analytical
y Processing)
g) is a term used to
describe the analysis of complex data from the data
warehouse.
• DSS (Decision Support Systems) also known as EIS
(E
(Executive
ti Information
I f ti Systems)
S t ) supportst organization’s
i ti ’
leading decision makers for making complex and
important decisions.
• Data Mi i is
D t Mining i usedd ffor kknowledge
l d di discovery, th
the
process of searching data for unanticipated new
knowledge.

44
Conceptual Structure of Data Warehouse

ƒ Data Warehouse processing involves


• Cleaning and reformatting of data
• OLAP
• Data Mining

45
Comparison with Traditional Databases
ƒ Data Warehouses are mainly optimized for appropriate data
access.
• T
Traditional
diti lddatabases
t b are ttransactional
ti l and
d are optimized
ti i d ffor b
both
th
access mechanisms and integrity assurance measures.
ƒ Data warehouses emphasize more on historical data as their
main purpose is to support time-series and trend analysis
analysis.
ƒ Compared with transactional databases, data warehouses
are nonvolatile.
ƒ In transactional databases transaction is the mechanism
change to the database. By contrast information in data
warehouse is relatively coarse grained and refresh policy is
carefully chosen, usually incremental.

46
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab
ƒ Revision

47
Introduction to ASIS_Lab

ƒ Advances in Security & Information Systems Lab


(www cse hcmut edu vn/~asis
(www.cse.hcmut.edu.vn/ asis )
ƒ Research Directions (2006-2010)
• Information Systems Security:
→Database Security
→Security Issues in E-/M-Commerce
→Security and Privacy in Location-Based Applications
→Security Issues in Outsourced Databases Services
→DBs/ISs Security Visualization
→E-Learning Systems Security
→Digital Watermarking and Steganography
→Privacy and Identity Management

48
Introduction to ASIS_Lab(2)

ƒ Research Directions (2006-2010) (cont.)


• Advanced Information Systems:
→E-/M-Commerce
→SOA-Based Modern Information Systems
→Large Database Systems
→Web Information Systems
→Modern Information Retrieval Systems
y
→Stream Data Management*
→Bioinformatics*
ƒ W
Weekly
kl meeting:
ti (usually)
( ll ) F
Friday
id morning,
i att
HCMUT

49
Outline
ƒ Mobile Databases
ƒ Multimedia Databases
ƒ Geographic Information Systems
ƒ Bioinformatics
ƒ XML
ƒ Data Miningg
ƒ Data Warehousing
ƒ Introduction to ASIS Lab

50
Q&A

51

Você também pode gostar