Você está na página 1de 12

S 


XML provides an application independent way of sharing data. With a DTD, independent
groups of people can agree to use a common DTD for interchanging data. Your application
can use a standard DTD to verify that data that you receive from the outside world is valid.
You can also use a DTD to verify your own data.
 
 
 
XML documents (and HTML documents) are made up by the following building blocks:
Elements, Tags, Attributes, Entities, PCDATA, and CDATA
This is a brief explanation of each of the building blocks:

 
Elements are the main building blocks of both XML and HTML documents.
Examples of HTML elements are "body" and "table". Examples of XML elements could be
"note" and "message". Elements can contain text, other elements, or be empty. Examples
of empty HTML elements are "hr", "br" and "img".


Tags are used to markup elements.
A starting tag like <element_name> mark up the beginning of an element, and an ending
tag like </element_name> mark up the end of an element.
Examples:
A body element: <body>body text in between</body>.
A message element: <message>some message in between</message>



Attributes provide extra information about elements.
Attributes are placed inside the start tag of an element. Attributes come in name/value
pairs. The following "img" element has an additional information about a source file:
<img src="computer.gif" />
The name of the element is "img". The name of the attribute is "src". The value of the
attribute is "computer.gif". Since the element itself is empty it is closed by a " /".


PCDATA means parsed character data.
Think of character data as the text found between the start tag and the end tag of an XML
element.
PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as
markup and entities will be expanded.


CDATA also means character data.
CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated
as markup and entities will not be expanded.

  
Entities as variables used to define common text. Entity references are references to
entities.
Most of you will known the HTML entity reference: "&nbsp;" that is used to insert an extra
space in an HTML document. Entities are expanded when a document is parsed by an XML
parser.
The following entities are predefined in XML:
   

c ‘
‘
&lt; <
&gt; >
&amp; &
&quot; "
&apos; '

   
In the DTD, XML elements are declared with an element declaration. An element declaration
has the following syntax:
<!ELEMENT element-name (element-content)>

 
Empty elements are declared with the keyword EMPTY inside the parentheses:
<!ELEMENT element-name (EMPTY)>

example:
<!ELEMENT img (EMPTY)>

    
Elements with data are declared with the data type inside parentheses:
<!ELEMENT element-name (#CDATA)>
or
<!ELEMENT element-name (#PCDATA)>
or
<!ELEMENT element-name (ANY)>
example:
<!ELEMENT note (#PCDATA)>

#CDATA means the element contains character data that is not supposed to be parsed by a
parser.
#PCDATA means that the element contains data that IS going to be parsed by a parser.
The keyword ANY declares an element with any content.
If a #PCDATA section contains elements, these elements must also be declared.

    !"#
Elements with one or more children are defined with the name of the children elements
inside the parentheses:
<!ELEMENT element-name (child-element-name)>
or
<!ELEMENT element-name (child-element-name,child-element-name,.....)>
example:
<!ELEMENT note (to,from,heading,body)>

When children are declared in a sequence separated by commas, the children must appear
in the same sequence in the document. In a full declaration, the children must also be

  ‘
‘
declared, and the children can also have children. The full declaration of the note document
will be:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#CDATA)>
<!ELEMENT from (#CDATA)>
<!ELEMENT heading (#CDATA)>
<!ELEMENT body (#CDATA)>

S 
If the DTD is to be included in your XML source file, it should be wrapped in a DOCTYPE
definition with the following syntax:
<!DOCTYPE root-element [element-declarations]>
example:
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#CDATA)>
<!ELEMENT from (#CDATA)>
<!ELEMENT heading (#CDATA)>
<!ELEMENT body (#CDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>

     


<!ELEMENT element-name (child-name)>
example
<!ELEMENT note (message)>
The example declaration above declares that the child element message can only occur one
time inside the note element.

      


<!ELEMENT element-name (child-name+)>
example
<!ELEMENT note (message+)>
The + sign in the example above declares that the child element message must occur one
or more times inside the note element.

  $  


<!ELEMENT element-name (child-name*)>
example
<!ELEMENT note (message*)>
The * sign in the example above declares that the child element message can occur zero or
more times inside the note element.

A ‘
‘
  $  
<!ELEMENT element-name (child-name?)>
example
<!ELEMENT note (message?)>
The ? sign in the example above declares that the child element message can occur zero or
one times inside the note element.

   % 


example
<!ELEMENT note (to+,from,header,message*,#PCDATA)>
The example above declares that the element note must contain at least one  child
element, exactly one  child element, exactly one  , zero or more , and
some other parsed    as well. Puh!
The purpose of a DTD is to define the legal building blocks of an XML document. It defines
the document structure with a list of legal elements. A DTD can be declared inline in your
XML document, or as an external reference.

& 
This is an XML document with a Document Type Definition: (Open it in IE5, and select view
source)
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The DTD is interpreted like this:
'( (in line 2) defines the element "note" as having four elements:
"to,from,heading,body".
'( (in line 3) defines the "to" element to be of the type "CDATA".
'( (in line 4) defines the "from" element to be of the type "CDATA"
and so on.....

% 
This is the same XML document with an external DTD: (Open it in IE5, and select view
source)
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>

ù ‘
‘
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
This is a copy of the file "note.dtd" containing the Document Type Definition:
<?xml version="1.0"?>
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>

  

In the DTD, XML element attributes are declared with an ATTLIST declaration. An attribute
declaration has the following syntax:
<!ATTLIST element-name attribute-name attribute-type default-value>

As you can see from the syntax above, the ATTLIST declaration defines the element which
can have the attribute, the name of the attribute, the type of the attribute, and the default
attribute value.
The 
) can have the following values:
*  %  
CDATA The value is character data
(eval|eval|..) The value must be an enumerated value
ID The value is an unique id
IDREF The value is the id of another element
IDREFS The value is a list of other ids
NMTOKEN The value is a valid XML name
NMTOKENS The value is a list of valid XML names
ENTITY The value is an entity
ENTITIES The value is a list of entities
NOTATION The value is a name of a notation
xml: The value is predefined
The 
)  )+  can have the following values:
*  %  
#DEFAULT value The attribute has a default value
#REQUIRED The attribute value must be included in the element
#IMPLIED The attribute does not have to be included
#FIXED value The attribute value is fixed


   % 
DTD example:
<!ELEMENT square EMPTY>

 ‘
‘
<!ATTLIST square width CDATA "0">

XML example:
<square width="100"></square>

In the above example the element square is defined to be an empty element with the
attributes width of type CDATA. The width attribute has a default value of 0.

 
+ 
Syntax:
<!ATTLIST element-name attribute-name CDATA "default-value">

DTD example:
<!ATTLIST payment type CDATA "check">

XML example:
<payment type="check">

Specifying a default value for an attribute, assures that the attribute will get a value even if
the author of the XML document didn't include it.

&  

Syntax:
<!ATTLIST element-name attribute-name attribute-type #IMPLIED>
DTD example:
<!ATTLIST contact fax CDATA #IMPLIED>

XML example:
<contact fax="555-667788">

Use an implied attribute if you don't want to force the author to include an attribute and you
don't have an option for a default value either.

"  

Syntax:
<!ATTLIST element-name attribute_name attribute-type #REQUIRED>
DTD example:
<!ATTLIST person number CDATA #REQUIRED>

XML example:
<person number="5677">

Use a required attribute if you don't have an option for a default value, but still want to
force the attribute to be present.

, % 
+ 
Syntax:
<!ATTLIST element-name attribute-name attribute-type #FIXED "value">
DTD example:
<!ATTLIST sender company CDATA #FIXED "Microsoft">

± ‘
‘
XML example:
<sender company="Microsoft">

Use a fixed attribute value when you want an attribute to have a fixed value without
allowing the author to change it. If an author includes another value, the XML parser will
return an error.

 
+ 
Syntax:
<!ATTLIST element-name attribute-name (eval|eval|..) default-value>
DTD example:
<!ATTLIST payment type (check|cash) "cash">

XML example:
<payment type="check">
or
<payment type="cash">

Use enumerated attribute values when you want the attribute values to be one of a fixed set
of legal values.

(
XML Namespaces provide a method to avoid element name conflicts.

( 
In XML, element names are defined by the developer. This often results in a conflict when
trying to mix XML documents from different XML applications.
This XML carries HTML table information:
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
This XML carries information about a table (a piece of furniture):
<table>
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
If these XML fragments were added together, there would be a name conflict. Both contain
a <table> element, but the elements have different content and meaning.
An XML parser will not know how to handle these differences-

. +  ( /  %
Name conflicts in XML can easily be avoided using a name prefix.
This XML carries information about an HTML table, and a piece of furniture:
<h:table>
<h:tr>
<h:td>Apples</h:td>

È ‘
‘
<h:td>Bananas</h:td>
</h:tr>
</h:table>

<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
In the example above, there will be no conflict because the two <table> elements have
different names.

() % 

When using prefixes in XML, a so-called namespace for the prefix must be defined.
The namespace is defined by the xmlns attribute in the start tag of an element.
The namespace declaration has the following syntax. xmlns:• ="".

<root
xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">

<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>

<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>

</root>
/ &   !/&#
A / &    (URI) is a string of characters which identifies an Internet
Resource.
The most common URI is the /  (URL) which identifies an
Internet domain address. Another, not so common type of URI is the / + 
( (URN).
In our examples we will only use URLs.

 (
Defining a default namespace for an element saves us from using prefixes in all the child
elements. It has the following syntax:
xmlns=" • "
This XML carries HTML table information:
<table xmlns="http://www.w3.org/TR/html4/">
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>

ß ‘
‘
</table>
This XML carries information about a piece of furniture:
<table xmlns="http://www.w3schools.com/furniture">
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>

(  /


XSLT is an XML language that can be used to transform XML documents into other formats,
like HTML.
In the XSLT document below, you can see that most of the tags are HTML tags.
The tags that are not HTML tags have the prefix xsl, identified by the namespace
xmlns:xsl="http://www.w3.org/1999/XSL/Transform":
<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr>
<th align="left">Title</th>
<th align="left">Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>

</xsl:stylesheet>

. 0

 . 
XML Schema is an XML-based language used to create XML-based languages and data
models. An XML schema defines element and attribute names for a class of XML documents.
The schema also specifies the structure that those documents must adhere to and the type
of content that each element can hold.
XML documents that attempt to adhere to an XML schema are said to be instances of that
schema. If they correctly adhere to the schema, then they are valid instances. This is not
the same as being well formed. A well-formed XML document follows all the syntax rules of

$ ‘
‘
XML, but it does necessarily adhere to any particular schema. So, an XML document can be
well formed without being valid, but it cannot be valid unless it is well formed.
  . 
You may already have some experience with DTDs. DTDs are similar to XML schemas in that
they are used to create classes of XML documents. DTDs were around long before the
advent of XML. They were originally created to define languages based on SGML, the parent
of XML. Although DTDs are still common, XML Schema is a much more powerful language.
As a means of understanding the power of XML Schema, let's look at the limitations of DTD.
1.‘ DTDs do not have built-in datatypes.
2.‘ DTDs do not support user-derived datatypes.
3.‘ DTDs allow only limited control over cardinality (the number of occurrences of an
element within its parent).
4.‘ DTDs do not support Namespaces or any simple way of reusing or importing other
schemas.
, 
An XML schema describes the structure of an XML instance document by defining what each
element must or may contain. An element is limited by its type. For example, an element of
complex type can contain child elements and attributes, whereas a simple-type element can
only contain text. The diagram below gives a first look at the types of XML Schema
elements.

Schema authors can define their own types or use the built-in types. Throughout this
course, we will refer back to this diagram as we learn to define elements. You may want to
bookmark this page, so that you can easily reference it.
The following is a high-level overview of Schema types.
1.‘ Elements can be of simple type or complex type.
2.‘ Simple type elements can only contain text. They can not have child elements or
attributes.
3.‘ All the built-in types are simple types (e.g, xs:string).

c ‘
‘
4.‘ Schema authors can derive simple types by restricting another simple type. For
example, an email type could be derived by limiting a string to a specific pattern.
5.‘ Simple types can be atomic (e.g, strings and integers) or non-atomic (e.g, lists).
6.‘ Complex-type elements can contain child elements and attributes as well as text.
7.‘ By default, complex-type elements have complex content, meaning that they have
child elements.
8.‘ Complex-type elements can be limited to having simple content, meaning they only
contain text. They are different from simple type elements in that they have
attributes.
9.‘ Complex types can be limited to having no content, meaning they are empty, but
they have may have attributes.
10.‘Complex types may have mixed content - a combination of text and child elements.
.  . 
Let's take a look at a simple XML schema, which is made up of one complex type element
with two child simple type elements.
 . 0. 1 22 -% 
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Author">
<xs:complexType>
<xs:sequence>
<xs:element name="FirstName" type="xs:string" />
<xs:element name="LastName" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Code Explanation
As you can see, an XML schema is an XML document and must follow all the syntax rules of
any other XML document; that is, it must be well formed. XML schemas also have to follow
the rules defined in the "Schema of schemas," which defines, among other things, the
structure of and element and attribute names in an XML schema.
Although it is not required, it is a common practice to use the xs qualifier to identify Schema
elements and types.
The document element of XML schemas is xs:schema. It takes the attribute xmlns:xs with
the value of http://www.w3.org/2001/XMLSchema, indicating that the document should
follow the rules of XML Schema. This will be clearer after you learn about namespaces.
In this XML schema, we see a xs:element element within the xs:schema element.
xs:element is used to define an element. In this case it defines the element Author as a
complex type element, which contains a sequence of two elements: FirstName and
LastName, both of which are of the simple type, string.
*  &
In the last section, you saw an example of a simple XML schema, which defined the
structure of an Author element. The code sample below shows a valid XML instance of this
XML schema.
 . 0. 1 22  -% 
<?xml version="1.0"?>
<Author xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="Author.xsd">
<FirstName>Mark</FirstName>
<LastName>Twain</LastName>
</Author>
Code Explanation

cc ‘
‘
This is a simple XML document. Its document element is Author, which contains two child
elements: FirstName and LastName, just as the associated XML schema requires.
The xmlns:xsi attribute of the document element indicates that this XML document is an
instance of an XML schema. The document is tied to a specific XML schema with the
xsi:noNamespaceSchemaLocation attribute.
There are many ways to validate the XML instance. If you are using an XML authoring tool,
it very likely is able to perform the validation for you. Alternatively, a couple of simple
online XML Schema validator tools are listed below.
‘ http://tools.decisionsoft.com/schemaValidate.html provided by DecisionSoft.
‘ http://apps.gotdotnet.com/xmltools/xsdvalidator provided by GotDotNet.
. 1   
In this lesson of the XML tutorial, you have learned to create a very simple XML Schema and
to use it to validate an XML instance document. You are now ready to learn more advanced
features of XML Schema.
To continue to learn XML go to the top of this page and click on the next lesson in this XML
Tutorial's Table of Contents.

c  ‘
‘