Você está na página 1de 46

XHTML Tutorial

XHTML is the next generation of HTML and is a hybrid


between HTML and XML.
XML was designed to describe data. HTML was
designed to display data.
XHTML is much stricter than HTML. Not all browers
support XML so XHTML provides an intermediary
soluton and can be interpreted by XML and HTML
browsers.
Start learning now!

What is an XHTML File?

XHTML stands for EXtensible HyperText Markup Language


XHTML is similar to HTML,The only difference is XHTML is
stricter and cleaner version of HTML
An XHTML file contains small mark up tags
The way how the page must be displayed in the browser is
described by these markup tags
An XHTML file must have an .xhtml file extension

Using any simple text editor XHTML file can be created

File extension:

.xhtml

MIME type:

application/xhtml+xml

Type code:

TEXT

Developed by:

W3C

Type of format: markup language


Standard(s):

Working with XHTML

XHTML 1.0, XHTML-MP 1.0

XHTML stands for eXtensible HyperText Markup Language which is


simply HTML 4.0 ,written as an XML application, and it makes
manipulating HTML possible in the same way as you would work
with XML.

Way from HTML to XHTML-MP 1.0


HTML 4 -> XHTML 1.0 -> XHTML Basic -> XHTML Mobile
Profile
As the world was going crazy about XML, Most popular mark-up also
could not escape its destiny: XHTML 1.0 is just a reformulation of
HTML 4 in XML. HTML 4 simply has to follow few syntatic rules to be
XHTML 1.0

All tags should be written using small letters

All tags needs to be closed either like this way(<b>bold


text</b>) or need to be closed like (<br> /).Here Single line
breaks (<br>) are not entertained and are not synctactically
correct
All documents should have a document type defination
All documents should be properly formed
Shortning of Attributes is not allowed

There must be a proper nesting of tags

How GOOD and how BAD is XHTML?


The two main uses are "extensibility and portability" and it is
standard too .

Extensibility:As XHTML is extesible, We can create and add on


our own tags

Portability: All new tags that are created and added are
understood by all.

Standardization: Unlike HTML, XHTML follows certain


standards to make a true template. And all that has to be
accepted and what not should be accepted is followed in that
template

On the other side of the coin,there are couple of problems

XHTML is not as easy as HTML to play in just a day. HTML is


like a toy that everyone can play with.But people may loose
interest to play with such rigid rules of XHTML. Still we cap
play with HTML as we used to do before.The XHTML DTD
contains HTML. Only thing is you should use an HTML
declaration statement at the top of the document.

For that reason XHTML, and XML go against the rules which
were laid down by W3C for the purpose of web content and
authoring tools accessible to disabled users.

Why XHTML?

Whats is wrong with HTML?


HTML is the set of codes (i'e "markup language") that a writer
puts into a document to make the page displayable on the World
Wide Web. HTML (HyperText Markup Language) has been the
lingua franca of the World Wide Web since 1990. It has gone
through several revisions, and now it is at version 4. Although it
has been enormously successful, the language is no longer
suitable for the deployment of commercial and industrial webbased applications on the Internet and intranets.
HTML will not go through another revision, except as an
application of XML, i.e. XHTML. HTML is enormously successful
and we fully expect XHTML to be of great interest to web
developers now that it's a W3C Recommendation.
There won't be any 'HTML 5'. Why it should not be? Well, bear in
mind that HTML was originally designed for different purpose than
today's very demanding hi-tech Internet - namely, to exchange
data and documents between scientists associated with CERN, the
birthplace of the web. Since then the language has been hacked
and stretched into an unwieldy monster, and the prevalence of
sloppy markup practices makes it hard or impossible for some user
agents (such as browsers, spiders, etc) to make sense of the web.
After a decade of use and ad-hoc evolution, there is a strong need
for a more extensible and more portable language.

The role of XML


XML (Extensible Markup Language) is structured set of rules for
how to define any kind of data that has to be shared on the Web.
It's called "extensible" because anyone can invent a particular set
of markup for a particular purpose and as long as everyone uses
it,can be adopted and used for different purposes - including, as it
happens, describing the appearance of a Web page.
However, the immediate issue is to facilitate transition from HTML
for the mass of developers who are already familiar with HTML. That
being the case, it seemed desirable to reframe HTML in the terms of
XML. The result is XHTML, which is a particular application of XML
for "expressing" Web pages.
XHTML is the follow-on version of HTML 4. You could think it as
HTML 5, except that it is called XHTML 1.0. In XHTML, all HTML 4
markup tags and attributes will continue to be supported.
With HTML, authors had a fixed set of elements to use, with no
variation. Unlike HTML, however, XHTML can be extended by anyone
who uses it. New tags and attributes can be defined and also added
to those that are already existing, making possible new ways
possible to embed content and programming in Web page. XHTML
1.0, allow authors to mix and match the known elements of HTML 4
elements with that of other XML languages elements. languages,
including the one which are developed by W3C for multimedia.
We combine HTML with other tag sets to meet the desires to extend
the functionality of the web. (Synchronized Multimedia Integration
Language - SMIL), mathematical expressions (MathML), two
dimensional vector graphics (Scalable Vector Graphics - SVG), and
metadata (Resource Description Framework - RDF).

Why should we go for XHTML


The reasons to upgrade language to a new version is to take
advantage of new bells and whistles, and also because problems
with the earlier version have been fixed. However, XHTML is just a
faithful copy of HTML 4, as tag functionalities go, so here we cant
expect any fancy new tags.
According to W3C XHTML should be used because of:

Extensibility
XML documents needs to be well-formed. Under HTML (an
SGML application), any addition of a new group of elements
requires to alter the entire DTD. In an XML-based DTD, all
that is required is that the new set of elements be internally
consistent and well-formed to be added to an existing DTD.
This makes the development and integration of new
collections of elements very easy.

Portability
Now days use of non-desktop devices to access Internet
documents is increasing .75% of Internet access could is
carried out on these alternate platforms. most of these nondesktop devices will not have the computing power of a
desktop computer, they are not designed to hold ill-formed
HTML as current browsers tend to do.These non-desktop
browsers will not display the document if they wont receive
well-formed markup (HTML or XHTML).

While HTML isn't completely lacking those attributes, we're all too
familiar with how painfully slow the evolution has been (relative to
the pace of Internet development), and how hard it it is to make
your pages work on a wide range of browsers and platforms.
XHTML will help to remedy those problems.

Differences Between XHTML And HTML


What makes XHTML different from HTML

Tag and attribute names must be in lower-case


Elements must be nested properly, no overlapping
Non-empty elements must be closed
Empty elements must be terminated
All attribute values must be quoted
Attribute value pairs cannot be shortened

<script> and <style> elements

Tag and attribute names must be in lower-case


XHTML element and attribute names must be written in
lowercase,as XML is case-sensitive, No longer you can get away

with what people did to improve readability of the code typing the
attributes and elements names in uppercase and the values in
lowercase. Attribute values can be any case you want. For example,
the "#ffcc33" value below can also be written as "#FFCC33."
HTML

XHTML

<TD BGCOLOR="#ffcc33">

<td bgcolor="#ffcc33">

Elements must be nest properly, no overlapping


Browsers wont care overlaped elements in most case. For example,
if there is bold tag at the end of a paragraph, it doesnot matters
whether to close the </b> first or the </p>. But with XML and
XHTML, we have to unclose the last opened tag first and then first
opened
HTML

XHTML

<p>Be <b>bold!</p></b>

<p>Be <b>bold!</b></p>

Overlapping is widely tolerated in HTML,though it is illegal. An


XHTML document must be well-formed XML. It should follow the
basic XML syntax. If it fails doing so, There will be no obligation to
continue processing of the document bythe XML parser.XML parser
will not try to recover and "guess" what you meant if the syntax is
wrong as HTML parser did.
Non-empty elements must be closed
either Explicitly or Implicitly all elements must be closed. Since the
<p> is designed to mark the beginning and end of a paragraph it is
a "non-empty" tag .Thus it must be closed at the end of paragraph
HTML

XHTML

First paragraph<p>
Second paragraph<p>

<p>First paragraph</p>
<p>Second paragraph</p>

Affected Elements: <basefont>, <body>, <colgroup>, <dd>,


<dt>, <head>, <html>, <li>, <p>, <tbody>/<thead>/<tfoot>,
<th>/<td>, <tr>
Empty elements must be terminated

There are tags which contains no content within them, when we feel
there is no important role to play by them, then should delete those
empty tags. <p> tag contains a paragraph, and a <b> tag contains
text to be bolded, a <br> tag is "empty" as it never contains any
content.Other tags like this are <hr> and <img src="valid.gif">
HTML

XHTML

<hr>

<hr />

<br>

<br />

<input ... >

<input ... />

<param ... >

<param ... />

<img
src="valid.gif">

<img
src="valid.gif" />

Affected Elements: <area> <base> <br> <col>


<frame><hr><img> <input><isindex><link><meta>
<option><param>
All Attribute values must be quoted
No more <img ... border=0>is allowed type.Attribute values
including numeric values must be quoted
HTML

XHTML

<img ... border=0>

<img ... border="0" />

Attribute value pairs cannot be minimized


Usually we try to minimize the attribute if it has single value.But
XML does not allow attribute minimization.Single valued or standalone attributes in XHTML must be expanded (eg. <td nowrap>text
</td> becomes <td nowrap="nowrap">text</td>). umeric
HTML

XHTML

<dl compact>

<dl compact="compact">

<ul compact>

<ul compact="compact">

<option ... selected>

<option ...
selected="selected"> >

<td nowrap> text


</td>

<td nowrap="nowrap"> text


</td>

<input type="radio" ...


<input type="radio" ...
checked>
checked="checked" />
<input
type="checkbox" ...
checked>

<input type="checkbox" ...


checked="checked" />

<script> and <style> elements


The script and style elements in XHTML, are declared as having
#PCDATA content. As a result, < and & will be treated as the start
of markup, and entities such as < and & will be recognized as entity
references by the XML processor to < and & respectively.We can
avoid the expansion of these entities by Wrapping the content of the
script or style element within a CDATA marked section. The string
which ends the CDATA section."]]>" is The only delimiter that is
recognized in a CDATA.
XHTML
<script language="JavaScript
type="text/javascript">
<![CDATA[ document.write("<b>Hello
World!</b>"); ]]>
</script>
Affected Elements: <basefont>, <body>, <colgroup>, <dd>,
<dt>, <head>, <html>, <li>, <p>, <tbody>/<thead>/<tfoot>,
<th>/<td>, <tr>
XHTML Syntax Rules
XHTML follows the following syntax

Documents in XHTML must always be well-formed


All XHTML Document can have only one root element that is
<html>all other elements should be nested within the root
element. There will be only one parent called a root element
which can have its children(sub elements) nested within it.
Sub elements must be in pairs and correctly nested within
their parent element. The basic document structure is:

(doctype)
<html xmlns="http://www.vyom.co.in/xhtml">
<head>
<title>...</title>
... </head>
<body>... </body>
</html>
Notice the (doctype) and the "xmlns" attribute on the opening
html tag. You should also include a character set meta tag in the
<head> element.

XHTML elements must be properly nested

Wrong:
<p>This is our site <em>paragraph.</p></em>
Right:
<p>This is our site <em>paragraph.</em></p>

Since XML is case-sensitive. Tag names must be in


lowercase.

Wrong:
<PRE>Some preformatted text.</PRE>
Right:
<pre>Some preformatted text.</pre>

All Attributes that are used here must be in lower case

Wrong:

<a HREF="http://www.vyom.co.in">
Right:
<a href="http://www.vyom.co.in">

All elements must be closed


This includes elements that traditionally do not contain any
content, such as images, form inputs, meta tags, etc.

Wrong:
<p>Welcome to vyom.
<p>This is our website
Right:
<p>Welcome to vyom..</p>
<p>This is our website.</p>
Elements which dont have closing tags must be closed with slash
inside the tag.
<br> becomes <br />
<hr> becomes <hr />
<input type="text">becomes <input type="text" />

All attributes values must be quoted

Wrong:
<a href=http://www.vyom.co.in>example link</a>
Right:
<a href="http://www.vyom.co.in">example link</a>

Attribute minimization is forbidden

Wrong:
<input type="checkbox" checked />
Right:
<input type="checkbox" checked="checked" />

All image tags should have "alt" attributes.

Wrong:
<img src="kitten.jpg" />
Right:
<img src="kitten.jpg" alt="an evil kitten" />

The "id" attribute and the "name" attributeElement ids which should be uniqe eventually will replace
element names.For now, to ensure backword-compatibility it
is recommended to use both,for example, many of the form
fields are accessed by their names. In cascading style sheets
and various scripting languages Ids are used .

<img src="kitty.jpg" name="kittypic" alt="an evil kitten" />


<img src="kitty.jpg" name="kittypic" alt="an evil kitten"
id="kittypic" />

The XHTML DTD defines mandatory

elements

XHTML Document Type Definitions.

By referencing the Document Type Definition (DTD),


DOCTYPE definition line in an XHTML document specifies
the document type .The syntax and legal elements of an
XHTML document are specified by DTD. The three types of
documents are defined as follows:

Strict

By using this type a strictest rule is applied on the document.


There shoulnot be any presentational tags in XHTML. To display
the data should use cascading style sheets. The attributes and
elements which are not deprecated or do not appear in framesets
are included in strict DTD.

<!DOCTYPE html
PUBLIC "-//VYOM//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/strict.dtd">

Transitional

This document type is used when document contains


presentational tags. For the browsers which do not support
Cascading Style Sheets, This document type seems to be useful.
The Transitional DTD includes all that included in the strict DTD
and also the deprecated elements and attributes.

<!DOCTYPE html
PUBLIC "-//VYOM//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/transitional.dtd">

Frameset

When we make use of frames to split the browser window then


this document type can be used. This includes all that which is
present in The transitional DTD and also the frames

<!DOCTYPE html
PUBLIC "-//VYOM//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/frameset.dtd">

XHTML Elements

XHTML documents are text files made up


of XHTML elements.

XHTML elements are defined using XHTML


tags.

What Is An XHTML Element


An XHTML element indicates structure in an XHTML document and a
way of hierarchically arranging content. XHTML elements have both
properties attributes and content, according to the appropriate
XHTML DTD as specfied (allowable and required). A variety of
structures are represented by elements like headings, hypertext
links, paragraphs, lists, embedded media.

XHTML elements can be constructed with:


1) a start tagthat marks the start of an element.
2) it may follow any number of attributes and their
corresponding values.
3) it can have some contentin it. (characters and other
elements). and
4) finally an end tag comes

Many of XHTML elements hold attributes in the start tags, defining


desired behavior or indicating additional element properties. Unlike
HTML here there is always a need of end tag. There are some
elements which are not subset of any official DTDs, but still are
supported by some browsers and are also used in some web pages.
All such elements may be ignored or displayed improperly on
browsers which do not support them.

XHTML Tags
Sometimes XHTML elements are referred to as "tags". though many
prefer "tag" strictly to the semantic structures to delimit the start
and end of an element.

XHTML tags are used to mark-up XHTML elements

In XHTML the tags are always surrounded by < and >

These characters are called angle brackets

XHTML tags always come in pairs like <b> and </b>

The first tag is called the start tag, and the second tag
is the end tag

The text that lies between the start and the end tags is
called element content

XHTML tags are case sensitive,

WE should start using lowercase tags if we have to prepare ourself


for the next generations of HTML, XHTML which is next generation
HTML demands for lowercase tags
XHTML Tag List

Sometimes XHTML elements are referred to as "tags",


though many prefer the term tag strictly in reference to the
semantic structures delimiting the start and end of an
element.

XHTML tags are used to mark-up XHTML elements

Tags in XHTML are always surrounded by characters <

and >

These characters are called angel brackets

All tags in XHTML come in pairs for example <b> and


</b>

In these pair of tags,the first tag is the start tag,and


the second tag is called end tag

The text that lies between the start tag and the end
tag is called the element content

XHTML tags are case sensitive,

DTD: indicates in which XHTML 1.0 DTD the tag is allowed.


S=Strict, T=Transitional, and F=Frameset

Tag

Description

DTD

<!--...-->

Defines a comment

STF

<!DOCTYPE>

Defines the document type

STF

<a>

Defines an anchor

STF

<abbr>

Defines an abbreviation

STF

<acronym>

Defines an acronym

STF

<address>

Defines an address element

STF

<applet>

Deprecated. Defines an applet

TF

<area>

Defines an area inside an image map

STF

<b>

Defines bold text

STF

<base>

Defines a base URL for all the links in a page STF

<basefont>

Deprecated. Defines a base font

TF

<bdo>

Defines the direction of text display

STF

<big>

Defines big text

STF

<blockquote>

Defines a long quotation

STF

<body>

Defines the body element

STF

<br>

Inserts a single line break

STF

<button>

Defines a push button

STF

<caption>

Defines a table caption

STF

<center>

Deprecated. Defines centered text

TF

<cite>

Defines a citation

STF

<code>

Defines computer code text

STF

<col>

Defines attributes for table columns

STF

<colgroup>

Defines groups of table columns

STF

<dd>

Defines a definition description

STF

<del>

Defines deleted text

STF

<dir>

Deprecated. Defines a directory list

TF

<div>

Defines a section in a document

STF

<dfn>

Defines a definition term

STF

<dl>

Defines a definition list

STF

<dt>

Defines a definition term

STF

<em>

Defines emphasized text

STF

<fieldset>

Defines a fieldset

STF

<font>

Deprecated. Defines text font, size, and color TF

<form>

Defines a form

STF

<frame>

Defines a sub window (a frame)

<frameset>

Defines a set of frames

<h1> to <h6> Defines header 1 to header 6

STF

<head>

Defines information about the document

STF

<hr>

Defines a horizontal rule

STF

<html>

Defines an html document

STF

<i>

Defines italic text

STF

<iframe>

Defines an inline sub window (frame)

TF

<img>

Defines an image

STF

<input>

Defines an input field

STF

<ins>

Defines inserted text

STF

<isindex>

Deprecated. Defines a single-line input field

TF

<kbd>

Defines keyboard text

STF

<label>

Defines a label for a form control

STF

<legend>

Defines a title in a fieldset

STF

<li>

Defines a list item

STF

<link>

Defines a resource reference

STF

<map>

Defines an image map

STF

<menu>

Deprecated. Defines a menu list

TF

<meta>

Defines meta information

STF

<noframes>

Defines a noframe section

TF

<noscript>

Defines a noscript section

STF

<object>

Defines an embedded object

STF

<ol>

Defines an ordered list

STF

<optgroup>

Defines an option group

STF

<option>

Defines an option in a drop-down list

STF

<p>

Defines a paragraph

STF

<param>

Defines a parameter for an object

STF

<pre>

Defines preformatted text

STF

<q>

Defines a short quotation

STF

<s>

Deprecated. Defines strikethrough text

TF

<samp>

Defines sample computer code

STF

<script>

Defines a script

STF

<select>

Defines a selectable list

STF

<small>

Defines small text

STF

<span>

Defines a section in a document

STF

<strike>

Deprecated. Defines strikethrough text

TF

<strong>

Defines strong text

STF

<style>

Defines a style definition

STF

<sub>

Defines subscripted text

STF

<sup>

Defines superscripted text

STF

<table>

Defines a table

STF

<tbody>

Defines a table body

STF

<td>

Defines a table cell

STF

<textarea>

Defines a text area

STF

<tfoot>

Defines a table footer

STF

<th>

Defines a table header

STF

<thead>

Defines a table header

STF

<title>

Defines the document title

STF

<tr>

Defines a table row

STF

<tt>

Defines teletype text

STF

<u>

Deprecated. Defines underlined text

TF

<ul>

Defines an unordered list

STF

<var>

Defines a variable

STF

<xmp>

Deprecated. Defines preformatted text

XHTML Standard Attributes


Just adding text limits the the way of creating web pages.
Tags have attributes and they also accept style rules which
modify their functions, this gives XHTML it utility.
The rules of XML and rules for general attributes should be
kept in mind while using attributes.

The rules are as follows:

At the opening of the element tag only the attribute must be


specified. closing tag of an element merely contains a back
slash and tag name.
All attribute names must be in lower case.
All attributes must have a value.
All attribute should hold a values and always be double
quotes. Here double quote and two single quotes are not the

same.

All attributes must be placed in a list seperated by blank


space and no other characters must be included between
them

Attributes are divided into categories, among thes some of


them overlap. These can be grouped as follows:

Universal/Core Attributes
Including the <body> tags, almost all tags that come in the
body are held under this category. These are informational, or
they associate the element with information which is
elsewhere in the document.

Events
Based on the user actions or browser these are used to invoke
scripts. For example using onmouseover attribute to create
rollover effects in a document. They are used in JavaScript
extensively,

Presentational or Styling Attributes


Presentational or Styling attributes are the attributes that are
used to change the way in which an element is to be
displayed on the screen. They are deprecated into style
sheets, To code for backward compatibility we should know
them well.

Tag-Specific Attributes
Attributes which are concerned to a given tag or group of tags
are called Tag-specific attributes. Certain tags like <img> tag,
does not hold any meaning without attributes. Therefore we
do discuss these attributes in relation to the tags they apply
to.

XHTML Color Names


Colors can be defined using
color names.

XHTML Color Names


The table below provides a list of color names.

XHTML Validation
With DTD how to Validate
XHTML
An XHTML document is validated against a Document Type
Definition (DTD). A proper DTD must be added as the first
line of the file,before validating properly an XHTML file.

The Strict DTD includes all non-deprecated elements and attributes


or those which donot appear in framesets:

!DOCTYPE html PUBLIC


"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1strict.dtd"

The Transitional DTDcontains everything which the strict DTD


includes plus deprecated elements and attributes:
!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1transitional.dtd"

The Frameset DTD includes frames and all the other things which
are included in transitional DTD:

!DOCTYPE html PUBLIC


"-//W3C//DTD XHTML 1.0 Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1frameset.dtd"

This is a simple XHTML document:

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1strict.dtd" >
<html>
<head>
<title>simple document</title>
</head>
<body>
<p>a simple paragraph</p>
</body>

</html>
XHTML Modularization Model
To develop any complex type of application it is very
important to follow a clear conceptual standard for
organizing the development. To reduce the application's
functionality into some number of "building blocks" or
"modules"we can use the Modular approach. Then some
specific rules are followed to combine these modules to form
an entire application. This approach is very easy and is more
advantageous than any other approaches

Why Modularization is
required?

Conceptual clarity Ideas and code can be shared between


developers due to the Conceptual clarity
Reduces complexity Since the application functionality is
divided into modules it reduces the complexity
Supports object-oriented design principles Since it follows
concept of Encapsulation and data hiding it supports object
oriented design principles
Encourages reuse If the modules are well defined they can
be reused in future applications.
Decreases debugging time By design changes we can
localize errors and thus the debugging time can be reduced
Increases flexibility and maintainability Since each
modules can be upgraded or replaced independently of others
we can increase flexibility and maintainability
Eases development, testing, and maintenance As it
provides logical,easy to understand, and consistent
organization it eases development,testing and maintenance.
Allows the creation of generic rules, methods, and
procedures To help in consistent development practices it
allows creation of generic rules, methods, and procedures
Creates configurable objects As the objects are
cofigurablethat it helps the end user to tailor them for different
purposes
Supports a variety of end user interface and deployment
environments It supports all sorts of interfaces of the end
users and deployment environments as it follows standardized
subsets and supersets.

Design
Goals
Here are the design goals for modularization framework for XHTML:

To create coherent sets of semantically related modules within


the XHTML namespace using XML Schema
To support the creation of subsets and supersets of XHTML for
specific purposes such as handheld devices and specialpurpose appliances
To facilitate future development by allowing modules to be
upgraded or replaced independently of other modules
To encourage and facilitate the reuse of common modules by
developers.

XHTML 7-BIT ASCII Reference


While transmitting data over the Web, XHTML uses standard
7-BIT ASCII.
128 different character values are represented using 7-BIT
ASCII (0-127).

Printable 7-BIT ASCII Characters


Result

Description

Entity Number

space

&#32;

exclamation mark

&#33;

"

quotation mark

&#34;

number sign

&#35;

dollar sign

&#36;

percent sign

&#37;

&

ampersand

&#38;

'

apostrophe

&#39;

left parenthesis

&#40;

right parenthesis

&#41;

asterisk

&#42;

plus sign

&#43;

comma

&#44;

hyphen

&#45;

period

&#46;

slash

&#47;

digit 0

&#48;

digit 1

&#49;

digit 2

&#50;

digit 3

&#51;

digit 4

&#52;

digit 5

&#53;

digit 6

&#54;

digit 7

&#55;

digit 8

&#56;

digit 9

&#57;

colon

&#58;

semicolon

&#59;

<

less-than

&#60;

equals-to

&#61;

>

greater-than

&#62;

question mark

&#63;

at sign

&#64;

uppercase A

&#65;

uppercase B

&#66;

uppercase C

&#67;

uppercase D

&#68;

uppercase E

&#69;

uppercase F

&#70;

uppercase G

&#71;

uppercase H

&#72;

uppercase I

&#73;

uppercase J

&#74;

uppercase K

&#75;

uppercase L

&#76;

uppercase M

&#77;

uppercase N

&#78;

uppercase O

&#79;

uppercase P

&#80;

uppercase Q

&#81;

uppercase R

&#82;

uppercase S

&#83;

uppercase T

&#84;

uppercase U

&#85;

uppercase V

&#86;

uppercase W

&#87;

uppercase X

&#88;

uppercase Y

&#89;

uppercase Z

&#90;

left square bracket

&#91;

backslash

&#92;

right square bracket

&#93;

caret

&#94;

underscore

&#95;

grave accent

&#96;

lowercase a

&#97;

lowercase b

&#98;

lowercase c

&#99;

lowercase d

&#100;

lowercase e

&#101;

lowercase f

&#102;

lowercase g

&#103;

lowercase h

&#104;

lowercase i

&#105;

lowercase j

&#106;

lowercase k

&#107;

lowercase l

&#108;

lowercase m

&#109;

lowercase n

&#110;

lowercase o

&#111;

lowercase p

&#112;

lowercase q

&#113;

lowercase r

&#114;

lowercase s

&#115;

lowercase t

&#116;

lowercase u

&#117;

lowercase v

&#118;

lowercase w

&#119;

lowercase x

&#120;

lowercase y

&#121;

lowercase z

&#122;

left curly brace

&#123;

vertical bar

&#124;

right curly brace

&#125;

tilde

&#126;

7-BIT ASCII Device Control Characters


There is nothing to do by these characters inside an XHTML
document. They were designed to control hardware parts like
printers and tape drives.
Result

Description

Entity Number

NUL

null character

&#00;

SOH

start of header

&#01;

STX

start of text

&#02;

ETX

end of text

&#03;

EOT

end of transmission

&#04;

ENQ

enquiry

&#05;

ACK

acknowledge

&#06;

BEL

bell (ring)

&#07;

BS

backspace

&#08;

HT

horizontal tab

&#09;

LF

line feed

&#10;

VT

vertical tab

&#11;

FF

form feed

&#12;

CR

carriage return

&#13;

SO

shift out

&#14;

SI

shift in

&#15;

DLE

data link escape

&#16;

DC1

device control 1

&#17;

DC2

device control 2

&#18;

DC3

device control 3

&#19;

DC4

device control 4

&#20;

NAK

negative acknowledge

&#21;

SYN

synchronize

&#22;

ETB

end transmission block

&#23;

CAN

cancel

&#24;

EM

end of medium

&#25;

SUB

substitute

&#26;

ESC

escape

&#27;

FS

file separator

&#28;

GS

group separator

&#29;

RS

record separator

&#30;

US

unit separator

&#31;

DEL

delete (rubout)

&#127;

XHTML Entities Reference

Characters like the < have a special meaning in HTML,


So these cannot be used in the text.

An entity is a term used for symbol. Many symbols such


as foreign cash symbols, trademark, or copyright exist
outside of the ones on your keyboard. In order to
display them, you need to know 4 parts.

In XHTML tags are created using less than and greater


than symbol. We need entities to use them on our websites

A Character Entity has three parts:

Each entity starts with ampersand - &

An entity name or a # and an entity number

And finally a semicolon - ;

The ISO 8859-1 character set is supported by HTML 4.01.


The original 7-BIT ASCII standard is the lower part of ISO-8859-1
(codes from 0-127). Without a character reference, many of these
characters can be used .
Using character entity names we can use all the codes from 160255 (higher part of ISO-8859-1)
Note that the entity names are case sensitive.

ASCII Entities with new Entity Names


Result Description

Entity Name

Entity Number

"

quotation
mark

&quot;

&#34;

'

apostrophe

&apos; (does not work in


IE)

&#39;

&

ampersand

&amp;

&#38;

<

less-than

&lt;

&#60;

>

greater-than

&gt;

&#62;

Result Description

Entity Name Entity Number

non-breaking space

&nbsp;

&#160;

inverted exclamation
mark

&iexcl;

&#161;

currency

&curren;

&#164;

cent

&cent;

&#162;

pound

&pound;

&#163;

yen

&yen;

&#165;

broken vertical bar

&brvbar;

&#166;

section

&sect;

&#167;

spacing diaeresis

&uml;

&#168;

copyright

&copy;

&#169;

feminine ordinal indicator

&ordf;

&#170;

angle quotation mark


(left)

&laquo;

&#171;

negation

&not;

&#172;

soft hyphen

&shy;

&#173;

registered trademark

&reg;

&#174;

trademark

&trade;

&#8482;

spacing macron

&macr;

&#175;

degree

&deg;

&#176;

plus-or-minus

&plusmn;

&#177;

superscript 2

&sup2;

&#178;

superscript 3

&sup3;

&#179;

spacing acute

&acute;

&#180;

micro

&micro;

&#181;

paragraph

&para;

&#182;

middle dot

&middot;

&#183;

spacing cedilla

&cedil;

&#184;

superscript 1

&sup1;

&#185;

masculine ordinal
indicator

&ordm;

&#186;

angle quotation mark


(right)

&raquo;

&#187;

fraction 1/4

&frac14;

&#188;

fraction 1/2

&frac12;

&#189;

fraction 3/4

&frac34;

&#190;

inverted question mark

&iquest;

&#191;

multiplication

&times;

&#215;

division

&divide;

&#247;

ISO 8859-1 Character Entities


Result Description

Entity Name

Entity Number

capital a, grave accent

&Agrave;

&#192;

capital a, acute accent

&Aacute;

&#193;

capital a, circumflex
accent

&Acirc;

&#194;

capital a, tilde

&Atilde;

&#195;

capital a, umlaut mark

&Auml;

&#196;

capital a, ring

&Aring;

&#197;

capital ae

&AElig;

&#198;

capital c, cedilla

&Ccedil;

&#199;

capital e, grave accent

&Egrave;

&#200;

capital e, acute accent

&Eacute;

&#201;

capital e, circumflex
accent

&Ecirc;

&#202;

capital e, umlaut mark

&Euml;

&#203;

capital i, grave accent

&Igrave;

&#204;

capital i, acute accent

&Iacute;

&#205;

capital i, circumflex
accent

&Icirc;

&#206;

capital i, umlaut mark

&Iuml;

&#207;

capital eth, Icelandic

&ETH;

&#208;

capital n, tilde

&Ntilde;

&#209;

capital o, grave accent

&Ograve;

&#210;

capital o, acute accent

&Oacute;

&#211;

capital o, circumflex
accent

&Ocirc;

&#212;

capital o, tilde

&Otilde;

&#213;

capital o, umlaut mark

&Ouml;

&#214;

capital o, slash

&Oslash;

&#216;

capital u, grave accent

&Ugrave;

&#217;

capital u, acute accent

&Uacute;

&#218;

capital u, circumflex
accent

&Ucirc;

&#219;

capital u, umlaut mark

&Uuml;

&#220;

capital y, acute accent

&Yacute;

&#221;

capital THORN, Icelandic

&THORN;

&#222;

small sharp s, German

&szlig;

&#223;

small a, grave accent

&agrave;

&#224;

small a, acute accent

&aacute;

&#225;

small a, circumflex
accent

&acirc;

&#226;

small a, tilde

&atilde;

&#227;

small a, umlaut mark

&auml;

&#228;

small a, ring

&aring;

&#229;

small ae

&aelig;

&#230;

small c, cedilla

&ccedil;

&#231;

small e, grave accent

&egrave;

&#232;

small e, acute accent

&eacute;

&#233;

small e, circumflex
accent

&ecirc;

&#234;

small e, umlaut mark

&euml;

&#235;

small i, grave accent

&igrave;

&#236;

small i, acute accent

&iacute;

&#237;

small i, circumflex accent &icirc;

&#238;

small i, umlaut mark

&iuml;

&#239;

small eth, Icelandic

&eth;

&#240;

small n, tilde

&ntilde;

&#241;

small o, grave accent

&ograve;

&#242;

small o, acute accent

&oacute;

&#243;

small o, circumflex
accent

&ocirc;

&#244;

small o, tilde

&otilde;

&#245;

small o, umlaut mark

&ouml;

&#246;

small o, slash

&oslash;

&#248;

small u, grave accent

&ugrave;

&#249;

small u, acute accent

&uacute;

&#250;

small u, circumflex
accent

&ucirc;

&#251;

small u, umlaut mark

&uuml;

&#252;

small y, acute accent

&yacute;

&#253;

small thorn, Icelandic

&thorn;

&#254;

small y, umlaut mark

&yuml;

&#255;

Some Other Entities supported by HTML


Resul
Description
t

Entity
Name

Entity
Number

capital ligature OE

&OElig;

&#338;

small ligature oe

&oelig;

&#339;

capital S with caron

&Scaron;

&#352;

small S with caron

&scaron;

&#353;

capital Y with diaeres

&Yuml;

&#376;

modifier letter circumflex


accent

&circ;

&#710;

small tilde

&tilde;

&#732;

en space

&ensp;

&#8194;

em space

&emsp;

&#8195;

thin space

&thinsp;

&#8201;

zero width non-joiner

&zwnj;

&#8204;

zero width joiner

&zwj;

&#8205;

left-to-right mark

&lrm;

&#8206;

right-to-left mark

&rlm;

&#8207;

en dash

&ndash;

&#8211;

em dash

&mdash;

&#8212;

left single quotation mark

&lsquo;

&#8216;

right single quotation mark

&rsquo;

&#8217;

single low-9 quotation mark

&sbquo;

&#8218;

left double quotation mark

&ldquo;

&#8220;

right double quotation mark

&rdquo;

&#8221;

double low-9 quotation mark

&bdquo;

&#8222;

dagger

&dagger;

&#8224;

double dagger

&Dagger;

&#8225;

horizontal ellipsis

&hellip;

&#8230;

per mille

&permil;

&#8240;

single left-pointing angle


quotation

&lsaquo;

&#8249;

single right-pointing angle


quotation

&rsaquo;

&#8250;

euro

XHTML URL-Encoding Reference


The non-standard characters and letters in browsers and
plug-ins can be displayed using Hexadecimal values.
Here is a list of The ASCII characters reference in URLencoding form (hexadecimal format)

URL-encoding from %00 to %8f

ASCII
Value

URLencode

ASCII
Value

URLencode

ASCII
Value

URLencode

%00

%30

%60

%01

%31

%61

%02

%32

%62

%03

%33

%63

%04

%34

%64

%05

%35

%65

%06

%36

%66

%07

%37

%67

backspace %08

%38

%68

tab

%09

%39

%69

linefeed

%0a

%3a

%6a

%0b

%3b

%6b

%0c

<

%3c

%6c

%0d

%3d

%6d

%0e

>

%3e

%6e

%0f

%3f

%6f

%10

%40

%70

%11

%41

%71

%12

%42

%72

%13

%43

%73

%14

%44

%74

%15

%45

%75

%16

%46

%76

%17

%47

%77

%18

%48

%78

%19

%49

%79

%1a

%4a

%7a

%1b

%4b

%7b

%1c

%4c

%7c

c return

%1d

%4d

%7d

%1e

%4e

%7e

%1f

%4f

space

%20

%50

%21

%51

"

%22

%52

%82

%23

%53

%83

%24

%54

%84

%25

%55

%85

&

%26

%56

%86

'

%27

%57

%87

%28

%58

%88

%29

%59

%89

%2a

%5a

%8a

%2b

%5b

%8b

%2c

%5c

%8c

%2d

%5d

%2e

%5e

%2f

%5f

%7f

%80
%81

%8d

%8e
%8f

URL-encoding from %90 to %ff


ASCII
Value

URLencode

ASCII
Value

URLencode

ASCII
Value

URLencode

%90

%c0

%f0

%91

%c1

%f1

%92

%c2

%f2

%93

%c3

%f3

%94

%c4

%f4

%95

%c5

%f5

%96

%c6

%f6

%97

%c7

%f7

%98

%c8

%f8

%99

%c9

%f9

%9a

%ca

%fa

%9b

%cb

%fb

%9c

%cc

%fc

%9d

%cd

%fd

%9e

%ce

%fe

%9f

%cf

%ff

%a0

%d0

%a1

%d1

%a2

%d2

%a3

%d3

%a4

%d4

%a5

%d5

%a6

%d6

%a7

%a8

%d8

%a9

%d9

%aa

%da

%ab

%db

%ac

%dc

%ad

%dd

%ae

%de

%af

%df

%b0

%e0

%b1

%e1

%d7

%b2

%e2

%b3

%e3

%b4

%e4

%b5

%e5

%b6

%e6

%b7

%e7

%b8

%e8

%b9

%e9

%ba

%ea

%bb

%eb

%bc

%ec

%bd

%ed

%be

%ee

%bf

%ef

XHTML HTTP Status Messages


When a request is made by the browser to web server, an
error might occur.
HTTP status messages which can be returned is listed below:

1xx: Information
Message:

Description:

100 Continue

Only small some part of the request is


recieved by the server. Client can
continue with the request until the
request has not been rejected

101 Switching Protocols

The server switches protocol

2xx: Successful
Message:

Description:

200 OK

The request is OK

201 Created

The request is complete, and a new


resource is created

202 Accepted

A request is taken for processing, but


processing is no completed

203 Non-authoritative
Information
204 No Content
205 Reset Content
206 Partial Content

3xx: Redirection
Message:

Description:

300 Multiple Choices

A link list. An user can select a link and


go to that location. At the Maximum of
five addresses

301 Moved Permanently

The requested page has moved to a new


url

302 Found

The requested page has moved


temporarily to a new url

303 See Other

The requested page can be found under


a different url

304 Not Modified


305 Use Proxy
306 Unused

This code was used in a previous


version. It is no longer used, but the
code is reserved

307 Temporary Redirect

The requested page has moved


temporarily to a new url

4xx: Client Error


Message:

Description:

400 Bad Request

The server did not understand the


request

401 Unauthorized

The requested page needs a username


and a password

402 Payment Required

You can not use this code yet

403 Forbidden

Access permission is not there for the


requested page

404 Not Found

The server can not find the requested


page

405 Method Not Allowed

The method specified in the request is


not allowed

406 Not Acceptable

407 Proxy Authentication


Required
408 Request Timeout

The server generates a response which


the client will not accept
Before the request could be served you
must authenticate with a proxy server.
The request took longer than the server
was prepared to wait

409 Conflict

The request could not be completed


because of a conflict

410 Gone

The requested page is no longer


available

411 Length Required

412 Precondition Failed

413 Request Entity Too


Large

The "Content-Length" is not defined. The


server cannot accept the request
without it
he precondition given in the request
evaluated to false by the server
If the request entity is too large the
server will not accept the request,

414 Request-url Too Long

415 Unsupported Media


Type

If the url is too long.the server will not


accept the request, Occurs when you
convert a "post" request to a "get"
request with a long query information
If the media type is not supported The
server will not accept the request,
because the

416
417 Expectation Failed

5xx: Server Error


Message:
500 Internal Server Error

501 Not Implemented

502 Bad Gateway

503 Service Unavailable

504 Gateway Timeout


505 HTTP Version Not
Supported

XHTML Event Attributes

Description:
Due to an unexpected condition, The
request was not completed.
The functionality required was not
supported by the server. The request
was not completed.
The request was not completed. The
server received an invalid response from
the upstream server
The server is temporarily overloading or
down, Therefore the request was not
completed.
The gateway has timed out
The "http protocol"version is not
supported by the server

The ability to trigger actions in the browser by HTML events


is New to HTML 4.0, like executing a JavaScript when a user
clicks on an HTML element. Here is a list of attributes that
define event actions. these attributes can be inserted into
HTML tags

Window Events
These are valid only in body and frameset elements
Attribute

Value

Description

onload

script

Script to be run when a document loads

onunload

script

Script to be run when a document


unloads

Form Element Events


These are valid only in form elements.
Attribute

Value

Description

onchange

script

Script to be run when the element


changes

onsubmit

script

Script to be run when the form is


submitted

onreset

script

Script to be run when the form is reset

onselect

script

Script to be run when the element is


selected

onblur

script

Script to be run when the element loses


focus

onfocus

script

Script to be run when the element gets


focus

Keyboard Events

Not valid in base, bdo, br, frame, frameset, head, html, iframe,
meta, param, script, style, and title elements.
Attribute

Value

Description

onkeydown

script

What to do when key is pressed

onkeypress

script

What to do when key is pressed and


released

onkeyup

script

What to do when key is released

Mouse Events
These are not valid in base, bdo, br, frame, frameset, head, html,
iframe, meta, param, script, style, and title elements.
Attribute

Value

Description

onclick

script

Action will be taken on mouse click

ondblclick

script

Action will be taken on doubleclicking


the mouse

onmousedown

script

Action to be performed when mouse


button is pressed

onmousemove

script

Action to be performed when mouse


pointer moves

onmouseover

script

Action to be performed when mouse


pointer moves over an element

onmouseout

script

What to do when mouse pointer moves


out of an element

onmouseup

script

What function to carry when mouse


button is released

XHTML Summary
By this tutorial we have learnt to create clean and stricter
HTMl pages
In this tutorial we have learnt that all XHTML elements must
be properly nested, XHTML documents must be well-formed,

all the tag names should be in written in lowercase, and that


all XHTML elements must be closed.
We have also learnt that,As in HTML html,head,title,and body
elements should be present in XHTML.And a DOCUMENT
decleration must be made.

What is an XHTML File?

XHTML stands for EXtensible HyperText Markup Language


XHTML is similar to HTML,The only difference is XHTML is
stricter and cleaner version of HTML
An XHTML file contains small mark up tags
The way how the page must be displayed in the browser is
described by these markup tags
An XHTML file must have an .xhtml file extension

Using any simple text editor XHTML file can be created

The Most Important Differences:

Tag and attribute names must be


written in lower-case
Elements must be nested
properly, no overlapping
Non-empty elements must be
closed
Empty elements must be
terminated
All attribute values must be
quoted
Attribute value cannot be
shortened
<script> and <style> elements

Standard Attributes

At the opening of the element tag only the attribute must be


specified. closing tag of an element merely contains a back
slash and tag name

All attribute names must be in lower case.


All attributes must have a value.
All attribute should hold a values and always be double
quotes. Here double quote and two single quotes are not the
same.
All attributes must be placed in a list seperated by blank
space and no other characters must be included between
them

XHTML Tags

XHTML tags are used to mark-up XHTML elements


Tags in XHTML are always surrounded by characters
< and >
These characters are called angel brackets
All tags in XHTML come in pairs for example <b>
and </b>
In these pair of tags,the first tag is the start tag,and
the second tag is called end tag
The text that lies between the start tag and the end
tag is called the element content
XHTML tags are case sensitive,

Você também pode gostar