Breadcrumbs

Too Easy XHTML - XHTML Syntax [2]

Previously, in Part 1 we looked at what XHTML was and the basic page structure. With a firm understanding of XHTML, it is time to dive deeper into the world of web design and look closer at XHTML. Here in Part 2, we will look at specific elements of the page structure deeper and the format of proper XHTML tags.

Page Structure Recap

Earlier we looked at the basic structure of an XHTML document:

Code: XHTML

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>First Web Page</title>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1"/>
</head>
<body>
Hello World
</body>
</html>

The above document contains the XML declaration, DTD declaration, <html> area, <head> area, page title, MIME type, and the body. For a refreshment on this try reading Part 1 of the XHTML guide.

HTML Root Element

Previously we briefly looked at the <html> area:

Code: XHTML

<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">

In all XHTML documents, <html> has to be the root element of the web page. This means that <html> cannot be nested inside another element. The following is an example of a non-valid page with the <html> area not the root element:

Code: XHTML

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<somethingelse>
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>First Web Page</title>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1"/>
</head>
<body>
Hello World
</body>
</html>
</somethingelse>

A simple check is to just make sure that <html> is the first (succeeding the XML and DTD declaration) and last declaration in your document, and you'll simply avoid the error. If you look at the <html> tag you'll see it comes packaged along with a few other fine-tuned settings. These settings are for the XMLNS declaration for the XHTML namespace. We won't go into XML namespaces in this lesson, but remember to include it in your document. You can also changed the lang="" and xml:lang="" values to suit the exact language of your web page.

MIME Type

MIME type is an important element of the XHTML page structure. Like the XMLNS namespace, it is also required in an XHTML Strict document. Due to the diversity of web browsers, the MIME type can become quite tricky.

According to W3C standards the proper MIME type to use for XHTML is application/xhtml+xml. The problem comes around with the various interpretation and MIME type support. For example, Googlebot (Google.com (external link)) doesn't support application/xhtml+xml as part of its HTTP accept header. Googlebot indexes the XHTML 1.0 Strict version of web pages using mime-type text/html. With the variety of traffic on your web page, MIME types become a hassle!

In this guide we will continue to use the text/html MIME type, in order to keep the page structure simple. Later we will discuss deeper into MIME types and come up with some creative workarounds for this web design dilemma.

What is a tag?

Tags are the foundation of XHTML, and you've already seen quite a few tags yourself. Tags are used as markup for the information within the document. XHTML tags are held within two comparison symbols (< >). You've already been introduced to three tags - <html>, <head>, and <body>. There are many tags in XHTML, but you'll find that you'll only use a handful commonly.

Two tags make up an element. The content between the two tags is referred to as the "element content":

Code: XHTML

<body>element content</body>

Tags normally (exceptions will be discussed later in this lesson) come in pairs, with one start and one closing tag. Closing tags always contain a backslash (/) and follow immediately after the element content. Tags are also case sensitive, with all tags in lower cases. <body> is not the same as <BODY> or <bOdY>. Make sure that all your tags are in lower case to avoid page errors.

XHTML tags must be "well-formed" too, meaning that all tags must be closed in the order that they are declared. This makes the following invalid:

Code: XHTML

<b><p>test</b></p>

The correct markup would read:

Code: XHTML

<b><p>test</p></b>

Self Closing Tags

In XHTML, you'll use tags that don't have a closing pair. Some of these are line break, line, and image tags. This doesn't exclude them from the closing rules of XHTML. Instead, they are self-closing:

Code: XHTML

<br />
<img src="" />
<hr />

Each tag ends with a /> and is considered closed. Make sure to leave the space before the slash (/) as older versions of Netscape often mis-interpret self-closing tags without this added space.

Tag Attributes

Attributes are extra information provided about a certain tag. Each tag has its own set of attributes that you can set or leave as default. An attribute is set within the XHTML tag:

Code: XHTML

<img src="" />

In the above <img> tag we set the location of the image using its provided attribute, src. As mentioned earlier, there are many attributes for each tag. All attributes are case sensitive and must be in lower case. The attribute value also has to be held within quotes ("").

Avoid Attribute Minimization

If you are coming from HTML then you've probably used attribute minimization:

Code: XHTML

<input readonly />

This is not allowed in XHTML, but rather you have the attribute name also as the attribute value:

Code: XHTML

<input readonly="readonly" />

Make sure to avoid attribute minimization at all cost, it can be a common pit hole for HTML to XHTML transitioning.

Summary

In Part 2 you've learned more about the XHTML document structure and the XHTML rules for tags and attributes. In Part 3 we will continue our XHTML lessons by learning how to incorporate scripting and styling, some basic XHTML tags, and validating our pages.