3.1 HTML syntax

HTML code consists of HTML elements.

An element consists of a start tag, followed by the element content, followed by an end tag. A start tag is of the form <elementName> and an end tag is of the form </elementName>. The example code below shows a title element; the start tag is <title>, the end tag is </title>, and the content is the text: Poles of Inaccessibility.

<title>Poles of Inaccessibility</title>

start tag: <title>Poles of Inaccessibility</title>
content: <title> Poles of Inaccessibility</title>
end tag: <title>Poles of Inaccessibility </title>

Some elements are empty, which means that they consist of only a start tag (no content and no end tag). The following code shows an hr element, which is an example of an empty element.


An element may have one or more attributes. Attributes appear in the start tag and are of the form attributeName="attributeValue". The code below shows the start tag for an img element, with an attribute called src. The value of the attribute in this example is "poleplot.png".

<img src="poleplot.png">

HTML tag: <img src="poleplot.png">
element name: < img src="poleplot.png">
attribute: <img src="poleplot.png">
attribute name: <img src="poleplot.png">
attribute value: <img src=" poleplot.png">

There is a fixed set of valid HTML elements (Section 3.2.1 provides a list of some common elements) and each element has its own set of possible attributes.

Certain HTML elements are compulsory. An HTML document must include a DOCTYPE declaration and a single html element. Within the html element there must be a single head element and a single body element. Within the head element there must be a title element. Figure 3.1 shows a minimal piece of HTML code.

Figure 3.1: A minimal HTML document. This is the basic code that must appear in any HTML document. The main content of the web page is described by adding further HTML elements within the body element.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

Section 3.2.1 describes each of the common elements in a little more detail, including any important attributes and which elements may be placed inside which other elements.

3.1.1 HTML comments

Comments in HTML code are anything within <!-- and -->. All characters, including HTML tags, lose their special meaning within an HTML comment.

3.1.2 HTML entities

The less-than and greater-than characters used in HTML tags are special characters and must be escaped to obtain their literal meaning. These escape sequences in HTML are called entities. All entities start with an ampersand so the ampersand is also special and must be escaped. Entities provide a way to include some other special characters and symbols within HTML code as well. Table 3.1 shows some common HTML entities.

Table 3.1: Some common HTML entities.
Character Description Entity
< less-than sign &lt;
> greater-than sign &gt;
& ampersand &amp;
$\pi$ Greek letter pi &pi;
$\mu$ Greek letter mu &mu;
Euro symbol &euro;
£ British pounds &pound;
© copyright symbol &copy;

Paul Murrell

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.