Subsections

3.2 HTML semantics

The primary purpose of HTML tags is to specify the structure of a web page.

Elements are either block-level or inline. A block-level element is like a paragraph; it is a container that can be filled with other elements. Most block-level elements can contain any other sort of element. An inline element is like a word within a paragraph; it is a small component that is arranged with other components inside a container. An inline element usually only contains text.

The content of an element may be other elements or plain text. There is a limit on which elements may be nested within other elements (see Section 3.2.1).


3.2.1 Common HTML elements

This section briefly describes the important behavior, attributes, and rules for each of the common HTML elements.

<html>
 
The html element must contain exactly one head element followed by exactly one body element.

<head>
 
The head element is only allowed within the html element. It must contain exactly one title element. It may also contain link elements to refer to external CSS files and/or style elements for inline CSS rules. It has no attributes of interest.

<title>
 
The title element must be within the head element and must only contain text. This element provides information for the computer to use to identify the web page rather than for display, though it is often displayed in the title bar of the browser window. It has no attributes.

<link>
 
This is an empty element that must reside in the head element. It can be used to specify an external CSS file, in which case the important attributes are: rel, which should have the value "stylesheet"; href, which specifies the location of a file containing CSS code (can be a URL); and type, which should have the value "text/css". The media attribute may also be used to distinguish between a style sheet for display on "screen" as opposed to display in "print".

An example of a link element is shown below.

<link rel="stylesheet" href="csscode.css" 
      type="text/css">

Other sorts of links are also possible, but are beyond the scope of this book.

<body>
 
The body element is only allowed within the html element. It should only contain one or more block-level elements, but most browsers will also allow inline elements. Various appearance-related attributes are possible, but CSS should be used instead.

<p>
 
This is a block-level element that can appear within most other block-level elements. It should only contain inline elements (words and images). The content is automatically typeset as a paragraph (i.e., the browser automatically decides where to break lines).

<img>
 
This is an empty, inline element (i.e., images are treated like words in a sentence). It can be used within almost any other element. Important attributes are src, to specify the file containing the image (this may be a URL, i.e., an image anywhere on the web), and alt to specify alternative text for non-graphical browsers.

<a>
 
The a element is known as an anchor. It is an inline element that can go inside any other element. It can contain any other inline element (except another anchor). Its important attributes are: href, which means that the anchor is a hypertext link and the value of the attribute specifies a destination (when the content of the anchor is clicked on, the browser navigates to this destination); and name, which means that the anchor is the destination for a hyperlink.

The value of an href attribute can be: a URL, which specifies a separate web page to navigate to; something of the form #target, which specifies an anchor within the same document that has an attribute name="target"; or a combination, which specifies an anchor within a separate document. For example, the following URL specifies the top of the W3C page for HTML 4.01.

href="http://www.w3.org/TR/html401/"

The URL below specifies the table of contents further down that web page.

href="http://www.w3.org/TR/html401/#minitoc"

<h1> ... <h6>
 
These are block-level elements that denote that the contents are a section heading. They can appear within almost any other block-level element, but can only contain inline elements. They have no attributes of interest.

These elements should be used to indicate the section structure of a document, not for their default display properties. CSS should be used to achieve the desired weight and size of the text in general.

<table>, <tr>, and <td>
 
A table element contains one or more tr elements, each of which contains one or more td elements (so td elements can only appear within tr elements, and tr elements can only appear within a table element). A table element may appear within almost any other block-level element. In particular, a table can be nested within the td element of another table.

The table element has a summary attribute to describe the table for non-graphical browsers. There are also attributes to control borders, background colors, and widths of columns, but CSS is the preferred way to control these features.

The tr element has attributes for the alignment of the contents of columns, including aligning numeric values on decimal points. The latter is important because it has no corresponding CSS property.

The td element also has alignment attributes for the contents of a column for one specific row, but these can be handled via CSS instead. However, there are several attributes specific to td elements, in particular, rowspan and colspan, which allow a single cell to spread across more than one row or column.

Unless explicit dimensions are given, the table rows and columns are automatically sized to fit their contents.

It is tempting to use tables to arrange content on a web page, but it is recommended to use CSS for this purpose instead. Unfortunately, the support for CSS in web browsers tends to be worse than the support for table elements, so it may not always be possible to use CSS for arranging content. This warning also applies to controlling borders and background colors via CSS.

The code below shows an example of a table with three rows and three columns and the image below the code shows what the result looks like in a web browser.

<table border="1">
    <tr>
        <td></td> <td>pacific</td> <td>eurasian</td>
    </tr>
    <tr>     
        <td>min</td> <td>276</td> <td>258</td>
    </tr>
    <tr> 
        <td>max</td> <td>283</td> <td>293</td>
    </tr>
</table>

Image tablegray

It is also possible to construct more complex tables with separate thead, tbody, and tfoot elements to group rows within the table (i.e., these three elements can go inside a table element, with tr elements inside them).

<hr>
 
This is an empty element that produces a horizontal line. It can appear within almost any block-level element. It has no attributes of interest.

This entire element can be replaced by CSS control of borders.

<br>
 
This is an empty element that forces a new line or line-break. It can be put anywhere. It has no attributes of interest.

This element should be used sparingly. In general, text should be broken into lines by the browser to fit the available space.

<ul>, <ol>, and <li>
 
These elements create lists. The li element generates a list item and appears within either a ul element, for a bullet-point list, or an ol element, for a numbered list.

Anything can go inside an li element (i.e., you can make a list of text descriptions, a list of tables, or even a list of lists).

These elements have no attributes of interest. CSS can be used to control the style of the bullets or numbering and the spacing between items in the list.

The code below shows an ordered list with two items and the image below the code shows what the result looks like in a web browser.

<ol>
    <li>
    <p>
    Large bodies of water tend to 
    absorb and release heat more slowly 
    compared to large land masses.
    </p>
    </li>

    <li>
    <p>
    Temperatures vary differently over time 
    depending on which hemisphere the pole 
    is located in. 
    </p>
    </li>
</ol>

Image listgray

It is also possible to produce “definition” lists, where each item has a heading. Use a dl element for the overall list with a dt element to give the heading and a dd element to give the definition for each item.

<pre>
 
This is a block-level element that displays any text content exactly as it appears in the source code. It is useful for displaying computer code or computer output. It has no attributes of interest.

It is possible to have other elements within a pre element. Like the hr element, this element can usually be replaced by CSS styling.

<div> and <span>
 
These are generic block-level and inline elements (respectively). They have no attributes of interest.

These can be used as “blank” elements with no predefined appearance properties. Their appearance can then be fully specified via CSS. In theory, any other HTML element can be emulated using one of these elements and appropriate CSS properties. In practice, the standard HTML elements are more convenient for their default behavior and these elements are used for more exotic situations.

3.2.2 Common HTML attributes

Almost all elements may have a class attribute, so that a CSS style specified in the head element can be associated with that element. Similarly, all elements may have an id attribute, which can be used to associate a CSS style. The value of all id attributes within a piece of HTML code must be unique.

All elements may also have a style attribute, which allows “inline” CSS rules to be specified within the element's start tag.

Paul Murrell

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.