Subsections


2.5 Checking code

Knowing how to write the correct syntax for a computer language is not a guarantee that we will write the correct syntax for a particular piece of code. One way to check whether we have our syntax correct is to stare at it and try to see any errors. However, in this book, such a tedious, manual, and error-prone approach is discouraged because the computer is so much better at this sort of task.

In general, we will enlist the help of computer software to check that the syntax of our code is correct.

In the case of HTML code, there are many types of software that can check the syntax. Some web browsers provide the ability to check HTML syntax, but in general, just opening the HTML document in a browser is not a good way to check syntax.

The software that we will use to demonstrate HTML syntax checking is a piece of software called HTML Tidy.

2.5.1 Checking HTML code

HTML Tidy is a program for checking the syntax of HTML code. It can be downloaded from Source Forge2.3or it can be used via one of the online services provided by the World Wide Web Consortium (W3C).2.4

In order to demonstrate the use of HTML Tidy, we will check the syntax of the following HTML, which contains one deliberate mistake. This code has been saved in a file called broken.html.

Figure 2.6: An HTML document that contains one deliberate mistake on line 4 (missing <\title> tag). The line numbers (in grey) are just for reference.
 

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
    <head>
        <title>A Minimal HTML Document
    </head>
    <body>
    </body>
</html>

For simple use of HTML Tidy, the only thing we need to know is the name of the HTML document and where that file is located. For example, the online services provide a button to select the location of an HTML file. To run HTML Tidy locally, the following command would be entered in a command window or terminal.

tidy broken.html

HTML Tidy checks the syntax of HTML code, reports any problems that it finds, and produces a suggestion of what the correct code should look like.

Figure 2.7 shows part of the output from running HTML Tidy on the simple HTML code in Figure 2.6.

Figure 2.7: Part of the output from running HTML Tidy on the HTML code in Figure 2.6.
 

Parsing "broken.html"
line 5 column 5 - Warning: missing </title> before </head>

Info: Doctype given is "-//W3C//DTD HTML 4.01 Transitional//EN"
Info: Document content looks like HTML 4.01 Transitional
1 warning, 0 errors were found!

An important skill to develop for writing computer code is the ability to decipher warning and error messages that the computer displays. In this case, there is one error message.

2.5.2 Reading error information

The error (or warning) information provided by computer software is often very terse and technical. Reading error messages is a skill that improves with experience, and it is important to seek out any piece of useful information in a message. Even if the message as a whole does not make sense, if the message can only point us to the correct area within the code, our mistake may become obvious.

In general, when the software checking our code encounters a series of errors, it is possible for the software to become confused. This can lead to more errors being reported than actually exist. It is always a good idea to tackle the first error first, and it is usually a good idea to recheck code after fixing each error. Fixing the first error will sometimes eliminate or at least modify subsequent error messages.

The error from HTML Tidy in Figure 2.7 is this:

line 5 column 5 - Warning: missing </title> before </head>

To an experienced eye, the problem is clear, but this sort of message can be quite opaque for people who are new to writing computer code. A good first step is to make use of the information that is supplied about where the problem has occurred. In this case, we need to find the fifth character on line 5 of our code.

The line of HTML code in question is shown below.

    </head>

Column 5 on this line is the < at the start of the closing head tag.

Taken in isolation, it is hard to see what is wrong with this code. However, error messages typically occur only once the computer is convinced that something has gone wrong. In many cases, the error will actually be in the code somewhere in front of the exact location reported in the error message. It is usually a good idea to look in the general area specified by the error message, particularly on the lines immediately preceding the error.

Here is the location of the error message in a slightly broader context.

        <title>A Minimal HTML Document
    </head>
    <body>

The error message mentions both title and head, so we might guess that we are dealing with these elements. In this case, this case just confirms that we are looking at the right place in our code.

The message is complaining that the </title> tag is missing and that the tag should appear before the </head> tag. From the code, we can see that we have started a title element with a <title> start tag, but we have failed to complete the element; there is no </title> end tag.

In this case, the solution is simple; we just need to add the missing tag and check the code with HTML Tidy again, and everything will be fine.

Unfortunately, not all syntax errors can be resolved as easily as this. When the error is not as obvious, we may have to extract what information we can from the error message and then make use of another important skill: reading the documentation for computer code.


2.5.3 Reading documentation

The nice thing about learning a computer language is that the rules of grammar are usually quite simple, there are usually very few of them, and they are usually very consistent.

Unfortunately, computer languages are similar to natural languages in terms of vocabulary. The time-consuming part of learning a computer language involves learning all of the special words in the language and their meanings.

What makes this task worse is the fact that the reference material for computer languages, much like the error messages, can be terse and technical. As for reading error messages, practice and experience are the only known cures.

This book provides reference chapters for each of the computer languages that we encounter. Chapter 3 provides a reference for HTML.

These reference chapters are shorter and simpler than the official language documentation, so they should provide a good starting point for finding out a few more details about a language. When this still does not provide the answer, there are pointers to more thorough documentation at the end of each reference chapter.

Recap

Computer code should be checked for correctness of syntax before it can be expected to run.

Understanding computer error messages and understanding the documentation for a computer language are important skills in themselves that take practice and experience to master.

Paul Murrell

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.