Subsections


2.4 Writing code

Up to this point, we have only looked at HTML code that has already been written. We will now turn our attention to writing new HTML code.

There are three important steps: we must learn how to write the code in the first place; we must be able to check the syntax of the code; and we must learn how to run the code to produce the desired end result (in the case of HTML, a web page).

For each step, we will discuss what software we need to do the job as well as provide guidelines and advice on the right way to perform each task.

In this section, we look at the task of writing computer code.


2.4.1 Text editors

The act of writing code is itself dependent on computer tools. We use software to record and manage our keystrokes in an effective manner. This section discusses what sort of tool should be used to write computer code effectively.

An important feature of computer code is that it is just plain text. There are many software packages that allow us to enter text, but some are more appropriate than others.

For many people, the most obvious software program for entering text is a word processor, such as Microsoft Word or Open Office Writer. These programs are not a good choice for editing computer code. A word processor is a good program for making text look pretty with lots of fancy formatting and wonderful fonts. However, these are not things that we want to do with our raw computer code.

The programs that we use to run our code expect to encounter only plain text, so we must use software that creates only text documents, which means we must use a text editor.


2.4.2 Important features of a text editor

For many people, the most obvious program for creating a document that only contains text is Microsoft Notepad. This program has the nice feature that it saves a file as pure text, but its usefulness ends there.

When we write computer code, a good choice of text editor can make us much more accurate and efficient. The following facilities are particularly useful for writing computer code:

automatic indenting
 
As we will see in Section 2.4.3, it is important to arrange code in a neat fashion. A text editor that helps to indent code (place empty space at the start of a line) makes this easier and faster.
parenthesis matching
 
Many computer languages use special symbols, e.g., { and }, to mark the beginning and end of blocks of code. Some text editors provide feedback on such matching pairs, which makes it easier to write code correctly.
syntax highlighting
 
All computer languages have special keywords that have a special meaning for the language. Many text editors automatically color such keywords, which makes it easier to read code and easier to spot simple mistakes.
line numbering
 
Some text editors automatically number each line of computer code (and in some cases each column or character as well) and this makes navigation within the code much easier. This is particularly important when trying to find errors in the code (see Section 2.5).

In the absence of everything else, Notepad is better than using a word processor. However, many useful (and free) text editors exist that do a much better job. Some examples are Crimson Editor on Windows2.1 and Kate on Linux.2.2

Figure 2.4 demonstrates some of these ideas by showing the same code in Notepad and Crimson Editor.

Figure 2.4: The HTML code from Figure 2.2 viewed in Crimson Editor (top) and Microsoft Notepad (bottom). Crimson Editor provides assistance for writing computer code by providing syntax highlighting (the HTML keywords are highlighted) and by providing information about which row and column the cursor is on (in the status bar at the bottom of the window).
Image crimsongray



Image notepadgray


2.4.3 Layout of code

There are two important audiences to consider when writing computer code. The obvious one is the computer; it is vitally important that the computer understands what we are trying to tell it to do. This is mostly a matter of getting the syntax of our code right.

The other audience for code consists of humans. While it is important that code works (that the computer understands it), it is also essential that the code is comprehensible to people. And this does not just apply to code that is shared with others, because the most important person who needs to understand a piece of code is the original author of the code! It is very easy to underestimate the probability of having to reuse a piece of code weeks, months, or even years after it was initially written, and in such cases it is common for the code to appear much less obvious on a second viewing, even to the original author.

Other people may also get to view a piece of code. For example, other researchers will want to see our code so that they know what we did to our data. All code should be treated as if it is for public consumption.

One simple but important way that code can be improved for a human audience is to format the code so that it is easy to read and easy to navigate.

For example, the following two code chunks are identical HTML code, as far as the computer is concerned. However, they are vastly different to a human reader. Try finding the “title” part of the code. Even without knowing anything about HTML, this is a ridiculously easy task in the second layout, and annoyingly difficult in the first.

<html><head><title>A Minimal HTML 
Document</title></head><body>
The content goes here!</body>

<html>
    <head>
        <title>A Minimal HTML Document</title>
    </head>
    <body>
        The content goes here!
    </body>

This demonstrates the basic idea behind laying out code. The changes are entirely cosmetic, but they are extremely effective. It also demonstrates one important layout technique: indenting.

2.4.4 Indenting code

The idea of indenting code is to expose the structure of the code. What this means will vary between computer languages, but in the case of HTML code, a simple rule is to indent the contents of an element.

The following code provides a simple example, where a title element is the content of a head element. The title element is indented (shifted to the right) with respect to the head element.

<head>
    <title>A Minimal HTML Document</title>
</head>

The amount of indenting is a personal choice. The examples here have used 4 spaces, but 2 spaces or even 8 spaces are also common. Whatever indentation is chosen, it is essential that the indenting rule is applied consistently, especially when more than one person might modify the same piece of code.

Exposing structure of code by indenting is important because it makes it easy for someone reading the code to navigate within the code. It is easy to identify different parts of the code, which makes it easier to see what the code is doing.

Another useful result of indenting is that it provides a basic check on the correctness of code. Look again at the simple HTML code example. Does anything look wrong?

<html>
    <head>
        <title>A Minimal HTML Document</title>
    </head>
    <body>
        The content goes here!
    </body>

Even without knowing anything about HTML, the lack of symmetry in the layout suggests that there is something missing at the bottom of this piece of code. In this case, indenting has alerted us to the fact that there is no end </html> tag.

2.4.5 Long lines of code

Another situation where indenting should be applied is when a line of computer code becomes very long. It is a bad idea to have a single line of code that is wider than the screen on which the code is being viewed (so that we have to scroll across the window to see all of the code). When this happens, the code should be split across several lines (most computer languages do not notice the difference). Here is an example of a line of HTML code that is too long.

<img src="poleplot.png" alt="A plot of temperatures over time">

Here is the code again, split across several lines. It is important that the subsequent lines of code are indented so that they are visually grouped with the first line.

<img src="poleplot.png" 
     alt="A plot of temperatures over time">

In the case of a long HTML element, a reasonable approach is to left-align the start of all attributes within the same tag (as shown above).

2.4.6 Whitespace

Whitespace refers to empty gaps in computer code. Like indenting, whitespace is useful for making code easy for humans to read, but it has no effect on the semantics of the code. Wouldyouwriteinyournativelanguagewithoutputtingspacesbetweenthewords?

Indenting is a form of whitespace that always appears at the start of a line, but whitespace is effective within and between lines of code as well. For example, the following code is too dense and therefore is difficult to read.

<table border="1"width="100%"bgcolor="#CCCCCC">

This modification of the code, with extra spaces, is much easier on the eye.

<table border="1" width="100%" bgcolor="#CCCCCC">

Figure 2.5 shows two code chunks that demonstrate the usefulness of blank lines between code blocks to help expose the structure, particularly in large pieces of code.

Again, exactly when to use spaces or blank lines depends on personal style.

Figure 2.5: These two code chunks contain exactly the same code; all that differs is the use of several blank lines (whitespace) in the code on the right, which help to expose the structure of the code for a human reader.
\usebox{\tmpbox} \usebox{\tmpbox}


2.4.7 Documenting code

In Section 2.5.3, we discuss the importance of being able to read documentation about a computer language. In this section, we consider the task of writing documentation for our own code.

As with the layout of code, the purpose of documentation is to communicate. The obvious target of this communication is other people, so that they know what we did. A less obvious, but no less important, target is the code author. It is essential that when we return to a task days, weeks, or even months after we first performed the task, we are able to pick up the task again, and pick it up quickly.

Most of what we will have to say about documentation will apply to writing comments--messages written in plain language, embedded in the code, and which the computer ignores.


2.4.8 HTML comments

Here is how to include a comment within HTML code.

<!-- This is a comment -->

Anything between the start <!-- and end -->, including HTML tags, is completely ignored by the computer. It is only there to edify a human reader.

Having no comments in code is generally a bad idea, and it is usually the case that people do not add enough comments to their code. However, it can also be a problem if there are too many comments. If there are too many comments, it can become a burden to ensure that the comments are all correct if the code is ever modified. It can even be argued that too many comments make it hard to see the actual code!

Comments should not just be a repetition of the code. Good uses of comments include: providing a conceptual summary of a block of code; explaining a particularly complicated piece of code; and explaining arbitrary constant values.


Recap

Computer code should be written using a text editor.

Code should be written tidily so that it is acceptable for a human audience. Code should be indented, lines should not be too long, and there should be plenty of whitespace.

Code should include comments, which are ignored by the computer but explain the purpose of the code to a human reader.

Paul Murrell

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.