The Structure of HTML

Logical Structure

One of the things that is important in creating really professional Web pages is understanding the idea of a logically structured document. A logically structured document is one where every element on the page is defined, and the definition corresponds to the functional role that element plays in the page. HTML was designed as a logically structured language, based on SGML, and began life pristinely following the logical structure. At its heart, it remains a logical language, though some things have happened over the years to cause it to lose the focus it originally had. But the direction the W3C is taking in the move the XHTML, and from there to XML, clearly point to an evolution back towards a strict logical structure. So it certainly pays to understand what this implies, and prepare for it. The bonus is that following this leads to better Web pages all around.

The idea of logical structure in a document is not new. People have been doing it for several thousand years, actually. For example, if you pick up a newspaper or a magazine, you will see structure built-in to those publications. Textbooks, too, are highly structured. In a newspaper, for instance, each story has a headline, which tells you something about the story that follows, and makes you want to read it. If it is a long story, there may be sub-headings for different sections of the story as well. We know that the headline is more important than those subheadings Sometimes, we even know that one story is more important than another. How do we know these things?

Visual Cues

The answer is that we make use of visual cues that communicate relative importance. The most important of these cues is the size of the type. We know that the headline is most important because it has bigger type than anything else in the story. We know that one story on page on is more important than some other story on page one because it has a bigger typeface in the headline. This is a signal that people have been used to for a long time, and we make use of it almost unconsciously. Put someone at a word processor to create a document, and it is at least 90% certain that they will make the document title in the largest typeface they use, etc. People also use these visual cues as readers. Imagine you were looking at a document where the title was in the smallest font! I think you would experience a moment of confusion, before deciding that something had gone seriously wrong. The one thing you would probably not accept is that the document was supposed to be that way, that someone had actually intended it to have this odd appearance.

When HTML was being put together, most of the attention went to how pages ought to be constructed, but at least some attention went to how user agents (browsers) ought display elements. In the case of headings, for instance, the document on the global structure of HTML:

http://www.w3.org/TR/html401/struct/global.html

says the following:

"A heading element briefly describes the topic of the section it introduces. Heading information may be used by user agents, for example, to construct a table of contents for a document automatically.
There are six levels of headings in HTML with H1 as the most important and H6 as the least. Visual browsers usually render more important headings in larger fonts than less important ones."

Note the definition of the element focuses purely on its function within the document. H1 is the most important, h2 is slightly less important, and so on, down to the least important, h6. But note also that there is some attention to the visual cues we are all used to using. Browsers are expected to generally use different type sizes to reinforce the functional definition. So far, so good.

Why visual cues are not enough

While most people are used to using visual cues, they are not in any sense sufficient in devising Web pages. While the Web is a visual medium, it is not only visual medium. Any good Web designer has repeated the following statement many times, both to themselves and to others they talk to: The Web is not print. If Web pages were no different than printed pages, all we would need to do is select font sizes that conveyed our meaning. But they are significantly different.

Electronic information processing

If you look at that quote from the Global Structure document again, you might notice that it talks about creating a table of contents from your document. How is a user agent supposed to do that? The answer is that it makes use of the structure of your document to do it. It knows that the h1 tags are the most important, the h2 tags are slightly less important, and that the paragraphs do not appear at all in a table of contents. If you have structured your document properly, this is not difficult for a computer program to do. In fact, this is something you encounter beyond just the Web. In Microsoft word, for instance, you can create a Table of Contents in a document automatically provided you have created the document in a well-structured manner. Another example is in help files for Windows programs. If you have ever used them, you may have noticed that there is a tab that says "Index", and another that says "Find." The difference has to do with the structure of the Help System text. Index only searches through headings, while Find searches through paragraphs as well. That means that the results you get on the Index tab are likely to be higher quality than the ones you get from the Find tab, since the key words you searched for were found in section headings, instead of being a perhaps offhand reference in a paragraph somewhere.

Back to the Web. Searching for information is important here as well. We all have some experience of working with search engines. And we probably have some experience of not finding what we are looking for. How do they work? The answer is, that they use various programming algorithms to classify the information on a Web page so that they know when to offer that page in a search. An important algorithm is to weight key terms by where they appear in a document. If you create a Web page for Beanie Babies, for instance, if the term "beanie babies" appears within an h1 tag, it will rank higher in the search than the exact same page, but one that used an h2 tag instead. The reason is that using an h2 tag says to the search engine software "This is less important". And it is not just the major search engines that are concerned here. Many sites have their own search engines incorporated into the site, so that you can search within the site (for example, Amazon.com). These search engines also use the logical structure of individual pages to help classify the information on them. Finally, there is huge expansion in the use of purely internal Web sites, called Intranets, by major companies. These sites offer information to the employees of the company, and they use search engines as well to help retrieve the information. These Intranets too need well-structured documents in order to help employees find the right documents when they need them.

Accessibility

The other reason not rely on purely visual cues in creating documents is that many people, for whatever reason, can not make use of visual cues. They may be visually impaired, or they may be using some other user agent (e.g. cell phone) to access your site. To this audience, your visual cues simply do not exist. And this is becoming very important for two reasons. First, as regards people who are visually impaired, the law is becoming stricter. Many sites are now legally required to make their Web sites accessible. The Accessibility guidelines for HTML can be found at:

http://www.w3.org/TR/WAI-WEBCONTENT-TECHS/

In section 4.1.2, it says:

Sections should be introduced with the HTML header elements (H1-H6). Other markup may complement these elements to improve presentation (e.g., the HR element to create a horizontal dividing line), but visual presentation is not sufficient to identify document sections.
Since some users skim through a document by navigating its headings, it is important to use them appropriately to convey document structure. Users should order heading elements properly. For example, in HTML, H2 elements should follow H1 elements, H3 elements should follow H2 elements, etc. Content developers should not "skip" levels (e.g., H1 directly to H3). Do not use headings to create font effects; use style sheets to change font styles for example.

The rapid growth in the use of alternative methods to access Web content, such as cell phones, also militates against the use of purely visual cues to impart information. This is one of the most rapidly growing areas in Web access, and will have significant impact on how Web sites are designed in coming years. All of these alternative user agents make use of the logical information conveyed in your tags to decipher what is one the page, so you had better do a good job in structuring the information.

Structure vs. Presentation

What has happened over the last few years is the presentational elements (i.e. purely visual approaches) have taken the upper hand over logical structure. In large part, this is because so many designers have entered the field coming from a visual design (print) background, and do not have a background in information theory. This has led to some serious mistakes in Web design practice. The major thrust of the W3C in moving HTML back towards SGML (by way of XHTML, then XML) is to restore rigor to the logical structure of page design by insisting on good logical structure. At the same time, designers have legitimate needs in setting aspects of the visual appearance of their pages. The solution the W3C has adopted is to separate the two aspects of each element in HTML. Each element absolutely must have the proper logical, structural definition. But you can then, through the use of Cascading Style Sheets, give a precise appearance to each of those elements. You can choose the font, the size, etc. with complete freedom. You could even make your h1 elements smaller than your h2 elements if you wished.