The Structure of HTML

Logical Structure

One of the things that is important in creating really professional
Web pages is understanding the idea of a logically structured document. A logically
structured document is one where every element on the page is defined, and the
definition corresponds to the functional role that element plays in the page.
HTML was designed as a logically structured language, based on SGML, and began
life pristinely following the logical structure. At its heart, it remains a
logical language, though some things have happened over the years to cause it
to lose the focus it originally had. But the direction the W3C is taking in
the move the XHTML, and from there to XML, clearly point to an evolution back
towards a strict logical structure. So it certainly pays to understand what
this implies, and prepare for it. The bonus is that following this leads to
better Web pages all around.

The idea of logical structure in a document is not new. People
have been doing it for several thousand years, actually. For example, if you
pick up a newspaper or a magazine, you will see structure built-in to those
publications. Textbooks, too, are highly structured. In a newspaper, for instance,
each story has a headline, which tells you something about the story that follows,
and makes you want to read it. If it is a long story, there may be sub-headings
for different sections of the story as well. We know that the headline is more
important than those subheadings Sometimes, we even know that one story is more
important than another. How do we know these things?

Visual Cues

The answer is that we make use of visual cues that communicate
relative importance. The most important of these cues is the size of the type.
We know that the headline is most important because it has bigger type than
anything else in the story. We know that one story on page on is more important
than some other story on page one because it has a bigger typeface in the headline.
This is a signal that people have been used to for a long time, and we make
use of it almost unconsciously. Put someone at a word processor to create a
document, and it is at least 90% certain that they will make the document title
in the largest typeface they use, etc. People also use these visual cues as
readers. Imagine you were looking at a document where the title was in the smallest
font! I think you would experience a moment of confusion, before deciding that
something had gone seriously wrong. The one thing you would probably not accept
is that the document was supposed to be that way, that someone had actually
intended it to have this odd appearance.

When HTML was being put together, most of the attention went to
how pages ought to be constructed, but at least some attention went to how user
agents
(browsers) ought display elements. In the case of headings,
for instance, the document on the global structure of HTML:

http://www.w3.org/TR/html401/struct/global.html

says the following:

“A heading element briefly describes the topic of the section
it introduces. Heading information may be used by user agents, for example,
to construct a table of contents for a document automatically.

There are six levels of headings in HTML with H1 as the most important
and H6 as the least. Visual browsers usually render more important headings
in larger fonts than less important ones.”

Note the definition of the element focuses purely on its function
within the document. H1 is the most important, h2 is slightly less important,
and so on, down to the least important, h6. But note also that there is some
attention to the visual cues we are all used to using. Browsers are expected
to generally use different type sizes to reinforce the functional definition.
So far, so good.

Why visual cues are not enough

While most people are used to using visual cues, they are not
in any sense sufficient in devising Web pages. While the Web is a visual medium,
it is not only visual medium. Any good Web designer
has repeated the following statement many times, both to themselves and to others
they talk to: The Web is not print. If Web pages were
no different than printed pages, all we would need to do is select font sizes
that conveyed our meaning. But they are significantly different.

Electronic information processing

If you look at that quote from the Global Structure document again,
you might notice that it talks about creating a table of contents from your
document. How is a user agent supposed to do that? The answer is that it makes
use of the structure of your document to do it. It knows that the h1 tags are
the most important, the h2 tags are slightly less important, and that the paragraphs
do not appear at all in a table of contents. If you have structured your document
properly, this is not difficult for a computer program to do. In fact, this
is something you encounter beyond just the Web. In Microsoft word, for instance,
you can create a Table of Contents in a document automatically provided you
have created the document in a well-structured manner. Another example is in
help files for Windows programs. If you have ever used them, you may have noticed
that there is a tab that says “Index”, and another that says “Find.”
The difference has to do with the structure of the Help System text. Index only
searches through headings, while Find searches through paragraphs as well. That
means that the results you get on the Index tab are likely to be higher quality
than the ones you get from the Find tab, since the key words you searched for
were found in section headings, instead of being a perhaps offhand reference
in a paragraph somewhere.

Back to the Web. Searching for information is important here as
well. We all have some experience of working with search engines. And we probably
have some experience of not finding what we are looking for. How do they work?
The answer is, that they use various programming algorithms to classify the
information on a Web page so that they know when to offer that page in a search.
An important algorithm is to weight key terms by where they appear in a document.
If you create a Web page for Beanie Babies, for instance, if the term “beanie
babies” appears within an h1 tag, it will rank higher in the search than
the exact same page, but one that used an h2 tag instead. The reason is that
using an h2 tag says to the search engine software “This is less important”.
And it is not just the major search engines that are concerned here. Many sites
have their own search engines incorporated into the site, so that you can search
within the site (for example, Amazon.com). These search engines also use the
logical structure of individual pages to help classify the information on them.
Finally, there is huge expansion in the use of purely internal Web sites, called
Intranets, by major companies. These sites offer information to the employees
of the company, and they use search engines as well to help retrieve the information.
These Intranets too need well-structured documents in order to help employees
find the right documents when they need them.

Accessibility

The other reason not rely on purely visual cues in creating documents
is that many people, for whatever reason, can not make use of visual cues. They
may be visually impaired, or they may be using some other user agent (e.g. cell
phone) to access your site. To this audience, your visual cues simply do not
exist. And this is becoming very important for two reasons. First, as regards
people who are visually impaired, the law is becoming stricter. Many sites are
now legally required to make their Web sites accessible. The Accessibility guidelines
for HTML can be found at:

http://www.w3.org/TR/WAI-WEBCONTENT-TECHS/

In section 4.1.2, it says:

Sections should be introduced with the HTML header elements (H1-H6).
Other markup may complement these elements to improve presentation (e.g.,
the HR element to create a horizontal dividing line), but visual presentation
is not sufficient to identify document sections.

Since some users skim through a document by navigating its headings,
it is important to use them appropriately to convey document structure. Users
should order heading elements properly. For example, in HTML, H2 elements
should follow H1 elements, H3 elements should follow H2 elements, etc. Content
developers should not “skip” levels (e.g., H1 directly to H3). Do not use
headings to create font effects; use style sheets to change font styles for
example.

The rapid growth in the use of alternative methods to access Web content, such
as cell phones, also militates against the use of purely visual cues to impart
information. This is one of the most rapidly growing areas in Web access, and
will have significant impact on how Web sites are designed in coming years.
All of these alternative user agents make use of the logical information conveyed
in your tags to decipher what is one the page, so you had better do a good job
in structuring the information.

Structure vs. Presentation

What has happened over the last few years is the presentational elements (i.e.
purely visual approaches) have taken the upper hand over logical structure.
In large part, this is because so many designers have entered the field coming
from a visual design (print) background, and do not have a background in information
theory. This has led to some serious mistakes in Web design practice. The major
thrust of the W3C in moving HTML back towards SGML (by way of XHTML, then XML)
is to restore rigor to the logical structure of page design by insisting on
good logical structure. At the same time, designers have legitimate needs in
setting aspects of the visual appearance of their pages. The solution the W3C
has adopted is to separate the two aspects of each element in HTML. Each element
absolutely must have the proper logical, structural definition. But you can
then, through the use of Cascading Style Sheets, give a precise appearance to
each of those elements. You can choose the font, the size, etc. with complete
freedom. You could even make your h1 elements smaller than your h2 elements
if you wished.