Separation of Structure and Presentation

The HTML 4.01 Standard

In HTML 4.01, section 2.4.1, the following admonition appears:

2.4.1 Separate structure and presentation
HTML has its roots in SGML which has always been a language for the specification
of structural markup. As HTML matures, more and more of its presentational elements
and attributes are being replaced by other mechanisms, in particular style sheets.
Experience has shown that separating the structure of a document from its presentational
aspects reduces the cost of serving a wide range of platforms, media, etc.,
and facilitates document revisions.

What does this mean?

For Web designers, this means that we need to be thinking about how we construct
a page, and use tags first and foremost for structural purposes, rather than
to create a visual effect. That is why I wrote the article on the Structure
of HTML, to show why the structure really does matter. But at the same time,
designers really do want to use visual effects. Visual cues are the “non-verbal”
language most of us are used to, and you cannot tell designers to stop using
those cues. So how do you achieve the kind of Web page you want while keeping
structure and presentation separate? There are several answers, depending on
the situation.

Style Sheets

You will note that the quote form the W3C standard for HTML 4.01 talks about
style sheets. For a beginning Web designer, it is not clear exactly what a style
sheet is or how it is used. A full explanation does not come until a later course,
but it wouldn’t hurt to give you a peek at how this process works. When I am
creating a Web site, the first thing I do is create a separate file, a style
sheet, usually named something like style.css. This is a pure text file, just
like HTML files, and goes in my web site. I link my pages to this style sheet,
and it contains most of the presentational information for my site. Here is
a short sample from one of my sites:

h1 { font-family: Arial, Helvetica, sans-serif; font-size: 24pt;
font-style: italic; font-weight: bold}
h2 { font-family: Arial, Helvetica, sans-serif; font-size: 18pt; font-style:
italic; font-weight: bold}
h3 { font-family: Arial, Helvetica, sans-serif; font-size: 16pt; font-style:
italic; font-weight: bold}
h4 { font-family: Arial, Helvetica, sans-serif; font-size: 14pt; font-style:
italic; font-weight: bold}
p { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12pt; font-style:
normal; line-height: normal; font-weight: normal}
ul { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12pt}

li { }
blockquote { font-family: Verdana, Arial, Helvetica, sans-serif; font-size:
10pt}

I think with this simple example you can see how the process works. I declare
all of my presentational information in the style sheet. If I think that my
h1 tag too big, I just change the declaration in the style sheet to a smaller
font size. I can make it italic, or bold, as I wish. I can do a lot of other
things as well, such as change the line height. The point is that if I concentrate
on building HTML pages that are logically correct, I can then use style sheets
to present those pages any way I like. And if I change my mind, I can update
the appearance of my entire site (and I manage sites as large as 1,000 pages)
with one little change to my style sheet.

So, when you are looking at formatting entire elements (like h1, h2, etc.),
the proper method is to use style sheets to control the formatting.

Formatting within elements

If style sheets are the way to format entire elements, what approach should
you take when formatting something within an element,
like I just did in this sentence with the word “within”. You might
think I used the b and the i tag here, but if you check the source code you
will see that I did not. There is a good reason for this: in HTML 4.01 we are
asked not to use them. Here is what it says in the section where these tags
are discussed, Section 15.2:

The following HTML elements specify font information. Although they are
not all deprecated, their use is discouraged in favor of style sheets.

Now, discouraged is not quite the same thing as forbidden, but you can see
where they are going here.The preferred tags to use are found in section 9.2.1:

EM and STRONG are used to indicate emphasis. …

The presentation of phrase elements depends on the user agent. Generally,
visual user agents present EM text in italics and STRONG text in bold font.
Speech synthesizer user agents may change the synthesis parameters, such
as volume, pitch and rate accordingly.

This gives us a clue as to what they are doing. Generally, when people use
a b or an i tag for a word or phrase, they are doing it to signal some type
of emphasis. But this is an indirect method, and only appropriate in a visual
browser at that. And it violates the principle of separating structure from
presentation. By using the EM and the STRONG tags, you directly communicate
what it is you are trying to do: you are trying to emphasize something. This
leaves it to the user agent (i.e. the software that someone is using to read
your page; the browser) to determine how it will be presented to the user. And
the user agent can be configured to handle the emphasis in a way that is most
appropriate.

In a well-constructed HTML document, therefore, the b and the i tag should
almost never appear. The exceptional case where their use might be defended
is when they are used at the character level for an effect, or for readability,
in a way that does not communicate any information. Using the tags to communicate
information would get us back into the world of structure. But an interesting
example that appeared in one of the student pages was using the i tag to italicize
the initial letter of each word in a company name in the h1 header. This clearly
communicated no information at all, and was therefore a reasonable use of the
tag. Using the EM tag in that context would have been clearly wrong because
there was no way any reasonable person would want to emphasize the initial letter.