Basics of HTML

Special Characters

Special character basics

Special characters are either those that a typewriter keyboard can't normally display (like £ ) or a character that's used in HTML commands.

The basic format for special characters is &xxx; where the xxx is a numeric or letter code for the character. These expressions are called entities. Some commonly used entities are these:

line

I've had to use entities frequently in writing these pages, since the brackets < > that go around HTML tags have to be specified in this format in the raw code. They are among the reserved characters that can't be typed directly into HTML code, as they are used within commands. If you are using any of these characters, you have to use the entities:

HTML editors will of course put these entities into your HTML for you. But it's a good idea to know what they look like should you have to add any on your own, or so that you won't wonder why that gibberish is in amongst your text when you look at raw code.

You can cause all kinds of symbols and characters to display in this way. See elsewhere on this page for more examples and ways to use them. Other methods, like scripting, can be used to insert symbols into your pages, but this method is the most basic and requires no special programming skills.

More character info

What makes this simple-sounding procedure more complicated is that different browser versions may display some characters differently. For example, I found a site that generates Korean character codes pretty easily (unfortunately no longer around)  and used these in a set of web pages I made to show off photos from our trip to Seoul. And you can see them in Firefox and Chrome, but not IE! (Take a look.

There are also two sets of "codes" for each character; you'll notice this if you look at any of the code tables referenced on this page. The numeric code is an earlier effort, while the more mnemonic letter codes (like the ones I listed above) were an attempt to make things easier for the increasing numbers of non-programmers writing HTML. Most of the mnemonic entity codes that I have come across seem to work fine in all browsers. Upon testing (and you should always test your pages in different browsers) if you find a character not displaying, try using the numeric code instead.

W3Schools gives more information on these issues as well as tables of particular types of characters (with links to complete tables). You can find other sources on the web. Just search for the ISO 8859-1 set, which is the technical name for this basic set of character codes. (Codes which can display other characters have other character sets. See some of the references in the Language characters box on the right.)

Language characters

Using the few special characters needed in European languages that use the Latin alphabet is easy. They are included in the basic character set. Here are a few of the most common (use capital letters when you want a capital to display with the diacritical mark):

line

Korean charactersWhen you get to the realm of non-Latin alphabets, however, things get much more complicated. I was planning on giving you links to some nice code charts for Russian, Japanese, Thai, etc., characters, but got very bogged down in the minutiae of encoding, character sets, and other issues out of my comfort zone. So I'm just going to refer you to a couple of sites, should you want to pursue saying "hello" on the web in Chinese any time soon. 

I found this discussion of the concept of character set encoding, which is technical but reasonably easy to follow. There's a lot of information and some conversion tools at Character Set & Unicode Tools and Conversions. And this page has a hopeful-sounding title: "How to Develop a Multi-Lingual Web Page."

The basis of displaying characters in all sorts of programs is found in these Unicode Charts. This website provides codes for an astounding array of languages, as well as resources that can help you understand how to use them.

Math Symbols

piYou might notice when you look at the entities tables referenced elsewhere on this page that some mathematical symbols are included. Like some fractions, ÷, and of course < and >.

Even more are available, should you need them for mathematical or other purposes, in the easy-to-remember entities format. Here's the full math list at W3Schools.

Some examples:

Using Special Characters

One special character you should get in the habit of putting on all your pages is the © copyright symbol. Current copyright law protects content in any fixed medium (including web pages) without the symbol. But if you should ever need to charge someone with infringement, it helps to have the sign on your work.

The &nbsp; "character" is also very useful. It's the simplest way to add additional space to your text. If, for some reason, you wanted to set one word apart from the others in a line, just hitting the spacebar a number of times won't do the trick. But adding a bunch of &nbsp;'s will:

I want to set apart&nbsp;&nbsp;&nbsp;this&nbsp;&nbsp;&nbsp;word

displays as:

I want to set apart    this    word

Admittedly, there are other, cleaner, ways to do this (with styles, for example; see the tab). But &nbsp; will work. Some HTML editors also use it to "hold a space" in an otherwise empty table cell. In older browsers, empty cells would just disappear. (For more on tables see the Tables tab.)

Dashes—you should use them more sparingly than I do—are handy ways to set off parenthetical expressions. And the only way I know to display them in HTML is with the &mdash; or &ndash; entities: — –

line

Using special characters is not only useful—and even necessary in some cases—but can also spice up your text in unexpected and interesting ways.

I have become quite enamored of using &bull; to break up text. For example, in a simple table of contents line at the top of a page: 

Go to Spurs victory parade photos: 1999 • 2003 • 2005 • 2007 • 2014

And there's no law against using characters like mathematical symbols or punctuation or whatever comes to mind for situations they weren't designed for. As an example, you could also use the graceful shape of the integral sign to break up text if bullets are too plain for you:

Go to Spurs victory parade photos: 1999 ∫ 2003 ∫ 2005 ∫ 2007 ∫ 2014

Or to register greater than normal perplexity: What could this possibly mean?¿?¿?¿?¿

Graphic Credits

The handsome graphic lines on this page came from Realm Graphics, a nice free graphics site. The pi graphic came from the Florida Center for Educational Technology (when it was called something else). My daughter-in-law made the lovely Korean characters graphic for my personal web page.

Last updated:

Back to kathyamen.net