The Title/Encoding options in Page Properties let you specify the document encoding type that is specific to the language used to author your web pages, The Title/Encoding options also let you specify which Unicode Normalization Form to use with that encoding type.
From the Page Properties panel, select Title/Encoding. You can configure the following options:
- Title: Specifies the page title that appears in the title bar of the Document window and most browser windows.
- Document Type (DTD): Specifies a document type definition. For example, you can make an HTML document XHTML-compliant by selecting XHTML 1.0 Transitional or XHTML 1.0 Strict from the pop‑up menu.
- Encoding: Specifies the encoding used for characters in the document. If you select Unicode (UTF‑8) as the document encoding, entity encoding is not necessary because UTF‑8 can safely represent all characters. If you select another document encoding, entity encoding may be necessary to represent certain characters. For more information on character entities, see www.w3.org/TR/REC-html40/sgml/entities.html.
- Reload: Converts the existing document, or reopens it using the new encoding.
- Unicode Normalization Form: Enabled only if you select UTF‑8 as a document encoding. There are four Unicode Normalization Forms. The most important is Normalization Form C because it’s the most common form used in the Character Model for the World Wide Web. Adobe provides the other three Unicode Normalization Forms for completeness. In Unicode, some characters are visually similar but can be stored within the document in different ways. For example, “ë” (e‑umlaut) can be represented as a single character, “e‑umlaut,” or as two characters, “regular Latin e” + “combining umlaut.” A Unicode combining character is one that gets used with the previous character, so the umlaut would appear above the “Latin e.” Both forms result in the same visual typography, but what is saved in the file is different for each form. Normalization is the process of making sure all characters that can be saved in different forms are all saved using the same from. That is, all “ë” characters in a document are saved as single “e‑umlaut” or as “e” + “combining umlaut,” and not as both forms in one document. For more information on Unicode Normalization and the specific forms that can be used, see the Unicode website at www.unicode.org/reports/tr15.
- Include Unicode Signature (BOM): Includes a Byte Order Mark (BOM) in the document. A BOM is 2 to 4 bytes at the beginning of a text file that identifies a file as Unicode, and if so, the byte order of the following bytes. Because UTF‑8 has no byte order, adding a UTF‑8 BOM is optional. For UTF‑16 and UTF‑32, it is required.