Add structure to PDFs

When you export to Adobe PDF with the Create Tagged PDF option selected in the General area of the Export Adobe PDF dialog box, the exported pages are automatically tagged with a set of structure tags that describe the content, identifying page items such as headlines, stories, and figures. To add additional tags or to fine-tune existing ones before you export, you can use the Tags panel in InDesign. The Structure pane (View > Structure > Show Structure) reflects the changes.

You can improve the accessibility and reuse of Adobe PDF documents by adding tags to the InDesign document before you export. If your PDF documents don’t contain tags, Adobe Reader or Acrobat may attempt to automatically tag the document when the user reads or reflows it, but the results may be disappointing. If you do not get the results you want in the exported PDF file, you can use tools in Acrobat 6.0 Professional and later to edit the structure of tagged PDF documents. For the most advanced tools, use Acrobat 9 Professional.

When you apply tags to a document for PDF export, the tags do not control which content is exported to PDF, as is the case with XML export. Instead, the tags give Acrobat more information about the document’s structural contents.

Advantages of using tags

By applying tags to your document before exporting to PDF, you can do the following:

  • Map InDesign paragraph style names to Acrobat tagged Adobe PDF paragraph styles to create a reflowable PDF file for viewing on handheld devices and other media.

  • Mark and hide printing artifacts, text, and images so that they won’t appear when reflowed in Acrobat. For example, if you tag a page item as Artifact, the page item will not be displayed when you reflow the contents of a tagged Adobe PDF document on a handheld device, a small display, or a monitor at a large magnification.

  • Add alternative text to figures so that the text can be read aloud to the visually impaired with screen-reading software.

  • Replace graphic letters, such as ornate drop-caps, with readable letters.

  • Provide a title for a set of articles, or group stories and figures into articles.

  • Order stories and figures to establish a reading order.

  • Recognize tables, formatted lists, and tables of contents. Recognize which content blocks belong to the different stories.

  • Include text formatting information such as Unicode values of characters, spacing between words, and the recognition of soft and hard hyphens.

How tags affect reuse and accessibility

The content of an Adobe PDF document can be reused for other purposes. For example, you might create an Adobe PDF file of a report with text, tables, and images, and then use various formats to distribute it: for printing or reading on a full-sized monitor, for viewing on a handheld device, for reading out loud by a screen reader, and for direct access through a web browser as HTML pages. The ease and reliability with which you can reuse the content depends on the underlying logical structure of the document.

To make sure that your Adobe PDF documents can be reused and accessed reliably, you must add tags to them. Tagging adds an underlying organizational structure, or logical structure tree, to the document. The logical structure tree refers to the organization of the document’s content, such as title page, chapters, sections, and subsection. It can indicate the precise reading order and improve navigation—particularly for longer, more complex documents—without changing the appearance of the PDF document.

For people who are not able to see or decode the visual appearance of documents, assistive technology can access the content of the document reliably by using the logical structure tree. Most assistive technology depends on this structure to convey the meaning of content and images in an alternative format, such as sound. In an untagged document, no such structure exists, and Acrobat must infer a structure based on the reading order choices in the preferences. This method is unreliable and often results in page items read in the wrong order or not read at all.

The tags appear on the Tags tab in Acrobat 6.0 and later, where they are nested according to the relationship definitions for the tagged elements. You cannot edit tags in Acrobat Standard. If your work requires you to work directly with tags, you should upgrade to Acrobat 9 Professional. For more information, see Acrobat Help.

Logical structure tree on the Tags tab in Acrobat 9
Logical structure tree on the Tags tab in Acrobat 9

Note:

Tags used in Adobe PDF files can be compared to tags in HTML and XML files. To learn more about basic tagging concepts, see any of the many references and text books available in bookstores, in libraries, and on the Internet.

Understand and optimize reflow

You can reflow a PDF document to read it on handheld devices, smaller displays, or standard monitors at large magnifications, without having to scroll horizontally to read each line.

When you reflow an Adobe PDF document, some content carries into the reflowed document and some doesn’t. In most cases, only readable text reflows into the reflowed document. Readable text includes articles, paragraphs, tables, images, and formatted lists. Text that doesn’t reflow includes forms, comments, digital signature fields, and page artifacts, such as page numbers, headers, and footers. Pages that contain both readable text and form or digital signature fields don’t reflow. Vertical text reflows horizontally.

As an author, you can optimize your PDF documents for reflow by tagging them. Tagging ensures that text blocks reflow and that content follows the appropriate sequences, so readers can follow a story that spans different pages and columns without other stories interrupting the flow. The reading order is defined by the structure tree, which you can change in the Structure pane.

Headings and columns reflow in a logical reading order
Headings and columns (top) reflow in a logical reading order (bottom).

Tag page items

You can tag text frames and graphics automatically or manually. After you tag page items, you can use the Structure pane to change the order of your page by dragging elements to a new location within the hierarchy. If you change the order of the elements in the Structure pane, these changes are passed on to the Adobe PDF file. The order of the elements becomes useful when the PDF file is saved from Acrobat as an HTML or XML file. The order is also useful when you export an InDesign document for Dreamweaver (XHTML) or Digital Editions (EPUB) format.

Tag page items automatically

When you choose the Add Untagged Items command, InDesign adds tags to the Tags panel, and applies the Story and Figure tags to certain untagged page items. The Story tag is applied to any untagged text frames, and the Figure tag is applied to any untagged graphics. You can then manually apply other tags to sections of text. However, automatically tagging page items does not guarantee that the items will be structured accordingly in the exported PDF file.

  1. Choose Window > Utilities > Tags to display the Tags panel.
  2. Choose View > Structure > Show Structure to display the Structure pane, to the left of the Document window.
  3. Choose Add Untagged Items from the Structure pane menu.
    Tags in the Structure pane and Tags panel
    Tags in the Structure pane and Tags panel

Tag page items manually

  1. Choose Window > Utilities > Tags to display the Tags panel.
  2. Choose View > Structure > Show Structure to display the Structure pane, to the left of the Document window.
  3. Choose Add Untagged Items from the Structure pane menu.
  4. Select a page item in the document.
  5. Select a tag in the Tags panel. Note the following suggested uses for certain imported tags:

    Artifact

    The Artifact tag lets you hide page items, such as page numbers or unimportant objects, when viewing the exported PDF file in Reflow view, which displays only tagged items; see your Adobe Acrobat documentation. This is especially useful for viewing PDF files on a handheld device or in other PDF readers.

    Cell

    Use this tag for table cells.

    Figure

    Use this tag for placed graphics. The Figure tag is applied to all untagged graphics placed in your document when you choose Add Untagged Items.

    Paragraph tags (P, H, H1–H6)

    These tags have no effect on the exported PDF text when viewed in Reflow view. However, they can be useful in some situations when exporting a PDF file to HTML format.

    Story structure tag (PDF)Story

    Use this tag for stories. The Story tag is applied to all untagged text frames when you choose Add Untagged Items. For example, suppose you have an InDesign document formatted with three paragraph styles: Head1, Head2, and Body. First, map these paragraph styles to the H1, H2, and P tags, respectively. Next, export to PDF. Finally, when you export the PDF document to HTML or XML in Acrobat, the paragraphs tagged as H1, H2, and P will display appropriately (such as with large bold letters in H1) in a web browser. For information on exporting the PDF document to HTML or XML, see your Adobe Acrobat documentation.

Label graphics for use with screen-reader software

If you want screen readers to describe graphical elements that illustrate important concepts in the document, you must provide the description. Figures and multimedia aren’t recognized or read by a screen reader unless you add alternate text to the tag properties.

The Alt text attribute lets you create alternate text that can be read in lieu of viewing an illustration. ActualText is similar to Alt text in that it appears in lieu of an image. The ActualText attribute lets you substitute an image that is part of a word, such as when a fancy image is used for a drop cap. In this example, the ActualText attribute allows the drop cap letter to be read as part of the word.

When you export to Adobe PDF, the Alt text and Actual Text attribute values are stored in the PDF file and can be viewed in Acrobat 6.0 and later. This alternate text information can then be used when the PDF file is saved from Acrobat as an HTML or XML file. For more information, see your Adobe Acrobat documentation.

  1. If necessary, choose View > Structure > Show Structure to display the Structure pane, and choose Window > Utilities > Tags to display the Tags panel.
  2. Choose Add Untagged Items from the Structure pane menu.
  3. To make sure the image is tagged as Figure, select the image, and then select Figure in the Tags panel.
  4. Select the Figure element in the Structure pane, and then choose New Attribute from the Structure pane menu.
  5. For Name, type either Alt or ActualText (this feature is case-sensitive).
  6. For Value, type the text that will appear instead of the image.

Group page items into an Article element

Use the Structure pane to logically group page items into an Article element. For example, if a set of stories spans multiple pages, you can create an umbrella element that will contain these stories in a single group. These umbrella elements are called structural elements. You can also name your grouped articles.

Note:

You cannot tag grouped page items.

  • To group page items, select New Element from the Structure pane menu, select the Article element in the Tags panel, and then drag page elements underneath it in the Structure pane.
  • To name grouped items, right-click the Article element in the Structure pane and choose New Attribute. For Name, type Title. For Value, type the name of the article you want to use.

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License  Twitter™ and Facebook posts are not covered under the terms of Creative Commons.

Legal Notices   |   Online Privacy Policy