content-tag-invalid
Ensure meaningful content in PDFs is properly tagged, with relevant attributes, and mark non-meaningful content as artifacts.
Description
All meaningful content in PDFs need to be properly tagged with relevant attributes. Similarly, decorative and non-meaningful content should be marked as artifacts.
Such clear tagging helps assistive technologies like screen readers to identify elements accurately.
You can follow these guidelines to comply with this rule:
-
Real content must be in proper tags
- Every piece of real content (text, images, tables, etc.) must be placed inside a meaningful structure element (tag).
- These tags help define what the content is (e.g., a paragraph, table, figure).
- The definitions of these tags come from the PDF standard (ISO 32000-2:2020).
- Some tags are just for grouping, not meaning
- Some structure elements, like Div and Span, don’t have specific meanings by themselves.
- Div groups multiple elements together.
- Span is used to apply formatting (like bold or italic) to specific text.
- Attributes must match the context
- Some attributes (extra details added to tags) are useful only in certain situations.
- For example, the Placement attribute helps position images:
- If an image is inside a paragraph, it should have an inline placement.
- Content should be tagged logically, not based on layout
- The document’s meaning should be preserved even if it’s split across pages or columns.
- Example: A long paragraph that flows across two pages should still be in a single P (paragraph) tag, not split into two separate paragraph tags.
- Tables and other elements should stay together
- A table that spans multiple pages should be in one Table tag, not broken into separate table tags on each page.
- Not all images are figures
- Just because something looks like an image doesn’t mean it should always be tagged as a
figure
. - Example: If an image only contains text (like a screenshot of a word), it can be inside a
<span>
tag with an ActualText attribute instead of a Figure tag.
- Just because something looks like an image doesn’t mean it should always be tagged as a
- Artifacts and real content
- Some parts of a PDF (like decorative images, footers, or page numbers) are not considered real content.
- These elements should be marked as artifacts so they don’t interfere with accessibility tools like screen readers.
Examples
The following table illustrates a few examples of incorrect and correct implementation of the content-tag-invalid
rule:
Example | Incorrect | Correct |
---|---|---|
Table spanning pages | Splitting a single table into multiple table elements across pages. | Keeping a table spanning multiple pages inside a single table element. |
Heading structure | Skipping heading levels (e.g., using h1 → h3 instead of h1 → h2 → h3). | Using a logical sequence for headings, maintaining a proper hierarchy. |
List tagging | Using paragraphs to represent lists (e.g., “- Item 1” in a <p> tag). |
Using ul (unordered) or ol (ordered) elements for lists. |
Figure captioning | Placing a caption outside the figure element in a separate paragraph. | Using the figcaption element inside the figure for proper association. |
Quotations | Using <p> tags for quoted text instead of a dedicated blockquote element. |
Wrapping quoted text inside a blockquote with proper citation. |
Emphasis | Using <b> and <i> tags instead of <strong> and <em> . |
Using strong for strong emphasis and em for mild emphasis. |
Image tagging | Using the figure tag for an image that is purely textual (e.g., a company logo). |
Using a <span> with actualText to indicate the correct text representation. |
Inline and block elements in links | Placing a block-level element (like <div> ) inside an <a> tag. |
Ensuring only inline elements (like text or <span> ) are inside <a> tags. |
Paragraph spanning pages | Splitting a paragraph into multiple <p> tags across pages. |
Keeping a paragraph inside a single <p> tag even if it spans multiple pages. |
Text attributes | Using normal text for superscripts and strikethrough without attributes. | Using textPosition="sup" for superscript and textDecorationType="lineThrough" for strikethrough. |
How to fix
Follow these steps to fix any violations in the content-tag-invalid
rule:
- Ensure all real content is tagged correctly based on its meaning, not just appearance.
- Use appropriate attributes depending on the context (e.g.,
textDecorationType
for strikethrough text). - Keep logically connected content together:
- Use one
<table>
tag for a table spanning multiple pages. - Use one `<p>’ tag for a paragraph across columns or pages.
- Mark non-real content as artifacts.
Additional resources to fix issues
If you have created your PDFs using common tools like MS Word, Google Docs, or Adobe Acrobat, here are some resources to fix this issue in your tool of choice:
Adobe Acrobat
Adobe Acrobat Pro Accessibility Guide:
This guide explains how to tag elements (like headings, paragraphs, tables, and images), mark artifacts, and check accessibility.
Microsoft Word
Create Accessible PDFs from Word:
This guide explains how to use Word’s built-in accessibility checker and export a properly tagged PDF.
Add Alt Text to Images in Word:
Adding alt text ensures images are properly tagged when exported to PDF.
Google Docs
This guide explains how to use headings, alt text, and other features to create accessible documents.
References
- ISO 14289-2:2024, Section 8.2.2 (Real content)
- ISO 32000-2:2020, Section 14.8 (PDF structure elements)
- ISO/TS 32005 (PDF 1.7 and PDF 2.0 structure rules)
- ISO 32000-2:2020, Section 14.8.2.2.1 (Definition of real content)
We're sorry to hear that. Please share your feedback so we can do better
Contact our Support team for immediate help while we work on improving our docs.
We're continuously improving our docs. We'd love to know what you liked
We're sorry to hear that. Please share your feedback so we can do better
Contact our Support team for immediate help while we work on improving our docs.
We're continuously improving our docs. We'd love to know what you liked
Thank you for your valuable feedback!