At BookNet, metadata is at the core of what we do. With so much of it flowing in and out of our products and services, we’ve seen it all. In this blog series, we will share each of the common issues addressed in the Improving your metadata: Common issues and how to fix them Tech Forum presentation where eight BookNetters walk you through some of the most common issues we see in publishers’ metadata, highlighting what metadata standards are recommended for each case and showing you how to fix these common problems.
Why is this important? Accurate, high-quality metadata ensures your books are seen and that they succeed in today’s competitive marketplace. Join us as we help you optimize your metadata and unlock its full potential!
The issue: Display issues caused by rogue <div> tags
Image source: gunshowcomic.com/648
When HTML isn't implemented according to ONIX standards, your carefully crafted book, contributor and/or biographical descriptions, might appear perfectly formatted in your systems, but downstream partners may see broken layouts, missing text, or other display problems.
Why is this an issue?
ONIX allows only a specific subset of HTML tags to be used within certain metadata fields. Using tags outside this approved list can cause problems for systems that process your metadata and throughout the supply chain. This may lead to:
Display issues on platforms like CataList
Display issues among key data recipients, including retailers and libraries
Text not being indexed correctly for search
Content cut-off or formatting that makes it hard to read or inaccessible
What BookNet recommends
1. Stick to approved HTML/XHTML tags:
The ONIX 3.0 Implementation and Best Practice Guide lists recommended, allowed, and disallowed tags. Strongly preferred tags include:
<p> and <br/> for paragraphs and line breaks within paragraphs
<i> and <em> for italic
<b> or <strong> for bold
<cite> for book titles
<ul>, <ol>, and <li> for bulleted numbered lists
<sub> and <sup> for sub and subscript
<dl> <dt> and <dd> for definition lists
<ruby>, <rb>, <rp> and <rt> for simple glosses in Mandarin, Cantonese, Japanese, and other text.
2. Use markup only where necessary:
Not all tags support XHTML, so it’s important to use markup thoughtfully. In some cases, you may want to avoid it altogether. For example, if the text within your HTML/XHTML is used by retailers for indexing, removing the markup can help ensure the text is parsed correctly.
Alternatively, you can use display-oriented fields in your ONIX data, such as <ContributorStatement> or <TitleStatement>. These are useful if your book information requires specific formatting beyond standard conventions. Otherwise, leaving these fields free of HTML/XHTML will help retailer and library systems index and parse your data as intended.
3. Follow these key rules:
Make sure text is contained within paragraph tags (<p> … </p>)
Avoid copying content directly from word processors, PDFs, or websites
Ensure your characters are in UTF-8 encoding
Keep tags lowercase (XHTML is case-sensitive)
Avoid HTML attributes (like style attributes)
Always match your tags correctly and maintain proper nesting
4. Consider using XHTML instead of HTML:
Using XHTML instead of HTML is the best method of all. XHTML offers better structure and reliability as it requires properly closed tags, correct nesting, and case sensitivity (lowercase tags). This helps prevent parsing errors in automated systems that process ONIX data.
Need help? Don't hesitate to reach out to the BookNet team with specific questions.
The complete slide deck and transcript from this session are available here.
To stay up to date, subscribe to our weekly newsletter, eNews, where we share news about upcoming events and webinars, infographics and new research, updates on industry standards, links round-ups, and more.
Tips on how to avoid display issues caused by rogue <div> tags.