Home
Blog
Overview of all products
SalesData
LibraryData
CataList
Loan Stars
BiblioShare
Webform
EDI
Products for publishers
Products for retailers
Products for libraries
Information for authors
BNC Research
Canadian literary awards
SalesData & LibraryData Research Portal
Events
Tech Forum
Webinars & Training
Code of Conduct
Standards
EDI standards
Product identifiers
Classification schemes
ONIX standards
About
Contact us
Media
Bestseller lists
Newsletters
Podcast
Jobs
SalesData
LibraryData
CataList
BiblioShare
Webform
EDI

BookNet Canada

Home
Blog
Overview of all products
SalesData
LibraryData
CataList
Loan Stars
BiblioShare
Webform
EDI
Products for publishers
Products for retailers
Products for libraries
Information for authors
BNC Research
Canadian literary awards
SalesData & LibraryData Research Portal
Events
Tech Forum
Webinars & Training
Code of Conduct
Standards
EDI standards
Product identifiers
Classification schemes
ONIX standards
About
Contact us
Media
Bestseller lists
Newsletters
Podcast
Jobs
SalesData
LibraryData
CataList
BiblioShare
Webform
EDI
Tom Richardson
November 16, 2009
BiblioShare, ONIX, Standards & Metadata

Data Exchange Tip #5: Some Basics—Tools Before Validation

Tom Richardson
November 16, 2009
BiblioShare, ONIX, Standards & Metadata

An XML file is simply text—nothing very special—except that in order for XML software to read and interpret it, everything needs to be just so. XML is, loosely, a computer formatting language, and as such is a low type of computer code—if not quite as finicky as a proper programming language it has much stricter rules than HTML.

Every part of the structure of the file, and aspects of the contents, must match two defining documents: An ONIX file is validated by using XML software to compare your file against the rules of the XML standard (www.W3.org) and the schema written by the ONIX developers at Editeur (www.editeur.org). So an ONIX validation is both something that applies to all XML documents and is specific to the ONIX data exchange standard—and validation errors might be from either. You shouldn’t confuse the XML validation process with the Certification report generated by BiblioShare. Every file accepted into BiblioShare, after it passes the XML validation discussed here, gets a quality assessment that looks for data issues. This is a distinct and separate process from XML validation.

You probably should research and try to understand as much about XML as time, energy and inclination allow you—you’ll be happier producing ONIX if you do, and possibly more comfortable using Epub too. There are good resources on the web, Wikipedia and www.w3schools.com/XML/ are recommended as a start.

What Gets Validated?

The ONIX file, the file you send to BiblioShare and your other trading partners is what gets validated. In solving validation problems you might make corrections to your original dataset and re-output the ONIX file, or you might just manually modify the file itself, but it’s the file, whatever.xml, that we’re working with here. Validation is always the last step—before sending XML files to anyone they should be checked.

Taking Stock

First off: Do you have any XML software—an XML editor or development suite like XML Spy or oXygen? If you’ve inherited this job, look at your program list, ask! It can’t hurt and you may as well use what you’ve got or paid for. I will be recommending some specific software and one is free, but there’s nothing special about it. You should consider getting more than one validation tool (you can never have enough validation).

You can find software through a web search on “XML editor” or look at the “XML Resources” at O’Reilly’s www.XML.com.

Text Editor

As noted above, an XML file is just text and XML files can be opened in a text editor. If you’ve got an XML editor you might use that as it’s designed for the work, but you absolutely can view and edit an ONIX file in a text editor. Your only concern is ensuring the editor does not change the file. For example you can use MS-Word to open an xml file—but don’t do it!! Word is set up to “help” you run XSLT transformation scripts and will make any number of assumptions and changes to the file content, none will be good for our purpose of using the XML standard to exchange data. (This warning about software changing files applies to a lot of XML software. Until you’re sure it doesn’t, assume any software might be making changes to an XML file.)

What you want to use is as simple a text editor as possible, on a PC Notepad or Wordpad, on a Mac, TextEdit or SimpleText, make gentle changes to the text you can see and save it without rendering the document unreadable to XML software. Really, it’s just use the keyboard or cut and paste text, and exit using the most straightforward options. If forced to choose format options on saving first try to use the one labeled “ASCII US” or “ASCII text”.

There are number of text editors available designed to be used by programmers—they tend to have better “Go To Line” features, are usually tag sensitive (you’ll understand that when you see it—very handy in XML)—and they don’t muck with the code. I’m fond of Notepad2, http://www.flos-freeware.ch/notepad2.html, but Notepad++ http://notepad-plus.sourceforge.net/uk/site.htm might be worth checking out.

As always all work should be done on a copy of your ONIX file—experiment but don’t trash your work.

PC vs MAC

Macs are better for a lot of things but you have more options (and more free options) for XML software on a PC. If your Mac has a Windows emulation or operating system boot area any PC solution should work. Mac solutions are typically Java based—and there’s nothing wrong with that (PC software usually rely on the .NET Framework)—but they are more likely to have fees associated with them.

I would really appreciate feedback from Mac users as I’m not very familiar with what’s out there. oXygen seems to be the clear favorite but I’m sure there’s some good freeware too.

File Size

XML software is typically processor intensive and requires a lot of RAM memory resources. Some software fails at large file sizes, and all most will be more difficult when handling large files. You’ll find it faster and easier to understand if you do this on a smaller file (below 1000 records and below 100 records would be even better), at least while doing you’re first validations. When you’re familiar with the software and its responses try using larger sizes—most XML software has an upper limit at which it’s unresponsive. How would you know if you haven’t done it successfully?

How do you cut a file down to size? Use a text editor, open the ONIX file and remove individual product records by starting with the tag (or for short tags) and include the corresponding tag (or ). So long as you remove whole product records ( to ) and leave the other tags alone you can take out as many as you want.

Internet Access

XML software usually needs internet access to work—do this on a computer hooked up to it.

The ONIX Documentation

It’s big, it’s dull and you need it on your computer: www.editeur.org ONIX / ONIX for Books / Previous releases / Release 2.1 Downloads / Download Release 2.1 format specifications You’ll need to get the current release so I’ve not provided a direct link. Having a copy of the Product Manual and the Message Specifications is invaluable. The PDF is linked to the code lists and it’s the easiest way to look up something.

Tagged: xml, data exchange tips

Newer PostKindle Arriveth & Shortcovers Expandeth
Older PostData Exchange Tip #4: Escaping Entities—Påvøl Breaches Checkpoint Charlie
Blog RSS

The Canadian Book Market 2024 is the comprehensive guide to the Canadian market with in-depth category data.

Get your copy now

Listen to our latest podcast episode


  • Research & Analysis 446
  • Ebooks 304
  • Tech Forum 266
  • Conferences & Events 261
  • Standards & Metadata 227
  • Bookselling 218
  • Publishing 194
  • ONIX 177
  • Marketing 152
  • Podcasts 117
  • ebookcraft 112
  • BookNet News 99
  • Loan Stars 71
  • Libraries 66
  • BiblioShare 59
  • SalesData 51
  • 5 Questions With 48
  • CataList 42
  • Thema 42
  • Awards 30
  • Diversity & Inclusion 20
  • Publishing & COVID-19 18
  • Sustainability 10
  • LibraryData 9
  • EU Regulations 8
  • ISNI 4

 

 

BookNet Canada is a non-profit organization that develops technology, standards, and education to serve the Canadian book industry. Founded in 2002 to address systemic challenges in the industry, BookNet Canada supports publishing companies, booksellers, wholesalers, distributors, sales agents, industry associations, literary agents, media, and libraries across the country.

 

Privacy Policy | Accessibility Policy | About Us

BOOKNET CANADA

Contact us | (416) 362-5057 or toll free 1 (877) 770-5261

We acknowledge the financial support of the Government of Canada through the Canada Book Fund (CBF) for this project.

Back to Top

BookNet Canada acknowledges that its operations are remote and our colleagues contribute their work from the traditional territories of the Mississaugas of the Credit First Nation, the Anishnawbe, the Haudenosaunee, the Wyandot, the Mi’kmaq, the Ojibwa of Fort William First Nation, the Three Fires Confederacy of First Nations (which includes the Ojibwa, the Odawa, and the Potawatomie), and the Métis, the original nations and peoples of the lands we now call Beeton, Brampton, Guelph, Halifax, Thunder Bay, Toronto, Vaughan, and Windsor. We endorse the Calls to Action from the Truth and Reconciliation Commission of Canada (PDF) and support an ongoing shift from gatekeeping to spacemaking in the book industry.