This is a companion piece (a temporary one in terms of need, I hope) to our post on producing "born accessible" books. That post provides all the information a publisher needs to actually supply metadata on accessibility. This post looks at the data available, provides a possible explanation for the source of problems, and tries to explain the design intent in the ONIX standard.
Accessibility values are my focus here. I'm starting out at such a high level because, when we started researching accessibility metadata use in BiblioShare, what we found was either wrong or absent. There's no use of the Accessibility Code List 196 values in either ONIX 2.1 or 3.0, and what data we had from either version was of exceptionally poor quality. None of it was worth referencing, at least for accessibility purposes.
Notes are the key
It's my belief that the problem is simple: Metadata creators (or users of systems that create ONIX output) don't reference the ONIX code lists' Note section for clarification on how to properly use the codes.
Software and service providers (and I include BookNet Canada's Webform in this) are aware of and program based on what's in the notes, but they don't provide easy access to those notes for their users. I think developers would argue that the UI involved in providing them would be complex and unwieldy, and that users only need to reference the notes when making initial decisions; for day-to-day use the short descriptions available in their drop-downs are enough.
I understand that argument but I think that users go to their software, find ambiguous data entry options, and don't know it's as simple as looking at the ONIX code list notes for explanations. For the record, in Webform's case, we provide a separate document with access to full code lists and highlight it in the help menu. I've had a chance to use other ONIX software and I've never seen any obvious "code list note" solution, but maybe other ONIX software sources have something similar? I've never actually looked, have I? Who does? I doubt many Webform users have.
The takeaway: Regardless of what the software folk have done, you've got to reference the full code list and its notes to decide what to do when using whatever system you're using. Here's some guidance on using EDItEUR's documentation and code lists.
Product Form Feature and how to use it for accessibility purposes
Wily EDItEUR is always clear, so something called Product Form Feature must be used to describe features specific to a product sold to consumers. Product Form Feature (allow me to call it PF-Feature hereafter) is where you find support for things like the coloured end papers featured in some bibles; forestry and tree-friendly "green product" certifications; and, for digital products, accessibility features and accessible-friendly EPUB version numbers.
PF-Feature is intended to be a container (a.k.a., composite) that describes features. The Note column is important in any part of ONIX but it's particularly important for the PF-Feature. PF-Feature Type's source is ONIX Code List 79 and about half of those codes reference 10 (!) other code lists in the notes (not in the short description provided in software drop-downs).
An example: When you use the PF-Feature Type, list 79, Code "10" to denote the EPUB version number, you're told to use full version numbers and it's strongly recommended that you use Code 15 to provide that data. Code 15 will tell you to use ONIX Code List 220 as the value. To be clear: The PF-Feature Value, when using PF-Feature Type=15 is a code taken from ONIX Code List 220 "E-publication Version Number."
Think about that in terms of the code lists for a second. Your drop-down only shows the description (minus the notes). When you choose PF-Feature Type=15, the Value displayed in your software becomes the description (minus the notes) from a code list you're probably not told anything more about. If the note provides clarity and you've dropped down two layers without them, how do you know what you're choosing?
A step back again: Why is a code list recommended over the free text, EPUB-version numbering offered by PF-Feature Type=10? Accessibility functionality: The EPUB version affects e-reader functionality options so it needs accurate support. Free text entries, even when supplied accurately, will never have the consistency that a code offers. Consistency means that retailers can build support to help people with disabilities isolate products. Consistency means a librarian can search and select only records that meet certain criteria.
Actual use in ONIX 2.1 and 3.0
BiblioShare has around 490,000 ONIX 2.1 digital book records and 80,000 ONIX 3.0 records. Most of these records are created by large data houses that handle major American publishers. The data is representative of what's available in the North American supply chain.
In terms of the transition from ONIX 2.1 to 3.0, the PF-Feature Type Codes 10 and 15 are intended for ONIX 3.0 only, as ONIX 2.1 uses the tag EpubVersionType. But 2.1's version data does match to 3.0's PF-Feature Type 10 — remember that it's available for use but carries the strong recommendation to supply the data in Code 15 referencing ONIX Code List 220.
Of the half million digital ONIX 2.1 records in BiblioShare's data (all types and formats), 80,000 entries are using 2.1's EpubVersionType. All of them (with four exceptions) carry the value "2" or "3." Looking at 3.0 data finds about 40,000 records supplying the PF-Feature Code 10 as "2" or "3." The three or four exceptions? They add .00.
That data is no help for identifying accessibility functionality, so what data is using the recommended version Code 15 in ONIX 3.0? The strongly recommend code list–based data? Nothing. No use.
What's available as PF-Feature Type code 09 referencing Accessibility ONIX Code List 196? Nothing. No use. (Well, there's one instance of "09" being used but no value was supplied. My guess is the person doing the data entry didn't understand the secondary code list and their software gave no support.)
Remember that I'm referencing a decent swath of metadata, enough to think that this is very representative, so allow me my moment to be bitter: The current industry "solution" to making the transition from ONIX 2.1 to ONIX 3.0 isn't to improve the data to reliably support important information, but to find a slot that will accept the weak-not-very-useful norm of the ONIX 2.1 standard that does nothing to implement improvements. Same as it ever was. Thank you, I feel better. But still: No one will find your accessible books this way.
A code and its associated value are usually serving a unique purpose and your database may provide a slot. Do you know which code from which list is being used in the ONIX record? Software developers and IT departments should be making that clear and available. If you don't know why you're doing something, follow the data by looking at the codes and their definitions. If something doesn't make sense, read the notes. If it still doesn't make sense: Ask us!
This is such a poorly understood problem that Tom will continue his thoughts on code lists and context with a more generalized example on our blog later this month.