ONIX 2.1 to 3.0: Helping you make the transition

onix_2to3_transition.png

This is the first in a series exploring the differences between ONIX 2.1 and 3.0, and how to navigate your metadata through the transition. (Not sure what any of that means? Learn about ONIX here, or get a refresher on the sunset of 2.1 here.) In this first installment, Data Guru Tom Richardson explains why and how data traders ought to adopt ONIX 3.0, and gives an overview of the topics he’ll be covering this summer.

ONIX 2.1 and ONIX 3.0 largely track the same data points—what you are communicating in the two standards is essentially the same. Any place where there have been substantive changes to the ONIX 2.1 standard is because there were new types of information 2.1 wasn’t well designed to accommodate (usually digital data points) or—and this is what I’d like to start to address here—there were problems in the clarity and use of the data held in 2.1 across multiple markets.

Stage 1: Acceptance

If you accept all this—and you should because your National Group worked with EDItEUR and agreed to it—then part of making the transition to 3.0 requires acceptance that:

  • there are problems in metadata exchange that the transition to 3.0 is trying to solve; 
  • the solution may require change in how you export data; and, 
  • change may require supporting subdivisions of data points (a.k.a. “granularity”).

Acceptance should include knowing that fixing the problem is a needed and achievable goal because:

  • subdivision of data points allows flexibility in the supply chain; and, 
  • subdivision adds clarity and removes repetition.

Subdivision and granularity offer ways to make data simpler.

That last bit is subtle when you’re not sitting in an ivory tower, so here’s an example of what I mean by simplicity through subdivision. It’s supported in ONIX 2.1 and 3.0:

There’s a supply chain need to be able to disambiguate authors, i.e., retailers would love to keep all of the right books together with a specific author. You think I’m about to ramble on about ISNI, the persistent identifier that would make that possible, but not here. Using ISNI would qualify as supporting new information for most data suppliers, but ONIX supports a much simpler tool that’s almost as powerful: Contributors are supported in ONIX as people and as not-people, and that’s the first step in disambiguation. All you have to do to support it is put “people” information in the clearly labeled section for them using PersonName- and KeyName-related elements, and use CorporateName when the author is a group of individuals. The standard is not asking you to add “new” information by supporting people vs. not people. What they’re asking for is a primary piece of information you already have that will help the supply chain solve a problem. This is typical of many of the changes in ONIX 3.0: they’re new ways to present information you already have.

Okay, but what’s the ROI?

The above still means more fields (or it might mean that), and this is where publishers turn the discussion to the ROI. There’s a cost to every field, even if they already know the answer. Their database can’t support everything! And what assurance do they have that retailers will make use of the data? This argument can get pretty thin as it relies on a limited, old school definition of business data, and data senders do add fields and data all the time. If royalty statements need a second address, it will get added. Ebook metadata supports all sorts of “extra” data around prices and institutions.

Your ONIX feed is a primary tool of business communication—clarity in it is actually an important business need, used for more than just supporting EDI and the distribution warehouse. And it’s used for more than supporting consumer display. These days, business analysts at large retailers are looking to make connections in the data to help increase sales. New companies supporting apps and marketing tools are looking for data to support consumer access. And they don’t really care that in 1995 a comma delimited “LASTNAME, FIRST” with a “30 CHARACTER TITLE GIVEN IN AL” without subtitle or series support was all you needed to support business.

Still, it is true that data aggregators will program to the data they get, simply out of necessity. However, I’d like to offer that in a transition like the one from ONIX 2.1 to 3.0, there’s a simpler and cheaper solution they can use that if openly implemented would allay publisher fears about the data being used: 

  • As a first step, all retailers and aggregators should display fully complete and correctly delivered ONIX 3.0. Good ONIX data used well is the cornerstone of several Canadian ONIX projects and it has worked. If it’s displayed correctly based on the standard, then a problem to fix and matching the standard becomes the same thing.
  • If retailers find that, after doing that work, the actual quality of the ONIX 3.0 data, like its ONIX 2.1 predecessor, is poor, then based on the value of the client to them and the needs of their customers, they can program whatever workarounds are required to display the poor data adequately.

How will that save them money? It creates a reward system for supplying good data, and it can make it easier to absorb data from smaller players. The economics of adding a field is different when you support fewer books, but no one will invest in a small company’s data if it’s poor because there is no appreciable ROI for the retailer. Their choice is either: done right or loaded wrong. And there are a lot of small companies. For those companies with enough products to ask for “help” with their limited or poor data, every change they need has to be negotiated. The solution in place is one where retailers will push everyone. If you start at the standard, then the closer everyone comes to “the standard” the easier it is to maintain. If anyone wants to dispute that: comment, please!

What makes 3.0 different (and better)?

ONIX 3.0 is simpler, more robust, and easier to program and to implement. How do I know that? Well, BNC developers have told me. EDItEUR’s feedback tells them that. My own work in it tells me. But ONIX 3.0 is also much more capable and accurate. Sloppy data stands out like a sore thumb. It spoils the simplicity.

This series of educational blog posts will focus on the differences between the two systems. So, where are these changes? Answering this is made easier by the just released BISG ONIX Implementation Survey, which conveniently listed seven areas of “new functionality” in ONIX 3.0 and gives their current priority within the North American supply chain. One area involves new data points for digital, many of which are already being used in the digital supply chain. I’m less concerned about that and instead want to look at the six others that affect the metadata of BOTH print and digital supply chains. What are they? Straight from the survey:

  • New functionality for collections (e.g., title, series information, master brand, title statement, subseries) 
  • New functionality for suppliers (e.g., multiple supplier data, price identifiers, price conditions, tiered prices) 
  • New functionality for content (e.g., primary and secondary content types) 
  • New functionality for publishers (e.g., multiple publisher and imprint identifiers, product contacts) 
  • New functionality for markets (e.g., market-specific publishing details, sales restrictions) 
  • New functionality for contributors (e.g., multiple place descriptors)

And maybe if I haven’t gone mad and it makes sense, I’ll review at the end:

  • New functionality for ebooks (e.g., DRM, licensing, usage constraints)

Why these posts now?

A major e-tailer was in touch because they can’t make sense of the ONIX 3.0 data they see in Collection. A cursory dip into the ONIX 3.0 we’re starting to get in BiblioShare showed that it was easier to find poorly done data than properly done. So, the transition to ONIX 3.0 has begun but what’s being delivered is the same unchanged data that was identified as a problem, and that EDItEUR tried to solve by tweaking the standard.

So I’m going to start with Collections in my next post, review the available documentation, look at SIMPLE cases in the data, and show how it should have appeared.

And it should surprise no one that I hope to demonstrate that conversion software is a great place to start, but it’s not the solution for all the reasons I started the post with. These six areas are where you should be checking your data.