Você está na página 1de 134

CEN

Draft CWA NNNNN

WORKSHOP

April 2012

AGREEMENT
ICS 03.100.10; 01.140.20

English version

Classification Mapping for open and standardized product classification usage


in eBusiness (cMap)
This CEN Workshop Agreement has been drafted and approved by a Workshop of representatives of interested parties, the constitution of
which is indicated in the foreword of this Workshop Agreement.
The formal process followed by the Workshop in the development of this Workshop Agreement has been endorsed by the National
Members of CEN but neither the National Members of CEN nor the CEN Management Centre can be held accountable for the technical
content of this CEN Workshop Agreement or possible conflicts with standards or legislation.
This CEN Workshop Agreement can in no way be held as being an official standard developed by CEN and its Members.
This CEN Workshop Agreement is publicly available as a reference document from the CEN Members National Standard Bodies.
CEN members are the national standards bodies of Austria, Belgium, Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia,
Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Norway, Poland,
Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey and United Kingdom.

EUROPEAN COMMITTEE FOR STANDARDIZATION


COMIT EUROPEN DE NORMALISATION
EUROPISCHES KOMITEE FR NORMUNG

Management Centre: Avenue Marnix 17, B-1000 Brussels

2012 CEN All rights of exploitation in any form and by any means reserved worldwide for CEN national Members.

Ref. No. CWA NNNNN:2012 E E

CEN WS eCAT
Date: 2012-04-23
Secretariat: AFNOR

Draft CEN Workshop Agreement on Classification Mapping for open


and standardized product classification usage in eBusiness (cMap)
Draft CWA version 4

ICS: 03.100.10; 01.140.20


Descriptors:

prCWA XXX-1:200X (E)

Contents
Foreword ........................................................................................................................................................ 4
Introduction .................................................................................................................................................... 5
1. Scope ...................................................................................................................................................... 9
2. Normative References ............................................................................................................................ 9
3. Definitions and abbreviations ................................................................................................................ 10
3.1
Definitions .................................................................................................................................... 10
3.2
Abbreviations ............................................................................................................................... 12
4. Methodologies for product classification system mapping ................................................................... 14
4.1
Ontologies .................................................................................................................................... 14
4.1.1 Semantic heterogeneity ........................................................................................................... 14
4.1.2 Ontology matching problem ..................................................................................................... 17
4.1.3 Areas of ontology mapping / matching .................................................................................... 19
4.2
Product classification system mapping methodologies ............................................................... 21
4.2.1 A canonical process model for ontology mapping ................................................................... 23
4.2.2 CC3P ....................................................................................................................................... 24
4.3
Elementary schema-based matching / mapping approaches ..................................................... 26
4.4
Architecture .................................................................................................................................. 28
4.4.1 Centralized architecture ........................................................................................................... 29
4.4.2 Distributed architecture ............................................................................................................ 29
4.5
Tools supporting ontology development and mapping ................................................................ 29
4.6
Exchange formats for ontologies ................................................................................................. 32
4.6.1 XML .......................................................................................................................................... 32
4.6.2 RDF and RDFS ........................................................................................................................ 34
4.6.3 OWL ......................................................................................................................................... 36
4.6.4 SKOS ....................................................................................................................................... 37
4.6.5 BMEcat .................................................................................................................................... 38
4.7
Summary and Recommendations ............................................................................................... 38
5. The cMap Overall Mapping Methodology ............................................................................................. 40
5.1
Requirements............................................................................................................................... 40
5.1.1 Product Classification System versions used in cMap ............................................................ 40
5.1.2 General requirements about the cMap mapping / matching methodology .............................. 41
5.1.3 Mapping challenges ................................................................................................................. 41
5.1.4 Mapping relationships .............................................................................................................. 42
5.2
Design of the cMap Mapping Methodology ................................................................................. 43
5.2.1 Design of the mapping methodology ....................................................................................... 43
5.2.2 The cMap platform architecture ............................................................................................... 44
5.2.3 Selection of an appropriate tool for the mapping ..................................................................... 45
5.2.4 Import and Export format ......................................................................................................... 46
5.3
Usage of the cMap Mapping Methodology Mapping results statistics...................................... 47
5.3.1 Interpretation for the CPV system (I. Vertical Section) ............................................................ 48
5.3.2 Interpretation for the eCl@ss system (II. Vertical Section) ...................................................... 50
5.3.3 Interpretation for the GPC system (III. Vertical Section) ......................................................... 51
5.3.4 Interpretation for the UNSPSC system (IV. Vertical Section) .................................................. 53
5.4
Summary and Recommendations ............................................................................................... 55
6. Description of the classification systems .............................................................................................. 57
6.1
Introduction .................................................................................................................................. 57
6.2
Release policy and roadmap ....................................................................................................... 57
6.3
Maintenance process ................................................................................................................... 58
6.4
Version compatibility .................................................................................................................... 58
6.4.1 CPV .......................................................................................................................................... 58
6.4.2 UNSPSC .................................................................................................................................. 62
6.4.3 GPC ......................................................................................................................................... 64
6.4.4 eCl@ss .................................................................................................................................... 69
6.5
Summary...................................................................................................................................... 73
6.5.1 Differences and Similarities ..................................................................................................... 74
6.5.2 Identified problems for the common maintenance of the mapping .......................................... 74
6.5.3 Recommendations ................................................................................................................... 76
7. Definition of the architecture for an open standardized classification collaboration platform ............... 78
2

prCWA XXX-1:200X (E)

7.1
Introduction .................................................................................................................................. 78
7.2
Business use cases ..................................................................................................................... 78
7.3
Actors who require the mapping .................................................................................................. 79
7.4
cMap platform roles ..................................................................................................................... 82
7.4.1 End-user .................................................................................................................................. 83
7.4.2 Platform administration authority ............................................................................................. 83
7.4.3 Platform Provider ..................................................................................................................... 83
7.4.4 Classification Authority ............................................................................................................ 83
7.4.5 Mapping Proposer ................................................................................................................... 84
7.4.6 Quality manager....................................................................................................................... 84
7.4.7 Apply officially at the administration Release manager ........................................................... 84
7.5
Business objects .......................................................................................................................... 85
7.5.1 Representing classifications .................................................................................................... 85
7.5.2 Representing mappings ........................................................................................................... 86
7.5.3 Bringing in line the cMap mapping cases ................................................................................ 90
7.6
Use cases .................................................................................................................................... 95
7.6.1 Use Case 1: Query for mapping .............................................................................................. 96
7.6.2 Use Case 2: Manage mapping ................................................................................................ 96
7.6.3 Use case 3: Load classification system ................................................................................... 99
7.7
Requirement analysis ................................................................................................................ 104
7.7.1 Architectural requirements ..................................................................................................... 104
7.7.2 Platform requirements ........................................................................................................... 105
7.7.3 Process-related requirements................................................................................................ 107
7.7.4 End-user requirements .......................................................................................................... 107
7.8
Data Quality ............................................................................................................................... 107
7.8.1 Unique Identification .............................................................................................................. 108
7.8.2 Tracking and Tracing ............................................................................................................. 108
7.8.3 Change Management ............................................................................................................ 108
7.8.4 Intellectual property rights and conditions of use .................................................................. 108
8. Definition of a synchronization process .............................................................................................. 110
8.1
Introduction ................................................................................................................................ 110
8.2
The basis for a sustainable process: Gen-ePDC and its adaption for cMap ............................. 110
8.3
cMap processes ......................................................................................................................... 110
8.3.1 Process 1: Query for mapping result ..................................................................................... 110
8.3.2 Process 2: Apply Release Update Information ...................................................................... 112
8.3.3 Process 3: Manage Mapping ................................................................................................. 113
8.4
Maintenance strategy ................................................................................................................ 114
8.5
Governance models ................................................................................................................... 115
8.5.1 Governance model 1: Community-driven .............................................................................. 116
8.5.2 Governance model 2: Classification authority-driven ............................................................ 118
8.5.3 Governance model 3: Administration authority-driven .......................................................... 119
8.5.4 Pros and Cons...................................................................................................................... 120
8.6
Business models (high level) ..................................................................................................... 121
8.6.1 Proposal fee ........................................................................................................................... 122
8.6.2 Mapping result fee ................................................................................................................. 122
8.6.3 Membership restriction .......................................................................................................... 122
8.6.4 Classification authority financed ............................................................................................ 122
8.6.5 Third party financed ............................................................................................................... 122
8.6.6 cMap as a service .................................................................................................................. 122
8.6.7 Comparison............................................................................................................................ 122
9. Conclusion and recommendation ....................................................................................................... 124

prCWA XXX-1:200X (E)

Foreword
The production of this CEN Workshop Agreement on Classification Mapping for open and standardized
product classification usage in eBusiness (cMap) was decided at the CEN Workshop eCAT plenary meeting
on 2 March 2011.
The Project team started to work in May 2011.
CEN Workshop eCAT was launched in 2002 and gathers experts in electronic catalogues and classifications
systems used in ebusiness and eprocurement both from public and private sector.
The Workshop produced eight CWAs which are available at URL:
http://www.cen.eu/cen/Sectors/Sectors/ISSS/Workshops/Pages/eCAT.aspx
The list of companies supporting this CWA will be added in the final document.

prCWA XXX-1:200X (E)

Introduction
The cMap project - Classification and Mapping for eBusiness and eProcurement - is a follow up project of the
CC3P project - Classification and catalogue systems for public and private procurement - which has been
closed in 2010 with CWA 16138. The CC3P project conducted an analysis about how different product
classification systems can be aligned with each other to get knowledge about the possibility to map or align
these different systems with each other.
The basis requirement for such an alignment was to have product data classified in one product classification
system classified manually, semi-automatically or even automatically in another product classification
system. Such mapping or alignment would facilitate business processes, such as electronic procurement or
tendering, even if different classification systems are used enterprise-wide.
To get this knowledge, four main product classification systems have been assessed, since they are widely
used within companies and by the public sector. These main product classification systems are CPV,
eCl@ss, GPC and UNSPSC. A trial mapping has been undertaken for six domains to analyse differences
and similarities between the product classification systems in order to extract basic rules for alignment or
mapping. Those six domains (namely: Cloths, Food Beverage & Tobacco, Furniture, Electronics, Laboratory,
Energy) are updated in the cMap project and available alongside with 45 files will be available.
In addition to the analysis of the structure of the four product classification systems, the maintenance
processes of the different classification authorities that are responsible for the maintenance of each product
classification system have been assessed. The goal was to understand the differences and similarities of the
maintenance processes in order to define an overall maintenance process for a product classification
authority responsible that would be responsible for the alignment or mapping of the four main product
classification systems.
As main results from the CC3P project a set of recommendations have been extracted or defined. They will
serve to reach a harmonization between the product classification systems and to facilitate an alignment or
mapping in the future. On the other hand, recommendations for a high-level mapping platform have been
defined, which can be used to align or map the four product classification systems with each other and reach
the goal of classify ones, use in different product classification systems of product data.
The cMap project follows the CC3P project and extends the results of the CC3P project in two main areas:
Finishing a full mapping of all domains of the four product classification systems
Defining an architecture and a governance mechanism for a mapping platform in terms of building
blocks and its requirements.
In addition, an analysis has be carried out to investigate the methods and methodologies to fulfil a semiautomatic or even automatic mapping among the four main product classification systems used in CC3P.
This methodology can serve as the core for the classification platform and gives the glue among the four
product classification authorities to support mappings among these product classification systems. Not only
technical aspects but also organizational aspects are taken into account.
To reflect all these points, in CC3P the following overall harmonization strategy has been recommended to
reach the goal of product classification system alignment or mapping.

prCWA XXX-1:200X (E)

Figure 1 Overall harmonization strategy


The different parts of this high-level, overall harmonization strategy are described inside this document.
The CWA addresses the following topics:
Methodologies that are usable to map product classification systems
Possible architectures for the mapping platform
Exchange formats for import and export
Design of the cMap methodology
Recommended architecture for the cMap platform
Statistics on the mappings
Upgrade information
Use cases, actors, roles and business requirements for the cMap platform
Data quality recommendations
Definition of a synchronisation process with a focus on governance models and an approach to
business models.

prCWA XXX-1:200X (E)

Figure 1 Overall harmonization strategy............................................................................................................ 6


Figure 2: 40 sources of semantic heterogeneities [11] .................................................................................... 16
Figure 3: Two simple ontologies and an alignment [8] .................................................................................... 17
Figure 4: The ontology matching operation [8] ................................................................................................ 18
Figure 5: Overall methodology for product classification systems engineering .............................................. 21
Figure 6: Canonical process model for mapping ontologies ........................................................................... 23
Figure 7: Methodology of analysis of product classification systems [CWA 16138, p. 94] ............................. 24
Figure 8 Classification systems overview [CWA 16138, p. 95] ....................................................................... 24
Figure 9: Differences in numbering schemas of CPV, eCl@ss, GPC and UNSPSC ...................................... 25
Figure 10: Comparison of naming schemas for CPV, eCl@ss, GPC and UNSPSC ...................................... 25
Figure 11: Gen-ePDC data model for product classification systems ............................................................. 26
Figure 12: Overall platform architecture for the product classification mapping ............................................. 28
Figure 13: Example of an easy XML document .............................................................................................. 33
Figure 14: An RDF example ............................................................................................................................ 34
Figure 15: RDFS example ............................................................................................................................... 35
Figure 16: RDFS example abbreviated ........................................................................................................... 35
Figure 17: RDF and Dublin Core example ...................................................................................................... 36
Figure 18: Mapping tables ............................................................................................................................... 40
Figure 19: The cMap mapping methodology ................................................................................................... 43
Figure 20: The ontology matching operation [8] .............................................................................................. 43
Figure 21: The cMap Mapping architecture ..................................................................................................... 45
Table 22: Mapping statistics ............................................................................................................................ 47
Table 23: Comparison of release policies ....................................................................................................... 57
Table 24: New CPV codes in correspondence table ....................................................................................... 59
Table 25: Other CPV 2003 empty lines in correspondence table ................................................................... 59
Table 26: Machine-readable documentation of new CPV codes in correspondence table ............................. 60
Table 27: Modified CPV codes in correspondence table ................................................................................ 60
Table 28: Machine-readable documentation of modified CPV codes in correspondence table...................... 60
Table 29: Addition of product properties to CPV codes in correspondence table ........................................... 61
Table 30: Addition of product properties to CPV codes in correspondence table ........................................... 61
Table 31: Split CPV codes in correspondence table ....................................................................................... 61
Table 32: Machine-readable documentation of a split CPV codes in correspondence table .......................... 62
Table 33: Joined CPV classes in correspondence table example one ........................................................ 62
Table 34: Joined CPV classes in correspondence table example two ......................................................... 62
Table 35: New UNSPSC codes in audit trail ................................................................................................... 63
Table 36: Edited UNSPSC codes in audit trail ................................................................................................ 63
Table 37: Deleted UNSPSC codes in audit trail .............................................................................................. 64
Table 38: Moved UNSPSC codes in audit trail ................................................................................................ 64
Table 39: Example of additions as documented in GPC delta report ............................................................. 65
Table 40: Example of added classes as documented in GPC delta report ..................................................... 65
Table 41: Example of modified classes as documented in GPC delta report ................................................. 65
Table 42: Example of deleted classes as documented in GPC delta report ................................................... 66
Figure 43: Refuse bags in GPC version 01122010 ......................................................................................... 66
Figure 44: Refuse bags in GCP version 01062011 ......................................................................................... 67
Figure 45: Household Refuse bins (indoor) in GPC version 01122010 .......................................................... 67
Figure 46: Household Refuse bins (indoor) renamed as refuse / waste bins in GPC version 01062011 ....... 68
Table 47: GPC changes as documented in the delta report ........................................................................... 68
Table 48: eCl@ss changes as documented in the Release Update Files ...................................................... 69
Table 49: eCl@ss deliverables as documented in the Release Update Files (RUF) ...................................... 70
Table 50: New eCl@ss classes as documented in the mapping tables .......................................................... 71
Table 51: Modified eCl@ss classes as documented in the mapping tables ................................................... 71
Table 52: Closed eCl@ss classes as documented in the mapping tables...................................................... 71
Table 53: Successor-relation of closed eCl@ss classes as documented in the mapping tables ................... 72
Table 54: Example of moved eCl@ss classes as documented in the mapping tables ................................... 72
Table 55: Example of split eCl@ss classes as documented in the mapping tables ....................................... 73
Table 56: Example of joined eCl@ss classes as documented in the mapping tables .................................... 73
Table 57: Possible Changes and compatibility: UNSPSC, eCl@ss, GPC, CPV ............................................ 73
Figure 58: Business use case of mapping user: the problem of using different classification systems .......... 78
Figure 59: Business use case mapping user: the solution for using different classification systems ............. 79
Figure 60: Business use case mapping user: example for the usage of different classification systems ....... 79
Figure 61: Business use case mapping user: example for the usage of different classification systems ....... 80
Figure 62: Transclassification can be read as a portmanteau of translation and classification ...................... 81
7

prCWA XXX-1:200X (E)

Figure 63: cMap roles: Overview ..................................................................................................................... 82


Figure 64: Categorization class (source: ISO 13584-32) ................................................................................ 85
Figure 65: Classification and product description ............................................................................................ 85
Figure 66: Example for a 121 mapping of classes from all four systems from the cMap mapping tables ...... 89
Figure 67: Mapping of classes and additional mapping of properties ............................................................. 90
Figure 68: Representation of no mapping found ............................................................................................. 91
Figure 69: Representation of N21 and 12N mapping ...................................................................................... 91
Figure 70: Representation of N2M and M2N mapping .................................................................................... 92
Figure 71: Representation of 121 mapping ..................................................................................................... 93
Figure 72: Representation of 12M and M21 mapping ..................................................................................... 94
Figure 73: Representation of M2M mapping ................................................................................................... 95
Figure 74: cMap Mapping Status of Classification Classes ............................................................................ 95
Figure 75: UC02: Manage mapping ................................................................................................................ 97
Figure 76: UC02.04: Process mapping change request (based on ePDC maintenance) ............................... 98
Figure 77: UC02.07: Withdraw mapping (based on ePDC maintenance) ....................................................... 99
Figure 78: UC03.01: Load - New release ...................................................................................................... 100
Figure 79: UC03.02: Load - New classification system ................................................................................. 101
Figure 80: UC03.04: Apply release update information ............................................................................. 102
Figure 81: UC03.04 Example: Apply release update information of mapped CPV classes to GPC .......... 103
Figure 82: UC03.04 Example: Apply release update information of mapped GPC classes CPV .............. 103
Figure 83: cMap initial load ............................................................................................................................ 106
Figure 84: cMap Process 1: Query for Mapping Result ................................................................................ 111
Figure 85: cMap Process 2: Apply Release Update Information .................................................................. 112
Figure 86: cMap Process 3: Manage Mapping .............................................................................................. 113
Figure 87: Maintenance strategy ................................................................................................................... 114
Figure 88: Comparison of Governance Models ............................................................................................. 121
Figure 89: Comparison of Business Models .................................................................................................. 123
Figure 90: The OWLClasses view can be used to edit hierarchies of concepts ........................................... 127
Figure 91: Seamless integration of Protg-OWL with classification tool ..................................................... 127
Figure 92: Protg-OWL for editing RDF Schema models ........................................................................... 128
Figure 93: OWLViz for visualizing OWL ontologies graphically .................................................................... 128
Figure 94: Ontology comparison by PromptDiff ............................................................................................ 129
Figure 95: Ontology comparison by PromptDiff ............................................................................................ 129
Figure 96: Ontology merging Initial list of suggestions by Prompt ............................................................. 130
Figure 97: Prompt merging classes ............................................................................................................... 130
Figure 98: Prompt merged classes ................................................................................................................ 130
Figure 99: Ontology merging by Prompt........................................................................................................ 131

prCWA XXX-1:200X (E)

1.

Scope

The present document studies four product classifications used in eBusiness in Europe (and beyond) to
reach the overall goals stated in the introduction, according to the CC3P project for an initial mapping and
the research in the direction of methods, methodologies and platforms.
The versions of the product classification systems used here are:
UNSPSC v11 English
eCl@ss 6.0.1 English
GPC 30062008 English ( As at 31 August 2009)
CPV 2008 English

2.

Normative References

The following normative documents contain provisions which, through reference in this text, constitute
provisions of this CWA. For dated references, subsequent amendments to, or revisions of, any of these
publications do not apply. However, parties to agreements based on this CWA are encouraged to investigate
the possibility of applying the most recent editions of the normative documents indicated below. For undated
references, the latest edition of the normative document referred to applies.
CWA 15294:2005, Dictionary of Terminology for Product Classification and Description
CWA 15295:2005, Description of References and Data Models for Classification
CWA 15556-3:2006, Product Description and Classification Part 3: results of development in harmonization
of product classification and in multilingual electronic catalogues and their respective data modelling
CWA 16100:2010, Guidelines for the design, implementation and operation of a product property server
(ePPS)
CWA 16138:2010, Classification and catalogue systems used in electronic public
and private procurement
DIN 4002-100, Properties and their scopes for product data exchange Part 100: Properties on
www.DINsml.net
IEC 61360, Standard data element types with associated classification scheme for electric components
ISO/IEC 6523, Information technology Structure for the identification of organizations and organization
parts
ISO/IEC 11179, Metadata registries (MDR)
ISO 13584, Industrial automation systems and integration Parts library (PLIB)
ISO/DIS 22274:2011, Systems to manage terminology, knowledge and content Concept-related aspects
for developing and internationalizing classification systems
ISO/TS 29002-5:2009, Industrial automation systems and integration Exchange of characteristic data
Part 5: Identification scheme
ISO 8000, Data Quality
ISO 22745, Industrial automation systems and integration - Open technical dictionaries and their application
to master data

prCWA XXX-1:200X (E)

3.

Definitions and abbreviations

3.1

Definitions

For the purposes of the present document, the following terms and definitions apply:
3.1.1
Attribute
data element for the computer-sensible description of a () property, a relation or a () class. EXAMPLE
Creation date of a class object in a computer system
Source: ISO/DIS 22274
3.1.2
Backward Compatibility
the ability of software and hardware to use data produced by a previous generation of software and
hardware
Source: ISO Concept Database, 2011 (http://cdb.iso.org), ISO 12651:1999
Alternative definition:
a newer coding standard is backward compatible with an older coding standard if decoders designed to
operate with the older coding standard are able to continue to operate by decoding all or part of a bitstream
produced according to the newer coding standard
Source: ISO Concept Database, 2011 (http://cdb.iso.org), ISO/IEC 13818
3.1.3
Brick
Term used in the GPC for a product () class
3.1.4
Characteristic
distinguishing trait or quality
NOTE: Characteristics that apply to () concepts are called feature specifications (3.11), whereas
characteristics of () classes are called () properties. Source: ISO/DIS 22274
3.1.5
Class
description of a set of () objects that share the same () characteristics. NOTE: The characteristics may
include () properties, operations, methods, relationships and semantics
Source: ISO/DIS 22274
NOTE: a class is usually described by a class name and a class code that identifies the class hierarchical
position within a classification system
3.1.6
Classification
process of assigning phenomena to () classes according to criteria. Source: ISO/DIS 22274
3.1.7
Classification System
systematic collection of () classes organized according to a known set of rules, and into which () objects
may be grouped
NOTE: This CWA considers both the classification system with properties and the classification system
without properties
EXAMPLE 1: UNSPSC is an example of a classification system without properties.
EXAMPLE 2: eCl@ss is an example of a classification system with properties.
Source: ISO/DIS 22274
10

prCWA XXX-1:200X (E)

3.1.8
Commodity class
Term used in the UNSPSC and eCl@ss for a product () class
3.1.9
Concept
unit of knowledge created by a unique combination of () characteristics.
NOTE: Concepts are not necessarily bound to particular languages. They are, however, influenced by the
social or cultural background which often leads to different () classifications
Source: ISO/DIS 22274
3.1.10
Data provenance
a record of the ultimate derivation and passage of a piece of data through its various owners or custodians
Source: ISO 8000-102
3.1.11
Four-eye-principle
a security precaution that requires at least two people to approve of a particular activity
(http://en.wikipedia.org/wiki/Four_eyes)
3.1.12
Identifier
a character or group of characters constituting a data element value used to identify or name an object and
possibly to indicate certain properties of that object
Source: ISO Concept Database, 2011 (http://cdb.iso.org), ISO/IEC 6523-1:1998, cited ISO 13584-26:2000
Alternative definition:
linguistically independent sequence of characters capable of uniquely and permanently identifying that with
which it is associated.
Source: ISO Concept Database, 2011 (http://cdb.iso.org), adapted from ISO/IEC 11179-3:2003
3.1.13
Object
anything perceivable or conceivable. NOTE: Objects may be material (e.g. an engine, a sheet of paper, a
diamond), immaterial (e.g. conversion ratio, a project plan) or imagined (e.g. a unicorn)
Source: ISO/DIS 22274
3.1.14
Ontologist
someone who professionally deals with shared formal conceptualizations
3.1.15
Predecessor
the class for a classified product before upgrading to a new release. The product will be classified with a ()
successor class code in the new release
EXAMPLE: a user has assigned the class 21150301 (predecessor) to their product according to eCl@ss
Release 6.2. Due to a change in the classification system this class code is now changed to 23170203
(successor) in eCl@ss Release 7.0. If only one class of the existing (source) release corresponds to one
class in the new (target) release, there is a 121-relation. If more than one class is joined into one class there
is a M21-relation, in the case of one class being split into more than one class we speak of a 12M-relation
3.1.16
Property
defined characteristic suitable for the description and differentiation of the () objects in a () class
EXAMPLE: Ambient temperature may be a property of a class comprising geographical locations.
Source: ISO/DIS 22274
11

prCWA XXX-1:200X (E)

3.1.17
Release Policy
certain rules and principles that define the criteria for releasing new versions, e.g. the frequency, the content
scope, the validity etc.
3.1.18
Release Update File
Information on the changes that were done between two consecutive releases of a classification system,
usually published by the classification authority
EXAMPLE: For the GPC, a delta report is published, for the UNSPSC an audit trail, for the CPV
correspondence tables, for eCl@ss Release Update Files (mapping tables) are available
3.1.19
Roadmap
a plan that applies to a new product or process, or to an emerging technology
Source: http://en.wikipedia.org/wiki/Technology_roadmap (2011-09-05)
3.1.20
Successor
the new class for a classified product when upgrading to a new release. The product was already classified
with a () predecessor class code in the last release
EXAMPLE: a user has assigned the class 21150301 (predecessor) to his/her product according to eCl@ss
Release 6.2. Due to a change in the classification system this class code is now changed to 23170203
(successor) in eCl@ss Release 7.0. If only one class of the existing (source) release corresponds to one
class in the new (target) release, there is a 121-relation. If more than one class is joined into one class there
is a M21-relation, in the case of one class being split into more than one class this is a 12M-relation
3.1.21
Upward Compatibility
ability to move data from a more advanced version of a system or software package to a less advanced
version
Source: ISO Concept Database, 2011 (http://cdb.iso.org), ISO 12651:1999
3.1.22
Value
part of an attribute specification which specifies one possible content of an attribute compliant with the
domain of the attribute. Source: CWA 15294:2005
NOTE: The quoted term attribute is replaced here by the term property according to the ISO definition. I.e.
a value is one possible content of a property. The domain of the property is the quantity of allowed or valid
values for a property. EXAMPLE: for a class traffic light the property colour would have the allowed values
red, yellow, green which form the propertys domain

3.2

Abbreviations

CEN

European Committee for Standardization

CPV

Common Procurement Vocabulary

CWA

CEN Workshop Agreement

DIN

German Organization for Standardization

DTD

Document Type Description

eCat

electronic product catalogue; in this document eCat


12

prCWA XXX-1:200X (E)

ePDC

electronic Product Description and Classification

GPC

Global Product Classification (GS1 Classification System)

GDSN

Global Data Synchronisation Network by GS1 (see www.gs1.org/gdsn)

GTIN

Global Trade Item Number (GS1 Trade item unique identifier cf


http://www.gs1.org/gdsn/gpc)

GUI

Graphical User Interface

IEC

International Electrotechnical Commission

ISO

International Organization for Standardization

PCS

Product Classification System

PLIB

Parts Library [ISO 13584]

UNSPSC

United Nations Standard Products and Services Code

13

prCWA XXX-1:200X (E)

4.

Methodologies for product classification system mapping

4.1

Ontologies

An ontology typically provides a vocabulary that describes a domain of interest and a specification of the
meaning of terms used in the vocabulary [8]. Depending on the precision of the specification, the notion of
ontology encompasses several data and conceptual models, including, sets of terms, classifications,
thesauri, database schemas, or fully axiomatised theories [7]. When several competing ontologies are used
in different applications, most often these applications cannot immediately interoperate.
Ontologies are serving for structuring and exchanging of data or information. They typically consist of:
Concepts or classes
Types
Instances
Relations
Inheritance and
Axioms
For further information on ontologies, http://semanticweb.org/wiki/Ontology.

4.1.1 Semantic heterogeneity


Interoperability and integration of data sources are becoming ever more important issues as both, the
amount of data and the number of data producers are growing. Interoperability not only has to resolve the
differences in data structures, it also has to deal with semantic heterogeneity. Semantics refer to the
meaning of data in contrast to syntax which only defines the structure of the schema items (e.g., classes and
attributes).
Resolving semantic heterogeneities must address more than 40 discrete categories of potential mismatches
from units of measure, terminology, language, and many others. These sources may derive from structure,
domain, data or language [9].
Even if OWL (or similar) languages now provide the means to represent an ontology, there is the vexing
challenge of how to resolve the differences between different views or perspectives, even within the same
domain. An example of a perspective is the purchasing one as different to the sales perspective.
When independent parties develop database schemas for the same domain, they will almost always be quite
different from each other. These differences are referred to as semantic heterogeneity, which also appears in
the presence of multiple XML documents, Web services, and ontologies or more broadly, whenever there
is more than one way to structure a body of data. The presence of semi-structured data exacerbates
semantic heterogeneity, because semi-structured schemas are much more flexible to start with. For multiple
data systems to cooperate with each other, they must understand each others schemas. Without such
understanding, the multitude of data sources amounts to a digital version of the Tower of Babel [10].
1

There are many potential circumstances where semantic heterogeneity may arise :
Enterprise information integration
Querying and indexing the deep Web (which is a classic data federation problem in that there are
literally tens to hundreds of thousands of separate Web databases)
Merchant catalogue mapping
Schema versus data heterogeneity
Schema heterogeneity and semi-structured data

1 http://techwiki.openstructs.org/index.php/Classification_of_Semantic_Heterogeneity

14

prCWA XXX-1:200X (E)

Naturally, there will always be differences in how differing authors or sponsors create their own particular
world view, which, if transmitted in XML or expressed through an ontology language such as OWL may
also result in differences based on expression or syntax. Indeed, the ease of conveying these schemas as
semi-structured XML, RDF or OWL is in and of itself a source of potential expression heterogeneities. There
are also other sources in simple schema use and versioning that can create mismatches [10]. Thus, possible
drivers in semantic mismatches can occur from world view, perspective, syntax, structure and versioning and
timing:
One schema may express a similar world view with different syntax, grammar or structure
One schema may be a new version of the other
Two or more schemas may be evolutions of the same original schema
There may be many sources modelling the same aspects of the underlying domain (horizontal
resolution such as for competing trade associations or standards bodies), or
There may be many sources that cover different domains but overlap at the seams (vertical
resolution such as between pharmaceuticals and basic medicine)
Heterogeneities can be classified into three broad classes [11]:
Structural conflicts arise when the schema of the sources representing related or overlapping data
exhibit discrepancies. Structural conflicts can be detected when comparing the underlying DTDs 2.
The class of structural conflicts includes generalization conflicts, aggregation conflicts, internal path
discrepancy, missing items, element ordering, constraint and type mismatch, and naming conflicts
between the element types and attribute names.
Domain conflicts arise when the semantic of the data sources that will be integrated exhibit
discrepancies. Domain conflicts can be detected by looking at the information contained in the
DTDs and using knowledge about the underlying data domains. The class of domain conflicts
includes schematic discrepancy, scale or unit, precision, and data representation conflicts.
Data conflicts refer to discrepancies among similar or related data values across multiple sources.
Data conflicts can only be detected by comparing the underlying documents The class of data
conflicts includes ID-value, missing data, incorrect spelling, and naming conflicts between the
element contents and the attribute values.
Moreover, mismatches or conflicts can occur between set elements (a population mismatch) or attributes (a
description mismatch).
The figure below shows about 40 distinct potential sources of semantic heterogeneities:

2 Document Type Description

15

prCWA XXX-1:200X (E)

Figure 2: 40 sources of semantic heterogeneities [11]


Some of these line items have to be explained:
Homonyms refer to the same name referring to more than one concept, such as Name referring to a
person versus Name referring to a book
A generalization/specialization mismatch can occur when single items in one schema are related to
multiple items in another schema, or vice versa. For example, one schema may refer to phone but
the other schema has multiple elements such as home phone, work phone and cell phone
Intra-aggregation mismatches come when the same population is divided differently (Census versus
Federal regions for states, or full person names versus first-middle-last, for examples) by schema,
whereas inter-aggregation mismatches can come from sums or counts as added values
Internal path discrepancies can arise from different source-target retrieval paths in two different
schemas (for example, hierarchical structures where the elements are different levels of remove)
The four sub-types of schematic discrepancy refer to where attribute and element names may be
interchanged between schemas
Under languages, encoding mismatches can occur when either the import or export of data to XML
assumes the wrong encoding type. While XML is based on Unicode, it is important that source
retrievals and issued queries be in the proper encoding of the source. For Web retrievals this is very
important, because only about 4% of all documents are in Unicode
16

prCWA XXX-1:200X (E)

Even should the correct encoding be detected, there are significant differences in different language
sources in parsing (white space, for example), syntax and semantics that can also lead to many
error types.

4.1.2 Ontology matching problem


Ontology matching3 is a solution to the semantic heterogeneity problem. It finds correspondences between
semantically related entities of ontologies. These correspondences can be used for various tasks, such as
ontology merging, query answering, or data translation. Thus, matching ontologies enables the knowledge
and data expressed with respect to the matched ontologies to interoperate [8].
The following example to explain the problem of ontology matching is taken from [8].
Classes are shown in rectangles with rounded corners, e.g., in ontology O1, Book being a specialization
(subclass) of Product. Relations are shown as arrows,. Attributes like price being an attribute defined as
on a basic data type like integer while properties are defined as for example are reference to another class.
Albert Camus: La chute is a shared instance. Correspondences are shown as thick arrows that link an
entity from O1 with an entity from ontology O2. They are annotated with the relation that is expressed by the
correspondence: for example, Person in O1 is less general than Human in O2.
Assume that an e-commerce company acquires another one. Technically, this acquisition requires the
integration of their information sources, and hence, of the ontologies of these companies. The documents or
instance data of both companies are stored according to ontologies O1 and O2, respectively. In the example
these ontologies contain subsumption4 statements, property specifications and instance descriptions. The
first step in integrating ontologies is matching, which identifies correspondences, namely the candidate
entities to be merged or to have subsumption relationships under an integrated ontology.
Once the correspondences between two ontologies have been determined, they may be used, for instance,
for generating query expressions that automatically translate instances of these ontologies under an
integrated ontology. For example, the attributes with labels title in O1 and in O2 are the candidates to be
merged, while the class with label Monograph in O2 should be subsumed by the class Product in O1.

Figure 3: Two simple ontologies and an alignment [8]

3 Due to the different terminologies used by the reference authors, three different terms are used: matching, mapping and alignment but
they have the same meaning.
4 To be understood as aggregation

17

prCWA XXX-1:200X (E)

Even since there are different formalisations of the matching operation available for this document the
matching operation determines an alignment A for a pair of ontologies O1 and O2. Hence, given a pair of
ontologies (which can be very simple and contain one entity each), the matching task is that of finding an
alignment between these ontologies. There are some other parameters that can extend the definition of
matching [8]:
the use of an input alignment A, which is to be extended
the matching parameters, for instance, weights, or thresholds
external resources, such as common knowledge and domain specific thesauri

Figure 4: The ontology matching operation [8]


Classification can be defined as a process of grouping things into common categories (classes).
Classification is closely related to the term ontology, which is the science of determining and structuring the
essential properties and relationships between things. But the process of structuring and semantic
description requires some notion of the elements and the relationships among the elements that makes
up a structure. Thus, ontology leads to taxonomy (classification framework), a set of elements or categories
and logical relationships among the classes typically ordered hierarchically.
The current status in the area of mapping of product classification systems has shown that although the
mapping of product classification systems has a strong practical relevance, there is a lack of clarity in the
terms (words) used to describe the topic. The same concepts are named with different words and the same
words may be used for different meanings. For example the following different terms are used in the area of
product classification or ontology mapping, alignment, merging, articulation, fusion, integration,
morphism, mapping and so on [1]. In addition, part of the problem is the lack of a comprehensive survey,
a standard terminology, hidden assumptions or undisclosed technical details, and the dearth of evaluation
metrics [1].
In general, mapping or matching is linking one classification system to another. That is each individual class
in the source classification should be linked with the most appropriate corresponding class(es) in the other.
In addition, ontology mapping can be defined as given two ontologies O1 and O2, mapping one ontology to
another means that for each entity (concept C, relation R, or instance I) in ontology O1, a corresponding
entity shall be found, which has the same intended meaning, in ontology O2. [1]
This allows for better management of classification systems in a co-coordinated way and facilitates the
usage of more than one system. Ontology mapping is also known as ontology alignment, semantic
integration, and ontology merging in some cases, depending upon the application and intended outcome [1].
Experience shows that there are two directions for approaching the problem of product classification
mapping:
One direction is driven by the community of database systems.
In the research field of database systems often the problem arises to map or merge different
database schemas to or into one database schema, e.g. in the field of federated database systems.
The intention is partial and controlled data-sharing between database systems. But Kalfoglou and
Schorlemmer explicate that the problem in the area of database schemas is the heterogeneity of
different schemas. This heterogeneity occurs when there is a disagreement about the meaning,
interpretation, or intended use of the same or related data [1]. This rises up since database
schemas typically provide only few semantics to interpret data. A database schema only consists of
18

prCWA XXX-1:200X (E)

schema objects like class definitions (e.g. table definitions in a relational model), entity types and
relationship types and their relationships. So, a potential user is responsible for understanding the
semantics of the objects in the database schema.
A second direction for approaching the problem of product classification mapping is ontologies. The
ontology mapping problem is very closely related to the product classification mapping problem,
since a product classification can be seen as a specialized ontology using a (mono-) hierarchical
structure for the concepts, e.g. product classes, within the ontology. When referring to the computer
science, ontologies are defined as formal representation systems of knowledge or a domain of
discourse, e.g. products.
They are characterized by:
-

Typically a denoted and formalized and ordered representation of a set of terms and their
relationships in a specific domain of discourse.
They contain inference - and integrity - rules.
Represent a network of information with logical relations.

Database schemas and ontologies share similarity since they both provide a vocabulary of terms and
somewhat constrain the meaning of terms used in the vocabulary. Hence, they often share similar matching
solutions [8].
Overcoming semantic heterogeneity is typically achieved in two steps [8]:
matching entities to determine an alignment, i.e., a set of correspondences
interpreting an alignment according to application needs, such as data translation or query
answering.
Taking into account these formal parts of ontologies, a product classification system can be seen as a
lightweight ontology, containing terms, a taxonomy (a classification framework for all products and
services) and relations between terms and properties, which describe these terms, like classes.

4.1.3 Areas of ontology mapping / matching


The investigation of ontology mapping or matching is actually organized around three specific areas [3]:
Mapping discovery
Declarative formal representation of mappings
Reasoning with mapping

4.1.3.1 Describe the mapping challenges


The main question concerning mapping discovery is given two ontologies, how is it possible to find
similarities between them, and how is it possible to determine which concepts and properties represent
similar notions? [1].
The mapping discovery is motivated by the following challenges [3]:
The increasing volume of data suggests that it would be impractical in most cases to produce
mappings between ontologies manually.
o To overcome this, fully or semi-automated means of identifying correspondence 5and
consequent mappings should be developed to facilitate this process.
The heterogeneity of information reflected by the ontologies will lead to mismatches between
ontologies.
o Different classes for ontology mismatches are identified as:
Mismatch in the expressiveness of the representational languages for different
ontologies. This occurs when
Languages differ in their syntax
If one language has constructs (syntactical items) which are not available in
5 A correspondence links two concepts or items that have strong similarities. In product classification system mapping, a
correspondence identifies a candidate class for a mapping. It is similar to the class in the source system.

19

prCWA XXX-1:200X (E)

another ontology
If the semantics of the same language constructs vary in their
implementation
Mismatch on the ontological level, which occurs if two languages are using
the same term to denote different concepts
different terms to denote the same concepts
different modelling paradigms, conventions or level of granularity
using constructs that cover different ranges of the domain.
o Automated and semi-automated mapping systems must be able to
identify the variations in concepts as represented by various
ontologies and take appropriate steps to normalize the meaning of
those concepts.

4.1.3.2 Declarative formal representation of mappings


The main question concerning declarative formal representation of mappings is given two ontologies, how
can mappings be represented between them to enable reasoning with mappings? [1]
Mappings constitute a form of knowledge about the interrelationship between two ontologies and the domain
of discourse they model. So, if mappings themselves are formalized, they can [1]
Be stored for future reference,
Queried to derive new information,
Combined with other mappings to produce new mappings and
Evaluated against other mappings to compare the effectiveness of different mapping algorithms.
Ontology mapping may be represented in different ways, depending upon the intended use of the mapping:
The mapping operation produces a translation of a source ontology into a target ontology, or merges
two ontologies into a new, third ontology. By doing this, the mapping result is integrated in the
resultant outcome and cannot be reused.
The mappings may be represented as queries or bridging axioms describing how one entity can be
mapped or transformed into another, and stored separately from the ontologies they map.
To support interoperability and reuse of knowledge, mappings may be encoded by using
standardized mapping representation languages.
By standardising the syntax and semantics of the mappings, supporting tools can be developed to
facilitate the management of mappings.
A possible formalism to describe a mapping element 6 might be [5] :
o A mapping element is a 5-tuple (id, e, e, n, R), where
o Id is a unique identifier of the given mapping element
o E and e are the entities of the ontologies to be mapped
o N is a confidence measure in some mathematical structure, like [0, 1] range, for the
correspondence between entities e and e
o R is a relation (e.g. equivalence, more general, disjointness, overlapping) for the entities e
and e.
According to this, an alignment between two ontologies is a set of mapping elements. The matching
operation determines the alignment for a pair of ontologies.
In general, mapping representations should provide [6]:
Clear semantics
o The semantics of mappings should be formally defined such that they can be used to
reasoning, combine evidence to propose likely mappings, and enhance the mapping rules
Accommodation of incompleteness
o Incompleteness may arise in a mapping when two ontologies cover different domains or
different portions of a domain, and, when a certain concept can be mapped between the
models, but the mapping is unknown or hard to specify.
6 A mapping element is a mapping rule.

20

prCWA XXX-1:200X (E)

In such cases, support should be provided for identifying the incompleteness that may be
present within the mapping.
Accommodation for heterogeneity
o Mappings between domains will likely entail the use of multiple representational languages.
Consequently, mapping representations will need to should support multiple languages or, a
common representation language should be employed and mappings between models
described with this language.
This short comparison shows the problems of product classification mapping in the research field of
ontologies. It is much more adequate to investigate the possible solutions with ontologies than with database
schemas. For this reason, this work will concentrate on mapping solutions driven by the ontology science to
find a practical solution to the product classification mapping problems within a mapping methodology.

4.2

Product classification system mapping methodologies

Based upon the research on methodologies regarding product classification systems, or more generally
ontologies, a discipline called ontology engineering brings light to the topic of product classification system
mapping. It covers the systematic creation, development, maintenance and mapping of ontologies for
different application domains.
In ontology engineering the process of ontology development can be split into different phases which are:
Requirement engineering for the ontology development
Design of the ontology according to the requirements
Development of the ontology and the maintenance process
Usage of the ontology
Maintenance of the ontology
Retiring the ontology
Mapping of ontologies or product classification systems as lightweight ontologies can be seen mainly using
the phases of design (phase 2) and development of an ontology (phase 3) since in these phases the ground
is set up to support the mapping methods applies between different ontologies or product classification
systems.

Phase 1

Phase 2

Requirement
engineering

Design of the
ontology

Phase 3

Phase 4

Development
of the
ontology

Usage of the
ontology

Phase 5

Phase 6

Maintenance
of the
ontology

Placing out of
order of the
ontology

Figure 5: Overall methodology for product classification systems engineering


Phase 1: Requirements engineering
In this first phase of the ontology and product classification system development (as lightweight ontology)
methodology the general requirements based on the application domain must be defined and documented
from a user perspective or the intended usage of the ontology.
21

prCWA XXX-1:200X (E)

In addition, the view on the ontology has to be decided since the view or probably different views are used
for representing the information that is structured within the ontology or product classification system to the
audience or users. Since product classification systems are typically modelled as lightweight ontologies
consisting at least of two different parts, a description of the concrete products as classes or concepts of the
ontology and a hierarchical tree-based taxonomy, it has to be extracted which information have to be
integrated into the product description to reflect the content of the different views and which taxonomies are
suitable for grouping the classes of the product classification system according to a specific view.
This means, e.g. that different pieces of information are needed to reflect a product classification system
from the view of purchasing in the opposite to reflecting a product classification system from the sales
perspective. In addition, different taxonomies are needed to facilitate the usage for the different application
areas like purchasing or sales. All information needed for the different views per product class or concept
has to be extracted from the requirements engineering process and the requirements of the correlating
taxonomy. When looking at the four horizontal product classification systems investigated within this
document, that is UNSPSC, eCl@ss, CPV and GPC, the UNSPSC product classification system consists
only of taxonomy whereas eCl@ss consists of a taxonomy and product description given by classes as
concepts and their related properties. The requirements for these two product classification systems are
accordingly very different in the amount of information which shall be represented within the product
classification system. Beside this, UNSPSC can be seen as mainly driven by a sales view whereas eCl@ss
can be seen as mainly driven by a purchasing view. According to these different views, the taxonomy of both
product classification systems is very different and not really comparable.
Phase 2: Design of the ontology
After defining the general requirements, the representation formalism for the description of the ontology or
product classification system shall be selected. In addition to this, the granularity of the description must be
defined. Today product classification systems are designed for a specific usage given by one taxonomy and
the description of classes or concepts in the direction of this intended usage. Product classification systems
as lightweight ontologies are in general defined for a specific intended usage that is the description of
products and services. But to cover different scopes different taxonomies shall be defined to meet the
requirements of the different views and also different product properties used by the different views shall be
defined.
Phase 3: Development of the ontology and maintenance process
In this phase the design principles given by the requirements have to be elaborated into a concrete ontology
or product classification system. That means, the concrete classes based on the meta model of the ontology
must be defined. Additionally it has to be decided which kind of tool shall be used to support the
development of the ontology or product classification system. Different tools are available on the market (e.g.
SKOS or Protg). The selection of the tool also supports in different ways different philosophies of
maintenance, like centralized versus distributed development and maintenance of ontologies.
Phase 4: Usage of the ontology
If the ontology or product classification system has been designed according to the requirements and has
been developed according to specific problem-based representation formalism maybe based on a specific
tool, the ontology can be used by companies for describing products and services.
Phase 5: Maintenance of the ontology
Since the development of ontology is not a static work and ontologies might change over time, the ontology
has to be adapted to the needs of the application domain. In the area of product classification systems, every
year thousands of products are developed, so that also these products must be described according to a
specific product classification system used by companies. If some information introduced by new products
and services cannot be covered by the ontology, the ontology itself, that is the metamodel has to be adapted
to meet these new requirements. This can include the adoption of product representation capabilities or new
views that is to say new taxonomies.
Phase 6: Retiring the ontology
If a product classification system or ontology is not used any more, it has to be retired. It happens when a
specific product classification system will not be used any more or is replaced by a successor. In the second
case a transition to the new product classification system should be supported to map product data used to a
specific product classification system to the successor product classification system to avoid breaks and
additional work in the electronic supply chain.
22

prCWA XXX-1:200X (E)

Based on the overall methodology for product classification system engineering, a huge number of
methodologies have been developed to satisfy different usages of ontologies in different, but mostly specific
application domains. Within the design phase of these methodologies, different approaches to support the
development of ontologies exist such as:
Architecture of a platform and tools supporting the development and mapping of ontologies or
product classification systems
Formal representation mechanism or mechanism and formats/notations for ontologies
Methods for representing the rules for the ontologies
Import and export formats for ontologies
Methods to support mapping between ontologies

4.2.1 A canonical process model for ontology mapping


The canonical process model for ontology mapping has been introduced by [4] and aggregates all the
7
major mapping approaches. It is based on the CRISP-DM model. The procedure given by this model to
map two given ontologies to another one is shown in Figure 6 and consists of the following steps:
Feature Engineering: transforming the source ontologies into a format for doing similarity
calculations.
Selection of next steps: since the derivation of ontology mappings takes place in a search space of
candidate mappings, this step may choose to compute the similarity of a restricted subset of
candidate concept pairs {(e, f) | e O1, f O2} and to ignore others.
Similarity computation: determines similarity value for the candidate mappings.
Similarity aggregation: different similarity values reflecting different similarity measurements for two
candidate concept pairs have to be aggregated into one consolidated value.
Interpretation: within this step, the individual or aggregated similarity values have to be used to
derive the mapping between entities of the investigated ontologies. To support this, thresholds or
relaxation labelling or combination of structural and similarity criteria might be used as mechanism.
Iteration: iteration might be used to repeat one or more of the steps 1 to 5 by using different
algorithms in order to extract the amount of structural knowledge.

Feature
Engineering

Selection of
next steps

Interpretation

similarity
aggregation

Similarity
computation

Figure 6: Canonical process model for mapping ontologies

7 Cross Industry Standard for Data Mining

23

prCWA XXX-1:200X (E)

4.2.2 CC3P
Within the CC3P project a first analysis of the four main product classification systems, CPV, eCl@ss, GPC
and UNSPSC, has been undertaken. This analysis followed a four-step approach given by four so-called
phases [CWA 16138, p.94 ff.]:
Phase 1: Numeric analysis
Phase 2: Syntactical analysis
Phase 3: Semantic analysis
Phase 4: Summary and recommendations

Figure 7: Methodology of analysis of product classification systems [CWA 16138, p. 94]


The first phase of the CWA 16138 methodology gives an overview of the four main product classification
systems by analyzing the number of classes or concepts on each level of the different product classification
systems (see Figure 8).

Figure 8 Classification systems overview [CWA 16138, p. 95]


The second phase of the CWA 16138 methodology includes a semantical analysis, comprising two
aspects:
Numbering schemas of the classification systems;
Naming schemas of the classification systems.
This led to the following comparison of numbering schemas.

24

prCWA XXX-1:200X (E)

Figure 9: Differences in numbering schemas of CPV, eCl@ss, GPC and UNSPSC


Furthermore following differences within the naming schemas could be extracted.

Figure 10: Comparison of naming schemas for CPV, eCl@ss, GPC and UNSPSC
Within the third phase of the CWA 16138 methodology the semantic analysis has taken the numerical and
syntactical analysis phases as starting points. A comparison has been made to get a rough overview of the
overlapping of the classification systems in terms of commodity class names. Identical and similar
commodity class names between the different classification systems have been investigated. This analysis
has mainly been done on the class level without taking into account the properties of the different classes. As
a consequence it is not a very deep analysis of the semantics.

25

prCWA XXX-1:200X (E)

With the CC3P project the description of product classification systems is based on a data model approach
taken from the Gen-ePDC project.

Figure 11: Gen-ePDC data model for product classification systems


The Gen-ePDC data model contains all entities which are necessary to describe product classification
systems from a database perspective, including properties and data types. Within this data model no
information is included to describe the mapping between different product classification systems.

4.3

Elementary schema-based matching / mapping approaches

According to the classification of elementary schema-based matching approaches, the following matchers8
can be differentiated [5]. Lots of these matching approaches are used within learning and heuristic
techniques:
Element level matchers
o This kind of matching approaches include matching techniques like:
String-based matching
8 A matcher is a function that fulfils the mapping operation

26

prCWA XXX-1:200X (E)

This technique is based on the assumption that the more similar two strings
are, the more likely it is that they denote the same concept.
The similarity is often explained by using distance functions or variations of
distance functions.
Language-based matching
This technique is based on morphological properties of words to identify
important concepts within a source and is widely used in natural language
processing.
o The first step is the tokenizing of an input stream to locate potential
words of relevance within the data source. In the application area of
product classification systems these tokens are concept or class
names and property names and values.
o In the next step, lemmatization that is the process of grouping
together the different inflected forms of a word so they can be
analysed as a single item looks at each candidate word and finds
all it permutations (e.g. dog, dogs).
o During the last step, parts of the investigated language resource,
like prepositions, conjunctions and so on, will be flagged for
elimination since they do not denote concepts.
Constraint-based matching
These techniques are making evaluations of entities based on internal
constraints that exist within an entity, like data types or cardinality of
attributes.
Linguistic resource matching
In this case, common knowledge resources such as thesauri maintain
information that can be used to ascertain whether two concepts are equal or
similar.
Alignment reuse
In this case, the intuition is used that many schemas or ontologies to be
matched are similar to already matched schemas or ontologies (especially if
there are in the same application domain).
The available schema or ontology matching can be used to facilitate the
mappings to new domains.
Upper level formal ontologies
The upper level ontologies are a form of external knowledge resource that
can be used to ground ontologies which are under investigation for mapping
in a shared semantic context.
Typically the formalization is done using logic-based systems.
Structural level matchers
o This kind of matching approaches include matching techniques like:
Graph-based techniques
These techniques take a data source as a labelled graph and assume that if
nodes from two separate ontologies are related or similar, then the nodes
around them seems to be similar.
This matching is typically computationally expensive and works with
approximation.
Taxonomy based techniques
e.g. is-a relationship and assumes that if an is-a relation exists between
two nodes that are already similar, so the neighbours are then likely to be
similar too.
Repository of structures
For the storage of ontologies and their fragments, a repository is used.
The idea is that if new structures are to be matched, first similarities to the
structures already given by the repository have to be checked.
Model-based techniques
Matching of concepts is handled based on model-theoretic semantics of
these concepts, like description logic.
Combined matchers
o This kind of matching approaches aggregate element and structure level matchers.

27

prCWA XXX-1:200X (E)

4.4

Architecture

The development of ontologies or even lightweight ontologies like thesauri, terminology systems or product
classification systems is a challenging task. Since the development process is distributed over a huge
amount of participants and/or organisations, the architecture for an appropriate development platform for the
mapping of product classification systems and product classification systems themselves should be open for
the integration of the different parties involved.
In general, there are two overall architectures in the research of ontology matching to look for
correspondences between ontologies [3]:
Reuse of a shared ontology (upper ontology) as a general ontology which has to be extended with
concepts and properties specific to an application area.
o As long as the extensions e.g. concepts are defined consistently with the definitions of the
shared ontology, finding correspondence between concepts can be facilitated.
o This architecture is used in language dictionaries.
Using learning and heuristic techniques, which is applied in cases where either no upper ontology is
available or shall be used in the future.
When focussing on an platform architecture for the mapping of product classification systems in general
there are two meaningful approaches which will be outlined in the next two subsections:
Centralized architecture and
Distributed architecture.
In both cases the overall platform architecture looks like in Figure 12.

Figure 12: Overall platform architecture for the product classification mapping
28

prCWA XXX-1:200X (E)

Users in public and private organisations are working with one or more product classification systems. If a
mapping shall be established, these product classification systems have to be imported into the platform or
the access to the different product classification systems has to be elaborated online, that is a direct access
to the different product classification systems without storing them as copies within the platform.
The platform has to support a mapping engine with contains the rules to fulfil the mapping between the
different product classification systems.
After mapping two or more product classification systems to each other, on the one hand a mapping file
containing the mapping results shall be generated and supported and on the other hand the concrete
mapping of different product classification systems has to be generated and delivered to the user and/or
product classification authority asking for it.

4.4.1 Centralized architecture


The centralized architectural approach is based on the fact that all product classification systems which shall
be mapped to each other first have to be imported into the database of the mapping platform. In addition,
upgrade information, describing possible changes to a specific product classification system shall be
supported too to facilitate the mapping of different versions of product classification systems. The import of
the different product classification systems shall be supported by using different, standard-based import
formats for product classification systems, such as OWL or BMEcat.
The mapping engine works on that internal database which is designed according to a specific and standardbased data model, like e.g. the ePDC data model.
After mapping two or more product classification systems to each other, the mapping result must be exported
to the user in different probable standards-based export formats for product classification systems, like e.g.
OWL or BMEcat.

4.4.2 Distributed architecture


The distributed architecture approach is based on the assumption that the different product classification
systems are developed totally independently from each other by the different classification authorities and
can only be accessed directly over a network e.g. by internet. Within this approach the internal database only
contains the mapping result and not the product classification systems themselves. Consequently, no
maintenance of the product classification systems is necessary. For the access to the different product
classification systems, an interface, like an appropriate web service interface, has to be specified and used
for the import and the export of the mapping result. Only the mapping rules are stored inside the mapping
platform.
This approach is the one used when developing distributed thesauri today.

4.5

Tools supporting ontology development and mapping

When talking about methodologies for product classification systems, one has to distinguish between the
development of a single product classification system and the mapping of two or more classification systems
to each other.
For the development of product classification systems the same distinction regarding the platform
architecture between centralized and distributed applies. In both cases the systematic development of at
least big product classification systems or the mapping of those is only possible with appropriate tools.
There are a lot of tools that support the development of ontologies or product classification systems on the
market and they are based on different representation mechanisms and architectures. These kinds of tools
have to be integrated or at least associated to the mapping platform to facilitate the development and
mapping of product classification systems.
Beside commercial tools there are also open-source tools available. A good overview about available tools is
accessible in the World Wide Web9 and the ontology wiki10.
9

http://www.xml.com/2004/07/14/examples/Ontology_Editor_Survey_2004_Table_-_Michael_Denny.pdf

29

prCWA XXX-1:200X (E)

These tools are characterized and distinguished by the following criteria:


Modelling capabilities and limitations
Base language for modelling
Web support for the usage
Supported import and export formats
Graphical representation support
Support for consistency checking
Multi-user support
Merging capabilities for ontologies
Support for lexical analysis
Information extraction possibilities
Comment support.
The tools available on the market are mainly suited for the development of ontologies. There are only few
tools which support the mapping of ontologies and they are typically enhancements of the ontology
development tools supporting a graphical but manual mapping of product classification systems.

4.5.1.1 Categories of Mapping Tools


When talking about mapping in the area of product classification systems, one has to distinguish between
the mapping of different product classification systems themselves and the activity to map or assign concrete
products to one or more specific product classification system(s).
The focus of this project is the mapping process between product classification systems and not the
assignment of products to a product classification system. Nevertheless, the second use case is an
important action within enterprises and has a very strong practical usage.
As a consequence both categories of mapping tools will be addressed. Category B is only mentioned for
completeness sake.

4.5.1.2 Category A: Mapping tools to map from one classification system to another
classification system
The tools available for the mapping of product classification systems can be classified according to different
criteria:
Technology
o Stand-alone tool
These tools are totally stand-alone, which means that they have to be installed
separately and can read the product classification systems to provide a mapping.
Product classification systems to be mapped are available in some kind of data, e.g.
excel-sheet, database system or csv-files.
The mapping result will be exported in some kind of export format.
o Online-platform
These tools are available online to support the mapping of product classification
systems, e.g. on the internet or intra-/extranet of some enterprises.
The product classification systems can be imported in some input format.
The mapping result can be exported in some export formats.
o Integrated tool
These tools are typically extension of some business software like ERP-systems
and support the mapping of master data to one or more product classification
systems.
The mapping result will be available within the master data of the enterprise inside
the business software and can be exported in some format.
Interface support
o Proprietary interfaces
10

http://techwiki.openstructs.org/index.php/Ontology_Tools

30

prCWA XXX-1:200X (E)

According to the technologies mentioned in the previous bullet points, most tools
support proprietary import and export formats for the mapping.
o Standard interfaces
Some tools are supporting standard-based import and export interfaces, e.g. XMLbased interfaces or even the product classification systems formats.
Automation
o Manually
In some tools there is no function support for automating the mapping between
product classification systems given. Consequently the mapping will be done
manually by one or more domain-skilled users. In some cases this process of
mapping is supported by a graphical user interface.
o Semi-automatically
Most of the tools available support some kind of semi-automatic mapping between
product classification systems.
The mapping mechanisms used inside these tools rely on rule-based systems
where these rules are built up and extended during the mapping process.
In a first step, these tools are able to support an automatic One-to-One mapping, if
related classes within the different product classification systems are present.
For other kind of mapping relations like One-to-Many or Manty-to-One, the user has
to interact with the system manually to solve mapping conflicts.
This manual conflict resolution will lead to the extension of the rule to facilitate later
mappings in an automatic way.
o Automatically
None of the tools on the market support automatic mapping between product
classifications systems.
In some specific application domains, ontology mapping tools exist which are able to
support an automatic mapping between ontologies exist. But to do this, a huge effort
is required to prepare the import formats and the mapping rules to support the
automatic mapping process.
User support
o Textual user interface
Some tools only support a script-based or textual user interface to support the
mapping of product classification systems. If an automatic mapping is supported this
kind of interface is acceptable.
o Graphical user interface
Some tools support a graphical user interface to facilitate the mapping of product
classification systems in drag-and-drop manner. After importing, the product
classification systems are shown in two different windows on the screen and the
user can drag a product class from one window to a product class in the other
window to define the mapping between these classes. This can also be done on a
property or value level.
The result of this mapping process will lead to a rule base for later classification
mapping, e.g. creating some kind of regular expression to describe the mapping
rule.

4.5.1.3 Category B: Mapping tools to map product information to a classification system


In addition to tools available for Product classification system mapping, there are tools that support the
assignment of concrete products to a specific product class of a classification system. The categorisation of
these tools can be done with the same criteria as tool category A. Typically; these kinds of tools are used to
create electronic product catalogues if they are stand-alone or online tools. In addition also integrated tools
for business software are available.
Tools for mapping product information to a specific product classification system are out of the scope of this
document.

31

prCWA XXX-1:200X (E)

4.6

Exchange formats for ontologies

In the area of exchanging ontologies a lot of different exchange formats are used by the different tools.
Formats mostly used are :
XML
RDF and RDFS
OWL
BMEcat

4.6.1 XML
The acronym XML stands for eXtensible Markup Language and is an official recommendation of the W3C. It
is a markup language used to describe content in a platform-independent manner by using tags for different
content items like the hypertext markup language (HTML). Unlike HTML, XML is not designed to present the
content items to a human user but to transport and store data in a machine-readable form without any layout
information since this kind of information is not necessary for data exchange between machines or software
systems.
One further major difference between HTML and XML is the fact that XML can be used to define a specific
dialect for different application areas, like BMEcat for the exchange of product catalogues between software
systems as will be explained later. In this sense, XML can be used as meta language for the definition of e.g.
application oriented exchange formats. Tags used in XML are not predefined but can be defined by the users
according to their specific needs. These tags are used to describe the content transported in a selfdescriptive manner.
Since XML in general supports only the capabilities to define own languages for the exchange and storage of
data in a platform and system-independent style, it is widely used as a basis for the definition of languages
for the description of resources and ontologies. For example, it is the basis of the resource description
framework (RDF) and the ontology web language (OWL). To describe resources and ontologies, new tags
are introduced for the semantic description of information items in RDF and OWL. XML defines basic
mechanisms like e.g. namespaces and basis data types which can be used within these specific languages.
Each XML document defines a tree structure for the document starting with a root element. All tags
(elements) which can be used within this XML document are also defined by using a document type
description (DTD) or a XML schema definition (XSD). Each element or tag within a XML document can have
attributes and can have text content.

<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
32

prCWA XXX-1:200X (E)

<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
Figure 13: Example of an easy XML document
All XML documents shall be well formed and valid. To be well formed, an XML document as to be defined by
using a correct XML syntax. To be valid, an XML document must conform to a DTD or XSD.

33

prCWA XXX-1:200X (E)

4.6.2 RDF and RDFS


The Resource Description Framework (RDF) is a W3C standard for describing Web resources such as the
title, author, modification date, content, and copyright information of a Web page.
Like XML, RDF is a tag-based description of resources to support machine-readable text-based documents.
The XML language used by RDF is called RDF/XML. The RDF language is a part of the W3C's Semantic
Web Activity. W3C's "Semantic Web Vision" is the future of the Web since:
Web information has exact meaning
Web information can be understood and processed by computers
Computers can integrate information from the web
Within the RDF all things are identified by using Web identifiers (URIs) and these resources are described
with properties and property values:
A Resource is anything that can have a URI, such as "http://www.w3schools.com/rdf"
A Property is a Resource that has a name, such as "author" or "homepage"
A Property value is the value of a Property, such as "Jan Egil Refsnes" or
"http://www.w3schools.com" (note that a property value can be another resource)
The combination of a Resource, a Property, and a Property value forms a Statement (known as the subject,
predicate and object of a Statement).

e chapter
This is political. We describe here the technical and processual issues, not
the political issues
This is all introductory to the CWA, not the chapter
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist>Bob Dylan</cd:artist>
<cd:country>USA</cd:country>
<cd:company>Columbia</cd:company>
<cd:price>10.90</cd:price>
<cd:year>1985</cd:year>
</rdf:Description>
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Hide your heart">
<cd:artist>Bonnie Tyler</cd:art

Figure 14: An RDF example11


The first line of the RDF document is the XML declaration. The XML declaration is followed by the root
element of RDF documents: <rdf:RDF>. The xmlns:rdf namespace specifies that elements with the rdf prefix
are from the namespace "http://www.w3.org/1999/02/22-rdf-syntax-ns#". The xmlns:cd namespace, specifies
11 http://www.w3schools.com/rdf/rdf_example.asp

34

prCWA XXX-1:200X (E)

that elements with the cd prefix are from the namespace "http://www.recshop.fake/cd#".The
<rdf:Description> element contains the description of the resource identified by the rdf:about attribute. The
elements: <cd:artist>, <cd:country>, <cd:company>, etc. are properties of the resource.
The main elements of RDF are the root element, <RDF>, and the <Description> element, which identifies a
resource.
RDF describes resources with classes, properties, and values. In addition, RDF also needs a way to define
application-specific classes and properties. Application-specific classes and properties must be defined using
extensions to RDF. One such extension is RDF Schema (RDFS). RDF Schema does not provide actual
application-specific classes and properties. Instead RDF Schema provides the framework to describe
application-specific classes and properties. Classes in RDF Schema are much like classes in object oriented
programming languages. This allows resources to be defined as instances of classes, and subclasses of
classes.
c:date>2008-09-01</dc:date>
<dc:type>Web Development</dc:type>
<dc:format>text/html</dc:format>
<dc:language>en</dc:language>
</rdf:Description>
</rdf:RDF>
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.animals.fake/animals#">
<rdf:Description rdf:ID="animal">
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
</rdf:Description>
<rdf:Description
Figure 15: RDFS example12
rdf:ID="horse">
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
<rdfs:subClassOf rdf:resource="#animal"/>
</rdf:Description>
</rdf:RDF>
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="ht
Figure 16: RDFS example abbreviated13

12

http://www.w3schools.com/rdf/rdf_example.asp
13 http://www.w3schools.com/rdf/rdf_schema.asp

35

prCWA XXX-1:200X (E)

RDF is metadata (data about data) and is used to describe information resources. The Dublin Core is a set
of predefined properties for describing documents.
p://www.animals.fake/animals#">
<rdfs:Class rdf:ID="animal" />
<rdfs:Class rdf:ID="horse">
<rdfs:subClassOf rdf:resource="#animal"/>
</rdfs:Class>
</rdf:RDF>
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc= "http://purl.org/dc/elements/1.1/">
<rdf:Description rdf:about="http://www.w3schools.com">
<dc:description>W3Schools - Free tutorials</dc:description>
<dc:publisher>Refsnes Data as</dc:publisher>
<
Figure 17: RDF and Dublin Core example14

4.6.3 OWL
The ontology web language (OWL) is a language for processing web information. Since an ontology is about
the exact description of things and their relationships, ontology is about the exact description of web
information and relationships between pieces of web information.
OWL is designed on top of RDF to facilitate the processing of ontologies by software systems. In this sense it
is not meant to be used by human users but by machines. The syntax of OWL is written in XML and the W3C
has defined OWL as standard to semantically describe resource on the web. It is widely spread and used
within the web community. Actually three sublanguages are defined for OWL, which differ in their
expressiveness. These sublanguages are:
OWL Lite
OWL DL, which includes OWL Lite and
OWL full, which includes OWL DL.
OWL and RDF are very similar, but OWL is a stronger language with greater machine interpretability than
RDF. It comes with a larger vocabulary and stronger syntax than RDF.
A totally different approach for describing the semantics of product classification systems and the mapping
between these systems is driven by the ontology engineering application area. A lot of different
representation formalisms have been invested and investigated for the description of ontologies differing in
different expressiveness.
Starting with XML as general notation to describe resources in a technology independent way, it has become
obvious that XML in general is not expressive enough to describe the semantics of ontologies and also
product classification systems as lightweight ontologies. To reduce the gap for the description of ontologies
based on XML, the RDF and RDFS have been introduced. RDF and RDFS have much more capabilities to
describe semantic information necessary for ontologies.
14 http://www.w3schools.com/rdf/rdf_dublin.asp

36

prCWA XXX-1:200X (E)

Since the major driving research field of ontologies is the semantic web, the W3C consortium has developed
OWL as the standard language or representation mechanism to describe content items used in the World
Wide Web. In this sense, OWL can be seen as the next generation technology for the semantic description
of content.
OWL is a widely used, standardized format to describe semantics.
Ontologies and product classification systems can be expressed using OWL as formal representation
mechanism and description language.

4.6.4 SKOS
The Simple Knowledge Organization System (SKOS) is a data-sharing standard, bridging several different
fields of knowledge, technology and practice.
In the library and information sciences, a long and distinguished heritage is devoted to developing tools for
organizing large collections of objects such as books or museum artefacts. These tools are known generally
15
as "knowledge organization systems" (KOS) or sometimes as "controlled structured vocabularies" . Different
families of knowledge organization systems, including thesauri, classification schemes, subject heading
systems, and taxonomies are widely recognized and applied in both modern and traditional information
systems. In practice it can be hard to draw an absolute distinction between thesauri and classification
schemes or taxonomies, although some properties can be used to broadly characterize these different
16
families .
The Simple Knowledge Organization System is a common data model for knowledge organization systems
such as thesauri, classification schemes, subject heading systems and taxonomies. Using SKOS, a
knowledge organization system can be expressed as machine-readable data. It can then be exchanged
between computer applications and published in a machine-readable format in the Web. The SKOS data
model is formally defined as an OWL Full ontology whereas SKOS data are expressed as RDF triples, and
17
may be encoded using any concrete RDF syntax (such as RDF/XML) .
The SKOS data model views a knowledge organization system as a concept scheme comprising a set of
concepts. These SKOS concept schemes and SKOS concepts are identified by URIs, enabling anyone to
refer to them unambiguously from any context, and making them a part of the World Wide Web.
SKOS concepts can be labelled with any number of lexical (UNICODE) strings, such as "romantic love" or
"", in any given natural language, such as English or Japanese (written here in hiragana). One of
these labels in any given language can be indicated as the preferred label for that language, and the others
as alternative labels. Labels may also be "hidden", which is useful where a knowledge organization system is
being queried via a text index.
SKOS concepts can be assigned one or more notations, which are lexical codes used to uniquely identify the
concept within the scope of a given concept scheme. While URIs are the preferred means of identifying
SKOS concepts within computer systems, notations provide a bridge to other systems of identification
already in use such as classification codes used in library catalogues.
SKOS concepts can be documented with notes of various types. The SKOS data model provides a basic set
of documentation properties, supporting scope notes, definitions and editorial notes, among others. This set
is not meant to be exhaustive, but rather to provide a framework that can be extended by third parties to
provide support for more specific types of notes.
SKOS concepts can be linked to other SKOS concepts via semantic relation properties. The SKOS data
model provides support for hierarchical and associative links between SKOS concepts. Again, as with any
15

http://www.w3.org/TR/2009/REC-skos-reference-20090818/
http://www.w3.org/TR/2009/REC-skos-reference-20090818/
17
http://www.w3.org/TR/2009/REC-skos-reference-20090818/
16

37

prCWA XXX-1:200X (E)

part of the SKOS data model, these can be extended by third parties to provide support for more specific
needs.
SKOS concepts can be grouped into collections, which can be labelled and/or ordered. This feature of the
SKOS data model is intended to provide support for node labels within thesauri, and for situations where the
ordering of a set of concepts is meaningful or provides some useful information.
SKOS concepts can be mapped to other SKOS concepts in different concept schemes. The SKOS data
model provides support for four basic types of mapping link: hierarchical, associative, close equivalent and
exact equivalent.
The SKOS Mapping Vocabulary contains a set of properties for specifying mapping relations among
concepts from different domain ontologies (broadMatch, narrowMatch, exactMatch, majorMatch,
minorMatch). Such a rich set of semantic relations for expressing mapping is useful in ranking search results
to reflect the weight of the mapping. Apart from the properties, the SKOS Mapping has the three classes for
defining: intersection of concepts (the AND class), union of concepts (OR), and negation (NOT).
The search system starts by taking two ontologies (a source and a target), and a concept from the source
ontology as initial input. The result of the search algorithm is a set of concepts in the target ontology. For
each of the input concepts, the system searches for a mapping to the target ontology. When a matching
target concept is found, it is added to one of five search results lists (clusters) depending on which SKOS
Mapping relation it belongs to. Later, children of each of the matched concepts are searched for similarly
matching results. The algorithm limits the search for matching concept children to a predefined depth. Again,
matching children are added to one of the five search result clusters based on their SKOS mapping relation.
Finally, search results are ranked using the SKOS Mapping relation properties (broadMatch, narrowMatch,
exactMatch, majorMatch, minorMatch) and their weights.

4.6.5 BMEcat
BMEcat is an XML-based standard for the transfer of electronic product catalogues and is available as XMLSchema (XSD). The German Association of Materials Management, Purchasing and Logistics (BME), the
umbrella organization of German purchasing and logistics agents, developed this standard in close
cooperation with industry and academia based on the experience gathered from co-operation activities at
18
global level .
In BMEcat, data are categorized based on data areas which differ according to data content, entry,
management, and complexity:
Identification (article number, GTIN (formerly EAN number), )
Description (short and long description, )
Grouping (ERP-product category number, )
Classification data
Properties (weight, colour, )
Order information (order unit, minimum order quantity, )
Prices (customer price, list price, )
Logistics information (delivery times, packaging information, )
Additional multimedia data (images, pdf-files, )
References to other products
Qualifiers (special offer, discontinued model, )

4.7

Summary and Recommendations

Since interoperability and integration of data becomes more and more important, the problem of semantic
heterogeneity has to be solved in the area of product classification systems. One way to address this
problem is to represent product classification systems and their taxonomies as lightweight ontologies, that is
ontologies without reasoning capabilities.

18 http://www.bme.de/fileadmin/bilder/PDF/BMEcat_en.pdf

38

prCWA XXX-1:200X (E)

Different types heterogeneities must be addressed to solve the problem of ontology mismatch. One general
point is that product classification systems as lightweight ontologies must be represented in a formal
language like RDF(S) or OWL. By doing this, tools available on the market can be used to align or match
different product classification systems. If formal representations are given for product classification systems
also the ontology matching operation can be formalized to be reused in different versions of product
classifications systems. Once an alignment is available, it can be used to repeat the mapping operation and
can be used as the basis for an adapted alignment according to the new product classification system
versions. Without this formalization an on-going mapping between different versions of product classification
systems can only be achieved with huge manual effort.
In any case the process of product classification mapping shall follow a methodology to facilitate this process
and make it transparent to the user. Different approaches and models have been developed in the area of
ontology development and matching.
The most promising model is the canonical process model for ontology mapping described in subsection
4.2.1. This canonical model should be included within the overall methodology for product classification
system engineering to reach a complete methodology for the ontology and consequently the product
classification system mapping process.
To fulfil the ontology matching or mapping process, different types of matchers, elementary and structural
matchers, shall be used inside the methodology. The combination of both types promises the best results for
the mapping process.
To facilitate the usage of this methodology, a platform shall be provided supporting this methodology. There
are some tools on the market, commercial as well as open-source tools, dealing with different aspects of
ontology engineering. As a general architecture, both suggested architectures are suitable for the product
classification system mapping platform. The centralized platform architecture has some advantages against
the distributed architecture since exchange (for both import and export) formats are not needed thus limiting
the impact of the lack of industry standard format.
Additionally, data files for product classification systems might be very big, and therefore limiting the traffic to
importing product classification systems into the mapping platform will keep the network traffic at a lower
level compared to the online access of a distributed architecture. The processing capabilities of the
centralized platform are independent of the processing capabilities of the different development platforms
hosted by the different products classification system authorities.
Because of these criteria it is recommended to use a centralized platform architecture for the mapping
platform. In addition a meaningful import and export format for the exchange of the product classification
system has to be selected or developed as well as a representation for the mapping rules and the mapping
file containing the mapping between concepts of the different product classification systems.
The overview of the exchange formats for ontologies and product classification systems as lightweight
ontologies has shown that OWL is the most promising candidate for the exchange of ontologies. Therefore
product classification systems as lightweight ontologies shall be represented using OWL. Some work has
been made and is still in evolution, to represent product classification systems in the OWL notation, like
eCl@ss and UNSPSC. SKOS uses OWL full to describe the data model for ontologies and product
classification systems. Because of the introduction of specific mapping elements for product classification
systems, SKOS shall be used as basic system to map product classification systems on a semantic level. To
facilitate the usage of product classification systems in exchanging and mapping these systems, SKOS
provides the possibility to convert SKOS-based ontology representations to different other formats, like RDF
and BMEcat.

39

prCWA XXX-1:200X (E)

5.

The cMap Overall Mapping Methodology

5.1

Requirements

For the design of the cMap overall mapping methodology some general requirements have been taken into
account. These requirements include the information given by the investigated product classification
systems, the usage or computation of these product classification systems and some restriction set by the
project to fulfil the mapping based on a useful platform architecture.

5.1.1 Product Classification System versions used in cMap


In the CC3P project four different horizontal product classification systems have been selected for mapping
with the following versions:
CPV 2008
eCl@ss 6.0.1
GPC As at 31 August 2009
UNSPSC v11.1201
The mapping process started with six horizontal domains. Mappings among the four classification systems
have been established. Each CPV domain is mapped in one comprehensive mapping table. The mapping
tables show how the independent systems communicate with each other.
Figure 18 below illustrates that there is one mapping table available per each CPV domain and each system
maps to all the three other systems. The mapping directions are bidirectional. Since there are 45 CPV
domains the total number of mapping tables is 45.

Figure 18: Mapping tables


For the cMap project the same versions as those used in the CC3P project have been used.
These versions are not the latest versions of each classification system, however they can serve as starting
points to create the initial mapping of all domains among these product classification systems and make the
mapping results generally accessible.
A consistent mapping methodology is a key part of setting up a process such that it could be applied to any
versions in the future. The mapping principles and guidelines for a continuous mapping of any new releases
is described in detail in the following sections.

40

prCWA XXX-1:200X (E)

5.1.2 General requirements about the cMap mapping / matching methodology


The requirements given by the usage of the investigated product classification systems itself are:
All product classification systems are given in Microsoft Excel format as input files.
The mapping results shall be available for machine computation as well as human readable format.
The starting point of the mapping between the different product classification systems shall be the 45
domains of the CPV system
Mapping from one classification system to another classification system and vice versa for each 45
domain shall be reflected separately in a separate file
When establishing mappings find in the most detailed level of one classification the equivalent 19
concept of the most detailed appropriate concept in the other system
o This step then allows, when needed, the groups 20of one classification to be subsequently
aggregated to most of the relevant aggregated groups of the other
Information retrieval between product classification systems is semi-automatic
The information search should be domain-independent since a particular domain in the source
classification system is equivalent to or spreading over several domains in the target system
o For example the Clothing domain in CPV is spreading over 6 eCl@ss domains, 9 GPC
domains and 6 UNSPSC domains due to significantly different domain scopes
The search query takes the class name that describes the concept it represents
Choosing the right phrase for searching is critical for the mapping success
Extra semantic richness such as definitions, extended specs and synonyms is very important
towards identifying the required granularity to find the equivalent / similar classes in the target
system
Class properties could also be considered as semantic richness especially the concept specific
properties such as in GPC and partly in eCl@ss (beyond the concept specific attributes generic
attributes are also there that can be redundant from mapping perspective)
Once the equivalent / similar classes are found (or not found) then the relationship cardinality should
be defined for each matching line in the table, for example, One-to-One, One-to-Many. They are
further described in section 5.1.4.

5.1.3 Mapping challenges


The mapping exercise conducted in CC3P and continued with the cMap project have highlighted the
following mapping challenges:
Classification system domain scopes are different: as an example, the Clothing domain in CPV is
spreading over several domains in other systems: six domains in eCl@ss, nine domains in GPC and
six domains in UNSPSC
Not all classification systems do support mutual exclusivity principle: classes sometimes belong to
more than one domain
Domains are in some cases designed based on the technical description of the concepts and in
some other cases based on the intended usage of the concept and it may make it difficult to
determine which domain is hosting a certain class. As an example, in CPV Shirts and Sport Shirts
are classified in two separate domains while they could be classified in one domain in another
classification system
Class hierarchies are not always properly defined and some elements do not belong to the correct
hierarchy. In the Is a hierarchies21 some sub-classes should be moved up to the next level to the
next hierarchy level. E.g.: in eCl@ss accumulator, battery has sub-classes such as repair, or
maintenance that are not batteries.22
CPV hierarchy is not crystal clear Classes can be at different hierarchical level.
Domains are not always coherent in the way that the number of class level nodes, even under the
same maintenance body can be very different E.g.: in UNSPSC in the Food Beverage and
19 Equivalent means that have the same semantics
20 Group means concept on a higher level within the taxonomy.
21 Inheritance relationship or child-parent relationship
22 This has been changed in the latest eCl@ss release 7.0. Product-related services have been moved to the new segments Logistics
(service) and Maintenance (service) based on the recommendations made in the CC3P CWA.

41

prCWA XXX-1:200X (E)

Tobacco domain there is a huge proliferation of the commodity codes, unlike in other domains.
The extra semantic richness that helps the mapping is not available for all the classes in all the
classification systems. Definitions are sometimes missing and properties that help to understand the
class concept are only available in eCl@ss and in GPC.
Different terminologies due to the different set of dialects of the English language are in use and the
language-based search has to cope with the differences: CPV and GPC are in UK English; UNSPSC
is in American English while eCl@ss is translated from German to American English.
Property vocabularies: in some product classification systems that use class properties, those have
to be analysed when searching the correct equivalent class. The amount of information that is
investigated is much bigger and more resource-consuming.
In the same way, for product classification systems that use keywords (eCl@ss) these kewywords
have to be investigated, which required extra resource.
Terms with identical names are often semantically different.
The differences between classification systems are not as simple as the equivalence or subclass
relations between named classes typically found by such systems.
Due to the complexity of the mapping challenges fully automated alignment approaches seem unrealistic and
not available in the market.
For these reasons, the mapping was mainly a semi-automatic semantic mapping methodology process that
involved looking at names and structures, and when possible definitions, synonyms and actual product
properties were used.
The entire classification systems were used to find the equivalent codes for each mapping table since the
domain scopes (see clothing example above) were very different among those four classification systems.
The mapping exercise was conducted rather manually with the help of some semi-automatic search and
browse features and automated data extraction of the MS Excel tables and associated documents.
The potential reasonable relationships to search for best available match through multiple edges have been
analysed. The original CPV hierarchical structure has been kept as much as possible.

5.1.4 Mapping relationships


In the area of product classification systems, the concepts (classes) are structured in a hierarchical structure,
described by different terms, like class names, synonyms, properties etc. This hierarchical structure is
defined by relations between these concepts, e.g. is-part-of, is-a or refines. When only taking into
account the mapping of concepts or classes of product classification systems the following mapping
relationships in terms of cardinality have to be considered:
One-to-One (abbreviated to 121): One class in the source system is matching with an equivalent
class of the target system.
One-to-Many (abbreviated to 12M): One class in the source system is matching with more than one
class of the target system.
Many-to-One (abbreviated to M21): Several classes in the source system are matching with one
class of the target system.
Many-to-Many (abbreviated to M2M): Several classes in the source system are matching with more
than one class of the target system.
None-to-One (abbreviated to N21): Classes are missing in the source system but there is one class
available in the target system.
One-to-None (abbreviated to 12N): One class is available in one system but there is no equivalent
class in the target system.
None-to-None (abbreviated to N2N): Classes are missing in both the source and the target
systems but there are classes available in one of the / in both the other two target systems.
None-to-Many (abbreviated to N2M): A class is missing in the source system but there are several
classes available in one of the / in both the other two target systems.
Many-to-None (abbreviated to M2N): Several classes are available in the source system but there
is no equivalent class in the target system.

42

prCWA XXX-1:200X (E)

5.2

Design of the cMap Mapping Methodology

5.2.1 Design of the mapping methodology


The design of the cMap overall mapping methodology is mainly driven by the canonical process model for
ontology mapping described in section 4. Within this canonical process model the mapping methodology
given by the CC3P project has been used. In this way the cMap mapping methodology is a combination of
both methodologies.
Within the cMap mapping methodology the steps Similarity computation, Similarity aggregation and
Interpretation are exchanged by the CC3P methodology steps Numeric analysis, Syntactical analysis
and Semantic analysis. This is because no numerical threshold values are used to determine similarities
between possible corresponding classes of the investigated product classification systems.

Feature
Engineering

sematic
analysis

Selection of
next steps

syntactical
analysis

numeric
analysis

Figure 19: The cMap mapping methodology


Numerical values are not used within the cMap mapping methodology since the difference in the granularity,
scope, domain and architecture of the different product classification systems is very important. As a
consequence additional information given by the hierarchy context of each class which has to be mapped
has to be used.

Figure 20: The ontology matching operation [8]

43

prCWA XXX-1:200X (E)

According to the ontology matching operation described in section 4, within the cMap mapping methodology
no matching parameters and additional resources have been used to fulfil the mapping. Additional resources
are only implicitly given by the domain knowledge of the mapping experts, and not in a formal way or
provided by external thesauri. The input alignment A used to fulfil the mapping is given by the context
information derived from the hierarchy and the application domain of specific concepts e.g. product classes
which have to be mapped, but again, not in a formal way.
The general mapping approach taken by the cMap project is the combined matching approach described in
subsection 4.3. This combined matching contains the following matchers:
Element level matchers
String based matching
Language based matching
Constraint based matching
Alignment reuse (implicitly given by experts)
Structural level matchers
Taxonomy based techniques

5.2.2 The cMap platform architecture


Within the cMap project the centralized architecture approach has been chosen.
The different product classification systems of the project are available as Microsoft Excel files. The
approach has been selected since there was no intention of the project to directly access the product
classification systems online but to fulfil a mapping of specific version of the given product classification
systems. To facilitate the work within the cMap project, no restrictions or assumptions have been taken
concerning a specific data model for the storage of the different product classification systems. Instead of
this, the mapping has been fulfilled in a semi-automatic way based on the different Excel files. The result of
the mapping has been reflected in single mapping files also based on Excel for easy usage and analysis by
the project team and the proposed users. These resulting Excel files are attached as annex of this document.

44

prCWA XXX-1:200X (E)

Figure 21: The cMap Mapping architecture

5.2.3 Selection of an appropriate tool for the mapping


For the mapping and development of the initial entire mapping of the selected releases for the four
classification systems, Microsoft Excel has been selected as a semi-automatic mapping tool. This decision
has been made since the mapping process and also the mapping results drawn from the mapping should be
easy-to-read and used by any user and should not be proprietary to any tool available on the market.
Consequently , a semi-automatic semantic mapping methodology as selected. It includes search and browse
features and automated data extraction of the MS Excel tables and associated documents.
There are some commercial and also open-source mapping tools for ontologies on the market, which might
be used in the future to develop the mapping between the different product classification systems. The most
appropriate tool for this task seems to be the Protg Tool of Stanford University, since it is an opensource, java-based tool with a lot of enhancements for the representation and mapping of ontologies as well
as input and output converters to support different exchange formats for product classification systems and
the mapping itself. A short overview of the most promising tool for the future development of the mappings is
given in annexes A to C of this document.
The selection of the tools is based on the fact that Protg is an open-source platform for ontology
development which can be adopted and used also by SMEs to develop and maintain their product
classification systems as well as the product classification systems given by standard bodies, like CPV,
eCl@ss, GPC and UNSPSC. In addition to this platform, the Prompt tool might be used to fulfil the task of
ontology mapping and merging. At least, the SKOS Editor plug-in for Protg might be used for an
appropriate representation of product classification systems since the modelling capabilities fit very well the
application area of product classifications systems. Most of the overhead given by typical ontology
representation languages like OWL, are not included by the SKOS modelling language, which is still based
45

prCWA XXX-1:200X (E)

on OWL but better suiting for the representation of product classification systems as lightweight ontologies.
This is because hierarchy relations can be reflected in different ways, not only inheritance, and because of
the restricted capabilities of reasoning, which are not relevant in the application area of taxonomies.

5.2.4 Import and Export format


For import and export of ontologies and product classification systems a number of different formats are
available and used in the market, like XML, RDF(S), OWL, BMEcat. A short description of these different
formats is given in section 4, which is not exhaustive, but gives a good overview of the solutions available on
the market. An additional overview about exchange formats for product classification systems based on ISO13584-2, BMEcat and UN/CEFACT is given in final report of the eCat-PPS project.
Within this project, only Microsoft Excel has been used as exchange format for the different product
classification systems as well as for the resulting mapping tables.

46

prCWA XXX-1:200X (E)

5.3

Usage of the cMap Mapping Methodology Mapping results statistics

The mapping exercise of the 45 domains resulted in the following outcome:


Table 22: Mapping statistics

Table 22 has four vertical (I-IV) sections:


I. Mapping from CPV to the other three classifications systems
II. Mapping from eCl@ss to the other three classifications systems
III. Mapping from GPC to the other three classifications systems
IV. Mapping from UNSPSC to the other three classifications systems
and 6 horizontal sections that detail the statistics by cardinalities:
Subtotal 1: total of 121 & 12M mappings: one class is available in the source system and there is
one or more than one equivalent class(es) in the target system
Subtotal 2: total of 12N mappings: one class is available in the source system and there is no
equivalent class in the target system
Subtotal 3: total M21 + M2M mappings: more than one class is available in the source system and
there is one or more than one equivalent class(es) in the target system
Subtotal 4: total of M2N mappings: more than one class is available in the source system and there
is no equivalent class in the target system
Subtotal 5: total of N2N mappings: no class is available in the source & target systems
Subtotal 6: total of N21 + N2M mappings: no class is available in the source system and there is one
or more than one equivalent class(es) in the target system
47

prCWA XXX-1:200X (E)

5.3.1 Interpretation for the CPV system (I. Vertical Section)


5.3.1.1 Subtotal 1: 121 + 12M mappings
121 cardinality: one class in CPV is matching with one class in another system
To eCl@ss:
10%
To GPC:
14%
To UNSPSC:
17%
12M cardinality: one class is CPV has more than one matching in another system
In eCl@ss:
8%
In GPC:
5%
In UNSPSC:
7%
121 + 12M cardinality: one class is available in CPV and there is one or more than one matching
class(es) in the target system
In eCl@ss:
18%
In GPC:
19%
In UNSPSC:
24%
The most matching 121 + 12M cardinality is between CPV and UNSPSC with 24%. In other words,
24% of the CPV classes have equivalent class(es) in UNSPSC.

5.3.1.2 Subtotal 2: 12N mappings


12N cardinality: one class is available in the source system and there is no equivalent class in the target
system
In eCl@ss:
34%
In GPC:
35 %
In UNSPSC:
28 %
The least unavailable equivalent classes for the CPV classes are in UNSPSC = 28%

5.3.1.3 Subtotal 3: M21 + M2M mappings


M21 cardinality: more than one class in CPV is matching with one class in the target system
In eCl@ss:
0%
In GPC:
0%
In UNSPSC:
0%
M2M cardinality: more than one class is available in CPV and there is more than one equivalent class in
another system
In eCl@ss:
0%
In GPC:
0%
In UNSPSC:
0%
M21 + M2M cardinality: more than one class is available in the CPV and there is one or more than one
equivalent class(es) in the target system
eCl@ss:
0%
GPC:
0%
UNSPSC:
0%
There are no cases (0%) when more than one class is available in the CPV system and there is one or
more than one equivalent class(es) in any of the three target systems.

48

prCWA XXX-1:200X (E)

5.3.1.4 Subtotal 4: M2N mappings


M2N cardinality: more than one class in CPV is matching with no class in the target system
In eCl@ss:
0%
In GPC:
0%
In UNSPSC:
0%

There are no cases (0%) when more than one class is available in the CPV system and there is no
equivalent class in any of the three target systems.

5.3.1.5 Subtotal 5: N2N mappings


N2N cardinality: there is no class in CPV and there is no class in some of the other systems.
No class in CPV, eCl@ss, Class(es) in GPC and/or in UNSPSC:
25%
N class in CPV No class in GPC, Class(es) in eCl@ss and/or in UNSPSC:
33%
NO class in CPV No class in UNSPSC, Class(es) in eCl@ss and/or in GPC: 15%
There are only 15% of the cases when no classes are available in both the CPV and UNSPSC systems but
there are classes in eCl@ss and / or in GPC.

5.3.1.6 Subtotal 6: N21 + N2M mappings


N21 cardinality: no class in CPV is matching with one class in the target system
In eCl@ss:
19%
In GPC:
10%
In UNSPSC:
27%
N2M cardinality: no class in CPV is matching with more than one class in the target system
In eCl@ss:
4%
In GPC:
3%
In UNSPSC:
6%

N21 + N2M cardinality: no class is available in CPV and there is one or more than one equivalent
class(es) in the target system
eCl@ss
23 %
GPC
13 %
UNSPSC
33 %
There are only 13 % of the cases when no class is available in the CPV system and there is one or more
than one equivalent class(es) in GPC.

49

prCWA XXX-1:200X (E)

5.3.2 Interpretation for the eCl@ss system (II. Vertical Section)


5.3.2.1 Subtotal 1: 121 + 12M mappings
121 cardinality: one class in eCl@ss is matching with one class in another system:
In CPV:
10%
In GPC:
7%
In UNSPSC:
8%
12M cardinality: one class in eCl@ss is matching with more than one class in another system:
In CPV:
0%
In GPC:
2%
In UNSPSC:
3%
121 + 12M cardinality One class is available in eCl@ss and there is one or more than one equivalent
class(es) in the target system
In CPV:
10%
In GPC:
9%
In UNSPSC:
11%
The most matching 121 + 12M cardinality is between eCl@ss and UNSPSC with 11 that is that in 11% of the
eCl@ss cases you can find equivalent class(es) in UNSPSC.

5.3.2.2 Subtotal 2: 12N mappings


12N cardinality: one class is available in the source system and there is no equivalent class in the target
system:
In CPV:
19%
In GPC:
26 %
In UNSPSC:
17 %
The least unavailable equivalent classes for the CPV classes are in UNSPSC with 17%.

5.3.2.3 Subtotal 3: M21 + M2M mappings


M21 cardinality: more than one class in eCl@ss is matching with one class in the target system:
In CPV:
8%
In GPC:
4%
In UNSPSC:
4%
M2M cardinality: more than one class in eCl@ss is matching with more than one class in the target
system:
In CPV:
0%
In GPC:
4%
In UNSPSC:
5%
M21 + M2M cardinality One class is available in the eCl@ss and there is one or more than one
equivalent class(es) in the target system:
CPV:
8%
GPC:
8%
UNSPSC:
9%
There are 9% when more than one class are available in the eCl@ss system and there is one or more than
one equivalent class(es) in UNSPSC.

50

prCWA XXX-1:200X (E)

5.3.2.1 Subtotal 4: M2N mappings


M2N cardinality: more than one class in eCl@ss is matching with no class in the target system:
In CPV:
4%
In GPC:
7%
In UNSPSC:
3%
There are 7% of the cases when more than one class is available in the eCl@ss system and there is no
equivalent class in GPC.

5.3.2.2 Subtotal 5: N2N mappings


N2N cardinality: there is no class in eCl@ss and there is no class in some other systems:
No class in eCl@ss no class in CPV, Class(es) in GPC and/or in UNSPSC:
25%
No class in eCl@ss no class in GPC, Class(es) in CPV and/or in UNSPSC:
36%
No class in eCl@ss no class in UNSPSC, Class(es) in CPV and/or in GPC
25%
There are 36 % of the cases when no classes are available in both the eCl@ss and GPC systems but there
are classes in CPV and / or in UNSPSC.

5.3.2.3 Subtotal 6: N21 + N2M mappings


N21 cardinality: no class in eCl@ss is matching with one equivalent class in the target system:
CPV:
34%
GPC:
12%
UNSPSC:
29%
N2M cardinality: no class in eCl@ss is matching with more than one equivalent class in the target
system:
CPV:
0%
GPC:
2%
UNSPSC:
6%
N21 + N2M: cardinality: no class is available in eCl@ssand there is one or more than one equivalent
class(es) in the target system:
CPV:
34%
GPC:
14%
UNSPSC:
35%
There are only 14 % of the cases when no class is available in the eCl@ss system and there is one or more
than one equivalent class(es) in GPC.

5.3.3 Interpretation for the GPC system (III. Vertical Section)


5.3.3.1 Subtotal 1: 121 + 12M mappings
121 cardinality: one class in GPC is matching with one class in another system:
In CPV:
4%
In eCl@ss:
7%
In UNSPSC:
13%
12M cardinality: one class in GPC is matching with more than one class in another system:
In CPV:
0%
In eCl@ss:
4%
In UNSPSC:
5%

51

prCWA XXX-1:200X (E)

121 + 12M cardinality: one class is available in the GPC and there is one or more than one equivalent
class(es) in the target system
CPV:
14%
eCl@ss:
11%
UNSPSC:
18%
The most matching 121 + 12M cardinality is between GPC and UNSPSC with 18%, i.e. in 18% of the GPC
cases there are equivalent class(es) in UNSPSC.

5.3.3.2 Subtotal 2: 12N mappings


12N cardinality: one class is available in GPC and there is no equivalent class in the target system:
In CPV:
10%
In eCl@ss:
12 %
In UNSPSC:
7%
The least unavailable equivalent classes for the GPC classes are in UNSPSC with 7%.

5.3.3.3 c) Subtotal 3: M21 + M2M mappings


M21 cardinality; more than class in GPC is GPC is matching with one class in the target system:
In CPV:
5%
In eCl@ss:
2%
In UNSPSC:
1%
M2M cardinality: more than class in GPC is GPC is matching with more than one class in the target
system:
In CPV:
0%
In eCl@ss:
4%
In UNSPSC:
4%
M21 + M2M cardinality: more than one class is available in the GPC and there is one or more than one
equivalent class(es) in the target system:
CPV:
5%
eCl@ss:
6%
UNSPSC:
5%
There are 6% when more than one class is available in the GPC system and there is one or more than one
equivalent class(es) in eCl@ss.

5.3.3.4 Subtotal 4: M2N mappings


M2N cardinality: more than one class in GPC is matching with no class in the target system:
In CPV:
3%
In eCl@ss:
2%
In UNSPSC:
2%
There are 3% of the cases when more than one class is available in the GPC system and there is no
equivalent class(es) in CPV.

5.3.3.5 Subtotal 5: N2N mappings


N2N cardinality: there is no class in GPC and there is no class in some other systems:
NO class in GPC NO class in CPV, Class(es) in CPV and/or in UNSPSC:
NO class in GPC NO class in eCl@ss , Class(es) in CPV and/or in UNSPSC:
NO class in GPC NO class in UNSPSC, Class(es) in CPV and/or in GPC:

33%
36%
31%

There are 36 % of the cases when no classes are available in both the eCl@ss and GPC systems but there
are classes in CPV and / or in UNSPSC.
52

prCWA XXX-1:200X (E)

5.3.3.6 Subtotal 6: N21 + N2M mappings


N21 cardinality: no class in GPC is matching with one equivalent class in the target system:
In CPV:
35%
In eCl@ss :
26%
In UNSPSC:
28%
N2M cardinality: no class in GPC is matching with more than one equivalent class in the target system:
In CPV:
0%
In eCl@ss:
7%
In UNSPSC:
9%

N21 + N2M: cardinality: no class is available in GPC and there is one or more than one equivalent
class(es) in the target system
CPV:
35%
eCl@ss:
33%
UNSPSC:
37%
There are only 33 % of the cases when no class is available in the GPC system and there is one or more
than one equivalent class(es) in eCl@ss.

5.3.4 Interpretation for the UNSPSC system (IV. Vertical Section)


5.3.4.1 Subtotal 1: 121 + 12M mappings
121 cardinality: one class in UNSPSC is matching with one class in another system:
In CPV:
17%
In eCl@ss:
8%
In GPC:
13%
12M cardinality: one class in UNSPSC is matching with more than one class in another system:
In CPV:
0%
In eCl@ss:
4%
In GPC:
1%
121 + 12M cardinality: one class is available in the UNSPSC and there is one or more than one
equivalent class(es) in the target system
In CPV:
17%
In eCl@ss:
12%
In GPC:
14%
The most matching cardinality is between UNSPSC and CPV with 17%, in other words, in 17% of the
UNSPSC cases there are equivalent class(es) in CPV.

5.3.4.2 Subtotal 2: 12N mappings


12N cardinality: one class is available in UNSPSC and there is no equivalent class in the target system:
In CPV:
27%
In eCl@ss:
29 %
In UNSPSC:
28 %
The least unavailable equivalent classes for the UNSPSC classes are in CPV with 27%.

5.3.4.3 Subtotal 3: M21 + M2M mappings


53

prCWA XXX-1:200X (E)

M21 cardinality: more than one class in UNSPSC is matching with one class in the target system:
In CPV:
7%
In eCl@ss:
3%
In GPC:
5%
M2M cardinality: more than one class in UNSPSC is matching with more than one class in the target
system:
In CPV:
0%
In eCl@ss:
5%
In GPC:
4%
M21 + M2M cardinality: one class is available in the UNSPSC and there is one or more than one
equivalent class(es) in the target system:
In CPV:
7%
In eCl@ss:
8%
In GPC:
9%
There are 9% cases when more than one class is are available in the UNSPSC system and there is one or
more than one equivalent class(es) in GPC.

5.3.4.4 Subtotal 4: M2N mappings


M2N cardinality: more than one class in UNSPSC is matching with no class in the target system:
In CPV:
6%
In eCl@ss:
6%
In GPC:
9%
There are 9% of the cases when more than one class is available in the UNSPSC system and there is no
equivalent class(es) in GPC.

5.3.4.5 Subtotal 5: N2N mappings


N2N cardinality: there is no class in UNSPSC and there is no class in some other systems:
No class in UNSPSC, no class in CPV, Class(es) in CPV and/or in UNSPSC:
15%
No class in UNSPSC, no class in eCl@ss , Class(es) in CPV and/or in UNSPSC:
25%
No class in UNSPSC, no class in GPC, Class(es) in CPV and/or in eCl@ss:
31%
There are 31 % of the cases when no class are available in both the UNSPSC and GPC systems but there
are classes in CPV and/or in eCl@ss.

5.3.4.6 Subtotal 6: N21 + N2M mappings


N21 cardinality: no class in UNSPSC is matching with one equivalent class in the target system:
In CPV:
28%
In eCl@ss:
17%
In GPC:
7%
N2M cardinality: no class in UNSPSC is matching with more than one equivalent class in the target
system:
In CPV:
0%
In eCl@ss:
3%
In GPC:
2%
N21 + N2M: cardinality: no class is available in UNSPSC and there is 1 one or more than one
equivalent class(es) in the target system
In CPV:
28%
54

prCWA XXX-1:200X (E)

In eCl@ss:
In GPC:

20%
9%

There are only 9 % of the cases when no class is available in the UNSPSC system and there is one or more
than one equivalent class(es) in GPC.

5.4

Summary and Recommendations

The starting point for the mapping of the four product classification systems within this document are the
versions given in subsection 5.1.1 where the basic product classification system is the CPV as stated in
Figure 18.
For the development of the cMap mapping methodology, the investigations of section 4 have been taken into
account. The recommendations drawn from section 4 have been adapted to the general requirements given
by the project context for the product classification mapping and the product classification systems
themselves as stated as mapping challenges in subsection 5.1.3.
It was shown that some different types of mapping relations have to be taken into account during the
mapping process as requirements given by the different product classification systems.
Taking the results from section 4, in subsection 5.2 a mapping methodology has been derived as a
combination of the canonical ontology mapping methodology and the steps taken from the CC3P mapping
methodology. This methodology should be integrated into the overall mapping methodology for product
classification systems development what has not been described again to focus on the mapping process
itself.
The mapping of the product classification systems has shown that a combined matcher best fits the need of
product classification mapping that is a combination of element level and structural matchers as stated in
subsection 5.2.1.
To fulfil the mapping between the product classification systems, the general requirements have shown that
a centralized architecture for the classification platform is the most appropriate architecture.
To facilitate the usage either by machine or human user, the import and export format is Microsoft Excel.
This format suites best the need of the project since no product classification system is given in a formal
representation applicable for deriving similarities between the product classification systems and because of
the different scopes of the given product classification systems, hierarchical information must be taken into
account during the mapping process.
To facilitate this mapping process, formal representations of the product classification systems should be
available, enhanced by context information, so that tools can be used to support the mapping process. In
any case, a semi-automated mapping process has to be followed, since actually no tool is able to make in
every case the final decision about the mapping between two concepts or classes of the given product
classification systems.
The user has to interact with the systems to make the final decision. Once this final decision has been made
by the user, formal representations of the product classification systems support the reuse of this decision for
new versions of the product classification systems. In addition, these decisions should be represented
formally by a language, so that they can be adapted as the product classification systems change.
In Table 22 the overall results from the fulfilled mapping process are shown and described.
The results of the mapping process have shown that it is strongly recommended that the different product
classification systems should be represented in a formal way so that a tool based mapping can be
supported. Promising tools available on the market are Protg with the enhancements of SKOS and
Prompt. A short overview of these tools is given in annexes A to C.
Some work is ongoing to represent eCl@ss and UNSPSC as OWL based ontologies. But also the other
product classification systems must be supported formally. Only based on a formal description of the product
classification system a formalization of mapping rules is possible and useful for reuse during product
classification system development.
55

prCWA XXX-1:200X (E)

As long as there is no commitment between the classification authorities to this formal representation or at
least representations which can be transformed into each others, the mapping between the product
classification systems will be a time-consuming and mainly manually-driven process, even when using the
cMap mapping methodology.

56

prCWA XXX-1:200X (E)

6.

Description of the classification systems

6.1

Introduction

This chapter is an enhancement of what was already described in the CC3P CWA 16138 Part 4. To start
with some comments on the different release policies (section 6.2) and maintenance processes (section 6.3)
are provided, followed by an analysis of the version compatibility of the four classification systems (section
6.4), as these are the crucial issues for not only upgrading to later versions of a classification system, but
also to maintain a mapping that is based on recent versions. This chapter will therefore go into detail and
describe all available upgrade information delivered by the classification authorities which is a main building
block to maintain the mapping.

6.2

Release policy and roadmap

As described in the CC3P CWA 16138 Part 4, the four classification authorities have established different
release policies according to the specific needs of their users and customers, i.e. they have developed
certain rules and principles that define the criteria for releasing new versions, e.g. the frequency, the content
scope, the validity etc. These policies have grown over the years and were established by each one of the
different initiatives for different purposes.
Whereas the CPV is not being published on a regular basis, eCl@ss, the UNSPSC and the GPC have
defined roadmaps that include at least one new release per year. Furthermore, eCl@ss distinguishes
between ServicePacks23, MinorReleases24 and MajorReleases25.
A third issue is the different validity policy of the classification authorities. Whereas the CPV only publish their
current release which is then obligatory for all users and makes old releases invalid, the UNSPSC and
eCl@ss are being published in all their versions without limitations so that the user can decide which release
best suits his/her needs. When GPC publishes a new release it takes a couple of months to put it into
GDSN26 production, i.e. normally there are two GPC releases available, one is the latest GPC publication
(currently 01 June 2011) and another one called GPC in production in GDSN (currently 01 December 2010).
The GDSN production version of GPC is mandatory according to the GDSN rules.
Therefore, the user requirements for a mapping could range from the mapping of just the latest release of
each standard to the complex task of covering all versions of these standards mapped with each other. The
task to align these different schedules would be challenging since the needs of the classification standards
users are diverse.
The following table compares the four standards policies:
Table 23: Comparison of release policies
CPV
Release
Frequency

Release
Types

UNSPSC

Not defined

No distinction

GPC

Two releases per


year

No distinction

eCl@ss

At least one
release per year

At least one
release per year

No distinction

Three different
types of releases
defined
depending on
change types
included

23 eCl@ss ServicePack: contains only textual corrections and translations, therefore being downwards and upwards compatible
24 eCl@ss MinorRelease: contains content of a ServicePack, add-ons and slight changes that only change an elements version, but
not the concept, therefore being downwards compatible
25 eCl@ss MajorRelease: contains any sort of change including deletions, re-structuring and generally changes, therefore being
incompatible, but including detailed release update information
26 GDSN = Global Data Synchronisation Network by GS1 (see www.gs1.org/gdsn)

57

prCWA XXX-1:200X (E)

CPV

UNSPSC

Only current
release is valid
After 6 months
old version is
invalid
i.e. only current
version is
supported and
has to be
maintained

Release
Validity

GPC

Any release is
supported,
published and in
use
i.e. the mapping
from/to any
UNSPSC release
will have
interested parties
and could be
supported /
maintained

eCl@ss

The last two


releases are
published and in
use27
i.e. the mapping
from/to the last
GPC releases
will have
interested parties
could be
supported /
maintained

Any release is
supported,
published and in
use
i.e. the mapping
from/to any
eCl@ss release
will have
interested parties
and could be
supported /
maintained

As shown, the product classification authorities have different ideas about the frequency of new versions, the
distinction of different types of new versions and whether a version is still published and supported after
releasing a later version.
For the scope of the cMap project, there is no need to go into further details.
For the development of the cMap platform architecture the following decisions have to be taken into account:
-

Consider any new release?


Consider any new release type (only relevant for eCl@ss)?
Maintain only the latest release of each classification system or maintain a mapping of all versions
with each other?

Recommendations on these decisions will be discussed later.

6.3

Maintenance process

As described in the CC3P CWA 16138 Part 4, the four classification authorities have established different
maintenance processes according to their specific requirements. E.g. change requests for the UNSPSC are
only created by members of the UNSPSC organization, whereas change requests for eCl@ss can be
submitted by everyone with the help of a cost-free online portal. For the GPC, a GS1-wide process is
established that is used for the maintenance of all GS1 standards. Also, in each case different bodies are
involved.
A short description of the maintenance processes can be found in CWA 16138. Further descriptive details
will not be given in the current CWA, as the maintenance processes of the classification systems themselves
are not the focus of this project, but the maintenance of the mapping between them. A deeper analysis of the
recommendations of CWA 16138 is conducted and leads to the definition of what can still be used and
further elaborated in section 6.5. Plus, for the maintenance of the mapping the version compatibility of the
releases and the release update information that is delivered by the different classification systems is more in
focus as it directly influences the maintenance process.

6.4

Version compatibility

In order to find a suitable way to maintain the mapping of the standards, the version compatibility of the
different standards plays a major role. If e.g. a full mapping was established today, this would be invalid
when a new UNSPSC/eCl@ss/GPC/CPV version is published.
In order to get a clearer view on the possibilities a short summary of the main issues of version compatibility
of the four classification standards is provided in the following sections.

6.4.1 CPV
6.4.1.1 General structure
There is no version compatibility policy applied today. From one version to the next version, codes can be
added, transferred, removed and even reused after being deleted. The structure can also be changed.
27 one of them is used by the GDSN user community, however outside the GDSN user communities several versions could be in use

58

prCWA XXX-1:200X (E)

As there is only one version of CPV valid at a time, any reused code should be placed in context to
understand its meaning. As CPV is meant to be used for public procurement notices and procedures, each
use of a code is by default linked to a specific date and time, and thus to a specific version. When moving
from one version to the following one, a transition procedure is set in place to avoid conflicting use of codes.
The CPV does not use any other kind of code or identifier for their classification classes apart from the
classification code. Therefore, there is no versioning and no change management. Any kind of change can
take place in any release.
The Commission provides release update files that are called correspondence tables. The structure of this
file is rather simple. It includes all codes of the previous release (here: CPV Code 2003) and all codes of the
following, new release (here: CPV Code 2007) including the class information (class code and class name).
There is no code that describes the type of change (e.g. NEW, EDIT, SPLIT, MOVE), but there are several
types of change described (see below). Unfortunately, only few changes are machine-readable, which is a
real disadvantage concerning the cMap goal to maintain a mapping with other standards. This will be
described in detail. The correspondence tables (available at
http://ec.europa.eu/internal_market/publicprocurement/rules/current/index_en.htm) document the changes
described in the following sections.

6.4.1.2 New codes


New codes are simply added in the CPV 2007 column at the end of the correspondence table without any
entry in the 2003 code column. It is therefore theoretically machine-readable, as a user has to simply import
the new 2007 codes. The problem here is that the user cannot simply filter by empty 2003 lines to find the
new codes. The reason is that the introduction of new items of the secondary vocabulary 28 are also marked
with the help of empty 2003 codes (see Table 25), which makes it a bit more complicated. But nevertheless,
after some manual work to identify the valid lines in the correspondence table (which is not documented
anywhere) this is machine-readable.
Table 24: New CPV codes in correspondence table
CPV code
2003

Description
2003

CPV code
2007
09134230-8

Description 2007

09134231-5

Biodiesel (B20).

09134232-2

Biodiesel (B100).

Biodiesel.

The new classes Biodiesel, Biodiesel (B20) and Biodiesel (B100) can be found in the correspondence table.
They are added at the end of the table, i.e. the last lines of the CPVs correspondence table are filled with
new classes.
Table 25: Other CPV 2003 empty lines in correspondence table
CPV code
2003

Description
2003

CPV code
2007
44163000-0

Description 2007

44163000-0

Pipes and fittings.

44163000-0

Pipes and fittings.

Pipes and fittings.

+ AA05-3 Iron
+ AA02-4 Aluminium
+ AA06-6 Lead

The difference with the information on new items of the secondary vocabulary has to be analysed by the user
manually by filtering not only the empty CPV code 2003 lines, but also to filled other lines, which leads to
some filtering problems. This problem could simply be solved by adding a description for the type of change
for which the following table is an example.

28 The CPVs secondary vocabulary contains additional information to describe classes. It is a mixture of properties and values.

59

prCWA XXX-1:200X (E)

Table 26: Machine-readable documentation of new CPV codes in correspondence table


Type of
change
NEW

CPV code
2003

Description CPV code


2003
2007

NEW
NEW
ADD VALUE
ADD VALUE
ADD VALUE

Description 2007

09134230-8

Biodiesel.

09134231-5
09134232-2
44163000-0
44163000-0
44163000-0

Biodiesel (B20).
Biodiesel (B100).
Pipes and fittings.
Pipes and fittings.
Pipes and fittings.

+ AA05-3 Iron
+ AA02-4 Aluminium
+ AA06-6 Lead

6.4.1.3 Re-coding
Some codes are being changed in a 121-relation of predecessor and successor. Table 30 below shows two
examples of the re-coding of a class within the hierarchy. In the first line a textual change is included as well
but is not documented. By comparing the two different names, it is nevertheless machine-readable in theory.
But due to the problems described for the split of classes (see below) the information is not machinereadable at all.
Table 27: Modified CPV codes in correspondence table
CPV code
2003

Description 2003

CPV code
2007

Description 2007

93900000-7

Miscellaneous services n.e.c..

98390000-3

Other services.

93910000-0

Decommissioning services.

98391000-0

Decommissioning services.

93920000-3

Relocation services.

98392000-7

Relocation services.

93930000-6

Tailoring services.

98393000-4

Tailoring services.

93940000-9

Upholstering services.

98394000-1

Upholstering services.

93950000-2

Locksmith services.

98395000-8

Locksmith services.

The first line of Table 28 includes two changes. First the class name is changed and second, the class is
moved within the hierarchy and a new class code is assigned. The two changes are not easy to identify. This
problem could simply be solved by adding a description for the type of change and including a line for each
single change. The following table shows an example.
Table 28: Machine-readable documentation of modified CPV codes in correspondence table
Type of
change

CPV code
2003

Description 2003

CPV code
2007

Description 2007

MOVE

93900000-7

Miscellaneous services n.e.c..

98390000-3

Other services.

MOVE

93910000-0

Decommissioning services.

98391000-0

Decommissioning services.

MOVE

93920000-3

Relocation services.

98392000-7

Relocation services.

MOVE

93930000-6

Tailoring services.

98393000-4

Tailoring services.

MOVE

93940000-9

Upholstering services.

98394000-1

Upholstering services.

MOVE

93950000-2

Locksmith services.

98395000-8

Locksmith services.

EDIT

93900000-7

Miscellaneous services n.e.c..

98390000-3

Other services.

In some cases a class was re-coded and renamed and product properties were added to transfer the
information that was deleted in the class name. Table 29 shows the addition of the properties Fresh and
Chilled that were used to distinguish the re-named class Fish, fresh or chilled etc. Unfortunately this is not
machine-readable because the CPV does not create relationships between the CPVs classes and the
secondary vocabulary that includes properties and values.
60

prCWA XXX-1:200X (E)

Table 29: Addition of product properties to CPV codes in correspondence table


CPV code
2003
05120000-2
05121000-9

05121100-0

Description 2003
Fish, fresh or chilled.
Flat fish, fresh or
chilled.

Sole, fresh or chilled.

CPV code
2007
03311000-2

Description 2007
Fish.

03311100-3

+ BA04-1

Fresh

+ BA33-8

Chilled

+ BA04-1

Fresh

+ BA33-8

Chilled

+ BA04-1

Fresh

+ BA33-8

Chilled

Flat fish.

03311110-6

Sole.

In some cases this is done in combination with more than one product property as shown in Table 30.
Table 30: Addition of product properties to CPV codes in correspondence table
CPV code
2003

Description 2003

25122400-6

Vulcanised rubber floor coverings and


mats.

25122410-9

Vulcanised rubber floor coverings.

25122420-2

Vulcanised rubber mats.

CPV code
2007

Description
2007

441122000
395320000
441122000
395320000

Floor
coverings.
Mats.
Floor
coverings.
Mats.

AB125
AB12+
5
AB12+
5
AB12+
5
+

BA412
BA41Rubber +
2
BA41Rubber +
2
BA41Rubber +
2
Rubber +

Vulcanised
Vulcanised
Vulcanised
Vulcanised

6.4.1.4 Split of classes


It is also possible that a previous class may be split into several new classes. Unfortunately as the type of
change is not documented in the CPV correspondence table, this was only found out by manually reading
the table, i.e. it is not machine-readable. Several successor classes are named in several lines. The line of
the previous class is simply joined in the spreadsheet, i.e. for three lines of successor classes as in the table
below, only one line of the predecessor exists. The split of a class can only be interpreted by the human eye
without changing the line order, i.e. the user is not allowed to sort or filter the correspondence table in any
way as it could change the meaning. This rather error-prone concept could simply be changed by 1) adding a
type of change code and 2) creating one line for each of the predecessor-successor-relations (12N) (see
Table 32).
Table 31: Split CPV codes in correspondence table
CPV code 2003
25122200-4

Description 2003
Vulcanised rubber
conveyor or transmission
belts or belting.

CPV code 2007


34312500-2
34312600-3
34312700-4

61

Description 2007
Gaskets.
Rubber conveyor belts.
Rubber transmission
belts.

prCWA XXX-1:200X (E)

Table 32: Machine-readable documentation of a split CPV codes in correspondence table


Type of
change

CPV code
2003

Description 2003

CPV code
2007

Description 2007

SPLIT

25122200-4

Vulcanised rubber conveyor or


transmission belts or belting.

34312500-2

Gaskets.

SPLIT

25122200-4

Vulcanised rubber conveyor or


transmission belts or belting.

34312600-3

Rubber conveyor
belts.

SPLIT

25122200-4

Vulcanised rubber conveyor or


transmission belts or belting.

34312700-4

Rubber transmission
belts.

By not documenting the split in a proper machine-readable way, not even the addition of new classes as
explained in 4.3.1.2 can be interpreted by a machine. Both changes are only documented by naming the
new/successor class in the CPV 2007 code column without naming a predecessor or the type of change
(NEW vs. SPLIT).

6.4.1.5 Join of classes


It is also possible that more than one class may be joined into one new class. Some of the information may
be transferred into product properties. Table 33 shows the join of the three classes Fish, Live fish and
Fish, fresh or chilled into one new class Fish that can be further described by the properties BA01-2
Live, BA04-1 Fresh and BA33-8 Chilled.
Table 33: Joined CPV classes in correspondence table example one
CPV code
2003

Description 2003

CPV code 2007

Description 2007

05100000-6 Fish.

03311000-2

Fish.

05110000-9 Live fish.

03311000-2

Fish.

+ BA01-2 Live

05120000-2 Fish, fresh or chilled.

03311000-2

Fish.

+ BA04-1 Fresh
+ BA33-8 Chilled

Table 34 shows a similar example in which classes are joined and product properties are added to further
distinguish the classes in the categories that were formerly described with the help of various classes.
Unfortunately this kind of documentation is not machine-readable because the CPV does not create
relationships between the CPVs classes and the secondary vocabulary that includes properties and values.
Table 34: Joined CPV classes in correspondence table example two
CPV code
2003

Description 2003

35211000-6 Electrically-powered
rail locomotives.
35212000-3 Diesel-electric
locomotives.
35213000-0 Diesel locomotives.

CPV code
2007

Description
2007

34611000-3

Locomotives. +

34611000-3

Locomotives. +

34611000-3

Locomotives. +

CB10-1 Electrically
powered
CB41-4 Hybrid
powered
CB09-8 Dieselpowered

6.4.2 UNSPSC
For all components of the UNSPSC downward compatibility is always guaranteed. Upward compatibility is
guaranteed for the portions of the release that centres on added classes. On classes that are modified or
deleted a remapping process is required if a member wishes to upgrade to that version, i.e. that users have
to decide which codes they want to use when existing codes are modified or deleted.
62

prCWA XXX-1:200X (E)

Release Update Files (The audit trail) provided in the excel format of each release include version
parameters as to when an entry was originally added, when it was last changed and or when it was deleted.
Additionally to the class code, the UNSPSC uses unique 6-digit identifiers. This way, the move of a class
within the hierarchy can be documented without changing the class as shown below.

6.4.2.1 ADD
Each class that is added to the UNSPSC is represented as a new class (ADD) in the audit trail. This way,
the users can easily identify the need to update their product information with new commodity classes that
might be better suitable for their products.
Table 35: New UNSPSC codes in audit trail
Effective_
version

change effective
_type
_id

effective_
code

UNv131201

ADD

174585

10121507 Compound feed

UNv131201

ADD

174586

10141505 Saddle pad

UNv131201

ADD

174587

10152004 Latifoliate tree seedling

UNv131201

ADD

174588

10152005 Conifer tree seedling

effective_title

effective_
date

effective_definition
Feedstuff that is blended from various
raw materials and additives. These
blends are formulated according to
the specific requirements of the target
01.12.2010 animal.
A pad that is placed under the saddle
01.12.2010 when riding a horse.
A young tree that is grown in a
nursery for cultivation of broad leaved
01.12.2010 trees.
A young tree that is grown in a
nursery for cultivation of acicular
trees, e.g. trees with the shape of
01.12.2010 leaves looking like needles.

6.4.2.2 EDIT
The textual modification of a class is marked in the audit trail (EDIT), but does not have an effect on the
class code, nor the unique 6-digit ID number. The fact that a change has happened is therefore documented
for the user, but the comparison of the same object in two succeeding releases is not possible by comparing
the ID as no version number or a similar mark exists.
Table 36: Edited UNSPSC codes in audit trail
Effective_
version

change changed_
_type
version

changed
_id

changed_
code

UNv131201

EDIT

UNv130601

170607

changed_title
Fresh cut snap
10316700 dragons

UNv131201

EDIT

UNv130601

170608

UNv131201

EDIT

UNv130601

UNv131201

EDIT

UNv130601

effective
_id

effective_
code

170607

10316700

Fresh cut bi color


10316701 snap dragon

170608

10316701

170609

Fresh cut burgundy


10316702 snap dragon

170609

10316702

170610

Fresh cut hot pink


10316703 snap dragon

170610

10316703

effective_title
Fresh cut
snapdragons
Fresh cut bi
color
snapdragon
Fresh cut
burgundy
snapdragon
Fresh cut hot
pink
snapdragon

6.4.2.3 DELETE
A commodity class can be deleted and being removed from the standard. It does have a successor-relation
to another class that shall be used instead, though. It is marked in the audit trail (DELETE). The availability
of a successor is marked in column map_to, the successor itself is marked in the audit trail with its class
code (effective_code).

63

prCWA XXX-1:200X (E)

Table 37: Deleted UNSPSC codes in audit trail


Effective_
version

change_
type

changed_
version

changed
_id

changed_
code

map_ effective
to
_id

UNv131201

DELETE

UNv130601

168281

10215930 Live new zealand disa orchid

10251800

UNv131201

DELETE

UNv130601

168290

10215939 Live stem cream cymbidium orchid

10252200

UNv131201

DELETE

UNv130601

168292

10215941 Live stem green cymbidium orchid

10252200

UNv131201

DELETE

UNv130601

170475

10361800

UNv131201

DELETE

UNv130601

170484

10362200

UNv131201

DELETE

UNv130601

170486

10315930 Fresh cut new zealand disa orchid


Fresh cut stem cream cymbidium
10315939 orchid
Fresh cut stem green cymbidium
10315941 orchid

10362200

UNv131201

DELETE

UNv130601

172686

10451800

UNv131201

DELETE

UNv130601

172695

10452200

UNv131201

DELETE

UNv130601

172697

10415930 Dried cut new zealand disa orchid


Dried cut stem cream cymbidium
10415939 orchid
Dried cut stem green cymbidium
10415941 orchid

10452200

changed_title

effective_
code

6.4.2.4 MOVE
As the UNSPSC has a 6-digit unique ID number for all of their classes additionally to the class code, the
move of an existing class to another spot in the hierarchy is possible without changing the class. The class
code will be changed as the class is relocated, but the unique ID does not change. The predecessorsuccessor-relation is included in the audit trail. This change is therefore documented and traceable by the
user (MOVE). The difference to the DELETE-function is that the class is still the same, so no successor is
needed.
Table 38: Moved UNSPSC codes in audit trail
Effective_
version

change changed_
_type
version

changed
_id

UNv131201

MOVE

UNv130601

168337

UNv131201

MOVE

UNv130601

168338

UNv131201

MOVE

UNv130601

168339

UNv131201

MOVE

UNv130601

168341

UNv131201

MOVE

UNv130601

168348

changed_
code

changed_title

effective
_id

10216304 Live bouquet protea


Live bottle brush
10216305 protea

168337

10216306 Live carnival protea


Live cordata foliage
10216308 protea
Live grandiceps
10216315 protea

168339

168338

168341
168348

effective_c
ode

effective_title

10218101 Live bouquet protea


Live bottle brush
10218102 protea
10218103 Live carnival protea
Live cordata foliage
10218104 protea
Live grandiceps
10218105 protea

6.4.3 GPC
Release Update Files (Delta reports) between two consecutive releases are available for all updates.
Companies can use several versions, however to achieve master data synchronisation the GDSN users
should migrate to the GDSN version practically one or two times a year. Codes that were once used are
marked in the database as unavailable for future release, this way it is ensured that a code cannot be reused
again.
The GPC does not make use of an additional unique identifier apart from the classification code. But as this
cannot be used again after it was marked as unavailable it is in fact used as a unique identifier. Version
changes are not visible in the code 29.
The delta report displays all changes of all elements (classes, attributes, values) in their context. Attributes
and values are displayed in the context of the class that they are assigned to. Both the element and the text
field that was e.g. changed are named (see below Table 39). Relevant for this project are only the class
changes.

29 See also: GPC_Development_Implementation.pdf

64

prCWA XXX-1:200X (E)

6.4.3.1 Addition
New elements, be it classes, attributes or values, are simply marked with an A for ADDITION.
Table 39: Example of additions as documented in GPC delta report

Table 40: Example of added classes as documented in GPC delta report

6.4.3.2 Minor Modification


A textual change like the correction of a spelling mistake does not imply a change of the code. This is
documented in the delta report without displaying what has changed, nor changing the code. The change is
marked as a modification with the code M.

Table 41: Example of modified classes as documented in GPC delta report

6.4.3.3 Deletion
A brick can be deleted without a successor-relation. The code will not be used again and marked as being
th
unavailable for future releases. In case of the deletion of a class on the 4 level (brick), all its attributes and
values are deleted as well as shown in the example. The deletion is marked with the code D.
65

prCWA XXX-1:200X (E)

Table 42: Example of deleted classes as documented in GPC delta report

6.4.3.4 Move
In the GPC, the move of a class within the hierarchy is possible and documented in a very detailed way. The
above explained change codes are added by an M for move. A brick that has been moved (the
predecessor) is marked with a DM which stands for DELETE/MOVE, i.e. a brick has been deleted in the
hierarchy from its previous position. The bricks new position in the hierarchy is marked with an AM which
stands for ADD/MOVE, i.e. a brick has changed its place in the hierarchy, but no other change occurred. If
additionally to the move of the brick, the brick itself was modified, then it is marked as AMM
(ADD/MOVE/MODIFY). The following examples will help understand better.

73000000
Household Kitchen
Merchandise

73040000
Household Kitchen
Merchandise

73040100
Household Kitchen
Storage

10001761 Refuse
bags

Figure 43: Refuse bags in GPC version 01122010

66

prCWA XXX-1:200X (E)

47000000
Cleaning/Hygiene
Products

47210000 Waste
Management
Products

47210100 Waste
Storage Products

10001761 Refuse
bags

Figure 44: Refuse bags in GCP version 01062011


Between GPC version 01122010 (Figure 43) and version 01062011 (Figure 44) refuse bags have changed
segment, family and class with the same brick code. They are marked in segment 73000000 as DM and in
segment 47000000 as AM and additionally as AMM as the definition was slightly modified.

73040100
Household Kitchen
Storage

10002125
Household Refuse
Bins (Indoor)

Figure 45: Household Refuse bins (indoor) in GPC version 01122010

67

prCWA XXX-1:200X (E)

47210100
Waste Storage
Products

10002125
Refuse /
Waste Bins
Figure 46: Household Refuse bins (indoor) renamed as refuse / waste bins in GPC version 01062011
In this example, Household Refuse Bins (Indoor) were also moved from 73040100 Household Kitchen
Storage (see Figure 45) to 47210100 Waste Storage Products (Figure 46) and renamed to Refuse / Waste
Bins. The brick code stays the same and the changes are marked as DM in 73040100, AM in 47210100 and
additionally as AMM in 47210100, as the name has changed.

6.4.3.5 Major Modification


A major modification according to the GPC delta report is a change that requires human interaction to check
what class to choose for a certain product. An example could be that a Brick is redefined warranting the Brick
to be split.
If a level is split there are two options that could apply. The option selected depends on the severity and
impact of the change.
A more severe example would be new Bricks being built up from existing Bricks or parts of Bricks and would
require the reclassification of all of the products classified with the source Bricks. The impact for the mapping
is the same as for a user that classifies his/her products: the successor class has to be found manually. In
this example the rule applied would be, new levels are added as required, all products in the source level are
reclassified and moved leaving the source level empty, then the source level is deleted. Added and deleted
levels would follow the Addition and Deletion rules as stated above.

6.4.3.6 Summary
The following changes are documented in the delta report and were explained in the above chapter.
Table 47: GPC changes as documented in the delta report
Change
code
A

Change

MODIFY

DELETE

ADD

Valid for
element
Brick, Attribute,
Value
Brick, Attribute,
Value
Attribute, Value

description
a new segment, family, class, brick, attribute or value
was added into the
hierarchy
a segment, family, class, brick, attribute or value was
changed (e.g. the title or definition)
a segment, family, class, brick, attribute or value was
deleted, the attribute code or attribute value code will not
68

prCWA XXX-1:200X (E)

Change
code

Change

Valid for
element

AM

ADD MOVE

Brick

AMM

ADD MOVE
MODIFY
DELETE
MOVE

Brick

DM

Brick

description
be used again
A brick has changed its place in the hierarchy, but no
change occurred. The successor position is marked.
A brick has changed its place in the hierarchy and has
been modified. This is documented additionally to AM.
A brick has been deleted in the hierarchy from its
previous position.
The predecessor position is marked.

6.4.4 eCl@ss
Within the scope of the cMap project, eCl@ss is the only classification system that distinguishes between
different types of releases. The distinction criterion is the type of compatibility:
eCl@ss ServicePacks are backward and upward compatible as only textual changes are allowed.
ServicePacks are used for translations and other textual corrections.
eCl@ss MinorReleases are compatible within the same MajorRelease cycle, i.e. all 6.x- or 7.xversions are backward compatible as only new or changed elements are included. Changes can only
be textual modifications that do not change the concept of an element.
eCl@ss MajorReleases are not backward compatible, as changes like the restructuring of the class
hierarchy, the deletion of properties and values are allowed. In some classes, a successor is
obligatory, but in case of correction there is sometimes no successor at all.
These changes are documented with the help of two things:
a combination of a unique identifier and a version number of an object;
Release Update Files (called mapping tables)30 that have been published separately to any
eCl@ss MajorRelease starting with eCl@ss version 4.0.
eCl@ss uses the class code to identify the location in the hierarchy, but every class (every object in general)
has a unique identification scheme called IRDI International Registration Data Identifier 31 and based on
ISO-standards32. This IRDI includes both an object identifier and a version number. The version number
documents slight changes that do not change the concept of an object, but a slight change of the meaning,
e.g. if a class is moved in the hierarchy or a definition is added, the version number is changed, but it keeps
the identifier. If an objects concept is changed, a new object with a unique identifier substitutes the old one.
This way a classification class can be moved in the hierarchy, changing the class code and the version
number, but not the object identifier. New classes are simply added, this is not documented separately.
The Release Update Files document changes in a MajorRelease. The following changes are documented in
the Release Update Files:
Table 48: eCl@ss changes as documented in the Release Update Files
Change code
NEW
VERSION NUMBER

Valid for element


All
All

Description
New element in TargetRelease
The element was changed without changing the concept
(e.g. textual correction). Identifiers do not change, only the
Version Number is raised.

30 Starting with eCl@ss MajorRelease 7.0, additionally to the Release Update Files (including predecessor-successor-relation of
classes), eCl@ss now publishes Transaction Update Files that enable the user to update automatically the evaluation, i.e. product
description of their products. For the cMap project only the Release Update Files are relevant.
31 Source: http://wiki.eclass.de/wiki/IRDI
32 ISO/IEC 11179, ISO 29002, ISO/IEC 6523

69

prCWA XXX-1:200X (E)

CLOSED

All

REPLACE

Property

SUBSTITUTE

Property

MOVE

Class

SPLIT

Class

JOIN

Class

The element from the SourceRelease was removed in the


TargetRelease
The element was replaced by another identical element (a
compatible replacement according to ISO Change
Management)
The element was substituted by another similar element (an
incompatible substitution according to ISO Change
Management)
The class was moved in the hierarchy (CodedName
th
changed), this type of change applies only to 4 level
The class was split into several other classes and
th
deprecated, this type of change applies only to 4 level
The class was joined into another class and deprecated, this
th
type of change applies only to 4 level

eCl@ss delivers nine files in total that include release update information (shown in Table 49). The first and
the third are relevant for the class mapping as they document the class changes in a MajorRelease.
Table 49: eCl@ss deliverables as documented in the Release Update Files (RUF)
Name of file

Type of release update information

eClass-RUF-TU-CC_6_x_to_7_x.csv

Transaction Upgrade Classes (Class-Update-Table)

eClass-RUF-TU-PR_6_x_to_7_x.csv

Transaction Upgrade Properties (Property-UpdateTable)

eClass-RUF-CC_6_x_to_7_x_EN.csv

Table of Classification Classes

eClass-RUF-KWSY_6_x_to_7_x_EN.csv

Table of Keywords / Synonyms

eClass-RUF-PR_6_x_to_7_x_EN.csv

Table of Properties

eClass-RUF-VA_6_x_to_7_x_EN.csv

Table of Values

eClass-RUF-CC_PR_6_x_to_7_x.csv

Relations Classes-Properties

eClass-RUF-PR_VA_6_x_to_7_x.csv

Relations Properties-Values

eClass-RUF-UN_6_x_to_7_x_en_US.csv

Table of units

In the RUF-Table of Classification Classes all new classes in the target release (NEW) and all changes of
existing classes are listed, whether they are closed, changed (VERSION NUMBER) or restructured
(MOVE, SPLIT, JOIN).
Additionally, the predecessor-successor-relation of the restructuring changes MOVE, SPLIT, JOIN is listed in
the Transaction Upgrade File of Classes (Class-Update-Table).

6.4.4.1 NEW
New Classes are marked with the change code NEW in the Table of Classification Classes of the RUF as
follows. Their preferred names are not included in the Release Update Files as they are an additional
product to and only applicable within the eCl@ss standard.

70

prCWA XXX-1:200X (E)

Table 50: New eCl@ss classes as documented in the mapping tables


Version
Date

Command IrdiCC

Coded
Name

Source
Release

Level PreferredName

Target
Release

NEW

0173-1#01-ABS666#001 11.02.2011 19059205

4 content is available in target release eCl@ss6.2.1 eCl@ss7.0

NEW

0173-1#01-ABS733#001 11.02.2011 21179004

4 content is available in target release eCl@ss6.2.1 eCl@ss7.0

NEW

0173-1#01-ABS736#001 11.02.2011 27010603

4 content is available in target release eCl@ss6.2.1 eCl@ss7.0

NEW

0173-1#01-ABS739#001 11.02.2011 23039201

4 content is available in target release eCl@ss6.2.1 eCl@ss7.0

6.4.4.2 VERSION NUMBER


Modifications like textual corrections that do not change the concept of a class are marked with the help of
the items version number and documented as follows:
Table 51: Modified eCl@ss classes as documented in the mapping tables

IrdiCC
0173-1#01ABR658#003
0173-1#01ABR485#002
0173-1#01ABR918#003
0173-1#01ABS068#003
0173-1#01ABS203#003

Version
Date

Coded
Name

ISO
Country
Code
ISO
Language
Code

Command
VERSION
NUMBER
VERSION
NUMBER
VERSION
NUMBER
VERSION
NUMBER
VERSION
NUMBER

Level

11.02.2011 23380201

PreferredName

Source
Release

Target
Release

en

US

eCl@ss6.2.1

eCl@ss7.0

en

US

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 23389205

4 Impeller
Rotor and flowguide
3 component (accessories)
Hub cap (pinion shaft,
4 accessories)

en

US

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 23389204

4 Cover band (accessories)

en

US

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 23389203

4 Blade piece (accessories)

en

US

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 23389200

6.4.4.3 CLOSED
In eCl@ss, classes can be closed in a MajorRelease, i.e. they are still part of older releases, but are no
longer part of the target release. Elements that are no longer part of a new eCl@ss release are marked as
th
DEPRECATED=TRUE, see below Table 52. As for 4 level classes, they always need a successor, which is
additionally listed in the Class Update Table (Table 53).
Table 52: Closed eCl@ss classes as documented in the mapping tables

CLOSED
CLOSED
CLOSED
CLOSED
CLOSED

Version
Date

Coded
Name

Preferred
Level Name

ISO
Country

CLOSED

IrdiCC
0173-1#01AAA359#009
0173-1#01AAA361#009
0173-1#01AAA380#008
0173-1#01AAB007#007
0173-1#01ACH692#005
0173-1#01AAA377#008

Code
ISO
Language
Code

Command

Depre
cated

Source
Release

Target
Release

11.02.2011 21040602

4 Knife

en

US

TRUE

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 21040604

4 Trowel

en

US

TRUE

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 21041005

4 Fork (tool)

en

US

TRUE

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 23110603

4 Ring bolt

en

US

TRUE

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 22390602

4 Solar collector

en

US

TRUE

eCl@ss6.2.1

eCl@ss7.0

11.02.2011 21041002

4 Hoe

en

US

TRUE

eCl@ss6.2.1

eCl@ss7.0

71

prCWA XXX-1:200X (E)

Table 53: Successor-relation of closed eCl@ss classes as documented in the mapping tables
Command IrdiSourceRelease
SPLIT

0173-1#01-AAA359#008

SPLIT

0173-1#01-AAA359#008

SPLIT

0173-1#01-AAA359#008

SPLIT

0173-1#01-AAA359#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

SPLIT

0173-1#01-AAA361#008

CodedName
SourceRelease IrdiTargetRelease
0173-1#0121040602 ADS512#001
0173-1#0121040602 ADS513#001
0173-1#0121040602 ADS514#001
0173-1#0121040602 ADS515#001
0173-1#0121040604 ADS545#001
0173-1#0121040604 ADS546#001
0173-1#0121040604 ADS547#001
0173-1#0121040604 ADS548#001
0173-1#0121040604 ADS549#001
0173-1#0121040604 ADS550#001
0173-1#0121040604 ADS551#001
0173-1#0121040604 ADS552#001
0173-1#0121040604 ADS553#001
0173-1#0121040604 ADS554#001
0173-1#0121040604 ADS555#001
0173-1#0121040604 ADS556#001

CodedName
Source
TargetRelease Release

Target
Release

21044801 eCl@ss6.2.1 eCl@ss7.0


21044802 eCl@ss6.2.1 eCl@ss7.0
21044803 eCl@ss6.2.1 eCl@ss7.0
21044804 eCl@ss6.2.1 eCl@ss7.0
21044901 eCl@ss6.2.1 eCl@ss7.0
21044902 eCl@ss6.2.1 eCl@ss7.0
21044903 eCl@ss6.2.1 eCl@ss7.0
21044904 eCl@ss6.2.1 eCl@ss7.0
21044905 eCl@ss6.2.1 eCl@ss7.0
21044906 eCl@ss6.2.1 eCl@ss7.0
21044907 eCl@ss6.2.1 eCl@ss7.0
21044908 eCl@ss6.2.1 eCl@ss7.0
21044909 eCl@ss6.2.1 eCl@ss7.0
21044910 eCl@ss6.2.1 eCl@ss7.0
21044911 eCl@ss6.2.1 eCl@ss7.0
21044912 eCl@ss6.2.1 eCl@ss7.0

6.4.4.4 REPLACE
The REPLACE function is only valid for properties. An eCl@ss property can be replaced by an identical one
(compatible substitution). REPLACE is therefore irrelevant in the cMap project and will not be considered
here.

6.4.4.5 SUBSTITUTE
The SUBSTITUTE function is only valid for properties. An eCl@ss property can be substituted by a similar
one (incompatible substitution). SUBSTITUTE is therefore irrelevant in the cMap project and will not be
considered here.

6.4.4.6 MOVE
Similar to all mentioned classification systems, classes can be moved from one position in the hierarchy to
another. The change is documented with the help of the identifiers version number (e.g. #006 to #007), the
class code changes, the identifier stays the same.
Table 54: Example of moved eCl@ss classes as documented in the mapping tables
Command
MOVE
MOVE
MOVE

CodedName
CodedName
IrdiSourceRelease SourceRelease IrdiTargetRelease TargetRelease
0173-1#010173-1#01AKH756#006
21170314 AKH756#007
21170535
0173-1#010173-1#01AKH764#007
21170323 AKH764#008
21170536
0173-1#010173-1#01AKG104#008
25020805 AKG104#009
25021104

72

SourceRelease TargetRelease
eCl@ss6.2.1

eCl@ss7.0

eCl@ss6.2.1

eCl@ss7.0

eCl@ss6.2.1

eCl@ss7.0

prCWA XXX-1:200X (E)

6.4.4.7 SPLIT
An eCl@ss class can be split into more than one other classes, i.e. one predecessor has more than one
successor. This is documented by listing all successor-relation in one line each. This way, a user may
directly choose the right successor for his/her product, usually going from more general to more specific.
Table 55: Example of split eCl@ss classes as documented in the mapping tables
Command

IrdiSourceRelease
0173-1#01AKJ667#005
0173-1#01AKJ667#005
0173-1#01AKJ708#005
0173-1#01AKJ708#005

SPLIT
SPLIT
SPLIT
SPLIT

CodedName
SourceRelease IrdiTargetRelease
0173-1#0117019890
ADV551#001
0173-1#0117019890
ADV552#001
0173-1#0117029890
ADV556#001
0173-1#0117029890
ADV557#001

CodedName
TargetRelease

SourceRelease TargetRelease

15320101

eCl@ss6.2.1

eCl@ss7.0

15320102

eCl@ss6.2.1

eCl@ss7.0

15320201

eCl@ss6.2.1

eCl@ss7.0

15320202

eCl@ss6.2.1

eCl@ss7.0

6.4.4.8 JOIN
Several eCl@ss classes can be joined into one class, i.e. several predecessors have the same successor.
This is documented by listing all successor-relations in one line each. This way, a user can automatically
choose the right successor for his/her product, usually going from more specific to more general.
Table 56: Example of joined eCl@ss classes as documented in the mapping tables
Command IrdiSourceRelease
0173-1#01JOIN
AKG133#006
0173-1#01JOIN
BAC008#004
0173-1#01JOIN
ACF464#004
0173-1#01JOIN
ACF466#004

6.5

CodedName
SourceRelease
25041402
25041490
25049890
25049990

IrdiTargetRelease
0173-1#01BAC097#005
0173-1#01ADT413#001
0173-1#01ADT413#001
0173-1#01ADT413#001

CodedName
TargetRelease

SourceRelease TargetRelease

25209090

eCl@ss6.2.1

eCl@ss7.0

25299090

eCl@ss6.2.1

eCl@ss7.0

25299090

eCl@ss6.2.1

eCl@ss7.0

25299090

eCl@ss6.2.1

eCl@ss7.0

Summary

As shown in the previous sections, all classification authorities publish additional upgrade information when
publishing a new release. The following table lists the availability of upgrade information and the kind of
changes that are possible in each classification system.
Table 57: Possible Changes and compatibility: UNSPSC, eCl@ss, GPC, CPV
Classification
system

Supports
automatic
update
Yes
Yes
Yes

Codes can
be re-used

Unique identifier
used

Possible
changes

UNSPSC
eCl@ss
GPC

Release
Update Files
available
Yes
Yes
Yes

No
Yes33
No

Any
Any
Any

CPV

Yes

Restricted

Yes

Yes
Yes
No, but not reused
No

Any

The following section summarizes and compares what was shown above and lists recommendations on how
to further proceed.

33 eCl@ss allows the reuse of classification codes, but uses additional class identifiers that include a version and revision number. This
way, a class with the same classification code (coded name) cannot be confused with the old one as the identifiers differ.

73

prCWA XXX-1:200X (E)

6.5.1 Differences and Similarities


The common approach of the classification authorities is the publication of additional release update files
with a new release. The difference between the classification systems is the possibility to let a machine read
the information so that the upgrade to a new release is made easy. Some cases can never be managed
automatically such as the SPLIT of a class into several new classes, as the human eye has to decide which
new class is relevant for the users products or for the mapping. But very important is a kind of machinereadable upgrade information to help the user upgrade in a semi-automatic way, e.g. in the case of a SPLIT
of classes to deliver the information that a certain predecessor class has now two or more identified
successor classes and let the user choose the most relevant one.
The CPV release update file is less suitable to do so, as shown above, because the kind of change is not
well documented. This is a problem for the maintenance of mapping tables, too.
The differences of documentation could be implemented in the cMap platform by using different import
procedures for the files. But there should rather be a comparable or even aligned terminology of change
types and how they shall be documented so that they are all readable by the cMap platform.
Some changes are more relevant for the specific user group of one of the standards than for the mappings
as defined in cMap.
Rule compliant minor changes for cMap are modifications like textual changes, changes of the name and/or
the definition, because they do not change the mapping and the concepts that are described with the classes
are still comparable. These rule compliant minor changes that can be imported automatically are:
Textual modifications.
Further, there are changes that are less relevant for the mapping itself, but could assist as they further
describe a class, namely:
Additions of keywords;
Additions of properties.
The relevant changes for the cMap mapping are:
New classes;
Rule compliant major structural changes, i.e. changes of class codes when class codes are:
deleted
moved
split/joined
Replacements/substitutions.
Some of these changes can be updated in the mapping tables automatically if they are well documented. If,
e.g. a class is simply moved in the hierarchy, the mapping could be updated automatically as the class is still
the same and identifiable. If several classes are joined in one class then the mapping can be updated
automatically to this new class, too, as they all have the same successor class. Otherwise the change
management rules of the classification authority would no longer be valid. As all classification authorities
except CPV are industry-driven, their distinct user communities have an interest that makes them comply
with certain change management rules.

6.5.2 Identified problems for the common maintenance of the mapping


As documented above, several problems hinder an easy maintenance of the mappings.
First, the release policy is very different. It ranges from two planned releases at certain publication dates per
year (e.g. UNSPSC) to no release roadmap at all (CPV). Therefore, a synchronisation between the
classification systems while they are still in the development phase of each single standard would be rather
impossible.

74

prCWA XXX-1:200X (E)

Second, the validity of the classification systems is very diverse. Whereas only the current release of the
CPV is valid due to its legal policy (with a transition period of six months), all of the versions of the UNSPSC
and eCl@ss are valid, published and used by their customers. In GPC the GDSN production version is used
for data synchronisation which is not necessarily the very latest version.
The following thoughts lead to the conclusion that only the latest release of each standard should be taken
into consideration as the basis for the mapping updates:
Users of any classification system classify their product according to their own scheme and would
like to find a corresponding code in another classification system.
Each class once created in a classification system can have a successor class code, i.e. the user of
an older version should be able to find an appropriate new class code for his/her product in the latest
release of a particular classification system.
Therefore, a mapping needs only be maintained for the latest release of any standard.
Therefore, only the latest releases have to be supported and displayed.
This would make the platform a lot simpler and easier to maintain.
Apart from that the identified problems lead to the conclusion that the release update information has to be
comparable and machine-readable so that updates can be imported a lot more easily.

75

prCWA XXX-1:200X (E)

6.5.3 Recommendations
As shown above, the goal cannot be to synchronize the maintenance processes of individual classification
authorities, as they will all stay self-governed due to the diversity of objectives, use cases, processes,
timelines and applications. The goal is simply to maintain the mapping between them (see chapter 8), i.e. to
found the basis for interoperability between the standards and thereby support users of various classification
systems to reduce processing costs.
In order to do so, the quality of the classification systems is a crucial factor to succeed. The better the quality
of the classification system, the easier the mapping between them can be created and maintained.
Therefore, the following recommendations are addressed to the classification authorities to help improve the
quality of their classification systems.

6.5.3.1 Recommendations on quality management


CWA 16138 already listed recommendations on the quality management of classification systems. They are
shortly summed up hereafter:
Use internationally standardized data model
Use unique and internationally standardized identification scheme
Use internationally standardized maintenance process
Define, document and publish classificatory principles
Define, document and publish naming principles and business rules
Deliver a definition for each class
Add keywords to classes
Add images to classes
Separate application area from class description
Add information in brackets for translational purposes
Use translation process to improve language versions
If necessary, clearly distinguish between classes, properties and values
Use a common library of terms and definitions
Follow the rule of unique placement
Rely on existing international standards
Use a consistent class hierarchy
Do not make use of special characters that are interpreted by machines as control characters
More rules and a guideline how to create a classification system were recently delivered in ISO/DIS
22274:2011. In general it is to say that the better the quality of all classification systems the higher the
comparability, i.e. interoperability and possibility to map them.

6.5.3.2 Recommendations on maintenance process of classification systems


CWA 16138 recommended to all the classification authorities to take over a standardized maintenance
process, e.g. the ePDC process as described in CWA15295:2005. The objective would be to align the
maintenance processes and make them comparable and operable for interaction between the classification
authorities. This should be the basis for implementing synchronization or alignment points to enable one
standardization authority to inform the others about their changes and enable them to discuss the propose
[sic!] changes and take it over into their own classification system or at least update the mapping with the
changes (CWA 16138, p.108).
This would imply that new content in one classification system would be checked by another classification
authority, which is surely not what the classification authorities plan to, as it would involve much more costs
and much more time until they can finally publish their new release. Plus, the classification systems have
different objectives and structures so that e.g. an eCl@ss body cannot check UNSPSC classes as they
would not comply with eCl@ss rules. Therefore, there is no need to let the other classification authorities
check content of other classification systems before it is even published.
Also, the approach of CWA15556-3:2006 (E), p. 12 to ensure a harmonized development of the different
classifications for the result of comparable development of product classifications in different fields by
different organizations but according to agreed compatibility rules cannot be followed from todays
perspective any more. As described on page 14 a joint working committee could only be found for two out of
76

prCWA XXX-1:200X (E)

three domains chosen by the ePDC project. The problems for only these three chosen domains are
described very well. The number of problems would surely be multiplied if enhanced to all relevant domains.
In fact, each of the four classification systems named here are driven by different users for different
purposes. Therefore, their distinct maintenance is not comparable. Plus, as the cMap methodology is meant
as a basis to facilitate the integration of other product classification systems, this would produce an even
higher organizational effort. At this point, the whole operation would be endangered by an organization that
could not be managed any longer. Even within one single classification organization, the maintenance is
hard organizational work and the final result can only be seen on the day of publication, but not at some
defined synchronization point before even publishing.
From todays point of view, each standard will stay self-governed and the organizations have already a lot to
do to let their change requests be processed through the whole internal workflow.
Therefore, to compare the maintenance processes again in order to establish synchronization points as
defined above is not a fruitful task. The major goal is not a synchronization of the maintenance of different
classification systems but rather the maintained interoperability between releases published by self-governed
authorities.
The mapping itself cannot be a deliverable by the classification authorities themselves, but mainly by the
users of the standards. In fact, most companies use various standards in their IT systems mostly additionally
to their company internal classification system. Therefore, they are used to map different classification
systems and can already deliver useful input. Plus, the specialists are in the companies that drive the
classification systems themselves. A mapping authority could only do the administrational work and doublecheck the quality, but not do the whole mapping. The mapping delivered as a main output of the cMap
project is done by two experts but can only be meant as a solid basis for further changes that will be
requested by users. This process shall rather be defined here than a synchronized process between the
classification authorities that might not lead to a maintained mapping. This will be elaborated in the following
chapters 7 and 8.

77

prCWA XXX-1:200X (E)

7.
Definition of the architecture for an open standardized classification
collaboration platform
7.1

Introduction

The scope of this chapter is to describe the technical level of the cMap platform. The strategic level including
process descriptions will follow in chapter 8. To start with, the building blocks of the cMap architecture like
defined user roles, business objects, use cases, a requirement analysis and thoughts on data quality will be
documented.
To be able to define an appropriate cMap architecture section 7.2 and 7.3 will describe the possible business
use cases of the mapping results and interested actors. Based on this section 7.4 will define the necessary
roles of the cMap platform kept as simple and basic. section 7.5 will address the description of the business
objects of the cMap platform before describing use cases of the platform (section 7.6) and the requirements
of the architecture (section 7.7) as well as giving some thoughts on data quality (section 7.8).

7.2

Business use cases

Wherever electronic product data is exchanged, classification systems play a major role. On the one hand,
companies might have developed their own internal classification systems in order to get an overview on
similar products or use an existing classification system. On the other hand, when exchanging product data
with business partners, they have to be sure to talk about the same product classes with their partners who
themselves might use a different classification system or another classification of their own. Companies
already have the need to map the classification system they use internally to the one they use to exchange
data.
The usage of mapped classification systems is very diverse, but can be illustrated in a rather simple way.
The following figure sums up the problem of using different classification systems.

Figure 58: Business use case of mapping user: the problem of using different classification systems
The solution for business partner 2 is to provide their product data classified according to the classification
system used by business partner 1 who requests the data. cMap will deliver the solution shown in the
following figure.

78

prCWA XXX-1:200X (E)

Figure 59: Business use case mapping user: the solution for using different classification systems
The following figure shows an example in the context of public procurement in Europe: for the bidding
process a public procurement agency needs product data classified according to the European mandatory
CPV. A supplier already uses GPC in his/her system to classify his/her data and needs to translate the GPC
codes into CPV codes to take part in the bidding process.

Figure 60: Business use case mapping user: example for the usage of different classification
systems

7.3

Actors who require the mapping

Many different business users could be identified. As mentioned above, any user exchanging classified
product data is a potential user of both the cMap mapping results and the cMap platform to maintain the
mapping. Among them are representatives along the whole supply chain and throughout all business
processes. It is therefore relevant for manufacturers, suppliers, public and private procurement and
everybody who participates in the supply chain.
Especially suppliers who are forced to deliver their product data in a form requested by their customers are
highly potential users. They might have to deliver data classified with CPV to public procurers, classified to
UNSPSC, GPC and eCl@ss to three different customers.
The classification system end-users do not care about databases, systems and content. They have
information needs that they demand be fulfilled in a helpful way rapidly and potentially in higher quantities.
79

prCWA XXX-1:200X (E)

Figure 61: Business use case mapping user: example for the usage of different classification
systems

80

prCWA XXX-1:200X (E)

Figure 62: Transclassification can be read as a portmanteau of translation and classification

81

prCWA XXX-1:200X (E)

7.4

cMap platform roles

The business users in the market who actually use the cMap mapping results might be diverse. The roles
they could play to maintain the cMap mapping in the cMap online platform should be kept simple and will be
explained in the next chapter.
The minimum number of roles that need to be involved with the cMap platform have been identified:
the end-user of the cMap mapping (i.e. somebody who queries a mapping result)
the cMap platform administration authority (i.e. somebody who governs the platform)
the cMap platform provider (i.e. somebody who runs the platform)
the classification authorities (i.e. somebody who delivers the input for the mapping)
the mapping proposers (i.e. somebody who creates a mapping change request)
the quality managers (i.e. somebody who approves a mapping change request)
cMap release manager (i.e. somebody who manages the mapping tables (and the cMap
classification depending on strategy)
For cMap, 2. and 3. provide the technical basis, 4. provides the content basis. Roles 5. and 6. actually do the
mapping work, i.e. they provide the maintained mapping ensuring a four-eye-principle, whereas 7. is in
charge of publishing the mapping results. More roles can be imagined, but to keep the process lean only the
basic requirements are described.
The following figure gives an overview of these roles:

Figure 63: cMap roles: Overview

82

prCWA XXX-1:200X (E)

7.4.1 End-user
The cMap platform is open for users who require a mapping result for their classification code. They submit
queries to the platform sending a specific classification code in one classification system to receive a
mapping result, i.e. the relevant classification code in another classification system. The end-user could be
anybody.

7.4.2 Platform administration authority


The cMap administration authority is the organization that owns the cMap platform. This could be the CEN,
the European Commission, some other organization or even a private owner. It is highly recommended to
find a neutral owner. The administration authority:
governs and defines the cMap platform
defines the underlying business rules
defines the business model
defines and controls the processes and interfaces
controls that the platform is run according to the rules, processes and interfaces
authorizes the platform administrator

7.4.3 Platform Provider


The platform provider technically hosts and maintains the cMap platform, i.e. is responsible for the operation
of the cMap platform. It could be the administration authority itself (e.g. the EC or the CEN) or a contractor of
the administration authority. The role has the following tasks:
Has to integrate any new release supplied by the classification authorities (see below) per
standardized import (including any available language version)
Has to integrate the release update files and upgrade from the last to the current release of any
classification system in the most possible automated way
Has to inform quality managers and interested mapping proposers about upgrade changes that need
manual check (e.g. SPLIT, could be made automatically)
Has to export the mapping result from the platform into (a) format(s) of publication

7.4.4 Classification Authority


Of course, the classification authorities that are responsible for the maintenance of the classification systems
have to be involved as their cooperation is needed 34. The following basic rights and duties that are a
precondition for a collaborative mapping of the classification systems should be put in place. Beyond that,
further cooperation issues might be possible.
The crucial information that the product classification systems authorities have to provide are:
their release policies
the exact dates of new releases in advance
release update information including documentation as explained in chapter 6.
If the major milestones of their release project plans were published, then interested parties could plan
resources to update the mapping and conduct checks on - for example BETA versions of the classifications.
This would be an easy-to-solve organizational precondition for the classification authorities.
Further, with the help of the release update information, the mapping maintenance can be assisted in a semiautomatic way.
Duties
o Make their release roadmaps and publication dates visible and transparent
o Supply each new release within a certain time period after publication to the cMap Platform
Administrator
incl. release update information
in all languages available
in the same documented format each time
34 In order to formally agree on the contribution of the classification authorities, a memorandum of understanding (MoU) or a similar
agreement for joint communications would be useful between the classification authorities and the administration authority. Many
examples can be found.

83

prCWA XXX-1:200X (E)

Rights
o Should be closely integrated into the information workflow of the cMap platform
o Should be involved in a kind of advisory or supervising board
o They should receive release and update information by the other classification authorities
right after publication to be informed about new classes in the other classification systems
(by the cMap Platform Administrator or the classification authorities themselves)

7.4.5 Mapping Proposer


After defining owners and hosts of the platform the most essential roles are those who do the actual work. As
more than one role is involved the work will not be done directly in the database making hard changes, but
rather with the help of change requests. For the mapping maintenance workflow one role requests a
mapping to be established (i.e. mapping change request) and another role evaluates and approves a
mapping request. Thereby, the four-eye-principle is guaranteed which is also a requirement of the GenePDC process. The mapping proposer doing the actual work is generally described by the following
attributes:
Registers as an end-user of the cMap platform
o Marks areas (i.e. domains) where he/she would like to receive proposal notifications
o Will automatically be registered as interested in the areas of his/her own proposals
o Every other registered user in this area receives his/her proposal information
o The decision whether a registration is bound to certain business rules is up to the platform
administrator
Can submit mapping proposals for every domain
Shall be an industry expert, i.e. a specialist in some fields of knowledge
Ideally he/she shall combine expertise on certain fields with principal knowledge about the
classification systems in question and their technical maintenance
Will be the working ontologist ensuring the success of the whole system on the content side
Shall have access to the discussion forum
Will be informed about status changes of his/her change requests automatically by the cMap online
platform

7.4.6 Quality manager


In order to check the quality and correctness of a mapping requested by the mapping proposer, somebody
has to act as the quality management body of cMap. The following attributes describe the cMap quality
manager:
Is the cMap quality management body to double-check mapping proposals (i.e. guarantee the foureye-principle), i.e. they either approve, reject or send back for re-work the mapping proposal
Should be an industry and classification expert, i.e. a senior mapping maintainer
Could be in charge of only a certain specific domain (as it will be hard to find somebody being an
expert in all domains)
Users, i.e. industry experts, can apply for the position of quality manager for a specific domain
o Shall have access to discussion forum

7.4.7 Apply officially at the administration Release manager


The cMap release manager is the role that publishes the mapping at a certain point in time. He/she takes the
decision when and how an update will be created (i.e. processed in the cMap platform) and made available
by publication. The scope depends on the mapping strategy, e.g. whether only the mapping tables or
additionally the cMap classification (compare 7.4.2) are released. The release manager:
manages the deliverables of the cMap platform
tracks the underlying change management rules, i.e. how changes of updates are documented
ensures that the mapping tables are easily accessible for the users by using a format that is relevant
and immediately usable

84

prCWA XXX-1:200X (E)

7.5

Business objects

7.5.1 Representing classifications


In an ePPS compliant architecture classification systems can be represented by means of hierarchically
structured categorization classes. In terms of ISO 13584-32 this means there is a graph of classes of
CATEGORIZATION_CLASS_Type which are in a categorization_class_superclasses relationship.
Note: As all classification systems covered in this CWA have a strictly hierarchical structure, the cardinality of
the categorization_class_superclass collection depicted in Figure 64 will actually be limited to 0..1 for the
classes within each release of a classification system, i.e. the graph will be a tree.

Figure 64: Categorization class (source: ISO 13584-32)


As a simplified big picture the classification systems in scope of this CWA each provide classifying elements
that form a hierarchy. The classifying elements can (and in the cases of GPC and eCl@ss actively are) be
further characterized by abstract descriptions of commodity classes as depicted in Figure 65:

Figure 65: Classification and product description

85

prCWA XXX-1:200X (E)

An additional and major challenge appears in classifications with properties, especially when a certain depth
of the classification is enforced. Such systems actually have two principal means of distinction: the
classification and the properties. For obvious or practical reasons this can lead to different modelling
approaches, such that in some areas of a standard classes may contain huge amounts of sub-types of
products distinguished by certain special properties while other areas extend to have a more granular
classification.
NOTE: Involving properties in the mapping tables was not included in scope of the cMap project. But it
seems obvious that involving them in the future could improve both the mapping results and their applicability
-- when properties fully distinct between or at least narrow down the available options of One-to-Many and
Many-to-Many mapping cases. Thus it would seem negligent to not consider this in the proposed system
architecture.

7.5.2 Representing mappings


7.5.2.1 Types of mappings
There are two principal variables for mappings:

scope, i.e. whether the mapping is intra or inter classification systems and whether it is within or
between different releases of a classification system
type of element, i.e. whether classes, characteristics or property values are mapped

There are additional criteria to describe mappings:

direction, i.e. whether the mapping is injective, surjective or bijective


completeness, i.e. whether the mapping is defined for all elements of a model or only for some
uniqueness, i.e. whether the mapping is lossy, ambiguous or neither.

Real world examples:

1. eCl@ss introduced with Release 7.0 two models for products, a basic and an advanced description.
For each property in the basic model there is at least one equivalent property in the advanced
model. This is an example of intra-system mapping of properties in scope of the same release. The
mapping is complete and injective from basic to advanced and incomplete as well as in some
exceptions not unique from advanced to basic.
2. eCl@ss introduced with Release 7.0 the concept of classification update. For each change
performed at classification level a predecessor-successor relation is given and exported in machine
readable format so that an automatic update of classifications can be performed. This is an example
of intra-system, cross-release mapping of classes. The mapping is injective (from old to new
release), complete (but exported only for changed elements) and not unique (because of lossy join
and ambiguous split operations)
3. PROLIST introduced with Release NE100 3.2 the concept of transaction update. For each change
performed at level of the (complex) list of properties a predecessor-successor relation is given and
exported in machine readable format so that an automatic update of transactions based on the data
dictionary can be performed. eCl@ss adapted the concept with Release 7.0. This is an example of
intra-system, cross-release mapping of characteristics and property values. The mapping is injective
(from old to new release), complete (but exported only for those elements where needed) and
unique (except for some lossy cases of error corrections)
4. eCl@ss has absorbed in Release 7.0 the PROLIST NE100 data dictionary which contains
sophistically elaborated lists of properties of devices used in process engineering and plant
automation (mostly in the chemical industry). So far only an inter-system mapping of classes has
been released; a mapping of characteristics and property values is in work. The mapping is injective
and aimed at becoming complete and unique.

86

prCWA XXX-1:200X (E)

Scope of the current cMap project is to enable semantic interoperability (cf. CWA 16100, p. 45) by producing
an inter-release mapping at level of categorization classes of selected releases of the four classification
systems CPV, eCl@ss, GPC and UNSPSC. The scope of the cMap platform architecture is more
widespread, though, as it also has to consider not only the variables and criteria described above but also :

evolution of the standards


extension with other standards
openness towards an extension of the cMap scope from classes only to classes, properties and
property values (where applicable)

7.5.2.2 Evolution and extension of mappings


The initial mapping produced in scope of this CWA is static, but it is in scope of the cMap platform
architecture to support extension of these initial mapping by further releases of CPV, eCl@ss, GPC and
UNSPSC as well as other classification systems.
Two principal approaches are possible to achieve this:

1. for each new release or classification system new tables are produced that contain the mapping
between the previously mapped systems and/or releases and the new one.
2. a set of intermediary elements is introduced (called further on in this text the cMap dictionary for
simplicitys sake) which help abstract the mapping and serve as a medium of exchange between the
classification systems and releases. For each new classification system or release then only the
mappings to and from the cMap dictionary are created, mappings to the other system get generated
automatically.

3.
Both approaches have pros and cons as follows):

pro
approach
1

con

no cMap dictionary to maintain

approach
2

adding a classification system or


release always adds exactly two
mapping pairs
many-to-many mappings can be
resolved in regard to cMap
dictionary and made more usable

combinatoric explosion of possible mapping


pairs when adding more classification
systems / releases (see example below)
usability of many to many mappings limited:
restricted by the available classification
systems
additional effort required for evolution of
cMap dictionary
cMap dictionary may be expensive to
maintain because of enforced formal
mediation process

Nota bene: Assuming that every release of every classification standard gets mapped with all other releases
very quickly leads to a kind of combinatoric explosion; the initial cMap mapping has to consider 12 mapping
tables for only four classification systems. Adding only more systems one by one would increase this number
each time by twice the number of systems minus 2 to 20, 30, 42, for 5, 6, 7, classification systems.

7.5.2.3 Representation of mappings


In this section two approaches for the representation of mappings are described, a first one that is human
readable, well suited to capture the status quo, but not easily machine tractable and not considering
evolution and extension as well as a second one, that has the objective to represent data in a machine and
is open for the evolution and extension required in the cMap platform architecture, but is less accessible for
immediate consumption by the layman.

87

prCWA XXX-1:200X (E)

Approach 1
Possible goals:

1. do not introduce additional elements


2. focus on direct transclassification between a pair of (classification system, release) tupels
Representation of mappings in this approach would follow the spirit of the Excel sheets used to capture the
initial mappings.
Approach 2
Possible goals:

1. bring classification systems in line


2. allow for interoperability when using classification systems
Different visions towards the result of approach 2:

1. establish a cMap classification as kind of harmonized and universal classification system


2. create an interoperability layer of is_case_of relations between the leaves of the classifications
including intermediate cMap nodes as a tool
A mapping example for the rare case that a 121 mapping exists for all four classification systems covered in
cMap is given in this subjection below. It contains the intermediate class (cMap underpants) expressing the
121 mapping, the classification classes above from the four systems as well as the characterization classes
below from the two systems that have properties. A more systematic approach of representing the cMap
mapping results according to approach 2 in the system of the proposed architecture is detailed in the
following subsection 7.5.3.

88

prCWA XXX-1:200X (E)

Figure 66: Example for a 121 mapping of classes from all four systems from the cMap mapping
tables

7.5.2.4 Outlook: Mapping more than just classification systems


Achieving interoperability between classification codes is a big step before a maybe even bigger leap, an
inter-standard equivalence also of properties. For this to become possible, there needs to be achieved a
harmonized superset of data types that are supported in all classification and characterization systems as
well as harmonization of the units of measurement that are used by these systems.
Furthermore it will make the mapping more likely to eventually succeed when those standards that do
describe their classified products with properties evolve towards models that are comparable in the
granularity of their descriptions.
Then, for the common parts of the classifications and characterization systems, properties can be mapped in
an equivalence relation that subsequently allows a user to automatically adjust the classification code from
one system to the other, but also have the values describing individual products taken over from one
referenced model to the other.

89

prCWA XXX-1:200X (E)

Figure 67: Mapping of classes and additional mapping of properties

7.5.2.5 Representing no mapping found


Besides the actual mappings also the knowledge about no mapping found in one or many systems has to
be recorded. This can be modelled by a free no mapping found relation between a cMap class and an
identifier representing the release of a certain system, that gets established for all cases when there is an
intermediate class that has no correspondence in a certain release of a classification system.

7.5.3 Bringing in line the cMap mapping cases


The following subsections describe how the different mapping relationships of cMap can be expressed in a
system that makes use of cMap classes and the is_case_of relation, The patterns of the relationships are
independent from particular classification systems therefore in the representation examples below the
systems are named systems Sn and not by any proper name. The platform which maintains the intermediate
classes stands as synonym for the cMap platform.
90

prCWA XXX-1:200X (E)

NOTE: in an actual implementation it is likely that there will be a second level of intermediate classes to
make handling of sets and combinations of is_case_of relationships between multiple systems more
overseeable and easier to maintain.

7.5.3.1 Representation of none to none (N2N) mapping


A N2N mapping between no class from S1 and S2 is expressed as follows: There is an intermediate class Ci
which is case of a non-empty set of classes {C3_1..n} from a S3. Ci is not case of any class from S1 or S2.
There is a no mapping found relation recorded between Ci and S1 and between Ci and S2.

Figure 68: Representation of no mapping found

7.5.3.2 Representation of None-to-One (N21) and One-to-None(12N) mapping


A N21 mapping between no class from S1 and a class C2 from S2 or a 12N mapping between a class C2 from
S2 and no class from S1 is expressed as follows: there is exactly one intermediate class Ci which is case of a
non-empty set of classes {C3_1..n} from a S3. Ci is not case of any class from S1, but it is case of only C2 (in
regard to S2). There is a no mapping found relation recorded between Ci and S1.

Figure 69: Representation of N21 and 12N mapping

7.5.3.3 Representation of None-to-Many (N2M) and Many-to-None (M2N) mapping


A N2M mapping between no class from S1 and a set of classes {C2_1..m} from S2 (with m2) or a M2N
mapping between a set of classes {C2_1..m} from S2 (with m2) and no class from S1 is expressed as follows:
there is an intermediate class Ci which is case of a non-empty set of classes {C3_1..n} from a S3. Ci is not case
91

prCWA XXX-1:200X (E)

of any class from S1, but it is case of every member of the set {C2_1..n} (but only these in regard to S2). There
is a no mapping found relation recorded between Ci and S1.

Figure 70: Representation of N2M and M2N mapping

7.5.3.4 Representation of One to One (121) mapping


A 121 mapping between C1 from S1 and C2 from S2 is expressed as follows: there is exactly one class Ci
which is case of only C1 (in regard to S1) and only C2 (in regard to S2).

92

prCWA XXX-1:200X (E)

Figure 71: Representation of 121 mapping

7.5.3.5 Representation of One to Many (12M) and Many to One (M21) mapping
A 12M mapping between C1 of S1 and a set of classes {C2_1..m} from S2 (with m2) or a M21 mapping
between a set of classes {C2_1..n} from S2 (with n2) and C1 from S1 is expressed as follows: there is exactly
one intermediate class Ci which is case of only C1 (in regard to S1) and every member of the set {C2_1..n} (but
only these in regard to S2).

93

prCWA XXX-1:200X (E)

Figure 72: Representation of 12M and M21 mapping

7.5.3.6 Representation of Many to Many (M2M) mapping


A M2M mapping between a set of classes {C1_1..n} from S1 (with n2) and a set of classes {C2_1..m} from S2
(with m2) is expressed as follows: there is an intermediate class Ci which is every member of the set
{C1_1..n} (but only these in regard to S1) and every member of the set {C2_1..m} (but only these in regard to S2).

94

prCWA XXX-1:200X (E)

Figure 73: Representation of M2M mapping

7.6

Use cases

There are several possible use cases for a change in the cMap platform. The ePPS documentation already
describes the use cases of a classification process and it is the base for this section that only focuses on this
CWAs objective: the description of the mapping.
First, mapping status has to be defined, it will apply to each class in each classification system and in relation
to any target classification system, as shown in the following figure:

Figure 74: cMap Mapping Status of Classification Classes

95

prCWA XXX-1:200X (E)

7.6.1 Use Case 1: Query for mapping


End-users (AC01) have a need for information: they have data on a certain product in some system or even
on paper and may even have rudimentary classification information attached. What they would like to obtain
is a certain piece of classification information, typically in order to include this information in a response to a
certain request for information on their product. The cMap system shall enable them to retrieve the desired
piece of classification information via a web GUI or web service, depending on the access method. The
system will display if a mapping exists (mapping status MAPPED), if a mapping might have changed (TO BE
CHECKED) or if no mapping exists (NO MAP) or no mapping attempt has been made yet (BLANK).

7.6.2 Use Case 2: Manage mapping


There are several possible types of managing the cMap mapping represented by various use cases. Some
of these use cases apply to change request when others apply to workflows as resulting actions of the
change requests:
UC02.01: change request - new mapping
UC02.02: change request - edit mapping
UC02.03: change request - delete mapping
UC02.04: workflow action - process mapping change requests
UC02.05: workflow action - verify mapping
UC02.06: workflow action - release mapping
UC02.07: workflow action - withdraw mapping

7.6.2.1 UC02.01: Change request - New mapping


It may seem to be a rather easy issue to add a mapping where no mapping has existed yet. But two different
situations may apply:
mappings that were already dealt with, but simply no mapping hit was found (NO MAP)
mappings that were not yet handled at all (BLANK)
After an initial mapping upload was done on the basis of the cMap project mapping results, there is in fact
already an existing mapping in the database for every classification class. Even if there is a None-to-One,
One-to-None, None-to-Many, Many-to-None or None-to-None relationship, there is still a relationship defined
in the platform to show that the mapping action was executed without any hits (NO MAP).
This relationship has to be marked so that a user knows that the mapping action was already finished, but
without any result. If a user checks the mapping result (NO MAP), but knows a proper mapping hit, then this
change of information is to be considered as a correction, i.e. an edit of an existing mapping (see UC02.02
below).
The case of additional mapping only applies for the mapping of a new item in a new release of a
classification system that was not yet handled at all (BLANK). This has to be marked completely blank so
that a user can identify it as missing a mapping action.
In both cases, the process will be similar to use case UC02.02 which is described below. No matter if a
mapping already exists (MAPPED) or a mapping without results shall be edited (NO MAP) or an additional
mapping (BLANK) shall be created: it is always the process of editing a mapping.

7.6.2.2 UC02.02: Change request - Edit mapping


If a mapping action was already executed on a certain classification item the result will be marked so that
everybody can see and evaluate this mapping result. As errors can always occur and mappings might not
always be accurate or exhaustive, a user must have the possibility to request a change of this mapping
result.
The user therefore submits a change request including the current mapping result, the new proposed
mapping result and a reason to explain why it needs to be changed.
96

prCWA XXX-1:200X (E)

Somebody requests a change and another one has to evaluate this change. This is how a mapping is
corrected and completed.

Figure 75: UC02: Manage mapping

7.6.2.3 UC02.03: Change request - Delete mapping


Of course, a mapping might have been a mistake. Therefore, a functionality to delete a mapping has to be
implemented as well. The use case does not differ from the edit mapping use case and can therefore be
described as shown in the figure above Manage Mapping.

7.6.2.4 UC02.04: Workflow action - Process mapping change requests


In order to guarantee the four-eye-principle more than one role is involved in the process. Therefore, the
mapping change request has to be processed through a workflow. In order to do so, a change request has to
have at least the following characteristics:
A unique identification code
A status that documents the progress of processing the change request
A requestor, i.e. the mapping proposers user identification
The type and the content of the change, i.e. the proposed mapping relation between two
classification codes
The creation date of the change request
The reason for the request
At least the following statuses in the workflow have to considered:
NEW:

a mapping proposer creates a new proposal for a mapping.


The request was created, but not yet submitted as the mapping proposer might still
want to edit it.
SUBMITTED: the mapping proposer submits the request to the platform. An automatic notification
is sent to the quality manager.
REWORK:
The quality manager evaluates the change request as not 100 % correct and
sends back the mapping change request to be reworked by the mapping proposer
REJECTED: The quality manager evaluates the change request as incorrect and rejects it.
ACCEPTED: The quality manager evaluates the change request to be correct and accepts it.
It can now be processed as a mapping relation.
RELEASED: After being accepted by the quality manager as a correct mapping proposal the
result of the change request is published by the release manager. The change
request is set to the status
RELEASED. The mapping relation is then published in the cMap platform to the public.
WITHDRAWN: A released mapping relation is requested to be withdrawn by a mapping proposer as
it is incorrect or outdated. If the quality manager accepts this request the relation is
set to status WITHDRAWN.

The workflow is based on the ePDC Maintenance Procedure as described in CWA 15295:2005, p. 35:
97

prCWA XXX-1:200X (E)

Figure 76: UC02.04: Process mapping change request (based on ePDC maintenance)

7.6.2.5 UC02.05: Workflow action - Verify mapping


A mapping change request is submitted by the mapping proposer. It is then the quality managers task to
verify the mapping proposal. As shown in the figure above the quality manager checks the proposal and the
following three results are possible:
The quality manager does not accept the request and sets its status to REJECTED including a
reason. The process ends here. This is only possible with requests that are totally inacceptable.
The quality manager does not accept all features of the request and sets its status to REWORK
including a reason and a proposal for improvement, e.g. he/she suggests a better alternative as a
target mapping class. The process starts again at the beginning as the requestor can now edit and
improve the request. The status REWORK is similar to status NEW.
The quality manager accepts the request and sets its status to ACCEPTED. The next process step
is already the last one: to release the mapping.

7.6.2.6 UC02.06: Workflow action - Release mapping


After verifying and accepting a mapping change request, the result shall be published. The release manager
role will release the mapping relation. The status of the change request changes from ACCEPTED to
RELEASED to document its publication.

7.6.2.7 UC02.07: Workflow action - Withdraw mapping


A mapping can always be wrong or become outdated. The ePDC maintenance procedure does also consider
this status change. The following figure shows the adapted process for cMap based on what is described in
CWA 15295:2005, p. 35.

98

prCWA XXX-1:200X (E)

Figure 77: UC02.07: Withdraw mapping (based on ePDC maintenance)

7.6.3 Use case 3: Load classification system


There are several possible use cases involved in the load of a classification system:
UC03.01: Load a new release of an already integrated classification system
UC03.02: Load a release of a new classification system
UC03.03: Upload release update information
UC03.04: Apply release update information on classification systems and/or mapping
UC03.05: Delete a release of an integrated classification system

7.6.3.1 UC03.01: Load - New release


Probably the highest effort in the work of the cMap platform is the upgrade to a newer release of one or more
classification systems. As shown in 6 new releases are published by all classification systems on a more or
less regular basis. As shown in 6.4 releases are backward compatible at least in parts. The release update
information described in detail in section 6.4is a crucial tool to maintain the mapping after uploading a new
release version of a classification system.
The use case is shown in the following figure and explained afterwards:

99

prCWA XXX-1:200X (E)

Figure 78: UC03.01: Load - New release

The use case can be described as follows:


After publishing a new release of a classification system, the responsible classification authority
provides the new version including the release update information (see also: section 6.4).
The platform provider uploads the new release and the release update information to receive the
following result:
x mappings could be updated automatically due to the release update information and are marked
as <UPDATE>
y mappings are marked as <TO BE CHECKED> as they could not be mapped 100%, if e.g. a
mapping is One-to-Many on the basis of the release update information and has to be manually
checked
z mappings cannot be found automatically and will be marked as <BLANK>
Both the registered mapping proposers and the quality manager(s) receive the information about the
exact classes and their status. On this basis, they can maintain the mapping on the new release of
the classification system

7.6.3.2 UC03.02: Load - New classification system


Apart from the four main classification systems mentioned in this project (CPV, GPC, UNSPSC, eCl@ss) the
cMap platform shall be able to integrate other classification systems if needed. The addition of a new
classification system is probably the highest effort in the work of the cMap platform after the upload of a new
release (incl. test application of new release).
The upload of a brand new classification system might be connected to the upload of an initial mapping.
The use case is shown in the following figure and explained afterwards:

100

prCWA XXX-1:200X (E)

Figure 79: UC03.02: Load - New classification system

The use case can be described as follows:


The responsible classification authority provides the classification system which is integrated into the
cMap platform for the first time
Additionally, if available, an initial mapping is uploaded into the cMap platform
No further process is needed at this stage
Both the registered mapping proposers and the mapping administrator(s) receive the information
about the added classes and their status. On this basis, they can start, respectively maintain the
mapping on the new classification system

7.6.3.3 UC03.03: Load - release update information


The cMap platform shall be able to import all relevant release update information that are described in detail
in chapter 6.4. The files are all readable in EXCEL, even though the information is structured differently. The
cMap platform has to be able to import all delivered information provided by the classification authorities.

7.6.3.4 UC03.04: Apply - release update information


To import the release update information is one thing, but the most important part is to apply the information
on existing data. As shown in section 6.4, the relevant change types for classes are the same in every
classification system. The following table shows the type of change, a brief explanation and the
consequence for the existing cMap mapping. It also displays the mapping status after applying the release
update information to existing mappings.

101

prCWA XXX-1:200X (E)

Figure 80: UC03.04: Apply release update information


Examples of the possible changes are described in the following two figures. The first three columns show
the existing mapping (in the n release). Column 4 displays the change applied with release n+1. Column 5
to 7 indicate the mapping result and status after applying release n+1.
Explanation of example: CPV Codes were mapped to GPC codes (except of the new classes). Then a new
CPV release is updated and the release update information uploaded and applied. Certain MAPPED classes
still remain MAPPED in their mapping status, others have to be checked or a BLANK. The cMap platform
shall send automatic notifications of TO BE CHECKED and BLANK classes to interested mapping proposers
and quality managers. As there is a mapping from the CPV to the GPC and form the GPC to the CPV, there
are two tables one for each source classification system. Tables like these would be created for every
other target classification system (i.e. from CPV to eCl@ss, from CPV to UNSPSC).

102

prCWA XXX-1:200X (E)

Figure 81: UC03.04 Example: Apply release update information of mapped CPV classes to GPC

Figure 82: UC03.04 Example: Apply release update information of mapped GPC classes CPV

103

prCWA XXX-1:200X (E)

As the table clearly shows, EDIT and MOVE are changes that keep the MAPPED status. A JOIN will keep
the MAPPED status, but might enhance the number of target classes. A SPLIT function requires a check as
a correct successor-relation has to be identified. A NEW class requires an initial mapping whereas a
DELETED class results in a TO BE CHECKED status as the target class was deleted and a new mapping
target has to be checked.

7.6.3.5 UC03.05: Delete - a release of an integrated classification system


If a whole release of an integrated classification system is deleted for whatever reason, then all mapping
results are lost. The results might be exported first to be saved for later, as an exported result could easily be
imported again. For all involved parties this is the use case that produces the least work as no action is
required at all. All mapping relations are deleted and no mapping status is set for the classes in the target
classification systems. If a release is deleted from the cMap platform, this would not result in any changes
concerning the mapping results between other classification systems.

7.7

Requirement analysis

7.7.1 Architectural requirements


7.7.1.1 AR01: Common format for import and export
It is a common best practice to have a data interchange format (or family thereof) that is used as reference
format for export of data as well as for data import.
Additional use cases may bring up the need to support specific (legacy) formats that can only be read in or
for conversion programs to translate between legacy formats and the preferred formats.
The platform has to be able to import the classification systems and their updates as well as their release
update information and to export the mapping of classification classes so that users can process them
electronically. This can be guaranteed by either standardizing the import/export interface or to standardize
the deliverables to import. Similar to the recommendation of CWA15295:2005 by the ePDC project to
reduce the number and variety of exchange formats (p.23) of classification standards, the goal could be to
standardize the deliverable of each classification authority to the cMap platform. That means that both the
content and the release update information could be delivered to cMap in a standardized way. This would
enable cMap to design a platform to import all classification systems and maintain the mapping between
them when upgrading to new releases. An open question would be if and how the classification authorities
would be willing and able to assist in standardizing their data formats and migrate to the standardized
format35.
Therefore, the most practicable solution is to simply adjust the import/export interface to cope with the
different formats delivered by the classification authorities.

7.7.1.2 AR02: Web or cloud based platform


The platform should be web- or cloud-based and provide two main interaction channels:
a graphical user interface directed at human users using a web browser
a web service interface directed at automated queries by machines
Note that a decision what type(s) of web service (SOAP, REST) and what type(s) of answers (XML, JSON,
RDF, CSV, ...) to support depends mostly on the requirements of the expected market.

7.7.1.3 AR03: Multilinguality of platform and its content


The platform needs to support multilinguality. On the one hand the platform GUI must allow for localization,
on the other hand the content maintained in the platform may be multilingual. The objective must be that
users whose native language is not English can still participate effectively36.

35 For import and export requirements see CWA 15295:2005, p. 61


36 See also CWA 15295:2005, p. 60.

104

prCWA XXX-1:200X (E)

For each collection of entities kept or maintained on the platform, requirements regarding the content
language have to be expressable separately.
See also: Section 4 of CWA 16100 (ePPS).

7.7.1.4 AR04: Integrate all available content


A mapping can only be done on the class level, i.e. a mapping of properties and property values is out of
scope of this project. Nevertheless, if existing, the properties and property values as well as keywords and
synonyms have to be displayed in the platform for the mapping proposer to be taken into account when
doing the mapping. This way, all available information about a classification class (its name, definition,
keywords, assigned properties etc.) combined with the location in the class hierarchy give the best available
hint37 how to map it to another classification class.

7.7.2 Platform requirements


To take the cMap platform into production, an initial load of the mapping of several classification systems has
to be imported. In this case it should be the mapping delivered within this cMap project, i.e. the mapping of
the following classification systems in the named versions:
CPV 2008
eCl@ss 6.0.1
GPC As at 31 August 2009
UNSPSC v11.1201
The mapping was done on the basis of the particular English versions. The initial mapping will be described
in detail in section 5.1 with the result of several mapping tables that will be made available by the cMap
project.
By the time this document will be published, the following new releases of the four classification systems will
be available:
Classification
System

Following
Release n+1

Following
Release n+2

Following
Release n+3

Following
Release n+4

CPV
GPC

Initial
Mapping on
Version (n)
2008
2009-08-31

2010-06-01

2010-12-01

2011-06-01

2011-12-01
(planned)

UNSPSC
eCl@ss

v11
6.0.1

v12
6.1

v13
6.2

v14
7.0

7.1

Following
Release n+5

8.0
(planned)

This means that since the initial mapping 11 new releases of the classification systems have been published.
How to handle this was described in use case 3 (section 7.6.3).
The initial mapping will simply be an upload of the information recorded in the EXCEL spread sheets that
were produced in the framework of this project. No process has to be defined for this initial upload.
Further, the mapping architecture has to be designed in a way that enables the addition of classification
systems apart from the four chosen ones. The fact that a 4-level class hierarchy cannot be taken as granted
has to be taken into account for the class mapping process.

37 Note: consider SuperClass, SubClass, IdenticalClass and SimilarClass CWA15295:2005, p.27

105

prCWA XXX-1:200X (E)

Figure 83: cMap initial load


The following are minimal requirements of the cMap platform:

7.7.2.1 PR01: User registration


The platform includes a registration process; the registration type depends on the platform strategy (e.g. if it
enables any user to register).

7.7.2.2 PR02: Change requests


The platform includes a change request functionality to create, submit, evaluate, edit, approve or reject
mapping requests.

7.7.2.3 PR03: Workflow


The platform includes a processing workflow for the change requests with changing status throughout the
consecutive stages of the workflow.

7.7.2.4 PR04: Comfortable GUI


The platform provides a drag & drop graphical user interface (GUI) so that users can manage mapping
proposals from one classification system to another easily and rapidly. The language being used shall be as
non-technical as possible to encourage users. Further, a clear guidance through the platform (like a help
function) has to be provided and a mechanism that allows change requests to be added quickly.
Plus, as all user activities take place here proactive support with necessary tools that facilitate the access,
use and governance of the mapping table should be provided directly in the GUI such as links to:
a.
b.
c.
d.
e.
f.

Business case studies and literature


Migration tools
Translations
Training and education programmes
Access to experienced services to help companies get started
Contact the administration authority

106

prCWA XXX-1:200X (E)

7.7.2.5 PR05: E-Mail notification


The platform supports automated information exchanges such as optional email notifications about
registration, submitted change requests, status changes of change requests etc.

7.7.2.6 PR06: Individual setting of platform and content language


The platform gives the user the possibility to individually configure the language setting of both the displayed
content language and the displayed platform language, if available in more than one language
(recommended).

7.7.2.7 PR07: Discussion forum


The platform includes a discussion forum to facilitate the evaluation of change requests.

7.7.2.8 PR08: Search function


The platform provides an elaborated search function including a semantic-based similarity search function to
propose candidates for mapping in the classification systems and taking into account all the information
available by the different classification systems to describe a class (e.g. definitions, keywords, attributes,
value lists etc.)

7.7.2.9 PR09: Support for mapping and its representation


The platform needs to support the data model and relations used to represent the mappings (i.e. the data
model proposed in Gen-ePDC / ISO 13584 / IEC 61360) as well as offer a transparent layer between the
actions of users and their result as a translation into relations between the classification system classes and
the intermediate classes as defined in subsection 7.6.3. Also releases have to be identified.

7.7.3 Process-related requirements


All use cases have to be supported by the cMap platform. That includes all changes of both the change
request status and the mapping status, combined with the automatic E-Mail notification function. I.e. the
system has to automatically change all statuses as described in the preceding chapters. That also includes
the upload of a new classification system, the upload and application of release update information and any
change made by the defined roles of the cMap platform.

7.7.4 End-user requirements


As mentioned in use case 1 the query for mapping has to be possible for an end-user. The requirements are
already mentioned in the platform requirements section above. An end-user needs a comfortable GUI that
provides an elaborate search function and lets the user set both platform language and content language.
Users might need the E-Mail notification function to be informed about updates, mapping status changes etc.
They should take part in the discussion forum, create change requests where mapping is still BLANK and
become a mapping proposer when browsing through the cMap platform.
Additionally, depending on the business model, the end-user could have access to the export function
mentioned in the architectural requirements.

7.8

Data Quality

See chapter 7 of CWA 16100 (ePPS) for an introduction to ISO 8000, a standard about data quality that is
currently under development. In this short chapter some fundamental requirements are repeated, that
underlies a meaningful maintenance of classification systems, ontologies or master data in general as well
as mappings in between.

107

prCWA XXX-1:200X (E)

7.8.1 Unique Identification


The elements used in the to-be-mapped systems need to be identified. Identification performs the action of
authoritatively sharing a statement on the identity of an entity. Systematic identification adds statements on
the condition or status of the entity. As it is in scope here to provide the identification not only to humans but
also to a computer system, the statement needs to be machine tractable.
Thus a systematic identification of the elements of a classification system should include in machine
readable form:

unique references to the authorities providing the identification and defining the entity
a unique reference to the entity that is assigned according to the facets supplying the entity with
identity
a possibility to express predecessor - successor relationships as well as the degrees of compatibility
between the evolution states of an entity

As already mentioned in CWA15295:2005 by the ePDC project the mapping of product [sic!] to classes
(classification) should use the class identifier only (not the classification code!), in order to prevent reclassification when the hierarchy changes (p.24). This would imply that either a unique identifier is used by
any of the authorities or the cMap platform has to assign a cMap identifier itself according to international
standards (e.g. ISO29002, already in use by systems like eCl@ss, DIN, PROLIST and companies like e.g.
SIEMENS). The recommendations would be to register for an IRDI code and assign distinct cMap identifiers
to at least use in the background as a basis for the cMap database.

7.8.2 Tracking and Tracing


Transparency of actions requires to keep track of who did what, when and why to an entity during the
process of its maintenance, e.g. during the processing workflow and in case changes and evolution of
entities is foreseen within a standard. This is what ISO 8000-102 defines as data provenance, a record of
the ultimate derivation and passage of a piece of data through its various owners or custodians.
To quote some recommendations of ISO/DIS 22274, p.47 concerning this issue:
When changes are implemented, they shall be clearly documented complete with historical and
administrative information indicating the nature of the change, who made it, and when.
Additionally, all classification system content shall, once released for public use, never be deleted to
allow users to refer also to content which was issued in earlier versions. Content that is no longer
valid shall be marked as obsolete.

7.8.3 Change Management


In case entities are entitled by their defining and identifying bodies to undergo an evolution, there will appear
the case that representations of entities differ between different releases of a system. This requires a
management of change requests and their resolution as well as their consequences in order to
institutionalize this evolution in a process which is comprehensible, transparent, maintainable by a human
and applicable by a machine.
The change processing process that should be installed, does not only require a formalization of change
requests to tell about the originator, scope and reason of the change request as to provide transparency and
supply enough knowledge for the processing body and system to express the correct predecessor successor relationships along the evolution of an entity. Also the following two building blocks are essential:
first, there has to be a shared set of formal rules that translate between the formal requests and the
representations of the evolution steps of entities. Second, there will be needed a formal language that
expresses all evolution of entities along other relations but equivalence.

7.8.4 Intellectual property rights and conditions of use


A solution has to be found to not violate the different conditions of use of the classification systems nor the
intellectual property rights (IPR). The IPR of the mapping itself stays with the administration authority - i.e. at
this point: the CEN - as the mapping is a deliverable of a CEN project. If the maintenance of the mapping
was done on a cMap platform, anyone who contributed to the mapping would leave their IPR in the hands of
the administration authority. Another very important task is to respect the IPR of the classification authorities
108

prCWA XXX-1:200X (E)

themselves, i.e. the IPR of their content and their release update information. Also, the origin of the inputs
has to be mentioned to the user when reused (content coming from CPV, eCl@ss, GPC, UNSPSC).
The CEN will not charge for the published mapping tables of the cMap project, but refer to the necessity to
respect the conditions of use of the classification authorities38. If implementing a cMap platform, further
discussions have to be made depending on the underlying business model.

38 The mapping results of the CEN WS eCAT/cMap that refer to the eCl@ss classification system can only be used by registered
eCl@ss users that have accepted the eCl@ss conditions of use. A registration is provided by a registered download via the eCl@ss
website, a cooperation agreement or the membership in the eCl@ss association. Further information is available at www.eclass.eu. The
eCl@ss conditions of use will be added to the annex of the final CWA.

109

prCWA XXX-1:200X (E)

8.

Definition of a synchronization process

8.1

Introduction

Based on the results of part 7 the identified actors and their business use cases, the technical level
including roles, business objects and use cases of the platform the strategic level can now be addressed and
an applicable synchronization process be proposed. The current CWA intends to develop a process that is
manageable and practicable, in other words: realistically suitable for all involved parties and not just a highly
elaborated (academic) process that will never be used due to its complexity or the lack of existing actors.
Therefore, the proposed process is designed with the possible business models (see 8.6) in the background
to design the process in the most practicable, simplest way.
The objective is to develop a long-term strategy that embraces the need for interoperability: one that is
sustainable, clearly communicated and supported with education. Associated technical, business and
marketing plans will need to be developed in detail when beginning to implement the processes that can only
be hinted at on a high level here.

8.2

The basis for a sustainable process: Gen-ePDC and its adaption for cMap

GEN-ePDC is already described in detail in CWA 15295 :2005 and does not have to be described again.
The proposed ePDC maintenance procedure is focused on interoperability and is very similar to ISO
directive 1 (CWA 15295:2005, p.35).
It deals with rules for the operation of a joint committee as well and refers to factors like members, voting
process, quality insurance, referencing mechanism, copyrights and translations.
All processes described in this CWA are based on the GEN-ePDC maintenance procedure. The different
processes are defined in the relevant parts of chapter 7 where different use cases were described. The basic
requirements and recommendations of GEN-ePDC are taken into account as they are still state of the art.

8.3

cMap processes

The following chapter will wrap up the processes of the cMap platform that were already partly mentioned in
the use cases of the platform described in chapter 7.2. The three basic processes identified are:
Process 1: Query for Mapping Result
Process 2: Apply Release Update Information
Process 3: Manage Mappings

8.3.1 Process 1: Query for mapping result


The first requirement of the cMap platform is to fulfil the functionality for an end-user to send a query for a
mapping and receive a result. Business use cases were presented in chapter 7.2, the use case of a query
was described in 7.6.1.
The following figure describes the process of the cMap platform for this procedure.

110

prCWA XXX-1:200X (E)

Figure 84: cMap Process 1: Query for Mapping Result

The end-user has classified his/her product data, visits the cMap platform and sends a query on a
specific classification code of one classification system to receive the result in another specific
classification system
The cMap platform checks the mapping status and sends back the result (see 7.6)
NO MAP: source class code could not be mapped in target classification system. The process ends.
BLANK: source class code has not been attempted to map in target classification system. The
process ends here, but the user could be automatically given the possibility to request a new
mapping for the relevant class, i.e. he/she could be directly forwarded as role mapping proposer to
Process 3: Manage Mapping
MAPPED: source class code has been mapped in target classification system. The user can decide
whether he/she accepts the mapping result. If yes, the query result is accepted and the process
ends. If not, the process ends here, but the user could be automatically given the possibility to
request a new mapping for the relevant class, i.e. he/she could be directly forwarded as role
mapping proposer to Process 3: Manage Mapping
TO BE CHECKED: source class code has been mapped in target classification system before an
update and something has changed in last update, i.e. no guarantee is given for the correctness, but
the mapping has not been updated yet. The user can decide whether he/she accepts the mapping
result nonetheless. If yes, the query result is accepted although it has to be checked and the process
ends. If not, the process ends here, but the user could be automatically given the possibility to
request a new mapping for the relevant class, i.e. he/she could be directly forwarded as role
mapping proposer to Process 3: Manage Mapping

111

prCWA XXX-1:200X (E)

8.3.2 Process 2: Apply Release Update Information


The second requirement of the cMap platform is to fulfil the functionality to update to a new release of a
classification system that already exists in the cMap platform and was already mapped to other classification
systems. The Key function here is the ability to apply the second deliverable by the classification authorities:
the release update information that documents the changes and helps to upgrade semi-automatically.
Different use cases are described in 7.5.4.
The following figure describes the process of the cMap platform for this procedure.

Figure 85: cMap Process 2: Apply Release Update Information

A classification authority delivers a new release version including their release update information
The cMap Platform Provider uploads both the new release and the release update information to
automatically update the mapping to other classification systems
The cMap platform checks whether the current mapping status is = MAPPED
If mapping status <> MAPPED: no update can be done, the mapping status remains unchanged
(BLANK;NO MAP;TO BE CHECKED) and E-Mail notifications are sent to the platform provider, the
responsible quality manager and interested mapping proposers. The process ends here. The
notification of all relevant bodies could be directly linked to Process 3 (Manage Mapping) so that they
will be in the role of the mapping proposer and can directly react to the status changes.
If mapping status = MAPPED: source class code has been mapped in target classification system.
The system checks whether the update of the mapping can be done automatically (see 7.6.2.5)
If the update can be done automatically: set mapping status = MAPPED and send E-Mail notification
to the platform provider, the responsible quality manager and interested mapping proposers.
If update cannot be done automatically: set mapping status = TO BE CHECKED and send E-Mail
112

prCWA XXX-1:200X (E)

notification to the platform provider, the responsible quality manager and interested mapping
proposers. The mapping proposer can then start Process 3: Manage Mapping

8.3.3 Process 3: Manage Mapping


The third requirement of the cMap platform is to fulfil the functionality to manage the mapping manually, i.e.
to request new or update existing mappings. Different use cases are described in 7.6.2.
The following figure describes the process of the cMap platform for this procedure.

Figure 86: cMap Process 3: Manage Mapping

A mapping proposer creates a mapping request


The mapping proposer edits and submits the mapping request
The cMap quality manager double-checks the correctness and the quality of the requested mapping
(independent of the fact if it is a new mapping, a changed mapping or a deleted mapping, see also
7.5.3 for the description of the use cases) and decides on the mapping request status:
ACCEPTED: The mapping request is considered to be acceptable, the mapping request status is set
to ACCEPTED and the mapping proposer is informed about the status change automatically. The
113

prCWA XXX-1:200X (E)

mapping relation will be published in the cMap platform and can be queried.
REWORK: The mapping request has to be edited, the mapping request status is set to REWORK
and the mapping proposer is informed about the status change automatically. The Mapping
Proposer can edit the mapping request and start again at process step 2.
REJECT: The mapping request is considered to be incorrect and is rejected, the mapping request
status is set to REJECT and the mapping proposer is informed about the status change
automatically.

8.4

Maintenance strategy

On a strategic level the following are the specific goals:


continue to maintain the distinct classification systems (by each classification authority)
create a widely-available mapping among the four classification systems
align the governance model of the four classification systems to the cMap mapping tables
maintain and update the mapping tables regularly considering release changes of the classification
systems and/or the integration of further classification systems
The following figure describes the overall process suitable to reach these goals:

Figure 87: Maintenance strategy

114

prCWA XXX-1:200X (E)

As mentioned above, all classification authorities will independently maintain their classification systems
according to the requirements of their specific users groups, their established processes, release roadmaps
and based on their specific business models.
The mapping will be maintained in the separate database of the cMap platform. By establishing import
interfaces for the deliverables of the classification authorities the mapping between CPV and the three
commercial systems can be maintained. Chapter 7 delivered a detailed description of the cMap architecture,
user roles, their use cases and the processes.
The benefits of this synchronization process are:
it acknowledges different business needs being met by different classification systems
it avoids governance and change management conflicts between the classification
authorities
it enables translations between classifications to meet the business need
it provides cost effective access to the codes of other classification systems
it provides interoperability, integration and migration advantages
it satisfies user communities
it accelerates the return of investment
it improves scheme quality, ease of search, code assignment and coverage
The following risks are identified:
the setting up and ongoing maintenance of the cMap platform needs funding
as it is another distinct platform, users of classification systems do not have a single source
of information and yet another process is created for users to implement
the mapping maintenance is complex, difficult, cannot guarantee high correctness, but only
a high workload
there will always be a time lag between new releases of the classifications and the
maintained mapping to the new release
When implementing the cMap platform the following issues need to be defined in detail:
Business and governance rules and principles for the mapping and the overall process.
o The owner of these rules should be the administration authority
o The maintenance of the rules does not necessarily have to be in the hands of the
administration authority, but could as well be delegated to either the platform provider, the
release manager or the quality manager
Conflict resolution processes: to answer questions like e.g.
o If more than one mapping is requested for an item what will happen?
o Does the mapping proposer have the right and possibility to appeal?
o Does anyone else have the right to appeal?
o Who takes the final decision after an appeal or is the owner of the mapping?
o In which timeline are decisions to be taken?
o How is quality defined?
Change Management Processes: to answer questions like e.g.
o Who will manage the change request, i.e. govern and control that a certain change request
is being assessed?
o How will change requests be assessed?
o What criteria will be used to assess change requests?
o Who will be responsible for change request communication, internally and externally?
In order to fulfil the synchronization process, there several possible different governance models and they
are described in the following section.

8.5

Governance models

Taking into account the different roles that were identified (see section 7.4) and the resources that will be
needed to maintain the mapping three different manageable governance models are described hereafter.
115

prCWA XXX-1:200X (E)

Different actors will be in focus, different business models would have to be developed. These three
governance models are distinguished by the respective driving actor:
Governance model 1: Community-driven
Governance model 2: Classification authority-driven
Governance model 3: Administration authority-driven
These governance models will be described in detail in the following section. For each model, the CWA will
answer the following questions: Who does? Who manages? Who pays?

8.5.1 Governance model 1: Community-driven


The basic idea of governance model 1 is that the maintenance of the mapping shall be driven by those who
need it: the users. Therefore, the maintenance shall be open for the public, i.e. registered, interested users
that are not only those who would benefit from the platform, but probably at the same time those who are
product experts and classification experts so that the maintenance would be of a high quality. Plus, any
community driven approach lives from the collective knowledge and is therefore based, in theory, on the
whole knowledge available on products and product classification.

8.5.1.1 Who does?


Based on the roles identified in section 7.3, for a community-driven governance model the roles might be
filled by a lot of different actors:
Administration authority
o Task: Owns and provides cMap platform
o Actor: to be defined
Platform administrator
o Task: Hosts cMap platform
o Actor: authorized and to be defined by administration authority
Classification Authority
o Task: Provides classification system in any release and any available language version and
release update information
o Actors: currently the contact persons for the four classification systems CPV, UNSPSC,
GPC, eCl@ss. Possibly: contact persons of other classification systems
Mapping Proposer
o Task: Creates and submits mapping requests, i.e. maps classification classes from one
classification system to another
o Actors: all interested users, the user community of the mapping results
Quality Manager
o Task: Evaluates mapping requests and approves/dismisses
o Actors: authorized by administration authority, e.g. designated experts in user community

8.5.1.2 Who manages?


A community-driven approach cannot be managed in a way that somebody takes care that a certain amount
of work is done within a certain period of time considering limited resources. Therefore, in this governance
model there is no such thing as a mapping maintenance management. The administration authority together
with the platform provider and the classification authorities provide the basis in terms of technique and
content to a community.
The governance model lives only by the initiative of the community and is comparable to the Wikipediaapproach. If the requirement for a mapping is urgent enough for a user then this user will provide resources
to maintain the mapping. The motivation do to a mapping in a public platform such as cMap in contrary to
mapping in ones own system is that everybody can contribute to and benefit from a collective knowledge
base. Of course, some areas might be more elaborated than others, which depends on the requirements of
the community that does voluntary work.
Nonetheless, the administration authority can always authorize experts to do additional work, depending on
the business model.
116

prCWA XXX-1:200X (E)

8.5.1.3 Who pays?


When thinking about who will do the maintenance work, the question of financing the work is always present.
This CWA cannot solve this crucial question but can contribute with some ideas about what a business
model could be built on. Even if the mapping maintenance was completely done by users who finance their
resources on their own (including the cooperating classification authorities), still the platform administration
needs to be financed. Business model ideas are listed in section 8.6, financial advantages or disadvantages
will be summarized at the end of this chapter.

117

prCWA XXX-1:200X (E)

8.5.2 Governance model 2: Classification authority-driven


The basic idea of governance model 2 is similar to that of governance model 1: the mapping shall be driven
by users. The difference in this model is that as the users requirements are already represented by the
respective classification authorities it shall be driven by the classification authorities of the included
classification systems themselves. The classification authorities - apart maybe from the European
Commission - are user-driven, i.e. all of these organizations represent the requirements of large user groups
in the global industry. Among these users are those who really need the mapping tables in case they have to
use different classification systems. It is for those users that a cMap platform would facilitate interoperability
and improve the ability to exchange product data classified in different ways.
Plus, the community of those who contribute to the maintenance of the different classification systems
comprises both market and classification experts a crucial factor to improve the quality of the mapping
result.
The classification authorities would then provide all input: the classification systems, the updates and release
update information and the mapping result. It is still theoretically based on the collective knowledge of
different, but representative user groups.

8.5.2.1 Who does?


Based on the roles identified in section7.4, for an authority-driven governance model the roles would be filled
by a limited number of actors:
Administration authority
o Task: Owns and provides cMap platform
o Actor: to be defined
Platform administrator
o Task: Hosts cMap platform
o Actor: authorized and to be defined by administration authority
Classification Authority
o Task: Provides classification system in any release and any available language version and
release update information
o Actors: the contact persons for the four classification systems CPV, UNSPSC, GPC,
eCl@ss. Possibly: contact persons of other classification systems
Mapping Proposer
o Task: Creates and submits mapping requests, i.e. maps classification classes from one
classification system to another
o Actors: only registered members of the classification authorities
Quality Manager
o Task: Evaluates mapping requests and approves/dismisses
rd
o Actors: authorized by administration authority, e.g. 3 party or much easier: a member from
the other involved classification system. If e.g. a GPC member submits a request on a GPCCPV-mapping, then a member of CPV should evaluate the correctness of the mapping. This
way, both classification authorities are involved and check the mapping from both sides.

8.5.2.2 Who manages?


A classification authority-driven approach would get the input, i.e. the mapping result from the classification
rd
authorities. It could not be managed by one of these authorities, but rather by either a 3 party or a
consortium of members of the classification authorities. cMap has already established an advisory group that
consists of representatives of the classification authorities and other interested key players from the market.
An easy task for this body could be the management of the maintenance process. They could supervise that
a certain amount of work will be done in a certain period of time with limited resources.
An additional chance for the classification authorities is the fact that they could adjust their maintenance
processes in order to give their requestors the possibility to deliver a mapping proposal already when
requesting new items in their classification systems. If e.g. a GPC user requests a new class for the GPC,
GS1 could give him the functionality in their workflow to already deliver the corresponding class code(s) (if
already existing) in the UNSPSC, the CPV and/or eCl@ss. After releasing the new GPC class, the result
could then be directly transferred to the cMap platform.
118

prCWA XXX-1:200X (E)

This way, the contribution of users can be enhanced at a very early stage of their own maintenance
processes.

8.5.2.3 Who pays?


The additional work of the classification authorities doing the maintenance work is still a financial question.
The CWA cannot finally solve this crucial question, but only contributes some ideas what a business model
could be built on. Even if the resources for the mapping maintenance were completely financed by the
classification authorities themselves (which cannot be guaranteed), still at least the platform administration
needs to be financed. Business model ideas are listed in section 8.6, financial advantages or disadvantages
will be summarized at the end of this chapter.

8.5.3 Governance model 3: Administration authority-driven


The third governance model is to let the maintenance of the mapping be driven by the administration
authority, i.e. the owner itself, respectively somebody authorized by the owner, e.g. the platform provider.
Depending on the business model, this could be done by a service provider. As recommended for the
maintenance of classification systems in ISO/DIS 22274, p. 47: [] the responsibility for managing a
classification system shall be assigned to an organization that will be dedicated to this task this responsible
central organization should be found for such a huge enterprise as cMap, too.

8.5.3.1 Who does?


Based on the roles identified in section 7.4, for an authority-driven governance model the roles would be
filled by a limited number of actors:
Administration authority
o Task: Owns and provides cMap platform
o Actor: to be defined
Platform administrator
o Task: Hosts cMap platform
o Actor: authorized and to be defined by administration authority
Classification Authority
o Task: Provides classification system in any release and any available language version and
release update information
o Actors: the contact persons for the four classification systems CPV, UNSPSC, GPC,
eCl@ss. Possibly: contact persons of other classification systems
Mapping Proposer
o Task: Creates and submits mapping requests, i.e. maps classification classes from one
classification system to another
o Actors: administration authority or service provider authorized by administration authority
Quality Manager
o Task: Evaluates mapping requests and approves/dismisses
rd
o Actors: authorized by administration authority as 3 party to guarantee the four-eyeprinciple.

8.5.3.2 Who manages?


An administration authority-driven approach would get the content basis, i.e. the content and release update
information from the classification authorities. It would then be centrally managed by the administration
rd
authority themselves or some authorized 3 party. As in governance model 2, a consortium of members of
rd
the classification authorities like the already established advisory group could be established as this 3 party
as well. The management of the maintenance process would be in the responsibility of the administration
authority. They could supervise that a certain amount of work will be done in a certain period of time with
limited resources. Thereby, the administration authority would have full responsibility for both the
maintenance of the platform and for the maintenance of the mapping.

119

prCWA XXX-1:200X (E)

8.5.3.3 Who pays?


rd

The work of the administration authority or some authorized 3 party (e.g. a contractor) doing the
maintenance work is still a financial question.
The CWA cannot finally solve this crucial question, but only contributes some ideas what a business model
could be built on. Both the resources for the mapping maintenance and the platform administration are
completely to be financed by the administration authority. Business model ideas are listed in section 8.6,
financial advantages or disadvantages will be summarized at the end of this chapter.

8.5.4 Pros and Cons


Pro
Governance model
1: Communitydriven

con

Governance model
2: Classification
authority-driven

Driven by the users/beneficiaries


themselves
Users are also experts
Contribution is open to the whole
market
Experts discuss to find the correct
results
Based on collective knowledge (Wiki
approach)
Manpower is less expensive

The communication ways are kept


short as the input to be mapped is
delivered and maintained by the
same actors
The classification authorities might
be able to access the know-how of
their members and partners
It is manageable by e.g. the
advisory board and can therefore be
driven equally across all segments
and the mapped result would be
completely up-to-date on a certain
point in time
In the long run, the classification
authorities could host their own
classification systems in the same
cMap platform and thereby use
synergies or use the same platform

Governance model
3: Administration
authority-driven

Easy to manage: the cMap platform


owner would be in charge of both
the platform maintenance and the
mapping maintenance.
Equally driven segments possible
Mapped result would be complete
on a certain point in time
The communication ways are kept
120

Mapping will never be driven


equally across all segments
Mapping will never be complete
on a certain point in time
Takes a lot of time
Is not manageable by a
managing authority
Relies on voluntary work which
is not guaranteed

The business model would


have to provide more resources
to finance the workload
The classification authorities
partly exist as competitors in
the market, the mapping could
therefore have a political
dimension that might lead to
conflicts
In the case of the UNSPSC,
GPC and eCl@ss there might
be resources that might be
available if financed. For the
CPV, resources have to be
identified, they might have to be
authorized
The mapping is only partly
driven by the users themselves
(represented as members of
the classification authorities)
The contribution is not open to
the whole market
The market experts are only
indirectly involved to find the
correct results
High costs expected or a
business model that would
need to generate a lot of
revenues
The mapping is not driven by
the users themselves
The contribution is not open to
the market

prCWA XXX-1:200X (E)

short as all maintenance work is


managed by the same body. Also,
less friction costs could be
expected.
The administration authority might
rely on experts in the field of
classification (and maybe technical
fields, too)

The market experts are not


involved to find the correct
results
Experts in the field of both
classification and technique
might be hard to find

Figure 88: Comparison of Governance Models


No governance model is perfect. Combinations of all three of them could be imagined, but the driving force of
the mappings is still the most crucial factor. The quality and quantity are not influenced by the roles and
processes described in chapter 7, as all roles and processes are prerequisites and suitable for a good quality
and high quantity, but they do not influence them directly. Neither should the underlying business model
have an influence on the quality or quantity of the mapping.
What influences both the quality and the quantity is the governance model. The governance models might be
combined are changed over the time. The starting point is delivered with this CWA that includes a complete
mapping of the CPV, GOC, UNSPSC and eCl@ss. The implementation of a cMap platform could take this as
the content basis. A wiki approach (governance model 1) might deliver a lot of input in the beginning to found
a solid basis. E.g. there are a lot of companies who have already done a lot of mapping work in their own
systems not only with their own company-internal classification systems, but also with the relevant ones
dealt with in this CWA. See the related business model in 8.6.6. Once a solid basis is founded, an
administration authority could take over and drive the equally distributed maintenance of the mapping and a
timely publication.
As all classification authorities have already experienced it is not an easy task to find people who are
technical experts and experts in the field of classification at the same time. Even more, they need to have
sound knowledge of the language of the classification system. In order to be able to verify change requests,
a quality manager needs to have even better knowledge in all three respects.
And finally, the whole endeavour needs to be financed, which influences both quality and quantity directly.
To give a recommendation how to achieve a mapping of high quantity and quality with less costs is a task
that is out of scope of the CWA.

8.6

Business models (high level)

The CWA only contributes with some ideas on what kind of business models the operation of the cMap
platform and the maintenance work of the mapping could be built on.
There is only one way to find out, if and how the cMap platform could be financed: analyse the market
requirement. First of all, a survey should be financed to ask users of classification systems and especially
involved service providers whether or not and if yes, in what extent they need the mapping and what they
would be willing to invest to contribute and/or have access to the results. This will lead to an understanding,
how the cMap platform could be financed, who should participate and what the exact requirements are. Only
the benefits of the mapping for the users justify the involved costs.
In order to draw more users attention to cMap an understanding of the role, contribution and value of the
mapping tables has to be raised first. This could be done by:
showing the big picture and how classification mapping is used across different functions. This could
be done in conjunction with a training programme that can be financially driven
finding and promoting practical case studies that demonstrate the value of using standards
Money to finance the platform could be raised in different ways. Some ideas will be described here after.

121

prCWA XXX-1:200X (E)

8.6.1 Proposal fee


One possible business model could be to let proposers pay for their proposal. Some national standardization
authorities charge money for contributing to the publication of a standard by participating in an expert body,
so this would be a well-proven strategy. On the other hand, the motivation to participate could be limited due
to the involved costs.

8.6.2 Mapping result fee


One could let the users pay for the mapping result. Users of the mapping will possibly save enormous
amounts of resources. Therefore, they could be willing to pay for it, as otherwise they would have to do it
themselves.

8.6.3 Membership restriction


One could connect the access to either the mapping results or the mapping platform or both to a
membership. Both the UNSPSC and eCl@ss provide their classification systems for free to their members in
any language and release version, non-members are charged. A membership would continuously finance
the mapping work and platform. Some users might only pay once to have one specific mapping version,
others might be interested in continuous work and need the updated results.

8.6.4 Classification authority financed


An alternative could be to let all involved classification authorities pay for the platform. The classification
authorities apart from the CPV represent the interested users already. It might be a comfortable way to
raise funds for cMap. A disadvantage could be that not all users will need the mapping to other classification
systems, because they might only use one system. Plus, the classification authorities could be in need to
raise more money, so that the financing problem would be somehow delegated to them.

8.6.5 Third party financed


The cMap platform could surely be financed with the help of e.g. public funding. A project might be outlined
to request funding of the EC, the CEN, some industry associations or other organizations. The motivation
would be to support European industries and help facilitate the international data exchange for the European
market. The costs and resources to implement a classification system into a companys IT infrastructure are
high. The mapping to other classification systems is even more cost-relevant which could be supported by
public funding.

8.6.6 cMap as a service


Apart from maintaining the mapping between the included classification systems, the cMap can be a platform
to map a companys own classification system. A lot of companies today rely on standardized classification
systems. But they rather use them in addition to their own company-specific classification system as a
standardized classification will probably not cover all company-specific products. Some products may not be
suitable for standardization. Therefore, those companies already map their own company-specific
classification to the standardized one they additionally use. Some companies already have to deliver their
product data in different classification systems depending on their business partners and do the mapping
work in their own systems.
cMap platform with all its defined processes, business rules and an established architecture could be the
appropriate tool for companies to rely on. It could be provided as a cloud-based service financed by
companies that rather rely on an established platform for a fee than having to design and host a system of
their own.

8.6.7 Comparison
The business models identified above are compared in the following figure :
122

prCWA XXX-1:200X (E)

Business
Model

Explanation

Pro

Con

Proposal
Fee

Charge a fee for a


Mapping Proposal (like
e.g. DIN)

+ Well-proven strategy

- Less participation could be


expected

Mapping
Query Fee

Charge a fee for the


mapping query

+ Saves resources, fee is


legitimate

Membership

Connect access to
membership fee

+ Continuous financing
secured

- Access restricted to certain


user groups
- Memberships always difficult
to finance for companies
- Interest in continuous
mapping limited

Classification
Authority
Financed

Let all involved


classification authorities
finance cMap

+ classification authorities
represent the end-users

- Not all members of


classification authorities might
be interested
- Financing problem
delegated

3rd Party
Financed

Public funding

+ rely on public funding,


already available in
organisations
+ investment legitimate to
support the European
market

- Public funding has to be


driven by somebody (i.e.
invest work that is not
necessarily paid for)

cMap as a
Service

Let companies map their


own internal
classification system

+ business need is identified


+ beneficiaries pay for an
extra service
+ saves company resources

Figure 89: Comparison of Business Models

123

prCWA XXX-1:200X (E)

9.

Conclusion and recommendation

This part comprises the conclusion and recommendations provided in each section. It will be consolidated at
the time of publication.

124

prCWA XXX-1:200X (E)

Annex A
(informative)
The SKOS platform
SKOS - Simple Knowledge Organization System - provides a model for expressing the basic structure and
content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies,
folksonomies, and other similar types of controlled vocabulary. As an application of the Resource Description
Framework (RDF), SKOS allows concepts to be composed and published on the World Wide Web, linked
with data on the Web and integrated into other concept schemes.
In basic SKOS, conceptual resources (concepts) are identified with URIs, labelled with strings in one or more
natural languages, documented with various types of note, semantically related to each other in informal
hierarchies and association networks and aggregated into concept schemes.
In advanced SKOS, conceptual resources can be mapped across concept schemes and grouped into
labelled or ordered collections. Relationships can be specified between concept labels. Finally, the SKOS
vocabulary itself can be extended to suit the needs of particular communities of practice or combined with
other modelling vocabularies.

125

prCWA XXX-1:200X (E)

Annex B
(informative)
The Protg Platform
Protg is a free, open-source platform that provides a growing user community with a suite of tools to
construct domain models and knowledge-based applications with ontologies. At its core, Protg implements
a rich set of knowledge-modelling structures and actions that support the creation, visualization and
manipulation of ontologies in various representation formats. Protg can be customized to provide domainfriendly support for creating knowledge models and entering data. Further, Protg can be extended by way
of a plug-in architecture and a Java-based Application Programming Interface (API) for building knowledgebased tools and applications.
An ontology describes the concepts and relationships that are important in a particular domain, providing a
vocabulary for that domain as well as a computerized specification of the meaning of terms used in the
vocabulary. Ontologies range from taxonomies and classifications, database schemas, to fully axiomatized
theories. In recent years, ontologies have been adopted in many business and scientific communities as a
way to share, reuse and process domain knowledge. Ontologies are now central to many applications such
as scientific knowledge portals, information management and integration systems, electronic commerce and
semantic web services.
The Protg platform supports two main ways of modelling ontologies:
The Protg-Frames editor enables users to build and populate ontologies that are frame-based, in
accordance with the Open Knowledge Base Connectivity protocol (OKBC). In this model, an
ontology consists of a set of classes organized in a subsumption hierarchy to represent a domain
salient concepts, a set of slots associated to classes to describe their properties and relationships
and a set of instances of those classes - individual exemplars of the concepts that hold specific
values for their properties.
The Protg-OWL editor enables users to build ontologies for the Semantic Web, in particular in the
W3C's Web Ontology Language (OWL). "An OWL ontology may include descriptions of classes,
properties and their instances. Given such an ontology, the OWL formal semantics specifies how to
derive its logical consequences, i.e. facts not literally present in the ontology, but entailed by the
semantics. These entailments may be based on a single document or multiple distributed documents
that have been combined using defined OWL mechanisms" (see the OWL Web Ontology Language
Guide).
Protg-OWL
The Protg-OWL editor is an extension of Protg that supports the Web Ontology Language (OWL). OWL
is the most recent development in standard ontology languages, endorsed by the World Wide Web
Consortium (W3C) to promote the Semantic Web vision. "An OWL ontology may include descriptions of
classes, properties and their instances. Given such an ontology, the OWL formal semantics specifies how to
derive its logical consequences, i.e. facts not literally present in the ontology, but entailed by the semantics.
These entailments may be based on a single document or multiple distributed documents that have been
combined using defined OWL mechanisms" (see the OWL Web Ontology Language Guide).
The Protg-OWL editor enables users to:
Load and save OWL and RDF ontologies.
Edit and visualize classes, properties, and SWRL rules.
Define logical class characteristics as OWL expressions.
Execute reasoners such as description logic classifiers.
Edit OWL individuals for Semantic Web markup.

126

prCWA XXX-1:200X (E)

Protg-OWL's flexible architecture makes it easy to configure and extend the tool. Protg-OWL is tightly
integrated with Jena and has an open-source Java API for the development of custom-tailored user interface
components or arbitrary Semantic Web services.

Figure 90: The OWLClasses view can be used to edit hierarchies of concepts

Figure 91: Seamless integration of Protg-OWL with classification tool

127

prCWA XXX-1:200X (E)

Figure 92: Protg-OWL for editing RDF Schema models

Figure 93: OWLViz for visualizing OWL ontologies graphically

128

prCWA XXX-1:200X (E)

Annex C
(informative)
The Prompt Tool
PROMPT implements an algorithm that provides a semi-automatic approach to ontology merging and
alignment. PROMPT performs some tasks automatically and guides the user in performing other tasks for
which his/her intervention is required and also determines possible inconsistencies in the state of the
ontology, which result from the users actions, and suggests ways to remedy these inconsistencies. [2] It is
based on a general knowledge model and therefore can be applied across various platforms.

Figure 94: Ontology comparison by PromptDiff39

Figure 95: Ontology comparison by PromptDiff40

39 http://protege.stanford.edu/plugins/prompt/PromptDiff.html
40 http://protege.stanford.edu/plugins/prompt/PromptDiff.html

129

prCWA XXX-1:200X (E)

Figure 96: Ontology merging Initial list of suggestions by Prompt41

Figure 97: Prompt merging classes42

Figure 98: Prompt merged classes43

41 http://protege.stanford.edu/plugins/prompt/Suggestions.html
42 http://protege.stanford.edu/plugins/prompt/merging.html
43 http://protege.stanford.edu/plugins/prompt/merging.html

130

prCWA XXX-1:200X (E)

Figure 99: Ontology merging by Prompt44

44 http://protege.stanford.edu/plugins/prompt/operations.html

131

prCWA XXX-1:200X (E)

Annex D
(informative)
Mapping Tables
Mapping tables can be accessed at
ftp://ftp.cen.eu/PUBLIC/CWAs/eCAT-CC3P/CWA/cmap.zip

132

prCWA XXX-1:200X (E)

Bibliography
[1] D. Marques, A survey of Recent Research in Ontology Mapping
[2] N. F. Noy and M. A. Musen. PROMPT: Algorithm and Tool for Automated Ontology Merging and
Alignment. http://www.cs.uga.edu/~kochut/Teaching/8350/Papers/Ontologies/PROMPT.pdf
[3] Noy, Natalya F. Semantic Integration: A Survey Of Ontology Based Approaches. Stanford Medical
Informatics, Stanford University. Downloaded from http://smiweb.
stanford.edu/people/noy/papers/SigmodRecordReview.pdf October 14, 2005.
[4] Ehrig, Marc and Staab, Steffen. QOM Quick Ontology Mapping. in S.A. McIlraith et al. (Eds.): ISWC
2004, LNCS 3298, pp. 683697, 2004.
[5] Shvaiko, Pavel and Euzenat, Jerome. A Survey of Schemabased Matching Approaches. Technical
Report DIT-04-087, Informatica e Telecomunicazioni, University of Trento, 2004.
[6] J. Madhavan, P. A. Bernstein, P. Domingos, and A. Halevy. Representing and reasoning about mappings
between domain models. In Eighteenth National Conference on Artificial Intelligence (AAAI 2002),
Edmonton, Canada., 2002.
[7] J. Euzenat and P. Shvaiko, Ontology matching. Springer, 2007
[8] P. Shvaiko, J. Euzenat. Ontology matching: state of the art and future challenges. IEEE Transactions on
knowledge and data engineering, 2012
[9] M.K Bergman. "Sources and Classification of Semantic Heterogeneity," from AI3:::Adaptive Information
blog, June 6, 2006.
[10] Alon Halevy, Why Your Data Wont Mix, ACM Queue vol. 3, no. 8, October 2005.
[11] Charnyote Pluempitiwiriyawej and Joachim Hammer, A Classification Scheme for Semantic and
Schematic Heterogeneities in XML Data Sources, Technical Report TR00-004, University of Florida,
Gainesville, FL, 36 pp., September 2000

133

Você também pode gostar