Escolar Documentos
Profissional Documentos
Cultura Documentos
Comprehensive Approach to
Data Management and the
Data Management Book of
Knowledge (DMBOK)
Alan McSweeney
Objectives
March 8, 2010 2
Agenda
March 8, 2010 3
Preamble
March 8, 2010 4
Management Wisdom
• There is nothing more difficult to take in hand, more perilous to conduct or more
uncertain in its success than to take the lead in the introduction of a new order of
things.
− The Prince
• Never be in the same room as a decision. I'll illustrate my point with a puppet
show that I call "Journey to Blameville" starring "Suggestion Sam" and "Manager
Meg.“
• You will often be asked to comment on things you don't understand. These
handouts contain nonsense phrases that can be used in any situation so, let's
dominate our industry with quality implementation of methodologies.
• Our executives have started their annual strategic planning sessions. This involves
sitting in a room with inadequate data until an illusion of knowledge is attained.
Then we'll reorganise, because that's all we know how to do.
− Dilbert
March 8, 2010 5
Information
March 8, 2010 6
Data, Information and Knowledge
March 8, 2010 7
Data, Information, Knowledge and Action
Knowledge Action
Information
Data
March 8, 2010 8
Information is an Organisation Asset
March 8, 2010 9
Data Management and Project Success
March 8, 2010 10
Generalised Information Management Lifecycle
March 8, 2010 11
Expanded Generalised Information Management
Lifecycle
Plan, Design and
Specify
De
Implement sig
Underlying n,
Im
Infrastructure ple
m en
Enter, Create, t, M
Acquire, Derive, an
ag
Update, Capture e,
Co
nt
Store, Manage, ro
la
Replicate and nd
Distribute Ad
mi
ni ste
r
• Include phases for information Protect and Recover
management lifecycle design
and implementation of Archive and Recall
appropriate hardware and
software to actualise lifecycle
Delete/Remove
March 8, 2010 12
Data and Information Management
March 8, 2010 13
Data and Information Management
March 8, 2010 14
Data Management Goals
• Primary goals
− To understand the information needs of the enterprise and all its
stakeholders
− To capture, store, protect, and ensure the integrity of data assets
− To continually improve the quality of data and information,
including accuracy, integrity, integration, relevance and
usefulness of data
− To ensure privacy and confidentiality, and to prevent
unauthorised inappropriate use of data and information
− To maximise the effective use and value of data and information
assets
March 8, 2010 15
Data Management Goals
• Secondary goals
− To control the cost of data management
− To promote a wider and deeper understanding of the value of
data assets
− To manage information consistently across the enterprise
− To align data management efforts and technology with business
needs
March 8, 2010 16
Triggers for Data Management Initiative
March 8, 2010 17
Data Management Principles
March 8, 2010 18
Organisation Data Management Function
Data Management
March 8, 2010 20
Shared Role Between Business and IT
March 8, 2010 21
Why Develop and Implement a Data Management
Framework?
• Improve organisation data management efficiency
• Deliver better service to business
• Improve cost-effectiveness of data management
• Match the requirements of the business to the management of the
data
• Embed handling of compliance and regulatory rules into data
management framework
• Achieve consistency in data management across systems and
applications
• Enable growth and change more easily
• Reduce data management and administration effort and cost
• Assist in the selection and implementation of appropriate data
management solutions
• Implement a technology-independent data architecture
March 8, 2010 22
Data Management Issues
March 8, 2010 23
Data Management Issues
March 8, 2010 24
Data Management Problems – User View
March 8, 2010 26
Data Quality
March 8, 2010 27
State of Information and Data Governance
March 8, 2010 28
Your Organisation Recognises and Values Information as a
Strategic Asset and Manages it Accordingly
Disagree 21.5%
Neutral 17.1%
Agree 39.5%
March 8, 2010 29
Direction of Change in the Results and Effectiveness of the
Organisation's Formal or Informal Information/Data
Governance Processes Over the Past Two Years
March 8, 2010 30
Perceived Effectiveness of the Organisation's Current
Formal or Informal Information/Data Governance Processes
March 8, 2010 31
Actual Information/Data Governance Effectiveness
vs. Organisation's Perception
March 8, 2010 32
Current Status of Organisation's Information/Data
Governance Initiatives
Started an Information/Data Governance Initiative, but
1.5%
Discontinued the Effort
Considered a Focused Information/Data Governance
0.5%
Effort but Abandoned the Idea
March 8, 2010 33
Expected Changes in Organisation's Information/Data
Governance Efforts Over the Next Two Years
Other 5.2%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100
%
March 8, 2010 35
Change In Organisation's Information / Data Quality
Over the Past Two Years
Information / Data Quality
10.5%
Has Significantly Improved
March 8, 2010 36
Maturity Of Information / Data Governance Goal
Setting And Measurement In Your Organisation
5 - Optimised 3.7%
4 - Managed 11.8%
3 - Defined 26.7%
2 - Repeatable 28.9%
1 - Ad-hoc 28.9%
March 8, 2010 37
Maturity Of Information / Data Governance
Processes And Policies In Your Organisation
5 - Optimised 1.6%
4 - Managed 4.8%
3 - Defined 24.5%
2 - Repeatable 46.3%
1 - Ad-hoc 22.9%
March 8, 2010 38
Maturity Of Responsibility And Accountability For
Information / Data Governance Among Employees In Your
Organisation
5 - Optimised 6.9%
4 - Managed 3.2%
3 - Defined 31.7%
2 - Repeatable 25.4%
1 - Ad-hoc 32.8%
March 8, 2010 39
Other Data Management Frameworks
March 8, 2010 40
Other Data Management-Related Frameworks
March 8, 2010 41
DMBOK, TOGAF and COBIT
Can be a DMBOK Is a Specific and
Precursor to Comprehensive Data
Implementing Oriented Framework
Data
Management DMBOK Provides Detailed
for Definition,
Implementation and
TOGAF Defines the Process Operation of Data
for Creating a Data Management and Utilisation
Architecture as Part of an
Overall Enterprise
Architecture
Can Provide a Maturity
Model for Assessing
Data Management
March 8, 2010 42
DMBOK, TOGAF and COBIT – Scope and Overlap
DMBOK
Data Development
Data Operations Management
Reference and Master Data Management
Data Warehousing and Business Intelligence Management
TOGAF Document and Content Management
Metadata Management
Data Quality Management
Data
Governance
Data Security COBIT
Management
March 8, 2010 43
TOGAF and Data Management
• Phase C1 (subset of
Phase C) relates to
Phase A:
Architecture defining a data
Vision
Phase H:
Phase B:
architecture
Architecture
Business
Change
Architecture
Management
Phase C1:
Data
Architecture
Phase G: Phase C:
Requirements Information
Implementation
Management Systems
Governance Architecture
Phase C2:
Solutions and
Application
Phase F: Phase D: Architecture
Migration Technology
Planning Architecture
Phase E:
Opportunities
and Solutions
March 8, 2010 44
TOGAF Phase C1: Information Systems Architectures
- Data Architecture - Objectives
• Purpose is to define the major types and sources of data
necessary to support the business, in a way that is:
− Understandable by stakeholders
− Complete and consistent
− Stable
• Define the data entities relevant to the enterprise
• Not concerned with design of logical or physical storage
systems or databases
March 8, 2010 45
TOGAF Phase C1: Information Systems Architectures
- Data Architecture - Overview
Phase C1: Information Systems
Architectures - Data Architecture
Key Considerations for Data Reference Materials External to the Select Reference Models,
Architecture Enterprise Viewpoints, and Tools
March 8, 2010 47
TOGAF Phase C1: Information Systems Architectures - Data
Architecture - Approach - Key Considerations for Data
Architecture
• Data Migration
− Identify data migration requirements and also provide indicators
as to the level of transformation for new/changed applications
− Ensure target application has quality data when it is populated
− Ensure enterprise-wide common data definition is established to
support the transformation
March 8, 2010 48
TOGAF Phase C1: Information Systems Architectures - Data
Architecture - Approach - Key Considerations for Data
Architecture
• Data Governance
− Ensures that the organisation has the necessary dimensions in
place to enable the data transformation
− Structure – ensures the organisation has the necessary structure
and the standards bodies to manage data entity aspects of the
transformation
− Management System - ensures the organisation has the
necessary management system and data-related programs to
manage the governance aspects of data entities throughout its
lifecycle
− People - addresses what data-related skills and roles the
organisation requires for the transformation
March 8, 2010 49
TOGAF Phase C1: Information Systems Architectures
- Data Architecture - Outputs
• Refined and updated versions of the Architecture Vision phase deliverables
− Statement of Architecture Work
− Validated data principles, business goals, and business drivers
• Draft Architecture Definition Document
− Baseline Data Architecture
− Target Data Architecture
• Business data model
• Logical data model
• Data management process models
• Data Entity/Business Function matrix
• Views corresponding to the selected viewpoints addressing key stakeholder concerns
− Draft Architecture Requirements Specification
• Gap analysis results
• Data interoperability requirements
• Relevant technical requirements
• Constraints on the Technology Architecture about to be designed
• Updated business requirements
• Updated application requirements
− Data Architecture components of an Architecture Roadmap
March 8, 2010 50
COBIT Structure
COBIT
Plan and Organise (PO) Acquire and Implement (AI) Deliver and Support (DS) Monitor and Evaluate (ME)
PO2 Define the information AI2 Acquire and maintain ME2 Monitor and evaluate
DS2 Manage third-party services
architecture application software internal control
PO3 Determine technological AI3 Acquire and maintain DS3 Manage performance and ME3 Ensure regulatory
direction technology infrastructure capacity compliance
PO5 Manage the IT investment AI5 Procure IT resources DS5 Ensure systems security
March 8, 2010 51
COBIT and Data Management
March 8, 2010 52
COBIT and Data Management
March 8, 2010 53
COBIT Process DS11 Manage Data
• DS11.1 Business Requirements for Data Management
− Establish arrangements to ensure that source documents expected from the business are received, all data received from the
business are processed, all output required by the business is prepared and delivered, and restart and reprocessing needs are
supported
• DS11.2 Storage and Retention Arrangements
− Define and implement procedures for data storage and archival, so data remain accessible and usable
− Procedures should consider retrieval requirements, cost-effectiveness, continued integrity and security requirements
− Establish storage and retention arrangements to satisfy legal, regulatory and business requirements for documents, data, archives,
programmes, reports and messages (incoming and outgoing) as well as the data (keys, certificates) used for their encryption and
authentication
• DS11.3 Media Library Management System
− Define and implement procedures to maintain an inventory of onsite media and ensure their usability and integrity
− Procedures should provide for timely review and follow-up on any discrepancies noted
• DS11.4 Disposal
− Define and implement procedures to prevent access to sensitive data and software from equipment or media when they are
disposed of or transferred to another use
− Procedures should ensure that data marked as deleted or to be disposed cannot be retrieved.
• DS11.5 Backup and Restoration
− Define and implement procedures for backup and restoration of systems, data and documentation in line with business
requirements and the continuity plan
− Verify compliance with the backup procedures, and verify the ability to and time required for successful and complete restoration
− Test backup media and the restoration process
• DS11.6 Security Requirements for Data Management
− Establish arrangements to identify and apply security requirements applicable to the receipt, processing, physical storage and
output of data and sensitive messages
− Includes physical records, data transmissions and any data stored offsite
March 8, 2010 54
COBIT Data Management Goals and Metrics
Activity Goals Process Goals Activity Goals
•Backing up data and testing •Maintain the completeness, •Backing up data and testing
restoration accuracy, validity and restoration
•Managing onsite and offsite accessibility of stored data •Managing onsite and offsite
storage of data •Secure data during disposal storage of data
•Securely disposing of data of media •Securely disposing of data
and equipment •Effectively manage storage and equipment
media
March 8, 2010 55
Data Management Book of Knowledge (DMBOK)
March 8, 2010 56
Data Management Book of Knowledge (DMBOK)
March 8, 2010 58
Scope and Structure of Data Management Book of
Knowledge (DMBOK)
Data Management
Environmental Elements
Data
Management
Functions
March 8, 2010 59
DMBOK Data Management Functions
Data Management
Functions
March 8, 2010 60
DMBOK Data Management Functions
• Data Governance - planning, supervision and control over data management and
use
• Data Architecture Management - defining the blueprint for managing data assets
• Data Development - analysis, design, implementation, testing, deployment,
maintenance
• Data Operations Management - providing support from data acquisition to
purging
• Data Security Management - Ensuring privacy, confidentiality and appropriate
access
• Data Quality Management - defining, monitoring and improving data quality
• Reference and Master Data Management - managing master versions and
replicas
• Data Warehousing and Business Intelligence Management - enabling reporting
and analysis
• Document and Content Management - managing data found outside of databases
• Metadata Management - integrating, controlling and providing metadata
March 8, 2010 61
DMBOK Data Management Environmental Elements
Data Management
Environmental Elements
March 8, 2010 62
DMBOK Data Management Environmental Elements
• Goals and Principles - directional business goals of each function and the fundamental
principles that guide performance of each function
• Activities - each function is composed of lower level activities, sub-activities, tasks and
steps
• Primary Deliverables - information and physical databases and documents created as
interim and final outputs of each function. Some deliverables are essential, some are
generally recommended, and others are optional depending on circumstances
• Roles and Responsibilities - business and IT roles involved in performing and supervising
the function, and the specific responsibilities of each role in that function. Many roles will
participate in multiple functions
• Practices and Techniques - common and popular methods and procedures used to perform
the processes and produce the deliverables and may also include common conventions,
best practice recommendations, and alternative approaches without elaboration
• Technology - categories of supporting technology such as software tools, standards and
protocols, product selection criteria and learning curves
• Organisation and Culture – this can include issues such as management metrics, critical
success factors, reporting structures, budgeting, resource allocation issues, expectations
and attitudes, style, cultural, approach to change management
March 8, 2010 63
DMBOK Data Management Functions and
Environmental Elements
Goals and Activities Primary Roles and Practices and Technology Organisation
Principles Deliverables Responsibilities Techniques and Culture
Data
Governance
Data
Architecture
Management
Data
Development
Data
Operations
Management
Scope of Each Data Management Function
Data Security
Management
Data Quality
Management
Reference and
Master Data
Management
Data
Warehousing
and Business
Intelligence
Management
Document and
Content
Management
Metadata
Management
March 8, 2010 64
Scope of Data Management Book of Knowledge
(DMBOK) Data Management Framework
• Hierarchy
− Function
• Activity
− Sub-Activity (not in all cases)
• Each activity is classified as one (or more) of:
− Planning Activities (P)
• Activities that set the strategic and tactical course for other data management
activities
• May be performed on a recurring basis
− Development Activities (D)
• Activities undertaken within implementation projects and recognised as part of the
systems development lifecycle (SDLC), creating data deliverables through analysis,
design, building, testing, preparation, and deployment
− Control Activities (C)
• Supervisory activities performed on an on-going basis
− Operational Activities (O)
• Service and support activities performed on an on- going basis
March 8, 2010 65
Activity Groups Within Functions
March 8, 2010 66
DMBOK Function and Activity Structure
Data
Management
Understand Data
Data Modeling, Develop and Promote Understand Reference Understand Business
Data Management Understand Enterprise Security Needs and Documents / Records Understand Metadata
Analysis, and Solution Database Support Data Quality and Master Data Intelligence
Planning Information Needs Regulatory Management Requirements
Design Awareness Integration Needs Information Needs
Requirements
Analyse and Align Data Model and Define and Maintain Implement Data
Define Data Security Profile, Analyse, and Develop and Maintain
With Other Business Design Quality the Data Integration Warehouses and Data
Standards Assess Data Quality Metadata Standards
Models Management Architecture Marts
Implement Reference
Define and Maintain Define Data Security Implement a Managed
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions
Define and Maintain Monitor User Define and Maintain Monitor and Tune BI
Set and Evaluate Data Manage Metadata
Enterprise Taxonomies Authentication and Hierarchies and Activity and
Quality Service Levels Repositories
and Namespaces Access Behaviour Affiliations Performance
Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data
Monitor Operational
DQM Procedures and
Performance
March 8, 2010 67
DMBOK Function and Activity - Planning Activities
Data
Management
Analyse and Align Data Model and Define and Maintain Implement Data Develop and
Define Data Security Profile, Analyse, and
With Other Business Design Quality the Data Integration Warehouses and Maintain Metadata
Standards Assess Data Quality
Models Management Architecture Data Marts Standards
Implement Reference
Define and Maintain Define Data Security Implement a
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Managed Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions
Define and Maintain Manage Data Access Test and Validate Monitor and Tune
Establish “Golden”
the DW / BI Views and Data Quality Data Warehousing Integrate Metadata
Records
Architecture Permissions Requirements Processes
Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data
Design and
Implement
Operational DQM
Procedures
Monitor Operational
DQM Procedures and
Performance
March 8, 2010 68
DMBOK Function and Activity - Control Activities
Data
Management
Understand Data
Data Modeling, Develop and Promote Understand Reference Understand Business
Data Management Understand Enterprise Security Needs and Documents / Records Understand Metadata
Analysis, and Solution Database Support Data Quality and Master Data Intelligence
Planning Information Needs Regulatory Management Requirements
Design Awareness Integration Needs Information Needs
Requirements
Analyse and Align Data Model and Define and Maintain Implement Data
Define Data Security Profile, Analyse, and Develop and Maintain
With Other Business Design Quality the Data Integration Warehouses and Data
Standards Assess Data Quality Metadata Standards
Models Management Architecture Marts
Implement Reference
Define and Maintain Define Data Security Implement a Managed
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions
Define and Maintain Monitor User Define and Maintain Monitor and Tune BI
Set and Evaluate Data Manage Metadata
Enterprise Taxonomies Authentication and Hierarchies and Activity and
Quality Service Levels Repositories
and Namespaces Access Behaviour Affiliations Performance
Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data
Monitor Operational
DQM Procedures and
Performance
March 8, 2010 69
DMBOK Function and Activity - Development
Activities Data
Management
Understand Data
Data Modeling, Develop and Promote Understand Reference Understand Business
Data Management Understand Enterprise Security Needs and Documents / Records Understand Metadata
Analysis, and Solution Database Support Data Quality and Master Data Intelligence
Planning Information Needs Regulatory Management Requirements
Design Awareness Integration Needs Information Needs
Requirements
Analyse and Align Data Model and Define and Maintain Implement Data
Define Data Security Profile, Analyse, and Develop and Maintain
With Other Business Design Quality the Data Integration Warehouses and Data
Standards Assess Data Quality Metadata Standards
Models Management Architecture Marts
Implement Reference
Define and Maintain Define Data Security Implement a Managed
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions
Define and Maintain Monitor User Define and Maintain Monitor and Tune BI
Set and Evaluate Data Manage Metadata
Enterprise Taxonomies Authentication and Hierarchies and Activity and
Quality Service Levels Repositories
and Namespaces Access Behaviour Affiliations Performance
Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data
Monitor Operational
DQM Procedures and
Performance
March 8, 2010 70
DMBOK Function and Activity - Operational
Activities Data
Management
Understand Data
Data Modeling, Develop and Promote Understand Reference Understand Business
Data Management Understand Enterprise Security Needs and Documents / Records Understand Metadata
Analysis, and Solution Database Support Data Quality and Master Data Intelligence
Planning Information Needs Regulatory Management Requirements
Design Awareness Integration Needs Information Needs
Requirements
Analyse and Align Data Model and Define and Maintain Implement Data
Define Data Security Profile, Analyse, and Develop and Maintain
With Other Business Design Quality the Data Integration Warehouses and Data
Standards Assess Data Quality Metadata Standards
Models Management Architecture Marts
Implement Reference
Define and Maintain Define Data Security Implement a Managed
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions
Define and Maintain Monitor User Define and Maintain Monitor and Tune BI
Set and Evaluate Data Manage Metadata
Enterprise Taxonomies Authentication and Hierarchies and Activity and
Quality Service Levels Repositories
and Namespaces Access Behaviour Affiliations Performance
Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data
Monitor Operational
DQM Procedures and
Performance
March 8, 2010 71
DMBOK Environmental Elements Structure
Data Management
Environmental
Elements
Attitudes. Styles,
Guiding Principles Trigger Events Other Resources
Preferences
Teamwork, Group
Dynamics,
Authority,
Empowerment.
Contracting
Strategies
Change
Management
Approach
March 8, 2010 72
DMBOK Environmental Elements
March 8, 2010 73
Data Governance
March 8, 2010 74
Data Governance
March 8, 2010 75
Data Governance – Definition and Goals
• Definition
− The exercise of authority and control (planning, monitoring, and
enforcement) over the management of data assets
• Goals
− To define, approve, and communicate data strategies, policies,
standards, architecture, procedures, and metrics
− To track and enforce regulatory compliance and conformance to
data policies, standards, architecture, and procedures
− To sponsor, track, and oversee the delivery of data management
projects and services
− To manage and resolve data related issues
− To understand and promote the value of data assets
March 8, 2010 76
Data Governance - Overview
Inputs Primary Deliverables
March 8, 2010 77
Data Governance Function, Activities and Sub-
Activities
Data Governance
Develop and Maintain the Data Strategy Coordinate Data Governance Activities
Identify and Appoint Data Stewards Monitor and Ensure Regulatory Compliance
Establish Data Governance and Stewardship Monitor and Enforce Conformance with Data
Organisations Policies, Standards and Architecture
Develop and Approve Data Policies, Oversee Data Management Projects and
Standards, and Procedures Services
March 8, 2010 78
Data Governance
March 8, 2010 79
Data Governance - Possible Organisation Structure
March 8, 2010 80
Data Governance Shared Decision Making
Business Decisions Shared Decision Making IT Decisions
Enterprise
Business Operating Enterprise Information Database
Model Information Model Management Architecture
Strategy
Enterprise
Information Needs Information Data Integration
IT Leadership Management Architecture
Policies
Enterprise Data Warehousing
Information Information and Business
Capital Investments Specifications Management Intelligence
Standards Architecture
Enterprise
Data Governance Issue Resolution Information Technical Metadata
Model Management
Services
March 8, 2010 81
Data Stewardship
March 8, 2010 82
Data Stewardship - Roles
March 8, 2010 83
Data Stewardship Roles Across Data Management
Functions - 1
All Data Stewards Executive Data Stewards Coordinating Data Business Data Stewards
Stewards
Data Architecture Review, validate, approve, Review and approve the Integrate specifications, Define data requirements
Management maintain and refine data enterprise data resolving differences specifications
architecture architecture
Data Development Validate physical data Define data requirements
models and database and specifications
designs, participate in
database testing and
conversion
Data Operations Define requirements for
Management data recovery, retention
and performance
Help identify, acquire, and
control externally sourced
data
Data Security Management Provide security, privacy
and confidentiality
requirements, identify and
resolve data security
issues, assist in data
security audits, and classify
information confidentiality
Reference and Master Data Control the creation,
Management update, and retirement of
code values and other
reference data, define
master data management
requirements, identify and
help resolve issues
March 8, 2010 84
Data Stewardship Roles Across Data Management
Functions - 2
All Data Stewards Executive Data Stewards Coordinating Data Business Data Stewards
Stewards
Data Warehousing and Provide business
Business Intelligence intelligence requirements
Management and management metrics,
and they identify and help
resolve business
intelligence issues
Document and Content Define enterprise
Management taxonomies and resolve
content management
issues
Metadata Management Create and maintain
business metadata (names,
meanings, business rules),
define metadata access
and integration needs and
use metadata to make
effective data stewardship
and governance decisions
Data Quality Management Define data quality
requirements and business
rules, test application edits
and validations, assist in
the analysis, certification,
and auditing of data
quality, lead clean-up
efforts, identify ways to
solve causes of poor data
quality, promote data
quality awareness
March 8, 2010 85
Data Strategy
March 8, 2010 86
Elements of Data Strategy
Data Management
Programme Charter
Data Management Data Management
Scope Statement Overall vision, business case,
goals, guiding principles, Implementation
measures of success, critical Roadmap
Goals and objectives for a success factors, recognised risks
defined planning horizon and the
Identifying specific programs,
roles, organisations, and
projects, task assignments, and
individual leaders accountable
delivery milestones
for achieving these objectives
March 8, 2010 88
Data Policies
March 8, 2010 89
Data Policies
March 8, 2010 90
Data Architecture
March 8, 2010 91
Data Standards and Procedures
March 8, 2010 92
Data Standards and Procedures
March 8, 2010 93
Regulatory Compliance
March 8, 2010 94
Regulatory Compliance
March 8, 2010 96
Issue Management, Control and Escalation
March 8, 2010 97
Data Management Projects
March 8, 2010 98
Data Asset Valuation
March 8, 2010 99
Data Architecture Management
Enterprise Data
Model
Other Enterprise
Conceptual Data Enterprise Logical
Subject Area Model Data Model
Model Data Models
Components
Data Steward
Valid Reference Data Quality
Responsibility Entity Life Cycles
Data Values Specifications
Assignments
• Definition
− Designing, implementing, and maintaining solutions to meet the
data needs of the enterprise
• Goals
− Identify and define data requirements
− Design data structures and other solutions to these requirements
− Implement and maintain solution components that meet these
requirements
− Ensure solution conformance to data architecture and standards
as appropriate
− Ensure the integrity, security, usability, and maintainability of
structured data assets
Data Modelling,
Data Model and Design
Analysis and Solution Detailed Data Design Data Implementation
Quality Management
Design
Implement
Analyse Information Design Physical Develop Data Modeling
Development / Test
Requirements Databases and Design Standards
Database Changes
Keys
• Data development activities are an integral part of the software development lifecycle
• Data modeling is an essential technique for effective data management and system design
• Conceptual and logical data modeling express business and application requirements while
physical data modeling represents solution design
• Data modeling and database design define detail solution component specifications
• Data modeling and database design balances tradeoffs and needs
• Data professionals should collaborate with other project team members to design
information products and data access and integration interfaces
• Data modeling and database design should follow documented standards
• Design reviews should review all data models and designs, in order to ensure they meet
business requirements and follow design standards
• Data models represent valuable knowledge resources and so should be carefully managed
and controlled them through library, configuration, and change management to ensure
data model quality and availability
• Database administrators and other data professionals play important roles in the
construction, testing, and deployment of databases and related application systems
• Entities
− A data entity is a collection of data about something that the
business deems important and worthy of capture
− Entities appear in conceptual or logical data models
• Relationships
− Business rules define constraints on what can and cannot be done
• Data Rules – define constraints on how data relates to other data
• Action Rules - instructions on what to do when data elements contain
certain values
• Performance and Ease of Use - Ensure quick and easy access to data
by approved users in a usable and business-relevant form
• Reusability - The database structure should ensure that, where
appropriate, multiple applications would be able to use the data
• Integrity - The data should always have a valid business meaning and
value, regardless of context, and should always reflect a valid state
of the business
• Security - True and accurate data should always be immediately
available to authorised users, but only to authorised users
• Maintainability - Perform all data work at a cost that yields value by
ensuring that the cost of creating, storing, maintaining, using, and
disposing of data does not exceed its value to the organisation
• What are the performance requirements? What is the maximum permissible time for a
query to return results, or for a critical set of updates to occur?
• What are the availability requirements for the database? What are the window(s) of time
for performing database operations? How often should database backups and transaction
log backups be done (i.e., what is the longest period of time we can risk non-recoverability
of the data)?
• What is the expected size of the database? What is the expected rate of growth of the
data? At what point can old or unused data be archived or deleted? How many concurrent
users are anticipated?
• What sorts of data virtualisation are needed to support application requirements in a way
that does not tightly couple the application to the database schema?
• Will other applications need the data? If so, what data and how?
• Will users expect to be able to do ad-hoc querying and reporting of the data? If so, how and
with which tools?
• What, if any, business or application processes does the database need to implement?
(e.g., trigger code that does cross-database integrity checking or updating, application
classes encapsulated in database procedures or functions, database views that provide
table recombination for ease of use or security purposes, etc.).
• Are there application or developer concerns regarding the database, or the database
development process, that need to be addressed?
• Is the application code efficient? Can a code change relieve a performance issue?
March 8, 2010 139
Performance Modifications
Participants Tools
•Database Administrators Metrics
•Software Developers
•Project Managers •Database Management Systems
•Data Stewards •Data Development Tools
•Data Architects and Analysts •Database Administration Tools •Availability
•DM Executives and Other IT •Office Productivity Tools •Performance
Management
•IT Operators
Set Database Performance Service Levels Inventory and Track Data Technology Licenses
Monitor and Tune Database Performance Support Data Technology Usage and Issues
• Definition
− Planning, development, and execution of security policies and
procedures to provide proper authentication, authorisation,
access, and auditing of data and information.
• Goals
− Enable appropriate, and prevent inappropriate, access and
change to data assets
− Meet regulatory requirements for privacy and confidentiality
− Ensure the privacy and confidentiality needs of all stakeholders
are met
•Business Goals
•Business Strategy
•Business Rules
•Business Process •Data Security Policies
•Data Strategy •Data Privacy and Confidentiality
•Data Privacy Issues Standards
•Related IT Policies and Standards •User Profiles, Passwords and
Memberships
Data Security •Data Security Permissions
•Data Security Controls
Suppliers
Management •Data Access Views
•Document Classifications
•Authentication and Access History
•Data Security Audits
•Data Stewards
•IT Steering Committee
•Data Stewardship Council
•Government
•Customers
Understand
Define Data Manage Users, Monitor User
Data Security Define Data Manage Data Classify
Define Data Security Passwords, and Authentication Audit Data
Needs and Security Access Views Information
Security Policy Controls and Group and Access Security
Regulatory Standards and Permissions Confidentially
Procedures Membership Behaviour
Requirements
Password
Business
Standards and
Requirements
Procedures
Regulatory
Requirements
•Business Drivers
•Data Requirements Policy and •Master and Reference Data
Regulations Requirements
•Standards •Data Models and Documentation
•Code Sets •Reliable Reference and Master Data
•Master Data •Golden Record Data Lineage
•Transactional Data •Data Quality Metrics and Reports
Reference and •Data Cleansing Services
Vocabulary
Management
Party Master
and
Data
Reference
Data
Defining
Financial Golden
Master Data Master Data
Values
Product
Master Data
Location
Master Data
• What are the important roles, organisations, places, and things referenced
repeatedly?
• What data is describing the same person, organisation, place, or thing?
• Where is this data stored? What is the source for the data?
• Which data is more accurate? Which data source is more reliable and credible?
Which data is most current?
• What data is relevant for specific needs? How do these needs overlap or conflict?
• What data from multiple sources can be integrated to create a more complete
view and provide a more comprehensive understanding of the person,
organisation, place or thing?
• What business rules can be established to automate master data quality
improvement by accurately matching and merging data about the same person,
organisation, place, or thing?
• How do we identify and restore data that was inappropriately matched and
merged?
• How do we provide our golden data values to other systems across the
enterprise?
• How do we identify where and when data other than the golden values is used?
March 8, 2010 220
Party Master Data
Staging
MetaData Management
Business Integration Job Flow and
Metadata Metadata Statistics
• Golden data values are the data values thought to be the most
accurate, current, and relevant for shared, consistent use across
applications
• Determine golden values by analyssing data quality, applying data
quality rules and matching rules, and incorporating data quality
controls into the applications that acquire, create, and update data
• Establish data quality measurements to set expectations, measure
improvements, and help identify root causes of data quality
problems
• Assess data quality through a combination of data profiling activities
and verification against adherence to business rules
• Once the data is standardised and cleansed, the next step is to
attempt reconciliation of redundant data through application of
matching rules
Understand Business Define and Maintain Implement Data Implement Business Monitor and Tune Monitor and Tune BI
Process Data for
Intelligence the DW-BI Warehouses and Data Intelligence Tools and Data Warehousing Activity and
Business Intelligence
Information Needs Architecture Marts User Interfaces Processes Performance
On Line Analytical
Mapping Sources and
Processing (OLAP)
Targets
Tools
Implementing
Management
Dashboards and
Scorecards
Performance
Management Tools
Predictive Analytics
and Data Mining Tools
Advanced
Visualisation and
Discovery Tools
March 8, 2010 246
Data Warehousing and Business Intelligence
Management Principles
• Obtain executive commitment and support as these projects are labour intensive
• Secure business SMEs as their support and high availability are necessary for getting the correct data
and useful BI solution
• Be business focused and driven. Make sure DW / BI work is serving real priority business needs and
solving burning business problems. Let the business drive the prioritisation
• Demonstrable data quality is essential
• Provide incremental value. Ideally deliver in continual 2-3 month segments
• Transparency and self service. The more context (metadata of all kinds) provided, the more value
customers derive. Wisely exposing information about the process reduces calls and increases
satisfaction.
• One size does not fit all. Make sure you find the right tools and products for each of your customer
segments
• Think and architect globally, act and build locally. Let the big-picture and end- vision guide the
architecture, but build and deliver incrementally, with much shorter term and more project-based
focus
• Collaborate with and integrate all other data initiatives, especially those for data governance, data
quality, and metadata
• Start with the end in mind. Let the business priority and scope of end-data- delivery in the BI space
drive the creation of the DW content. The main purpose for the existence of the DW is to serve up data
to the end business customers via the BI capabilities
• Summarise and optimise last, not first. Build on the atomic data and add aggregates or summaries as
needed for performance, but not to replace the detail.
Customers, Suppliers
and Regulators
Published
Frontline Workers Reports
BI Spreadsheets Business
Production Reporting Tools Statistics Query
Commonly Commonly
Specialist Tools
Used Tools Used Tools
March 8, 2010 254
On Line Analytical Processing (OLAP) Tools
•Stored Documents
Participants •Office Productivity Tools
•All Employees •Image and Workflow
•Data Stewards Management Tools Metrics
•DM Professionals •Records Management Tools
•Records Management Staff •XML Development Tools
•Other IT Professionals •Collaboration Tools •Return on investment
•Data Management Executive •Internet •Key Performance Indicators
•Other IT Managers •Email Systems •Balanced Scorecards
•Chief Information Officer
•Chief Knowledge Officer
March 8, 2010 270
Document and Content Management Function,
Activities and Sub-Activities
Document and Content Management
• Definition
− Planning, implementation, and control activities to enable easy
access to high quality, integrated metadata
• Goals
− Provide organisational understanding of terms, and usage
− Integrate metadata from diverse source
− Provide easy, integrated access to metadata
− Ensure metadata quality and security
• Metadata is information about the physical data, technical and business processes, data rules and
constraints, and logical and physical structures of the data, as used by an organisation
• Descriptive tags describe data, concepts and the connections between the data and concepts
− Business Analytics: Data definitions, reports, users, usage, performance
− Business Architecture: Roles and organisations, goals and objectives
− Business Definitions: The business terms and explanations for a particular concept, fact, or other item
found in an organisation
− Business Rules: Standard calculations and derivation methods
− Data Governance: Policies, standards, procedures, programs, roles, organisations, stewardship
assignments
− Data Integration: Sources, targets, transformations, lineage, ETL workflows, EAI, EII, migration /
conversion
− Data Quality: Defects, metrics, ratings
− Document Content Management: Unstructured data, documents, taxonomies, name sets, legal
discovery, search engine indexes
− Information Technology Infrastructure: Platforms, networks, configurations, licenses
− Logical Data Models: Entities, attributes, relationships and rules, business names and definitions
− Physical Data Models: Files, tables, columns, views, business definitions, indexes, usage, performance,
change management
− Process Models: Functions, activities, roles, inputs / outputs, workflow, business rules, timing, stores
− Systems Portfolio and IT Governance: Databases, applications, projects and programs, integration
roadmap, change management
− Service-Oriented Architecture (SOA) Information: Components, services, messages, master data
− System Design and Development: Requirements, designs and test plans, impact
− Systems Management: Data security, licenses, configuration, reliability, service levels
Industry /
Centralised
Business User Consensus Metadata
Metadata
Requirements Metadata Repositories
Architecture
Standards
Directories,
Distributed International Glossaries and
Technical User
Metadata Metadata Other
Requirements
Architecture Standards Metadata
Stores
Hybrid Standard
Metadata Metadata
Architecture Metrics
• Hybrid architecture where metadata still moves directly from the source systems
into a repository but the repository design only accounts for the user-added
metadata, the critical standardised items and the additions from manual sources
• Advantages
− Near-real-time retrieval of metadata from its source and enhanced metadata to meet
user needs most effectively, when needed
− Lowers the effort for manual IT intervention and custom-coded access functionality to
proprietary systems.
• Disadvantages
− Source systems must be available because the distributed nature of the back-end
systems handles processing of queries
− Additional overhead is required to link those initial results with metadata augmentation
in the central repository before presenting the result set to the end user
− Design forces the metadata repository to contain the latest version of the metadata
source and forces it to manage changes to the source, as well
− Sets of program / process interfaces to tie the repository back to the meta- data
source(s) must be built and maintained
• Understanding the various standards for the implementation and management of meta-
data in industry is essential to the appropriate selection and use of a metadata solution for
an enterprise
− OMG (Object Management Group) specifications
• Common Warehouse Metadata (CWM)
• Information Management Metamodel (IMM)
• MDC Open Information Model (OIM)
• Extensible Markup Language (XML)
• Unified Modeling Language (UML)
• Extensible Markup Interface (XMI)
• Ontology Definition Metamodel (ODM)
− World Wide Web Consortium (W3C) RDF (Relational Definition Framework) for describing and
interchanging meta- data using XML
− Dublin Core Metadata Initiative (DCMI) interoperable online metadata standard using RDF
− Distributed Management Task Force (DTMF) Web-Based Enterprise Management (WBEM)
Common Information Model (CIM) standards-based management tools facilitating the exchange of
data across otherwise disparate technologies and platforms
− Metadata standards for unstructured data
• ISO 5964 - Guidelines for the establishment and development of multilingual thesauri
• ISO 2788 - Guidelines for the establishment and development of monolingual thesauri
• ANSI/NISO Z39.1 - American Standard Reference Data and Arrangement of Periodicals
• ISO 704 - Terminology work Principles and methods
• Definition
− Planning, implementation, and control activities that apply quality
management techniques to measure, assess, improve, and ensure
the fitness of data for use
• Goals
− To measurably improve the quality of data in relation to defined
business expectations
− To define requirements and specifications for integrating data
quality control into the system development lifecycle
− To provide defined processes for measuring, monitoring, and
reporting conformance to acceptable levels of data quality
•Business Requirements
•Data Requirements •Improved Quality Data
•Data Quality Expectations •Data Management
•Data Policies and Standards •Operational Analysis
•Business metadata •Data Profiles
•Technical metadata •Data Quality Certification Reports
•Data Sources and Data Stores •Data Quality Service Level
Agreements
Suppliers
Monitor
Design and
Develop and Profile, Define Data Test and Set and Continuously Clean and Operational
Define Data Define Data Implement
Promote Data Analyse and Quality Validate Data Evaluate Data Measure and Manage Data Correct Data DQM
Quality Quality Operational
Quality Assess Data Business Quality Quality Monitor Data Quality Issues Quality Procedures
Requirements Metrics DQM
Awareness Quality Rules Requirements Service Levels Quality Defects and
Procedures
Performance
• Applications are dependent on the use of data that meets specific needs associated with
the successful completion of a business process
• Data quality requirements are often hidden within defined business policies
− Identify key data components associated with business policies
− Determine how identified data assertions affect the business
− Evaluate how data errors are categorised within a set of data quality dimensions
− Specify the business rules that measure the occurrence of data errors
− Provide a means for implementing measurement processes that assess conformance to those
business rules
• Dimensions of data quality
− Accuracy
− Completeness
− Consistency
− Currency
− Precision
− Privacy
− Reasonableness
− Referential Integrity
− Timeliness
− Uniqueness
− Validity
• Data quality SLAs specify the organisation’s expectations for response and
remediation
• Having data quality inspection and monitoring in place increases the likelihood of
detection and remediation of a data quality issue before a significant business
impact can occur
• Operational data quality control defined in a data quality SLA includes
− The data elements covered by the agreement
− The business impacts associated with data flaws
− The data quality dimensions associated with each data element
− The expectations for quality for each data element for each of the identified dimensions
in each application or system in the value chain
− The methods for measuring against those expectations
− The acceptability threshold for each measurement
− The individual(s) to be notified in case the acceptability threshold is not met. The
timelines and deadlines for expected resolution or remediation of the issue
− The escalation strategy and possible rewards and penalties when the resolution times
are met.
Scope of Project
Data Data Data Data Data Security Reference and Data Document Metadata Data Quality
Type of Governance Architecture Development Operations Management Master Data Warehousing and Content Management Management
Project Management Management Management and Business
Intelligence
Management
Management
Architecture
Analysis and
Design
Implementation
Operational
Improvement
Management
and
Administration
Organisational
Scope of Project
Data
Management
Functions
Within Scope
of Project
Type of
Project
• Separate business units with the organisation generally implement their own
solutions
• Each business unit will have different IT systems, data warehouses/data marts and
business intelligence tools
• Organisation-wide coordination of data resources requires a centralised dedicated
structure like the DMCOE providing data services
• Leads a organisation to business benefits through continuous improvement of
data management
• DMCOE functions need to focus on leveraging organisational knowledge and skills
to maximise the value of data to the organisation
• Maximise technology investment while decreasing costs and increasing efficiency,
centralise best practices and standards and empower knowledge workers with
information and provide thought leadership to the entire company
• DMCOE does not exist in isolation to other operations and service management
functions
Personnel Management
Data Management
Portfolio Management
Data
Reference
Data Data Warehousing Document
Data Management Data Data Data Security and Master Metadata Data Quality
Architecture Operations and Business and Content
Strategy Governance Development Management Data Management Management
Management Management Intelligence Management
Management
Management
Data Management
Environment and Specific Functions
Infrastructure
Management
Co-ordination of
Resource
Data Management Management of Analysis and
Management and
Systems and Data Processes Design
Allocation
Initiatives
Creation and
Enforcement of Vendor Performance Development
Data Principles Management Management Standards
and Standards
Solution
Data Usage
Data Quality Development and
Strategy
Deployment
Test Management –
Version Management Service Level Application and Tools
System, Integration,
and Control Management Architecture
UAT, UAT Support
Performance
Data Migration Data and Content
Monitoring and Security Management
Management Architecture
Management
Integration
Reporting System Maintenance
Architecture
• Business analysts begin to control the data management process with IT playing a
supporting role
• Data is recognised as a business enabler and moves from an undervalued commodity to an
enterprise asset but there are still limited controls in place
• Executive management appreciates and understands the role of data governance and
commits resources to its management
• Data administrative function exists as a complement to the database administration
function and data is present for both business and IT related development discussions
• Some core data has defined policy that it is documented as part of the applications
development lifecycle and the policies are enforced to a limited extent and testing is
performed to ensure that data quality requirements are being achieved
• Data quality is not fully defined and there are multiple views of what quality
• Metadata repository exists and a data group maintains corporate data definitions and
business rules
• A centralised platform for managing data is available at the group level and feeds analytical
data marts
• Data is available to business users and can be audited
• Data is treated as a critical corporate asset and viewed as equivalent to other enterprise wide assets
• Unified data governance strategy exists throughout the enterprise with executive level and CEO
support
• Data management objectives are reviewed by senior management
• Business process interaction is completely documented and planning is centralised
• Data quality control, integration and synchronisation are integral parts of all business processes
• Content is monitored and corrected in real time to manage the reliability of the data manufacturing
process and is based on the needs of customers, end users and the organisation as a whole
• Data quality is understood in statistical terms and managed throughout the transactions lifecycle
• Root cause analysis is well established and proactive steps are taken to prevent and not just correct
data inconsistencies
• A centralised metadata repository exists and all changes are synchronised
• Data consistency is expected and achieved
• Data platform is managed at the enterprise level and feeds all reference data repositories
• Advanced platform tools are used to manage the metadata repository and all data transformation
processes
• Data quality and integration tools are standardised across the enterprise.
Data Governance
< Description of
Data Architecture Management capability associated
with maturity level >
Data Development
Metadata Management
Alan McSweeney
alan@alanmcsweeney.com