Master Data Management IBM InfoSphere Rapid Deployment Package Front cover

Document technical information

Format pdf
Size 10.1 MB
First found May 22, 2018

Document content analysis

Category Also themed
Language
English
Type
not defined
Concepts
no text concepts found

Persons

Linus Torvalds
Linus Torvalds

wikipedia, lookup

Barry Rosen
Barry Rosen

wikipedia, lookup

Organizations

Places

Transcript

IBM ® Information Management Software
Front cover
Master Data Management
IBM InfoSphere
Rapid Deployment Package
Implementing faster to see the
benefits faster
Seeing benefits with a financial
services scenario
Getting control of your
data environment
Chuck Ballard
Priyanka Deswal
Paul Flores
Philippe Guitard
Charles Jia
Marty Pittman
Neeraj Singh
Lena Woolf
ibm.com/redbooks
International Technical Support Organization
Master Data Management: IBM InfoSphere Rapid
Deployment Package
April 2011
SG24-7704-01
Note: Before using this information and the product it supports, read the information in
“Notices” on page vii.
Second Edition (April 2011)
This edition applies to the Rapid Deployment Package (RDP) solution for IBM® InfoSphere™
Master Data Management (MDM) Server Version 9.0.1 and Versions 8.0.1, 8.1.0.1 and 8.5 of
IBM InfoSphere Information Server.
© Copyright International Business Machines Corporation 2009, 2011. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP
Schedule Contract with IBM Corp.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
The team who wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . xiv
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Summary of changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Chapter 1. Overview of the Rapid Deployment Package for MDM . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 The case for the RDP for MDM Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Determining whether RDP is the right choice for you . . . . . . . . . . . . . . . . . 5
Chapter 2. Rapid Deployment Package details. . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 MDMIS Parameter Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 DL_000_AutoStart_PS_DELTA_LOAD Job Sequence . . . . . . . . . . . 8
2.2.2 DL_000_DELTA_LOAD Job Sequence . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Standard Interface File (SIF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Suspect Duplicate Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Configuration screens in the MDM Server UI . . . . . . . . . . . . . . . . . . . . . . 19
2.5.1 Enabling and disabling Suspect Duplicate Processing . . . . . . . . . . . 19
2.5.2 Selecting the set of Party Match criteria . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Database Load options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Chapter 3. RDP MDM: Direct Load. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1 Direct Load process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Job Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Data Quality Assurance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.5 Data Quality Error Consolidation / Reporting . . . . . . . . . . . . . . . . . . . . . . 45
3.6 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.7 ID assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.8 Data insert and update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
© Copyright IBM Corp. 2009, 2011. All rights reserved.
iii
Chapter 4. RDP for MDM: Delta Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2 MDM Party Maintenance Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2.1 The instance resolution problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2.2 MDM Party Maintenance Services behavior . . . . . . . . . . . . . . . . . . . 68
4.2.3 MDM Party Maintenance Services Transaction List . . . . . . . . . . . . . 74
4.2.4 MDM Party Maintenance Services Profile. . . . . . . . . . . . . . . . . . . . . 87
4.2.5 MDM Party Maintenance Services installation . . . . . . . . . . . . . . . . . 89
4.3 MDM RDP Runtime Assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.3.1 SIF Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3.2 Data extension and SIF Parser configuration . . . . . . . . . . . . . . . . . . 99
4.3.3 SIF sequencer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.3.4 QualityStage runtime standardization and matching jobs . . . . . . . . 105
4.3.5 Search Suspect Candidates rule. . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.3.6 Disable phonetic keys generation in MDM Server . . . . . . . . . . . . . 106
4.3.7 MDM RDP Runtime Assets installation . . . . . . . . . . . . . . . . . . . . . . 107
4.3.8 MDM Matching Critical Data Rules Console user interface . . . . . . 111
4.4 Performance tuning for MDM Delta Load using RDP . . . . . . . . . . . . . . . 114
4.4.1 MDM BatchProcessor configuration . . . . . . . . . . . . . . . . . . . . . . . . 115
4.4.2 MDM Server configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.4.3 WebSphere Application Server configuration . . . . . . . . . . . . . . . . . 119
4.4.4 Database tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.4.5 Information Services Director job configuration . . . . . . . . . . . . . . . 122
4.5 Run Delta Load for MDM using RDP . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.5.1 Create source SIF files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.5.2 Run SIF Sequencer Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.5.3 Run MDM BatchProcessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.5.4 Check Delta Load result and error messages . . . . . . . . . . . . . . . . . 128
Chapter 5. Financial services business scenario . . . . . . . . . . . . . . . . . . 131
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.2 Business requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.3 Environment configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.4 An approach to implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5 Initial load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.5.1 FBankCoT checking, savings, and loans systems . . . . . . . . . . . . . 139
5.5.2 Data quality assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.5.3 Create canonical form from the data sources . . . . . . . . . . . . . . . . . 151
5.5.4 Validate and modify efficacy of the RDP MDM rule sets. . . . . . . . . 169
5.5.5 Create SIF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
5.5.6 Execute RDP for MDM jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
5.5.7 Verify successful load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
5.6 Suspect resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
iv
Master Data Management: IBM InfoSphere Rapid Deployment Package
5.7 Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
5.7.1 Hierarchy overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
5.7.2 Hierarchy scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
5.8 MDM consumption application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
5.9 Operational processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Appendix A. Configuration parameter file . . . . . . . . . . . . . . . . . . . . . . . . 275
Appendix B. Standard Interface File details . . . . . . . . . . . . . . . . . . . . . . . 295
Appendix C. MDM customization considerations . . . . . . . . . . . . . . . . . . 309
C.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
C.2 Data extensions and additions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
C.3 Behavior extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
C.4 Impact of data/behavior extensions on RDP for MDM . . . . . . . . . . . . . . 312
C.5 Extending RDP for MDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
C.6 Runtime column propagation (RCP). . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
C.7 Adding new elements (columns). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
C.8 Modifying existing elements (columns). . . . . . . . . . . . . . . . . . . . . . . . . . 316
Appendix D. Error processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
D.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
D.2 Pipe character (|) in the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
D.3 Validation error with the code table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
D.4 RT/ST/ADMIN_SYS_TP_CD error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
D.5 End of record missing error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
D.6 Start date after end date error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
D.7 Date format error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Appendix E. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Locating the web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Using the web material. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
System requirements for downloading the web material . . . . . . . . . . . . . 338
How to use the web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
Contents
v
vi
Master Data Management: IBM InfoSphere Rapid Deployment Package
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area.
Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product, program, or service that
does not infringe any IBM intellectual property right may be used instead. However, it is the user's
responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm
the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on
the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the
sample programs are written. These examples have not been thoroughly tested under all conditions. IBM,
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
vii
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. These and other IBM trademarked
terms are marked on their first occurrence in this information with the appropriate symbol (® or ™),
indicating US registered or common law trademarks owned by IBM at the time this information was
published. Such trademarks may also be registered or common law trademarks in other countries. A current
list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
DataStage®
DB2®
developerWorks®
IBM®
Information Agenda™
InfoSphere™
POWER5™
pSeries®
QualityStage™
Rational®
Redbooks®
Redbooks (logo)
Tivoli®
WebSphere®
®
The following terms are trademarks of other companies:
Java, and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other
countries, or both.
Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other
countries, or both.
Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
The following company names appearing in this publication are fictitious:
Fictional Bank Company T
FBankCoT
These names are used for instructional purposes only.
viii
Master Data Management: IBM InfoSphere Rapid Deployment Package
Preface
This IBM® Redbooks® publication documents the procedures for implementing
an IBM InfoSphere™ Master Data Management (MDM) solution using the Rapid
Deployment Package (RDP) for Master Data Management offering involving a
typical financial services business scenario.
It is aimed at IT architects, Information Management specialists, and Information
Integration specialists responsible for implementing an IBM InfoSphere Master
Data Management solution on a Red Hat Enterprise Linux® 4.0 platform.
This book is organized as follows:
 Chapter 1, “Overview of the Rapid Deployment Package for MDM” on page 1
provides an outline of the fundamentals of implementing an enterprise MDM
solution with IBM InfoSphere Master Data Management Server, taking
advantage of the RDP for MDM.
 Chapter 2, “Rapid Deployment Package details” on page 7 provides an
overview of the RDP component of the IBM InfoSphere RDP for MDM
solution. It includes the MDM Information Server (MDMIS) parameter set
configuration details, a brief description of the Standard Interface File (SIF),
the Duplicate Suspect Processing (DSP) configuration and database loading
options.
 Chapter 3, “RDP MDM: Direct Load” on page 33 provides a high level
description of the Direct Load process and the RDP components that are
used in that process. The Direct Load process categories help to organize the
presentation of the high-level descriptions of the related IBM DataStage® and
IBM QualityStage™ assets.
 Chapter 4, “RDP for MDM: Delta Load” on page 65 provides an overview of a
Delta Load solution using IBM InfoSphere MDM Server (MDM) RDP Runtime
Assets and MDM Party Maintenance Services. A Delta Load in RDP is the
process of synchronizing changes in source system data with MDM Server.
Because data is processed by MDM services during load, this solution
provides the best level of business data validation, ease of implementation
and maintenance, and highest MDM Server sustainability. The chapter
provides implementation, configuration, and installation details about MDM
RDP Runtime Assets and MDM Party Maintenance Services.
 Chapter 5, “Financial services business scenario” on page 131 describes an
approach to implementing an IBM InfoSphere MDM Server using the
InfoSphere RDP on a Linux platform. The scenario uses a fictitious financial
services business as an example to explain the approach. The initial load of
© Copyright IBM Corp. 2009, 2011. All rights reserved.
ix
the IBM InfoSphere MDM Server is performed with RDP for MDM DataStage
and QualityStage (QS) jobs, and subsequent operational loads are performed
using MDM Server RDP runtime assets.
 Appendix A, “Configuration parameter file” on page 275 classifies the various
parameters into broad categories and sub-categories based on their function.
It identifies the parameters in these categories that you must modify before
the RDP for MDM jobs can be executed, and those that you should consider
modifying.
 Appendix B, “Standard Interface File details” on page 295 provides an
overview of the Record Type/Sub Type (RT/ST) mapping of the Standard
Interface File (SIF).
 Appendix C, “MDM customization considerations” on page 309 describes the
extensions supported by MDM Server and the impact of such extensions on
the RDP for MDM jobs.
 Appendix D, “Error processing” on page 317 describes the most commonly
encountered data-related problems in the SIF, and how they are highlighted in
the RDP for MDM error log.
The team who wrote this book
This book was produced by a team of specialists from around the world working
at the International Technical Support Organization (ITSO), San Jose Center.
Chuck Ballard is a Project Manager at the International
Technical Support organization, in San Jose, California. He
has over 35 years of experience, holding positions in the
areas of product engineering, sales, marketing, technical
support, and management. His expertise is in the areas of
database, data management, data warehousing, business
intelligence, and process re-engineering. He has written
extensively on these subjects, taught classes, and presented
at conferences and seminars worldwide. Chuck has both a
Bachelors degree and a Masters degree in Industrial Engineering from Purdue
University.
x
Master Data Management: IBM InfoSphere Rapid Deployment Package
Priyanka Deswal is an Information Agenda™ Architect with
IBM Software group, focused on designing solutions for
customers in Asia Pacific. She has more than eleven years of
experience in information management. Her areas of
expertise include database technologies, information
integration, master data management, content management
and business analytics. She holds a bachelors degree in
Computer Science and Engineering.
Paul Flores is a Senior Software Developer and Application
Architect, and an IBM Certified Solution Developer for
InfoSphere DataStage v8.5, located in Phoenix, AZ. His
career spans more than 20 years in various Information
System disciplines, working with companies such as Sandia
National Laboratories, Intel® and Acxiom. The majority of his
career has centered around the development of software in
support of diverse areas that span research and
manufacturing. Paul joined IBM in 2008, and has been a part
of the MDM RDP Development team since its inception. He holds a Bachelors
degree in Mathematics and a Masters degree in Computer Information Sciences.
Philippe Guitard is a Senior Software Developer with the
MDM Server development team at IBM Canada. He has over
15 years of experience with software application
development and data integration projects, and designed
and developed several DataStage jobs for the RDP solution.
Philippe has previously been a Senior Consultant,
specialized in Enterprise Application Integration (EAI) with
IBM WebSphere® Transformation Extender, and developed
credit scoring applications for Experian. Philippe holds a
Masters degree in Computer Science from the Galilée Institute, Paris, France.
Charles Jia is a Senior Technical Specialist in the IBM
Software Group with specialization in IBM InfoSphere MDM
Server. His past experience has included leading roles in
RDP for MDM Maintenance Services and MDM
Development, and having primary responsibility for MDM
features. He has over seven years of experience in
developing MDM Server and four years of experience in IBM
client services and consulting. Charles holds a Bachelors
degree in Computer Science from Brock University in St.
Catharines, ON Canada.
Preface
xi
Marty Pittman is an Information Management Architect with
the IBM Software Group, located in Charlotte, NC. He has
more than 17 years experience as an Information
Management Solution Architect, and a Master Data
Management and Data Warehousing / Business Intelligence
Technical Specialist. Marty has broad experience in using
his Financial Services background for discovering business
problems and architecting information management
solutions throughout all aspects of the Information
Architecture, with a focus on master data management, data warehousing, data
integration, and business analytics solutions in the banking and financial services
industries. Marty holds a Bachelors degree in Finance.
Neeraj Singh is currently a Senior Performance Engineer,
and has been working on InfoSphere Master Data
Management Server performance since June 2007. He has
prior experience leading the Java™ technologies test team
for functional, system, and performance tests as technical
lead and test project leader. Neeraj joined IBM in 2000 and
holds a Bachelors degree in Electronics and
Communications Engineering.
Lena Woolf is a Senior Product Architect for the InfoSphere
Master Data Management Server at the IBM Toronto Lab.
She has over 12 years of experience in designing and
developing enterprise applications for a wide range of
industries, including banking, insurance, retail, and health
care. Lena joined IBM in 2005 as part of the DWL
acquisition, and since that time has been involved in the
product architecture of MDM Server, playing a key role in six
major releases of the product. She holds a Masters degree
in Computer Science from the National Technical University of Ukraine.
xii
Master Data Management: IBM InfoSphere Rapid Deployment Package
Other contributors
Thanks to the following people for their contributions to this project.
From IBM locations worldwide
Tim Davis: Executive Director, Information Agenda Architecture Group, IBM
Software Group, Information Management, Littleton, MA.
Dickson Fu: Software Developer, IBM Software Group, Information Management,
Markham, ON Canada.
Christopher Grote: Technical Solution Architect, InfoSphere Centre of
Excellence, IBM Software Group, London, UK.
Clive Hannah: Information Agenda Architect, IBM Software Group, Information
Management, Markham, ON Canada.
Susan Laime: IM Analytics and Optimization Software Services, IBM Software
Group, Information Management, Littleton, MA.
Barry Rosen: Global Executive Architect, InfoSphere Information Agenda Group,
Westford, MA.
From the ITSO, San Jose, CA
Mary Comianos: Publication Management
Emma Jacobs: Graphics
Ann Lund: Residency Administration
Diane Sherman: Editor
Authors of the first edition of this book
The following list of authors wrote the first edition of this book. We thank them for
their excellent work, much of which is still contained in this second edition.











Nagraj Alur
Alex Baryudin
Mike Carney
Priyanka Deswal
Tim Davis
Elizabeth Dial
Norbert Eschle
Clive Hannah
Patrick Owen
Barry Rosen
Torben Skov
Preface
xiii
Now you can become a published author, too!
Here's an opportunity to spotlight your skills, grow your career, and become a
published author—all at the same time! Join an ITSO residency project and help
write a book in your area of expertise, while honing your experience using
leading-edge technologies. Your efforts will help to increase product acceptance
and customer satisfaction, as you expand your network of technical contacts and
relationships. Residencies run from two to six weeks in length, and you can
participate either in person or as a remote resident working from your home
base.
Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about
this book or other IBM Redbooks publications in one of the following ways:
 Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
 Send your comments in an email to:
[email protected]
 Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
xiv
Master Data Management: IBM InfoSphere Rapid Deployment Package
Stay connected to IBM Redbooks
 Find us on Facebook:
http://www.facebook.com/IBMRedbooks
 Follow us on Twitter:
http://twitter.com/ibmredbooks
 Look for us on LinkedIn:
http://www.linkedin.com/groups?home=&gid=2130806
 Explore new Redbooks publications, residencies, and workshops with the
IBM Redbooks weekly newsletter:
https://www.redbooks.ibm.com/Redbooks.nsf/subscribe?OpenForm
 Stay current on recent Redbooks publications with RSS Feeds:
http://www.redbooks.ibm.com/rss.html
Preface
xv
xvi
Master Data Management: IBM InfoSphere Rapid Deployment Package
Summary of changes
Summary of changes as created or updated on April 27, 2011.
In this section, we provide a high-level summary of changes made to Master
Data Management: Rapid Deployment Package for MDM, SG24-7704-00 to
produce this second (updated) edition, Master Data Management: IBM
InfoSphere Rapid Deployment Package, SG24-7704-01.
This edition reflects the addition, deletion, or modification of new and changed
information described below. However, it may also include minor corrections and
editorial changes that are not identified.
New information
New information is as follows:
 Chapter 1, “Overview of the Rapid Deployment Package for MDM” on page 1:
This new chapter provides an overview of the MDM solution and benefits to
help position an MDM solution for IBM clients. It includes some information
from the now-deleted Appendix A of the first edition.
 Chapter 3, “RDP MDM: Direct Load” on page 33: This new chapter describes
the Direct Load process and the Rapid Deployment Package (RDP)
components that are utilized in that process.
 Chapter 4, “RDP for MDM: Delta Load” on page 65: This new chapter
describes the capability for Delta Loads for MDM. The first edition contained
information about only the MDM Initial Load.
 Appendix A, “Configuration parameter file” on page 275: This new appendix
provides additional information about customizing an MDM solution.
Changed information
Changed information is as follows:
 Chapter 2, “RDP Detail” from the first edition, has been updated and
expanded, but the configuration parameters file information has been
extracted and is now contained in the new Appendix A, “Configuration
parameter file” on page 275.
 Chapter 3, “Financial services business scenario,” from the first edition, has
been changed to Appendix 5, “Financial services business scenario” on
page 131, and is updated in scope.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
xvii
xviii
Master Data Management: IBM InfoSphere Rapid Deployment Package
THIS PAGE INTENTIONALLY LEFT BLANK
SPONSORSHIP PROMOTION
Elevate Performance™ with NEC.
Now, more than ever, you need a strategic partner that will help you
achieve your business goals. One with market strength and global
stability plus one that will empower you with strategies that will help your
business thrive. That partner is NEC.
NEC is a leading provider of innovative IT, network and communications
THISindustries.
PAGE INTENTIONALLY
solutions for businesses across multiple vertical
As a Premier
IBM Partner we deliver award-winning IBM Cognos, TM1, SPSS and
InfoSphere solutions & services. With dual-shore capabilities,
SmartPredict solutions and a highly successful delivery team NEC is
your “one stop” partner for Business Analytics/Information Management.
LEFT BLANK
IOD 2011 Platinum Partner – Booth P605.
For more information, visit
www.necam.com/performanceanalytics
© 2011 NEC Corporation. All rights reserved.
THE ABOVE IS A PAID PROMOTION. IT DOES NOT CONSTITUTE AN ENDORSEMENT OF ANY OF THE ABOVE COMPANY'S PRODUCTS, SERVICES OR WEBSITES
BY IBM. NOR DOES IT REFLECT THE OPINION OF IBM, IBM MANAGEMENT, SHAREHOLDERS OR OFFICERS. IBM DISCLAIMS ANY AND ALL WARRANTEES FOR
GOODS OR SERVICES RECEIVED THROUGH OR PROMOTED BY THE ABOVE COMPANY.
1
Chapter 1.
Overview of the Rapid
Deployment Package for
MDM
In this book, we outline the fundamentals of implementing an enterprise Master
Data Management (MDM) solution with IBM InfoSphere Master Data
Management (MDM) Server, using the Rapid Deployment Package (RDP) for
MDM.
In this chapter, we briefly describe the RDP for MDM Server and discuss
considerations for helping you determine whether RDP is the right choice for you.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
1
1.1 Introduction
The Rapid Deployment Package for MDM is a services offering that combines
the pre-integration of IBM InfoSphere software with a prescriptive MDM
implementation approach to significantly reduce the cost of MDM
implementations, and reduce the overall risk. RDP can be rapidly deployed as
the initial stage of an MDM Server deployment.
The RDP MDM solution delivers a fully integrated solution that provides a Single
View of the customer to your enterprise, whether the customer is defined as a
customer, client, member, or citizen, as examples. Fully Integrated means that
the solution is prepackaged with the IBM InfoSphere Information Server, IBM
InfoSphere MDM Server Foundation, customizable integration assets (such as a
pre-built set of Information Server jobs for data load and pre-built QualityStage
Data Quality Rule Sets), and a fully articulated set of suggested practices and
repeatable deployment standard practices.
The content we provide in this book details the RDP MDM solution, to help you
better understand the technical underpinnings, operational metrics, and
deployment methods for the solution.
1.2 The case for the RDP for MDM Server
The Rapid Deployment Package for MDM (RDP) offering is designed for a first
phase of MDM projects. At this stage, clients typically deploy MDM Server, as
depicted in Figure 1-1 on page 3, in consolidation or coexistence styles of master
data management, when data is loaded into an MDM repository, but most data
changes are still coming from existing systems. RDP provides a prebuilt set of
InfoSphere Information Server DataStage jobs for performing initial and delta
loads directly to the MDM database:
 An initial load is an original movement of data from source systems into the
MDM repository when the repository is empty.
 A delta load is a periodic (typically daily) data update from source systems
into MDM.
2
Master Data Management: IBM InfoSphere Rapid Deployment Package
Source Systems
Source
#1
Source
#2
MDM Server
Source
#N
Information Server
DataStage
QS
Information Server
Information Analyzer
Fast Track
SIF
DataStage
Load Process
DS Jobs
MDM Business
Serivces
Duplicate Suspect
Processing
MDM
Database
User Interface
and
Reporting
History
Figure 1-1 MDM Server implementation
MDM Server Business Services are used for data inquires, and the MDM Server
Data Steward User Interface (DSUI) is used by Data Stewards to collapse the
duplicate data that was not automatically collapsed during the load. After initial
deployment, RDP projects can easily be expanded to become full centralized
hubs.
Deployment of the RDP MDM solution essentially involves the following steps:
1. Install and configure the RDP MDM solution in your UNIX® or Linux
environment.
2. Analyze and profile your source customer data.
3. Map source systems to the RDP MDM Standard Interface File (SIF).
4. Export source data as SIF.
5. If required, extend the model. The RDP price includes extending the model for
up to 10 additional attributes and attribute lengths.
6. Configure RDP to feed source systems data as part of initial load and delta
load processes.
7. Test and tune the system. Tune standardization and matching rules.
8. Deploy the solution into your production environment.
The Rapid Deployment Package for MDM provides a prescriptive
implementation approach with a pre-determined scope, which results in a
predictable implementation timeline. The requirements for Master Data
Management projects vary by client and therefore the implementation solutions
vary to some extent. RDP is a pre-packaged solution that addresses the most
typical set of client requirements and has a limited set of functionality, but still
ensures an upgrade path to a full MDM solution.
Next, we outline several key business requirements fulfilled by the RDP offering.
Chapter 1. Overview of the Rapid Deployment Package for MDM
3
Supported entities
The Rapid Deployment Package for MDM provides support for the following
Party Domain MDM Server entities and their child objects:
 Party, Person, Organization, PersonName, OrganizationName,
PartyAddress, Party-ContactMethod, PartyPrivPref, PartyIdentification,
PartyValue, PartyLobRelationship, PartyAlert,PartyRelationship,
AdminContEquiv.
 Contract, ContractAlert, ContractValue, AdminNativeKey
 Contract Component, ContractComponentValue
 ContractPartyRole, ContractRoleLocation
 Party Hierarchy
Solution to the Instance Resolution Problem
When data flows into InfoSphere MDM Server directly from external applications,
such as established systems, the internal key is not known and often the nature
of the data change is also not known. This issue, which is referred to as an
Instance Resolution Problem, requires that the following information be
determined:
 Which party or contract are you working with?
 Is data being added or updated?
 If you are trying to update, what instance do you want to update when multiple
names or addresses, multiple contact methods or identifiers, multiple contract
components, multiple party roles, and so on, exist?
RPD solves the Instance Resolution Problem in the following way:
 The Unique Party Contact Equivalency key is used to identify the party you
are working with. The contact equivalent data for a party cannot be changed.
However, a party can have multiple rows in the contact equivalent table.
 To identify a contract, RDP uses the source system key stored in either the
contract table or the native key table. The determination of which to use is an
implementation decision and applies to the entire implementation. That is, all
contracts are identified through one or the other but not a combination of
both.
 For child objects, the record instance is identified by the implied business key
which includes a type. Using person names as an example, MDM Server
supports multiple legal names, multiple alias names, multiple preferred
names, and so forth. With RDP, a party can have only one legal name, one
alias name, one preferred name, and so forth. When you provide name
information, you must also provide the party cross reference in the form of a
Contact Equivalency key and the type of name (such as legal and alias). With
4
Master Data Management: IBM InfoSphere Rapid Deployment Package
this information, determining whether a name must be added, or which
specific name must be updated, is then possible.
Suspect Duplicate Processing
Several characteristics of Suspect Duplicate Processing (SDP), which is
searching for, matching, and creating associations, or suspects, between existing
parties in the system, are as follows:
 The matching rule for parties is implemented as a QualityStage job and
supports configurable matching attributes. Matching weights are calculated
by QualityStage.
 The suspect duplicate candidate selection algorithm differs from the algorithm
that is provided with the default candidate selection rule in MDM Server.
 Auto-collapsing of exact duplicates can be turned on or off.
 Auto-collapsing as part of direct initial load has implications in terms of data
lineage within the MDM solution.
1.3 Determining whether RDP is the right choice for you
There are three basic rules, listed here in the form of questions, that can help you
determine whether the Rapid Deployment Package for MDM is the right choice
for your first MDM Server implementation.
RDP might be the right choice for you if you can answer yes to all three of the
following questions (rules):
1. Do you have a large data volume to be loaded in a short time?
2. Does RDP meet your customer requirements for Party domain?
Rule 1, the first question, says to consider using the RDP solution for data load
only if the batch load using services cannot meet your load time requirements.
The typical suggested approach for loading data into MDM Server is to use
maintenance runtime services and the MDM Server Batch processor for both
initial data loads and delta loads. Be sure to turn off SDP during initial load. This
initial load approach provides both good performance, and ease of
implementation and maintenance. This approach requires Evergreening, which is
performing bulk processing in batch mode, for SDP after the initial load and prior
to delta load.
Even if you have selected to use the RDP solution because of the large data
volume for initial load, consult Rule 1 again when choosing a solution for delta
load. Delta load typically contains significantly less data and therefore the load
Chapter 1. Overview of the Rapid Deployment Package for MDM
5
through services is a better choice. We strongly suggest using maintenance
runtime services and MDM Server Batch processor for delta load.
Rule 2, the second question, says that RDP is a suitable solution if the customer
requirements for Party domain fit within the requirements outlined in 1.2, “The
case for the RDP for MDM Server” on page 2.
The business logic for the instance resolution problem and duplicate suspect
processing is embedded within numerous InfoSphere DataStage jobs. Significant
modifications to the business logic can take time and negate an important benefit
of the RDP solution, which is a predictable implementation timeline. The following
are several examples:
 Adding support for new MDM Server entities implemented as additions would
require development of several new DataStage jobs.
 Changing business keys (to support multiple legal names, for example) would
require modifications to multiple DataStage jobs.
 Although QualityStage matching and standardization rule sets are shared
between RDP and MDM Server run time, other changes in suspect duplicate
processing logic would have to be implemented in both DataStage jobs and in
Java rules for MDM Server run time. For example, changes to the blocking
logic in QualityStage would have to be implemented as a new Java candidate
selection rule to be invoked from MDM Server services as part of the MDM
Server DSUI. The DSUI is used by Data Stewards to collapse the duplicate
data that was not auto-collapsed during the load.
Also note that because RDP is a services offering with pre-built assets, after the
assets are customized and part of the solution, they are not supported, the same
as any assets written by services engagements.
Rule 3, the third question, says that you should check with the RDP Assets
availability matrix to ensure that RDP is supported on your platform of choice for
an MDM Server solution. Not all MDM Server supported platforms are available
with RDP.
6
Master Data Management: IBM InfoSphere Rapid Deployment Package
2
Chapter 2.
Rapid Deployment Package
details
This chapter provides an overview of the Rapid Deployment Package (RDP)
component of the IBM InfoSphere RDP for Master Data Management (MDM)
solution. It includes the MDM Information Server (MDMIS) parameter set
configuration details, a brief description of the Standard Interface File (SIF), the
Duplicate Suspect Processing (DSP) configuration, and Database loading
options.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
7
2.1 Introduction
The RDP for MDM solution consists of various components that either require
specific configuration or offer customization options. This chapter provides a
detailed review of the various configuration tasks that are required when setting
up the following components:




MDMIS Parameter Set
Standard Interface File (SIF)
Suspect Duplicate Processing
Database load options
2.2 MDMIS Parameter Set
The execution of the RDP for MDM jobs is driven by configuration parameters
that you can customize to suit the unique requirements of your organization.
The two ways of running the RDP jobs in which the configuration of the MDMIS
parameter set differ are as follows:
 DL_000_AutoStart_PS_DELTA_LOAD Job Sequence
 DL_000_DELTA_LOAD Job Sequence
2.2.1 DL_000_AutoStart_PS_DELTA_LOAD Job Sequence
If you execute either the DL_000_AutoStart_PS_DELTA_LOAD or the
DL_000_AutoStart_PS_HIERARCHY job sequences, it performs the following tasks:
1. Runs the IL_000_PS__Prestart sequence job to extract the following items:
– Configuration parameters from the CONFIGELEMENT table in MDM,
– Database settings from the MDM_CONNECTIONS parameter set,
– Default configuration parameter file (named STATIC_MDMIS)
2. Creates a temporary configuration file (named VOLATILE_MDMIS) where the
parameters values coming from the CONFIGELEMENT table override the
corresponding parameter values in the STATIC_MDMIS configuration file. The
BATCH_ID and DS_PROCESSING_DATE parameters are automatically set.
The newly generated VOLATILE_MDMIS parameter value file is then passed to
the DL_000_AutoStart_PS_DELTA_LOAD or DL_000_AutoStart_PS_HIERARCHY
sequence jobs. The default configuration parameter file STATIC_MDMIS is
unchanged and a copy of the VOLATILE_MDMIS parameter set value file is
created as another value file named VOLATILE_MDMIS.<Batch number> to keep a
8
Master Data Management: IBM InfoSphere Rapid Deployment Package
history of all the parameter set value files used in RDP runs. Figure 2-1 shows
the parameter set value files that are used in RDP.
Note: A temporary TEMP_MDMIS parameter set value file is automatically
created during the IL_000_PS__Prestart job sequence in preparation to the
VOLATILE_MDMIS parameter set value file creation. This parameter set value
file does not need to be maintained.
Figure 2-1 Parameter value files used in RDP
MDMIS parameter set configuration
To get the MDMIS parameter set ready for runtime, the following steps are
necessary:
1. Set the MDM_CONNECTIONS parameters with the correct information to
connect to the MDM database. You can create separate parameter value files
to be able to run RDP against various databases. Figure 2-2 shows an
example.
Figure 2-2 MDM_CONNECTIONS parameter set value file names
Chapter 2. Rapid Deployment Package details
9
Note: Be sure that the DS_VALUE_FILE_NAME parameter is matching
the parameter set Value File Name (DEV, QA and PROD in Figure 2-2)
because the IL_000_PS__Prestart job sequence is using the
DS_VALUE_FILE_NAME parameter to retrieve the correct
MDM_CONNECTIONS parameter set value file to merge with the
STATIC_MDMIS parameter set at runtime.
2. Prepare the STATIC_MDMIS parameter set value file in the MDM parameter
set. The best approach is to update the default values as shown in Figure 2-3,
and then delete the existing STATIC_MDMIS parameter set value file, if any,
and re-create it to ensure the newly created parameter set contains the
correct values taken from the Default Value column.
Figure 2-3 MDMIS parameter set default values
Note: All the parameters that are linked to the MDM CONFIGELEMENT
table will be retrieved from that table at runtime, therefore, setting set a
Default Value for those is not necessary. To determine which parameters
are linked to the MDM CONFIGELEMENT table, simply look at the Help
Text column to see if the CONFIGELEMENT= parameter is present. See
Figure 2-4 for an example.
10
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 2-4 MDMIS parameter set help text
3. Populate the parameter values stored in the CONFIGELEMENT table in
MDM. There are two different ways to do so:
– Using the CFG_Config job sequence.
– Using the MDM Management Console.
Using the CFG_Config job sequence
In the RDP DataStage jobs, there is a job category (folder) called Jobs
Configuration, containing two jobs as shown in Figure 2-5:
 A job sequencer called CFG_Config
 A job called CFG_Update_CM_Params
Figure 2-5 Configuration Jobs
To use the CFG_Config job sequence, all the parameters in the MDMIS
parameter set that will be assigned values from the CONFIGELEMENT table
must be updated. To know which parameters to update, look for the existence of
the string CONFIGELEMENT= in the Help Text column in the MDMIS
parameter set, as shown in Figure 2-4.
The format of the Help Text column in the MDMIS parameter is as follows:
<Help text>. CONFIGELEMENT=<Parameter Name in the CONFIGELEMENT
table>=<Parameter Value>
Chapter 2. Rapid Deployment Package details
11
Example 2-1 shows a sample of the FS_DATA_SET_HEADER_DIR parameter
help text.
Example 2-1 Help Text column for the FS_DATA_SET_HEADER_DIR parameter
Dataset headers directory.
CONFIGELEMENT=/IBM/ELMDM/IIS/Install/ISDataSetHeaders/path=/mdmisdata03
/Projects/MDMDLINT2/DATA/
In some cases, the Help Text column contains multiple CONFIGELEMENT
parameter names separated by a tilde (~) symbol, as shown in Example 2-2.
These CONFIGELEMENT parameters are exclusively QualityStage parameters
and should be set through the MDM Server UI. See 2.4, “Suspect Duplicate
Processing” on page 19 for more details.
Example 2-2 Help Text column for the QS_MATCH_PERSON_1 parameter
Specify Variable Match String type and TpCd for person - C1 (default).
CONFIGELEMENT=/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatch
String1/type=C~/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatc
hString1/TpCd=1
Important: When all the parameters’ Help Text CONFIGELEMENT values
have been set in the MDMIS parameter set, the CFG_Config job sequence
must be recompiled to be sure the MDMIS parameter set metadata (Help Text
column contents in particular) are available to the job sequence at runtime.
12
Master Data Management: IBM InfoSphere Rapid Deployment Package
To run the CFG_Config job sequence, the MDM deployment name of the MDM
instance to update and valid database connection settings from the
MDM_CONNECTIONS parameter set must be provided, as shown in Figure 2-6.
Figure 2-6 CFG_Config job sequence run options
Using the MDM Management Console
Another way of preparing the MDM CONFIGELEMENT table for RDP is to use
the Management Console command line tool provided by MDM Server. The
Management Console needs to be run from the MDM Server which is pointing to
the CONFIGELEMENT table needing to be prepared.
Chapter 2. Rapid Deployment Package details
13
Using Example 2-1 on page 12, the syntax to use the Management Console
would be as shown in Example 2-3.
Example 2-3 Updating CONFIGELEMENT using the MDM Management Console
LOG_FILE="setparam.log"
MANAGEMENT_CONSOLE_PATH=/usr/IBM/MDM80/HD_MDM80_01292008_0230_DB2_BE01/
ManagementConsole
APPLICATION_NAME='WebSphere Customer Center'
APPLICATION_VERSION=8.0.0.0
DEPLOYMENT_NAME='WebSphere Customer Center'
PARAM_PATH='/IBM/ELMDM/IIS/Install/ISDataSetHeaders/path'
PARAM_VALUE='/mdmisdata03/Projects/MDMDLINT2/DATA/'
$MANAGEMENT_CONSOLE_PATH/console.sh -file
$MANAGEMENT_CONSOLE_PATH/scripts/jacl/modifyConfigItem.jacl
"$APPLICATION_NAME" $APPLICATION_VERSION "$DEPLOYMENT_NAME" $PARAM_PATH
$PARAM_VALUE >> $LOG_FILE 2>&1
The script in Example 2-3 must be customized to go through all the parameters
needing to be updated in the CONFIGELEMENT table.
Summary
After either running the CFG_Config job sequence or updating the
CONFIGELEMENT table using the Management Console, the
DL_000_AutoStart_PS_DELTA_LOAD or DL_000_AutoStart_PS_HIERARCHY job
sequences are ready to be used.
14
Master Data Management: IBM InfoSphere Rapid Deployment Package
2.2.2 DL_000_DELTA_LOAD Job Sequence
If you execute the job sequence DL_000_DELTA_LOAD or DL_200_Hierarchy,
you are responsible for creating your own MDMIS parameter value file and
providing it as input to the job sequence. You can give it a name of your choosing,
as shown in Figure 2-7.
Figure 2-7 User defined parameter set value files
Attention: In a production environment, we do not recommend the use of the
DL_000_DELTA_LOAD or DL_200_Hierarchy sequences because they
require manual modifications that might result in runtime errors or data quality
deterioration. The use of the DL_000_AutoStart_PS_DELTA_LOAD and
DL_000_AutoStart_PS_HIERARCHY (as described in 2.2.1,
“DL_000_AutoStart_PS_DELTA_LOAD Job Sequence” on page 8) ensures a
proper BATCH_ID naming convention and the use of parameters from the
CONFIGELEMENT table to stay in sync with the parameters that are used by
MDM itself.
Note: The DL_000_DELTA_LOAD or DL_200_Hierarchy sequences are
suitable for testing various options using specific configuration parameter sets
involving certain combinations of options, especially during a quality
assurance (QA) process. A more cumbersome approach is to use the
DL_000_AutoStart_PS_DELTA_LOAD and
DL_000_AutoStart_PS_HIERARCHY sequences in such cases, because you
would have to constantly modify the CONFIGELEMENT table for each
test/development run.
Unlike the DL_000_AutoStart_PS_DELTA_LOAD or
DL_000_AutoStart_PS_HIERARCHY job sequences, the following parameters
Chapter 2. Rapid Deployment Package details
15
are not automatically filled in and therefore must be manually entered to execute
the RDP jobs:




BATCH_ID.
Database parameters (starting with “DB_”).
DS_PROCESSING_DATE.
Parameters retrieved from the CONFIGELEMENT table.
When creating a new MDMIS parameter set value file, all the parameter values
are taken from the Default Value column, as shown in Figure 2-3 on page 10,
which are provided with the RDP package. The preferred approach is to update
the default values according to your needs, then create a new parameter set
value file because editing the default values in a tabular form is easier than
editing the parameter value files, which are presented on a single line, as shown
in Figure 2-7 on page 15.
See Appendix A, “Configuration parameter file” on page 275 for more details
about the parameters to be filled in.
When a MDMIS parameter set value file is ready, the DL_000_DELTA_LOAD or
DL_200_HIERARCHY job sequences are ready to be used.
2.3 Standard Interface File (SIF)
The SIF is the file interface where you provide data to be loaded to the MDM
Server through RDP.
The SIF is a delimited ASCII file that contains data input to the load process. The
default delimiter is the pipe character (|). The file is a multi-record format flat file
with a record type code in the first field and a sub-record type code in the second
field following the separator.
Important: The SIF files must be in DOS format (records delimited by
<CR><LF> characters) regardless of the platform RDP is running on.
Each record type/sub-record type (also referred to as RT/ST) combination has a
unique layout (metadata). The record type identifies the primary subject areas,
which are Contact (P) and Contract (C). The Contact and Contract RT/ST
combinations are listed in Table 2-1 on page 17.
16
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table 2-1 Contact (P) and Contract (C) RT/ST combinations
Record Type/Sub-record Type
Content
Contact record type (P) and sub-record type
PP
Person Contact
PO
Organization Contact
PG
Organization Name
PH
Person Name
PE
External Match
PA
Address
PC
Contact Method
PI
Identifier
PB
Line of Business Relationship
PR
Contact Relationship
PM
Person Miscellaneous Value
PR
Contact Relationship
PM
Person Miscellaneous Value
PS
Privacy Preference
PT
Person Alert
Contract record type (C) and sub-record type
CH
Contract
CK
Native Key
CC
Contract Component
CR
Contract Component Role
CL
Role Location
CV
Contact Component Value
CM
Contract Misc Value
CT
Contract Alert
Chapter 2. Rapid Deployment Package details
17
The record layout of the SIF is as follows:
<RECORD_TYPE> | <SUBRECORD_TYPE> | <DATA><CR><LF>
In the record layout, <CR><LF> is the mandatory DOS line feed character.
The following considerations apply to the content of the SIF:
 The columns within the <DATA> section should be separated by a pipe
character (|), with a pipe character following the last data element. All pipe
separators must be present even if there is no data for a particular data
element.
Important: Currently, no escape character is provided if the input data
itself contains the | character. Configuring RDP to use a different delimiter
is not possible. This way can cause errors to be flagged by the parser in the
Import SIF step. Ensure that pipe characters in the input data are suitably
managed before populating the SIF.
 The domain values of key columns in the SIF must contain the values defined
by the MDM Server. This will require transformation of domain values in the
source system to the ones used in MDM Server. For example, the domain
values for Gender in the MDM Server are M and F; the source system might
have 0 and 1. The process that is creating the SIF is responsible for mapping
the domain values appropriately.
 When a column is identified as being not nullable, a value must be provided
for it and that value cannot be null.
 The Timestamp format is configurable using a format string such as
YYYY-MM-DD.HH.MM.SS. See the IBM WebSphere DataStage and QualityStage
Version 8 Parallel Job Developer Guide, SC18-9891 for details about format
strings.
 The order of rows does not matter, because the rows will be sorted in the
proper order by the DataStage jobs.
For more details, read Appendix B, “Standard Interface File details” on page 295.
18
Master Data Management: IBM InfoSphere Rapid Deployment Package
2.4 Suspect Duplicate Processing
The RDP for MDM solution can be configured using the configuration screens in
the Customer Matching Critical Data Rule user interface, or by setting values in
the MDM Server configuration and management tables. Modifying these settings
simultaneously affect both the RDP for MDM Direct Database Load and the MDM
Server Business Services. The settings are stored in the CONFIGELEMENT
table in MDM, which is read by RDP at run time to populate the QS_ parameters
from the MDMIS parameter set, as explained in 2.2, “MDMIS Parameter Set” on
page 8.
2.5 Configuration screens in the MDM Server UI
The configuration screens in the MDM Server user interface (UI) permit the RDP
for MDM solution customizations, as described in the following sections.
2.5.1 Enabling and disabling Suspect Duplicate Processing
You can enable (On) or disable (Off) Suspect Duplicate Processing option with
the buttons shown in Figure 2-8. To set the button to On or Off, from the MDM
Server UI, click Matching Critical Data Rules  Configuration Options in the
navigation pane. This setting affects the processing of both the RDP for MDM
Direct Database Load and the MDM Server Business Services.
Figure 2-8 Enabling or disabling Suspect Duplicate Processing in MDM Server UI
Chapter 2. Rapid Deployment Package details
19
A change in this configuration option will enable or disable Suspect Duplicate
Processing in RDP through the MDMIS parameters listed in Table 2-2.
Table 2-2 Suspect Duplicate Processing parameters to enable matching
Parameter
Default
Description
Description in CONFIGELEMENT Table
QS_PERFORM_ORG_MATCH
0
Perform Organization Match - 1 (true) / 0 (false)
/IBM/Party/SuspectProcessing/enabled
QS_PERFORM_PERSON_MATCH
0
Perform Person Match -1 (true) / 0 (false)
/IBM/Party/SuspectProcessing/enabled
2.5.2 Selecting the set of Party Match criteria
You can configure the critical match fields to use for party matching, and set the
threshold scores to be used to categorize the action to take for a given score.
Because the critical fields for person matching and organization matching can be
configured independently, the user interface has two screens respectively.
Configure critical matching fields for Person matching
From the MDM Server UI, click Matching Critical Data Rules  Person in the
navigation pane to view the Matching Critical Data for Person in the content pane
as shown in Figure 2-9 on page 21. You may perform the following tasks:
 Override the Minimum Match Score for each Suspect Match Category to set
the threshold scores for A1, A2, and B matches.
 Select the matching critical data fields for a person by moving the appropriate
fields from the left panes to the right panes under Matching Critical Data
Fields/Select National Identifier/Select additional Matching Fields sections.
The fields selected are Name, Address, City, State/Province, Country,
Zip/Postal Code, Gender, Birth Date, Social Security Number, Home
Telephone, Business Email, Passport Number and Mother’s Maiden Name.
20
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 2-9 Configuring the critical matching fields for Person matching
A change in these configuration options affect RDP through the MDMIS
parameters listed in Table 2-3.
Table 2-3 Suspect Duplicate Processing parameters for Person.
Parameter
Default
Description
Name in CONFIGELEMENT Table
QS_A1_MATCH_CU
TOFF_PERSON
205
QS_A2_MATCH_CU
TOFF_PERSON
175
QS_B_MATCH_CUT
OFF_PERSON
150
Specify Person A1 Minimum Match Score.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/MatchScores/a1
Specify Person A2 Minimum Match Score.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/MatchScores/a2
Specify Person B Minimum Match Score.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/MatchScores/b
Chapter 2. Rapid Deployment Package details
21
Parameter
Default
Description
Name in CONFIGELEMENT Table
QS_EXCLUDE_FIEL
DS_FROM_MATCH_
PERSON
(blank)
QS_MATCH_PERSO
N_1
C1
Select Critical Data Fields for Individual Match.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonAddress/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonBirthDate/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonCity/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonCountry/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonGender/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString1/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString2/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString3/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString4/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonNationalID/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonPostCode/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonState/enabled=true
Specify Variable Match String type and TpCd for
person.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString1/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString1/TpCd
QS_MATCH_PERSO
N_2
C3
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString2/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString2/TpCd
QS_MATCH_PERSO
N_3
C5
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString3/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString3/TpCd
QS_MATCH_PERSO
N_4
C7
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString4/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString4/TpCd
QS_MATCH_PERSO
N_NATID
C2
Specify Variable Match NationalId for person.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonNationalID/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonNationalID/TpCd
22
Master Data Management: IBM InfoSphere Rapid Deployment Package
Configure critical matching fields for Organization matching
From the MDM Server UI, click Matching Critical Data Rules  Organization
in the navigation pane to view the Matching Critical Data for Organization in the
content pane as shown in Figure 2-10. You may perform the following tasks:
 Override the Minimum Match Score for each Suspect Match Category to set
the threshold scores for A1, A2 and B matches.
 Select the matching critical data fields for an organization by moving the
appropriate fields from the left pane to the right pane under Selected
Matching Critical Data Fields/Selected National Identifier/Select additional
Matching Data fields sections. The fields selected in this case are different
from those selected for Person matching. Figure 2-10 shows the selected
fields Name, Address, City, State/Province, Country, Zip/Postal Code,
Established Date, Corporate Tax Identification, Business Telephone,
Business Email, Tax Registration Number and Tax Identification Number.
Figure 2-10 Configuring the critical matching fields for organization matching
Chapter 2. Rapid Deployment Package details
23
A change in these configuration options affect RDP through the MDMIS
parameters listed in Table 2-4.
Table 2-4 Suspect Duplicate Processing parameters for Organization
Parameter
Default
Description
Name in CONFIGELEMENT Table
QS_A1_MATCH_CUT
OFF_ORGANIZATION
205
QS_A2_MATCH_CUT
OFF_ORGANIZATION
175
QS_B_MATCH_CUTO
FF_ORGANIZATION
150
QS_EXCLUDE_FIELD
S_FROM_MATCH_O
RGANIZATION
(blank)
QS_MATCH_ORG_1
I1
Specify Org A1 Minimum Match Score - 205
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/MatchScores/a1
Specify Org A2 Minimum Match Score - 175 (default).
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/MatchScores/a2
Specify Org B Minimum Match Score - 150
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/MatchScores/b
Select Critical Data Fields for Organization Match.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgAddress/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgCity/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgCountry/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgState/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgCountry/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgPostCode/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgEstablishedDate/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString1/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString2/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString3/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString4/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgNationalID/enabled
Specify Variable Match String type and TpCd for
organization.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString1/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString1/TpCd
QS_MATCH_ORG_2
I2
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString2/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString2/TpCd
QS_MATCH_ORG_3
(blank)
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString3/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString3/TpCd
24
Master Data Management: IBM InfoSphere Rapid Deployment Package
Parameter
Default
Description
Name in CONFIGELEMENT Table
QS_MATCH_ORG_4
I3
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString4/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString4/TpCd
QS_MATCH_ORG_N
ATID
C2
Specify Variable Match NationalId for organization.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgNationalID/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgNationalID/TpCd
2.6 Database Load options
RDP offers various options to load data into the MDM database. The RDP jobs
include ODBC connectivity. However, IBM DB2® Enterprise, Oracle Enterprise,
and DB2 Bulk are also available.
The RDP jobs support a plug-in architecture for all the database accesses.
Database stages are encapsulated in a shared container that can be easily
replaced as shown in Figure 2-11.
Figure 2-11 Shared containers containing database stages
Chapter 2. Rapid Deployment Package details
25
The database shared containers are located in the DBContainers category in the
RDP DataStage project, as shown in Figure 2-12.
Figure 2-12 Database shared containers in the RDP DataStage Project
To switch from the default set of ODBC shared containers to the DB2 Enterprise
or Oracle Enterprise shared container, you must import the DataStage DSX file
containing the shared containers of your choice (supplied in the RDP package)
into the RDP DataStage project. After the new shared containers are imported,
you must recompile all the jobs that use the database shared containers.
26
Master Data Management: IBM InfoSphere Rapid Deployment Package
You can either use the Find where used in the DataStage Designer, as shown in
Figure 2-13 to get a list of jobs to be recompiled or recompile all the RDP jobs
with the Multiple Job Compile options.
Figure 2-13 Finding jobs to recompile
Important: When importing a separate set of database shared containers, the
existing shared containers are overwritten. If you have made customizations to
the existing database shared container, we suggest that you make a copy
(with a new name) before importing the new shared containers, then replicate
the same customizations to the newly imported shared containers where
applicable.
The MDM_COMPILATION_OPTIONS parameter set has been added to the
RDP DataStage project to make sure the various database shared containers
provided in the RDP package work with the different databases. This parameter
currently support DB2 and Oracle. This parameter set is included in the DB2
Enterprise and Oracle Enterprise database shared containers packages and are
already set with the correct parameter values but if you plan to use the ODBC
shared containers with Oracle, you must update it accordingly.
Chapter 2. Rapid Deployment Package details
27
The MDM_COMPILATION_OPTIONS parameter set is set by default to support
DB2, as shown in Figure 2-14.
Figure 2-14 MDM_COMPILATION_OPTIONS parameter set for DB2
To support ODBC Oracle, the MDM_COMPILATION_OPTIONS must be updated
as follows:
 Switch the SQL_TP_INT64 parameter Default Value from bigint to
number(19,0), selecting the value from the proposed list.
 Switch all the MOD_ parameters Default Value from the semi-colon (;)
character to the other value from the proposed list.
Note: The MOD_U_SIF_FILE_NAME parameter is not specific to
database support and should be changed only if your DataStage
installation does not have the National Language Support (NLS) support
enabled.
28
Master Data Management: IBM InfoSphere Rapid Deployment Package
The resulting MDM_COMPILATION_OPTIONS parameter set is similar to the
one in Figure 2-15.
Figure 2-15 MDM_COMPILATION_OPTIONS parameter set for Oracle
When the MDM_COMPILATION_OPTIONS parameter set is updated, all the
jobs using it must be recompiled to make the new values available at run time.
You can use the Find where used option shown in Figure 2-13 on page 27 to
find the jobs to compile or you can simply recompile all the RDP jobs.
The RDP database loading jobs support two modes:
 Regular database inserts and updates
 Bulk Load
Use the LOAD_METHOD parameter in the MDMIS parameter set, as shown in
Figure 2-16 on page 30, to control which type of database loading will be used at
run time. The two available options are:
 BULK
 ODBC_INSERT
Chapter 2. Rapid Deployment Package details
29
Figure 2-16 LOAD_METHOD parameter in the MDMIS parameter set
Note: Despite being called ODBC_INSERT, the second option also applies to
DB2 Enterprise and Oracle Enterprise database stages in RDP. In that case,
the upsert method is used (the database stage tries to update a record first if it
already exists or insert a new record if there is no existing record to update).
When the ODBC_INSERT load method is selected, RDP will use the following
jobs to load the records in the MDM database:
 IL_090_LD_Insert_*
 DL_090_LD_Update_*
 DL_091_LD_Update*
These jobs are using the upsert method, except the IL_090_LD_Insert_* jobs are
using an insert+update method because they are used to insert new records and
we do not expect records with the same primary key to be present in the
database.
When the BULK load method is selected, RDP will use the following generic jobs
instead of the specific jobs (stated in the previous list) for all the database inserts:




30
DL_091_LD_Bulk_Common
DL_091_LD_Bulk_ContEquiv
DL_091_LD_Bulk_NativeKey
DL_091_LD_Bulk_Suspect
Master Data Management: IBM InfoSphere Rapid Deployment Package
The database updates are still made using the DL_090_LD_Update_* and
DL_091_LD_Update* jobs. The main difference between the two method is that
we do not capture rejected records out of bulk load as we do for the Insert and
Update loads as shown in Figure 2-17 and Figure 2-18.
Figure 2-17 ODBC_INSERT load method
Figure 2-18 BULK load method
Important: Bulk load is available only for DB2 Enterprise and Oracle
Enterprise. When setting up the RDP jobs with the default ODBC database
stages, the DB2 Enterprise stages are used for bulk load. If you plan to use
the bulk load option with ODBC for Oracle, you must import the
DLDBBLKTABLE, DLDBBLKHISTORY, and DLDBBLKTRUNCTABLE shared
containers from the Oracle Enterprise RDP package.
Chapter 2. Rapid Deployment Package details
31
32
Master Data Management: IBM InfoSphere Rapid Deployment Package
3
Chapter 3.
RDP MDM: Direct Load
This chapter presents a high-level description of the Direct Load process, and the
Rapid Deployment Package (RDP) components that are use in that process.
These components consist of various DataStage and QualityStage assets.
The Direct Load process categories help to organize the presentation of the high
level descriptions of the related DataStage and QualityStage assets. For a
deeper and more detailed description of these components, see the InfoSphere
MDM RDP HVL Operations Guide, which is included in the distribution of the
MDM Server RDP assets.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
33
3.1 Direct Load process
The Direct Load process provides the means to process incoming Standard
Interface File (SIF) data into the MDM Server entities in a quick fashion, using the
parallel processing aspects of DataStage. The Direct Load process can be used
to perform initial, incremental, and operational (delta) data loading processes.
The Direct Load process consists of seven distinct processing categories as
depicted in Figure 3-1. The incoming SIF data is organized into two sets, one that
is related to the regular MDM Server database entities, and others related to the
hierarchies. The loading of the hierarchy SIF data is similar and therefore is not
presented here.
Job Control
Import SIF
MDM
GUI
Data Quality
Assurance
Standardization
Data type
checking
Code Tables
RT/ST
validation
Referential Integrity
Pair Validation
MDM
ConfigElement
table
Data Quality
Error
Consolidation &
Reporting
Matching
Suspect
ID Assignment
Internal ID
Surrogate Key
Transitive errors
Data Base Record ID
Error reporting
No more error logs
Error
log
Error
logs
Error
logs
Variable number of files
Configuration
Parameters
One file for Party
SIF
File(s)
Consolidated
Error log
One file for Contract
Figure 3-1 Direct Load process
34
Data
Insert and
Update
Master Data Management: IBM InfoSphere Rapid Deployment Package
MDM
data
repository
The seven processing categories are as follows:
 Job Control
Includes processes that compile and update processing parameters from the
MDM Server CONFIGELEMENT table, ensures that processing structures
exist, and invokes parameter specific jobs.
 Import
Consists of those DataStage assets and elements that are used to import the
SIF data for processing.
 Data Quality Assurance
This is the DataStage assets that perform the following functions:
–
–
–
–
Validate code column values
Perform parameter driven standardization of specific column values
Validate the referential integrity (RI) of incoming data records
Format related error messages
 Data Quality Error Consolidation / Reporting
Consists of the DataStage assets that consolidate error message files.
created by the Data Quality Assurance processes, and drops incoming data
records with associated errors.
 Matching
This is the parameter-driven DataStage and QualityStage assets that identify
specific match suspects for subsequent MDM Services processing.
 ID Assignment
Included here are the DataStage assets that perform the following tasks:
– Derive internal record identifiers to assist in processing.
– Use surrogate keys to aid in the creation of database record identifiers.
– Determine whether an incoming record represents a record insertion or a
update to an existing record.
 Data Insert and Update
This category consists of DataStage assets that either insert or update data
records into the MDM Server database. It includes the parameter-driven
DataStage assets that provide bulk loading of records into the MDM Server
database.
Chapter 3. RDP MDM: Direct Load
35
3.2 Job Control
This processing category consists of the MDM RDP Direct Load components
that perform the following tasks:




Establish values in the MDM Server CONFIGELEMENT table.
Compile parameter values from the CONFIGELEMENT table at run time.
Ensure that files exist prior to processing.
Invoke jobs based on existence of data or parameter values.
The components in this category generally represent pre-processing and
post-processing activities. They represent the jobs that are run to control the
overall processing of SIF data.
Table 3-1 identifies the related Job Control DataStage assets that are involved in
the processing (pre or post) of SIF formatted data.
Table 3-1 SIF Job Control DataStage assets
DataStage asset
Pre- or
Post-
Description
CFG_Config
Pre-
This job sequence invokes the CFG_Update_CM_Params
job. Use of this job sequence should be limited and it is not
recommended to use on a consistent basis.
CFG_Update_CM_Params
Pre-
This job uses the comment portion of the MDMIS parameter
value set to load values into the CONFIGELEMENT table.
DL_000_AutoStart_PS_DELT
A_LOAD
Pre-
This job sequence is used in production to invokes the
IL_000_PS__Prestart job sequence which builds a new
VOLATILE_MDMIS parameter value set before it invokes
the DL_000_DELTA_LOAD job sequence with the new
VOLATILE_MDMIS parameter value set.
IL_000_PS__Prestart
Pre-
This job sequence is invoked by
DL_000_AutoStart_PS_DELTA_LOAD,
DL_000_AutoStart_PS_HIERARCH, and the
IL_000_AutoStart_EX sequences. The sequence invokes
the IL_000_PS_SF_Create, IL_000_PS_Set_BatchID,
IL_000_PS_Gen_Volatile_Par_Set and
IL_000_PS_Stage_ErrReasonTbl jobs, based on specific
conditions described for each job.
IL_000_PS_SF_Create
Pre-
This job is invoked if the BATCH_ID.sf file exists in the
directory identified in the MDMIS parameter value set
FS_SK_FILE_DIR parameter.
36
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Pre- or
Post-
Description
IL_000_PS_Set_BatchID
Pre-
This job uses the file identified in the MDMIS parameter set
FS_PARAM_SET_DIR concatenated with
MDM_CONNECTIONS/ and the
MDM_CONNECTIONS_VALUE_FILE_NAME parameter
along with the file identified in the MDMIS parameter
FS_PARAM_SET_DIR concatenated with
MDMIS/STATIC_MDMIS to compile the file identified by the
MDMIS parameter set FS_PARAM_SET_DIR parameter
concatenated with MDMIS/TEMP_MDMIS.
IL_000_PS_Gen_Volatile_Par
_Set
Pre-
This job uses parameter values from the
CONFIGELEMENT table, the TEMP_MDMIS file compiled
in the IL_000_PS_Set_Batch job, and the file identified in
the PARAMS_FILE job parameter to compile a new
VOLATILE_MDMIS parameter value set and to archive the
previous VOLITILE_MDMIS parameter value set.
IL_000_PS_Stage_ErrReason
Tbl
Pre-
This job extracts values from the ERRREASON table to add
parameter values for the DROP_ON_PRVBY_ERR,
DROP_ON_FROM_ERR,
DROP_ON_ASSIGNEDBY_ERR,
DROP_ON_REPLBY_ERR and
RESET_ON_ASSIGNEDBY_ERR parameters along with
the contents of the file identified in the MDMIS parameter
set FS_PARAM_SET_DIR parameter value concatenated
with /MDM_EC/DEFAULT to the VOLATILE_MDMIS
parameter value set.
DL_000_DELTA_LOAD
This sequences controls the order of processing SIF data
invoking job sequences and jobs based on the existence of
data sets and parameter values. The jobs invoked in this job
sequence will be described in their related processing
category description. This sequence invokes the
IL_061__AI_SK_State_File_Control and the
IL_062_AI_CM_SK_Control job sequences based on the
value of the MANUAL_STATE_FILE_CONTROL
parameter.
IL_061__AI_SK_State_File_C
ontrol
Pre-
This job sequence invokes the IL_061_AI_SF_Delete and
IL_061_AI_SF_Create jobs before the ID Assignment
category of jobs are invoked.
IL_061_AI_SF_Delete
Pre-
This job deletes the surrogate key files.
Chapter 3. RDP MDM: Direct Load
37
DataStage asset
Pre- or
Post-
Description
IL_061_AI_SF_Create
Pre-
This job creates surrogate key files utilizing next key values
extracted from the CONFIGELEMENT table.
IL_062_AI_CM_SK_Control
Post-
This job updates the next key values in the
CONFIGELEMENT table after the ID Assignment category
jobs successfully complete.
3.3 Import
This processing category consists of those MDM RDP Direct Load components
that are used to import records from SIF formatted text files and output related
error logs. The DataStage assets provide specific functionality to perform the
following tasks:




Extract existing record identifiers from the MDM server database.
Initialize temporary data structures for downstream processes.
Parse incoming SIF data into specific data sets.
Capture and record specific data or file format errors.
The set of DataStage assets that imports the SIF formatted data consists of one
DataStage job (DL_010_IS_Import_SIF) and a number of DataStage shared
containers. These shared containers are much like shared libraries. They are
used to provide database specific functionality and practical structures for the
handling of incoming data structures. The DataStage import shared containers
described below each contain functionality specific to a Record Type/Sub Type
(RT/ST) defined in the SIF Mapping Specification spreadsheet.
The DataStage job (DL_010_IS_Import_SIF) requires an MDMIS parameter
value set as input. The job uses the parameters for the following reasons:
 To connect to the MDM Server database
 To determine behavior for handling of specific error conditions related to
parsing (creation of records and handling of column values)
 For directory path locations to find incoming SIF files
 For directory path locations to output intermediate data sets
 For directory path locations to output any related error logs
 For the related batch identifier for data set and error log file naming
38
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table 3-2 has a brief description of the associated DataStage assets. The
DataStage import shared containers (DLIS) description presents information
about the input data format it handles and resulting data set naming convention.
Table 3-2 SIF Import DataStage assets
DataStage asset
Description
DL_010_IS_Import_SIF
This DataStage job uses the assets listed here to import
related SIF data, create data sets and capture and record
data file / format errors.
DLISAddress
Import Shared Container
 SIF Data: LocationGroup_AddressGroup_Address
 RT/ST: PA
 Resulting Data Set: Parse_Address
DLISAlert
Import Shared Container
 SIF Data: Alert
 RT/ST: PT and CT
 Resulting Data Sets: Parse_Alert_Party and
Parse_Alert_Contract
DLISContact
Import Shared Container
 SIF Data: Contact
 RT/ST: PP and PO
 Resulting Data Set: Parse_Contact
DLISContactMethod
Import Shared Container
 SIF Data:
LocationGroup_ContactMethodGroup_ContactMethod
 RT/ST: PC
 Resulting Data Set: Parse_ContactMethod
DLISContactRel
Import Shared Container
 SIF Data: ContactRel
 RT/ST: PR
 Resulting Data Set: Parse_ContactRel
DLISContract
Import Shared Container
 SIF Data: Contract
 RT/ST: CH
 Resulting Data Set: Parse_Contract
DLISContractComponent
Import Shared Container
 SIF Data: ContractComponent
 RT/ST: CC
 Resulting Data Set: Parse_ContractComponent
DLISContractCompVal
Import Shared Container
 SIF Data: ContractCompVal
 RT/ST: CV
 Resulting Data Set: Parse_ContractCompVal
Chapter 3. RDP MDM: Direct Load
39
40
DataStage asset
Description
DLISContractRole
Import Shared Container
 SIF Data: ContractRole
 RT/ST: CR
 Resulting Data Set: Parse_ContractComponentRole.
DLISExternalMatch
Import Shared Container
 SIF Data: ExternalMatch
 RT/ST: PE
 Resulting Data Set: Parse_ExternalMatch
DLISIdentifier
Import Shared Container
 SIF Data: Identifier
 RT/ST: PI
 Resulting Data Set: Parse_Identifier
DLISLOBRel
Import Shared Container
 SIF Data: LobRel
 RT/ST: PB
 Resulting Data Set: Parse_LOBRel
DLISMiscValue
Import Shared Container
 SIF Data: MiscValue
 RT/ST: PM and CM
 Resulting Data Sets: Parse_MiscValue_Party and
Parse_MiscValue_Contract
DLISNativeKey
Import Shared Container
 SIF Data: NativeKey
 RT/ST: CK
 Resulting Data Set:Parse_NativeKey
DLISOrgName
Import Shared Container
 SIF Data: OrgName
 RT/ST: PG
 Resulting Data Set:Parse_OrgName
DLISPersonName
Import Shared Container
 SIF Data: Person Name, Person Search
 RT/ST: PH
 Resulting Data Set: Parse_PersonName
DLISPrivPref
Import Shared Container
 SIF Data: PPrefEntity_PrivPref
 RT/ST: PS
 Resulting Data Set: Parse_PrivPref
DLISRoleLocation
Import Shared Container
 SIF Data: Role Location
 RT/ST:CL
 Resulting Data Set: Parse_RoleLocation
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
DLDBSELCONTEQUIV
Database Shared Container.
Retrieves record identifiers from CONTEQUIV and
CONTACT database tables.
DLDBINSADMINCLIENT
Database Shared Container
Records record identifiers to the IS_ADMINCLIENT
database table.
DLDBSELNATIVEKEY
Database Shared Container
Retrieves record identifiers from CONTRACT and
NATIVEKEY database tables.
DLDBINSADMINCONTRACT
Base Shared Container
Records record identifiers to the IS_ADMINCONTRACT
database table.
3.4 Data Quality Assurance
This processing category consists of those MDM RDP Direct Load components
that perform the following tasks:
 Validates code column values against MDM Server database tables.
 Provides parameter driven standardization of specific column values.
Standardization processes use the same QualityStage rule sets used by
MDM Server run time as part of transaction processing.
 Provides parameter driven phonetic generation of specific column values.
Phonetic generation processes use same QualityStage rules sets used by
MDM Server run time as part of transaction processing.
 Validates the referential integrity of internal record references.
 Generates related error message log files.
 Loads column values from existing database records to related incoming
records.
The execution of these particular DataStage assets are preceded by the
execution of the DL_015_II__InternalID_Party and
DL_016_II__InternalID_Contract job sequences to establish internal identifiers.
See 3.7, “ID assignment” on page 50.
The DataStage assets used for SIF Data Quality Assurance processing consists
of two job sequences that invoke the 19 related DataStage jobs. Within these
jobs, DataStage and in some instances QualityStage assets are employed.
Chapter 3. RDP MDM: Direct Load
41
Table 3-3 lists the related job sequences and jobs; within a job description related
QualityStage assets are identified along with associated input and output data
sets.
Each DataStage job described in Table 3-3 provides Edit Point shared containers
to accommodate the user customizing the data quality assurance processing.
The naming convention for these shared container is Extension Point for Custom
Validation or Standardization (EPCVS) followed by the incoming data type such
as Address, Alert, and so on.
Note: Referential integrity validation for some of the Contract data sets can
only occur after Party (Contact) ID assignment processing has completed
successfully.
Table 3-3 SIF Data Quality Assurance assets
DataStage asset
Description
DL_020__VS_RI_EC_PARTY
This job sequences invokes the 12 DataStage
jobs that follow, not necessarily in the order
presented
DL_020_VS_Address



DL_020_VS_Alert_Party


Input data set: Parse_Alert_Party and
RI_Contact_SUBSET
Output data set: PARTY_Alert_Validated
and PARTY_AlertExisting
DL_020_VS_Contact


Input data set: Parse_Contact
Output data sets: Contact_Validated,
RI_Contact_SUBSET and
Existing_Contacts.
DL_020_VS_ContactMethod

Input data set: Parse_ContactMethod and
RI_Contact_SUBSET
QualityStage Standardization Rule Sets:
MNPHONE
Output data sets: ContactMethod_Validated
and ExistingContactMethod


42
Input data set: Parse_Address and
RI_Contact_SUBSET
QualityStage Standardization Rule Sets:
COUNTRY, USPREP, CAPREP, MNADKEY,
MNSPOST, MDMUSADDR, MDMUSAREA,
MDMCAADDR, MDMCAAREA
Output data set: Address_Validated and
AddressExisting.
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
DL_020_VS_ContactRel


DL_020_VS_Identifier


DL_020_VS_LOBRel


DL_020_VS_MiscValue_Party


DL_020_VS_Orgname



DL_020_VS_PersonName



DL_020_VS_PrivPref


DL_030_RI_Contact_Person_Org


Input data set: Parse_ContactRel and
RI_Contact_SUBSET
Output data set: ErrCon_ContactRel_0 and
ContactRelExisting
Input data set: Parse_Identifier and
RI_Contact_SUBSET
Output data set: ErrCon_Identifier_0 and
IdentifierExisting
Input data set: Parse_LOBRel and
RI_Contact_SUBSET
Output data set: LOBRel_Validated and
LOBRelExisting
Input data set: Parse_MiscValue_Party and
RI_Contact_SUBSET
Output data set:
PARTY_MiscValue_Validated and
PARTY_MiscValueExisting
Input data set: Parse_OrgName and
RI_Contact_SUBSET
QualityStage Standardization Rule Sets:
MNNAME
Output data sets: OrgName_RIValidate and
ExistingOrgName
Input data set: Parse_PersonName and
RI_Contact_SUBSET
QualityStage Standardization Rule Sets:
MNNAME and MNNMKEYS
Output data sets: PersonName_RIValidated
and PersonNameExisting.
Input data set: Parse_PrivPref and
RI_Contact_SUBSET
Output data sets: PrivPref_Validated and
PrivPrefExisting
Input data sets: Contact_Validated,
PersonName_RIValidated,
OrgName_RIValidated and
RI_Contact_SUBSET
Output data sets: Contact_Reference and
ErrCon_Contact_0
Chapter 3. RDP MDM: Direct Load
43
DataStage asset
Description
DL_021__VS_EC_CONTRACT
This job sequences invokes the seven
DataStage jobs that follow, not necessarily in
the order presented.
DL_021_VS_Alert_Contract


DL_021_VS_Contract


DL_021_VS_ContractComponent


DL_021_VS_ContractCompVal


DL_021_VS_ContractRole


44
Input data sets: Parse_Alert_Contract and
RI_Contract_SUBSET
Output data sets:
CONTRACT_Alert_Validated and
CONTRACT_AlertExisting
Input data sets: Parse_Contract and
INPUT_CONTRACT_MASTER
Output data sets: Contract_Reference.,
ErrCon_Contract_0 and
RI_Contract_SUBSET
Input data sets: Parse_ContractComponent
and RI_Contract_SUBSET
Output data sets:
ContractComponent_Validated,
ContractComponentExisting and
ContractComponent_For_RI_Validation
Input data sets: Parse_ContractCompVal,
ContractComponent_For_RI_Validation and
RI_Contract_SUBSET
Output data sets: ContractCompValExisting
and ContractCompVal_Validated
Input data sets:
Parse_ContractComponentRole,
ContractComponent_For_RI_Validation,
RI_Contract_SUBSET, Insert_CONTEQUIV
and RI_Contact_SUBSET
Output data sets:
ContractComponentRoleExisting,
ContractComponentRole_For_RI_Validatio
n and ContractRole_Validated
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
DL_021_VS_MiscValue_Contract


DL_021_VS_RoleLocation


Input data sets: Parse_MiscValue_Contract
and RI_Contract_SUBSET
Output data sets:
CONTRACT_MiscValueExisting and
CONTRACT_MiscValue_Validated
Input data sets: Parse_RoleLocation,
RI_Contract_SUBSET,
ContractComponent_For_RI_Validation,
ContractComponentRole_For_RI_Validatio
n, RI_Contact_SUBSET and
RISUBSET_ADDRESSGROUP
Output data sets: RoleLocationExisting and
RoleLocation_Validated.
3.5 Data Quality Error Consolidation / Reporting
This processing category consists of those MDM RDP Direct Load components
that consolidate error logs generated from both the Import and Data Quality
Assurance processes and drops records associated with records in error.
The DataStage assets used for SIF data quality error consolidation and reporting
processing consists of similar job sequences for Party and Contract data. These
job sequences (DL_040_EC_Party and DL_040_EC_Contract) each set up data
sets to run in a loop (IL_040_EC_Party_Initial and IL_040_EC_Contract_Initial),
execute iterative activities (IL_040_EC_Party_Iterative_Drop and
IL_040_EC_Contract_Iterative_Drop) that identify and drop records associated
to records that were identified earlier as containing errors, uses processing
parameter to determine how many times to loop
(DS_DROP_MAX_ITERATIONS), perform a last step after exiting the iterative
loop (IL_040_EC_Party_Last_Drop and IL_040_EC_Contract_Last_Drop) and
examine the error report results (DL_041_EC_Error_Check) using processing
option parameters to determine whether to abort
(DS_SIF_ERROR_THRESHOLD) or email the error report
(DS_EMAIL_ERROR_CHECK_REPORT and
DS_EMAIL_ERROR_CHECK_DISTRIBUTION).
Chapter 3. RDP MDM: Direct Load
45
Table 3-4 presents the related assets with a description consisting of the
associated input and output data sets.
Table 3-4 Data Quality Error Consolidation / Reporting DataStage assets
DataStage asset
Description
DL_040_EC_Party
This job sequences invokes the Party related
jobs as described earlier.
Parameter Input: MDMIS
IL_040_EC_Party_Initial


IL_040_EC_Party_Iterative_Drop



IL_040_EC_Party_Last_Drop



46
Input data sets:
PARTY*_VS_ERR_MSGS,
PARTY_SIF_Import_IID_ERR_MSGS
and PARTY*_RI_ERR_MSGS
Output data sets: PartiesToDrop_0 and
PARTY_ErrCon
Parameter Input: Nth and Mth
Input data set: PartiesToDrop_#Nth,
ErrCon_Contact_#Nth,
ErrCon_ContactRel_#Nth,
ErrCon_Identifier_#Nth
Output data set: PartiesToDropFinal,
ErrCon_Contact_M,
ErrCon_ContactRel_M,
ErrCon_Identifier_M, PARTY_ErrCon,
PartiesToDrop_#Mth and
PartyDropCount_#Nth
Input data set: PartiesToDropFinal
DataStageShared Container:
ILECDropByAssociation
Output data set: ErrCon_<SIF Record
Type>
ILECDropByAssociation
Shared container used to drop associated
records for identified input data sets.
 Parameter Input:
VALID_INPUT_RECS_DS,
VALID_OUTPUT_RECS_DS,
SSK_FIELD_ONE, SSK_FIELD_TWO
 Input data set: PartiesToDropFinal
 Output: VALID_OUTPUT_RECS_DS
(ErrCon_<SIF Record Type>) and
PARTY_ErrCon.
DL_040_EC_Contract
This job sequences invokes the Contract
related jobs as described earlier.
Parameter Input: MDMIS
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
IL_040_EC_Contract_Initial


IL_040_EC_Contract_Iterative_Drop



IL_040_EC_Contract_Last_Drop



DL_041_EC_Error_Check
Input data sets:
CONTRACT*_VS_ERR_MSGS,
CONTRACT_SIF_Import_IID_ERR_MS
GS and CONTRACT*_RI_ERR_MSGS
Output data sets: ContractsToDrop_0 and
CONTRACT_ErrCon
Parameter Input: Nth and Mth
Input data set: ContractsToDrop_#Nth,
and ErrCon_Contract_#Nth,
Output data set: ContractsToDropFinal
CONTRACT_ErrCon,
ContractsToDrop_#Mth and
ContractDropCount_#Nth
Input data set: ContractsToDropFinal
DataStageShared Container:
ILECDropByAssociation (described
above)
Output data set: ErrCon_<SIF Record
Type>
This job is invoked by both Party and Contract
job sequences
 Parameter Input: ERROR_TYPE (PARTY
or CONTRACT)
 Input data set: SIF_Import_ERR_MSGS,
#ERROR_TYPE#_ErrCon,
 Outputs:
ERROR_TYPE#ErrorThresholdReport
(MDMIS FS_LOG_DIR parameter)
3.6 Matching
This processing category consists of parameter driven MDM RDP Direct Load
components that perform match processing, identifying Suspect records for
subsequent MDM Service activities. Match processing is invoked in the job
sequence when the MDMIS parameter set QS_PERFORM_ORG_MATCH and
and QS_PERFORM_PERSON_MATCH parameter values are set to 1. The
QualityStage rule sets used in these matching jobs are also used by MDM
Server run time as part of transaction processing.
Chapter 3. RDP MDM: Direct Load
47
Table 3-5 lists the related assets with a description that consists of the
associated input and output data sets.
Table 3-5 Matching DataStage assets
DataStage asset
Description
DL_050_MA__Match
This job sequence invokes the related DataStage jobs passing
the MDMIS parameter value set to each.
DL_051_MA_Prep


DL_052_MA_Prep_Candidates



Input: ErrCon_Contact, ErrCon_PersonName,
ErrCon_OrgName, ErrCon_Address,
ErrCon_ContactMethod, ErrCon_Identifier,
INPUT_PARTY_MASTER, PersonNameExisting,
OrgNameExisting, AddressExisting,
ContactMethodExisting and IdentifierExisting.
Output: Direct_Update, Ids_For_Remove_Suspects,
Contact_To_Match, PersonName_InternalId_Errors,
PersonName_To_Match, OrgName_To_Match,
OrgName_InternalId_Errors, Address_To_Match,
ContactMethods_To_Match and Identifier_To_Match
Input: Contact_To_Match, PersonName_To_Match,
OrgName_To_Match, Address_To_Match,
ContactMethods_To_Match and Identifier_To_Match
Output: IncomingPersonRecords_for_Match,
IncomingOrgRecords_for_Match and
Match045_ERR_MSGS
Database Shared Container: DLDBINTEMPPERSON and
DLDBINTEMPORG
DLDBINTEMPPERSON
Write records to the IS_PERSONTEMPTABLE table used
when LOAD MODE is set to DELTA.
DLDBINTEMPORG
Write records to the IS_ORGTEMPTABLE table used when
LOAD MODE is set to DELTA.
DL_053_MA_Org_Candidate_Sel



DLDBSELCANDIDATES
DataStage shared container which returns records from
database using SQL queries that are built dynamically.
DL_053_MA_Person_Candidate_Sel
 Input: IncomingPersonRecords_for_Match
 Output: PersonCandidates_for_Match
Database Shared Container: DLDBSELCANDIDATES (see
DL_053_MA_Org_Candidate_Sel)
48
Input: IncomingOrgRecords_for_Match,
Output: OrganizationCandidates_for_Match
Database Shared Container: DLDBSELCANDIDATES
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
DL_054_MA_Org



Input: IncomingOrgRecords_for_Match and
OrganizationCandidates_for_Match
Output: MatchOrgA1 and MatchOrgNonA1
Matching Shared Container: DLMAOrganization
DLMAOrganization
Matching shared container consisting of various QualityStage
assets used in performing Organizational matching returning
A1 and NON-A1 matches to the invoking process.
 Input: IncomingOrgRecords_for_Match,
OrganizationCandidates_for_Match,
RefOrgTransMatchFreq, RefOrgCandidateMatchFreq and
OrgMatchFreqForRollUp,
 Output: MatchOrgA1, MatchOrgNonA1,
OrgRefMatchDebugFile and OrgDedupMatchDebugFile.
DL_054_MA_Person



Input: IncomingPersonRecords_for_Match and
PersonCandidates_for_Match
Output: MatchPersonA1 and MatchPersonNonA1
Matching Shared Container: DLMAPerson
DLMAPerson
Matching shared container consisting of various QualityStage
assets used in performing Person matching returning A1 and
NON-A1 matches to the invoking process.
 Input: IncomingPersonRecords_for_Match,
PersonCandidates_for_Match, RefPersonTransMatchFreq,
RefPersonCandidateMatchFreq and PersonMatchFreq
 Output: MatchPersonA1, MatchPersonNonA1,
PersonRefMatchDebugFile, PersonDedupMatchDebugFile
and PersonMatchFreqForRollUp.
DL_055_MA_LOB




DLMALOBRel
Input: MatchPersonA1,MatchPersonNonA1,
ErrCon_LOBRel, MatchOrgA1, MatchOrgNonA1,
ErrCon_ContactRel and INPUT_PARTY_MASTER
Output: LOB084_ERR_MSGS, LOBProcessing,
SuspectProcessing and ToImpliedMatches
Matching Shared Containers:DLMALOBRel, DLMALOBa
and DLMALOBb
Data BaseShared
Container:DLDBINNEWMATCHCONTID
This matching shared container is invoked twice (once for
Person and once for ORG) returns LOB matches to the
invoking process.
 Input: A1Match and LOBMatch
 Output: LnkAllLOBs
Chapter 3. RDP MDM: Direct Load
49
DataStage asset
Description
DLMALOBa
Matching shared container consisting of various QualityStage
assets used in performing Person LOB matching returning
linked LOB matches to the invoking process.
 Input: LinkedDBInput and LOBMatchFreq
 Output: LnkXfmOut (LOBPersMatches)
DLMALOBb
Matching shared container consisting of various QualityStage
assets used in performing Org LOB matching returning linked
LOB matches to the invoking process.
 Input: LinkedDBInput and LOBMatchFreq
 Output: LnkXfmOut (LOBOrgMatches)
DLDBINNEWMATCHCONTID
This Database Shared container writes records to the
IS_NEWMATCHCONTID database table.
DL_056_MA_Gen_Implied
This job identifies implied suspects from the DL_055_MA_LOB
job producing a data set that is used in subsequent processing
 Input: ToImpliedMatches
 Output: GEND_IMPLIED_SUSPECTS.
3.7 ID assignment
This processing category consists of those MDM RDP Direct Load components
that perform the following tasks:
 Creates an internal identifier, described below for each new incoming record.
 Identifies internal identifiers for incoming records that correspond to an
existing database record.
 Determines if an incoming record represents an update to an existing record.
 Generates record identifiers for each database table record to be inserted.
For each incoming SIF data record, a surrogate key is generated and used as an
internal identifier during processing, called an internal ID. If a SIF data record
identifier is found in the MDM Server database the value for the CONT_ID field is
populated, otherwise it is left null. The internal ID is integral in the processing of
the SIF data record up until specific table identifiers are generated for the related
records. The creation and assignment of a record identifier differ only slightly for
each record type.
The assignment of a record identifier usually begins with a query of the MDM
Server database to extract existing records (unless they have been previously
extracted into a data set). For every entity participating in the load, MDM Server
defines a business key, which is a combination of column values that uniquely
50
Master Data Management: IBM InfoSphere Rapid Deployment Package
identify the record. Based upon a comparison of business key column values a
determination can be made if an incoming record represents a insert or an
update. If a record is determined to represent an update existing column values
and incoming column values are compared. If there is indeed a difference, the
record is marked as an update. If there are no differences, the record is dropped
to avoid unnecessary and / or redundant updates.
The generation of internal identifiers for Party (Contact) data differs from
Contract because the user has the option to utilize a native key construct to store
contract identifiers. This differences in the naming convention of the specific
DataStage assets used to generate Contract internal identifiers identify those
assets that take into consideration the users decision regarding using “native
keys.” Another factor in the differences between the generation of internal
identifiers for Party and Contract is the use of EXTERNALMATCH SIF data
records, see Appendix B, “Standard Interface File details” on page 295 for further
details.
The DataStage jobs that need to use MDM server database record identifiers
use generic DataStage assets to generate the appropriate identifier. These
assets consist of two shared containers (AINextKeyPrefix and DLAINextKey).
The result of the DataStage assets that derive database record identifiers are
data sets of record that are either inserts into or updates to their respective
database tables.
The assignment of identifiers use four DataStage job sequences, 21 DataStage
jobs, and 20 DataStage shared containers. Table 3-6 describes these DataStage
assets.
Table 3-6 DataStage assets used in SIF identifier assignment
DataStage asset
Description
DL_015_II__InternalID_Party
This DataStage job sequence ensures that the data
constructs are in place to support the generation of
internal identifiers before invoking the DataStage job
DL_015_II_InternalID_Party.
DL_015_II_InternalID_Party



Inputs: Unique_Party_SSKs_DS,
ExternalMatch_DS and
SIF_Import_ERR_MSGS_Party
Outputs: INPUT_PARTY_MASTER,
ExternalMatchWithoutSIFContact and
PARTY_SIF_Import_IID_ERR_MSG
DataBase Shared Containers:
DLDBSELCONTEQUIV,
DLDBSELADMINEXTERNALMATCH,
DLDBSELSSKTMP and
DLDBSELCONTEQUIVEXIST
Chapter 3. RDP MDM: Direct Load
51
DataStage asset
Description
DLDBSELCONTEQUIV
This database shared container returns active
CONTEQUIV records utilizing the record identifiers to
identify existing database records.
DLDBSELADMINEXTERNALMATCH
This database shared container returns existing
database record identifiers to be used with incoming
external match data set records.
DLDBSELCONTEQUIVEXIST
This database shared container returns active
CONTEQUIV records utilizing the record identifiers to
be used with incoming external match data set
records to identify externally matched identifiers.
DL_016_II__InternalID_Contract
This DataStage job sequence ensures that the data
constructs are in place to support the generation of
internal identifiers and if the MDMIS parameter set
value for DS_USE_NATIVE_KEY is set before
invoking either of the following DataStage jobs:
 DL_016_II_InternalID_Contract_NativeKey
 DL_016_II_InternalID_Contract_NoNativeKeys
DL_016_II_InternalID_Contract_NativeKey
This DataStage job is invoked when the MDMIS
parameter set value for DS_USE_NATIVE_KEY is
set.
 Inputs: Parse_ContractSSKs, Parse_NativeKey
and CONTRACT_SIF_Import_ERR_MSGS
 Outputs: INPUT_CONTRACT_MASTER and
CONTRACT_SIF_Import_IID_ERR_MSGS
 DataBase Shared Containers:
DLDBSELSSKTMP2 and
DLDBSELADMINNATIVEKEY
DLDBSELSSKTMP2
This database shared container retrieves records
from a join between the CONTRACT, NATIVEKEY
and IS_ADMINCONTRACT database tables.
DLDBSELADMINNATIVEKEY
This database shared container retrieves records
from a join between the CONTRACT, NATIVEKEY
and IS_ADMINNATIVEKEY database tables.
52
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
DL_016_II_InternalID_Contract_NoNativeKeys
This DataStage job is invoked when the MDMIS
parameter values for DS_USE_NATIVE_KEY is not
set.
 Inputs: Parse_ContractSSKs and
CONTRACT_SIF_Import_ERR_MSGS
 Outputs: INPUT_CONTRACT_MASTER and
CONTRACT_SIF_Import_IID_ERR_MSGS
 Database Shared Container:
DLDBSELSSKTMP3
DLDBSELSSKTMP3
This database shared container retrieves records
from a join between the CONTRACT and
IS_ADMINCONTRACT database tables.
DL_060__AI_ASSIGN_IDS_PARTY
This DataStage job sequence checks the MDMIS
parameter values for QS_PERFORM_ORG_MATCH
and QS_PERFORM_PERSON_MATCH before
invoking the related DataStage jobs. It controls the
order in which the related DataStage jobs are
invoked.
DL_060_AI_Contact_Match
This DataStage job is invoked when the MDMIS
parameter set values for
QS_PERFORM_ORG_MATCH and
QS_PERFORM_PERSON_MATCH are set.
Inputs: ErrCon_Contact and LOBProcessing
Shared Container: DLAIContact
DLAIContact




Inputs: CONTID_NULL, CONTID_PRESENT,
ExternalMatchWithoutSIFContact and
Existing_Contacts
Outputs: COLSURVIVE_DROPS,
Final_Existing_Contacts, Insert_CONTEQUIV,
Insert_CONTACT, Update_CONTACT,
Update_PERSON and Update_ORG
Database Shared Container:
DLDBSELCONTACT
Assign Shared Container (Key): AINextKeyPrefix
(CONT_ID), DLAINextKey (CONT_ID and
Cont_Equiv)
DLDBSELCONTACT
This database shared container retrieves records
from a join between the IS_NEWMATCHCONTID,
CONTACT, PERSON and ORG database table.
AINextKeyPrefix
This shared container generates a key prefix based
on input data values.
Chapter 3. RDP MDM: Direct Load
53
DataStage asset
Description
DLAINextKey
This shared container generates a key based on input
data values.
DL_060_AI_Suspect
This DataStage job is invoked only when the MDMIS
parameter set values for
QS_PERFORM_ORG_MATCH and
QS_PERFORM_PERSON_MATCH are set and is
used to generate record identifiers for Suspect
database records.
 Inputs: GEND_IMPLIED_SUSPECTS,
SuspectProcessing, Insert_CONTEQUIV,
Ids_For_Remove_Suspect
 Outputs: Insert_SUSPECT
 Database Shared Container:
DLDBSELEXISTINGSUSPECT
 Assign Shared Container (Key): AINextKey
(SUSPECT_ID)
DLDBSELEXISTINGSUSPECT
This database shared container retrieves records
from the Suspect database table.
DL_060_AI_Contact_NoMatch
This DataStage job is invoked only when the MDMIS
parameter set values for
QS_PERFORM_ORG_MATCH and
QS_PERFORM_PERSON_MATCH are not set.
 Inputs: ErrCon_Contact
 Shared Container: DLAIContact (described
above)
DL_060_AI_OrgName
This DataStage job generates database record
identifiers for OrgName records and identifies
updates to the OrgName table.
 Inputs: ErrCon_OrgName, Insert_CONTEQUIV
and OrgNameExisting
 Outputs: Insert_ORGNAME, Update_ORGNAME
and DuplicateBeforeImage_OrgName
 Database Shared Container:
DLDBSELEXISTINGORGNAME
 Assign Shared Container (Key): DLAINextKey
(ORG_NAME_ID)
DLDBSELEXISTINGORGNAME
This database shared container retrieves records
from the OrgName database table.
54
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
DL_060_AI_PersonName
This DataStage job generates database record
identifiers for both PersonName and PersonSearch
records and identifies updates to the PersonName
and PersonSearch database tables.
 Inputs: ErrCon_PersonName,
Insert_CONTEQUIV and PersonNameExisting
 Outputs: Insert_PERSONNAME,
Insert_PERSONSEARCH,
Update_PERSONSEARCH,
Update_PERSONNAME and
DuplicateBeforeImage_PersonName
 Database Shared Container:
DLDBSELEXISTINGPERSONNAME
 Assign Shared Container (Key): DLAINextKey
(PERSON_NAME_ID), DLAINextKey
(PERSON_SEARCH_ID)
DLDBSELEXISTINGPERSONNAME
This database shared container retrieves related
records from both PersonName and PersonSearch
database tables.
DL_060_AI_LOBRel
This DataStage job generates database record
identifiers for LOBREL records and identifies updates
to the LOBREL database tables.
 Inputs: ErrCon_LOBRel, Insert_CONTEQUIV
and LOBRelExisting
 Outputs: Insert_LOBREL, Update_LOBREL and
DuplicateBeforeImage_LOBRel
 Database Shared Container:
DLDBSELEXISTINGLOBREL
 Assign Shared Container (Key): DLAINextKey
(LOB_REL_ID)
DLDBSELEXISTINGLOBREL
This database shared container retrieves records
from the LOBREL database table.
DL_060_AI_Alert
This DataStage job generates database record
identifiers for ALERT records and identifies updates to
the ALERT database tables.
 Inputs: PARTY_ErrCon_Alert,
Insert_CONTEQUIV and PARTY_AlertExisting
 Outputs: Insert_ALERT, Update_ALERT and
DuplicateBeforeImage_Alert
 Database Shared Container:
DLDBSELEXISTINGALERT
 Assign Shared Container (Key): DLAINextKey
(ALERT_ID)
Chapter 3. RDP MDM: Direct Load
55
DataStage asset
Description
DLDBSELEXISTINGALERT
This database shared container retrieves records
from the ALERT database table.
DL_060_AI_Identifier
This DataStage job generates database record
identifiers for IDENTIFIER records and identifies
updates to the IDENTIFIER database tables.
 Inputs: ErrCon_Identifier, Insert_CONTEQUIV
and IdentifierExisting
 Outputs: Insert_IDENTIFIER,
Update_IDENTIFIER and
DuplicateBeforeImage_Identifier
 Database Shared Container:
DLDBSELEXISTINGIDENTIFIER
 Assign Shared Container (Key): DLAINextKey
(IDENTIFIER_ID)
DLDBSELEXISTINGIDENTIFIER
This database shared container retrieves records
from the IDENTIFIER database table.
DL_060_AI_ContactRel
This DataStage job generates database record
identifiers for CONTACTREL records and identifies
updates to the CONTACTREL database tables.
 Inputs: ErrCon_ContactRel, Insert_CONTEQUIV
and ContactRelExisting
 Outputs: Insert_CONTACTREL,
Update_CONTACTREL,
SameCONTID_ContactRel and
DuplicateBeforeImage_ContactRel
 Database Shared Container:
DLDBSELEXISTINGCONTACTREL
 Assign Shared Container (Key): DLAINextKey
(CONT_REL_ID)
DLDBSELEXISTINGCONTACTREL
This database shared container retrieves records
from the CONTACTREL database table.
56
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
DL_060_AI_MiscValue
This DataStage job generates database record
identifiers for MISCVALUE records and identifies
updates to the MISCVALUE database tables.
 Inputs: PARTY_ErrCon_MiscValue,
Insert_CONTEQUIV and
PARTY_MiscValueExisting
 Outputs: Insert_MISCVALUE,
Update_MISCVALUE and
DuplicateBeforeImage_MiscValue
 Database Shared Container:
DLDBSELEXISTINGMISCVALUE
 Assign Shared Container (Key): DLAINextKey
(MISCVALUE_ID)
DLDBSELEXISTINGMISCVALUE
This database shared container retrieves PARTY
related records from the MISCVALUE database table.
DL_060_AI_PrivPref
This DataStage job generates database record
identifiers for both PREFENTITY and PRIVPREF
records and identifies updates to the MISCVALUE
database tables.
 Inputs: ErrCon_PrivPref, Insert_CONTEQUIV
and PrivPrefExisting
 Outputs: Insert_PPREFENTITY,
Insert_PRIVPREF, Update_PPREFENTITY,
Update_PRIVPREF and
DuplicateBeforeImage_PrivPref
 Databse Shared Container:
DLDBSELEXISTINGPRIVPREF
 Assign Shared Container (Key): DLAINextKey
(PPREF_ID)
DLDBSELEXISTINGPRIVPREF
This database shared container retrieves records
from a join between the PPREFENTITY and
PRIVPREF database tables.
Chapter 3. RDP MDM: Direct Load
57
DataStage asset
Description
DL_060_AI_Address_ContactMethod
This DataStage job generates database record
identifiers for Address, Address Group, Contact
Method Group, Location Group and Phone Number
related records and identifies updates to the Address
Group, Contact Method Group, Location Group and
Phone Number database tables.
 Inputs: ErrCon_Address, Insert_CONTEQUIV,
AddressExisting, ErrCon_ContactMethod and
ContactMethodExisting
 Outputs: RISUBSET_ADDRESSGROUP,
Insert_ADDRESS, Insert_ADDRESSGROUP,
Update_ADDRESSGROUP,
Insert_LOCATIONGROUP,
Update_LOCATIONGROUP,
Insert_PHONENUMBER,
Update_PHONENUMBER,
Insert_CONTACTMETHODGROUP,
Update_CONTACTMETHODGROUP,
DuplicateBeforeImage_Address and
DuplicateBeforeImage_Contact
 Database Shared Containers:
DLDBSELADDRESS,
DLDBSELCONTACTMETHOD and
DLDBSELMD5ADDRESS
 Assign Shared Container (Key): DLAINextKey
(ADDRESS_ID), DLAINextKey_Addr
(LOCATION_GROUP_ID), DLAINextKey
(ContactMethod:LOCATION_GROUP_ID) and
DLAINextKey (CONTACT_METHOD_ID)
DLDBSELADDRESS
This database shared container retrieves record from
a join between the ADDRESS, ADDRESSGROUP
and LOCATIONGROUP database tables.
DLDBSELCONTACTMETHOD
This database shared container retrieves record from
a join between the CONTACTMETHOD,
CONTACTMETHODGROUP, PHONENUMBER and
LOCATIONGROUP database tables.
DLDBSELMD5ADDRESS
This database shared container retrieves
MD5_ADDRESS values from the ADDRESS
database table.
DL_061__AI_ASSIGN_IDS_CONTRACT
This DataStage job sequence controls the order in
which the related DataStage jobs are invoked.
58
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage asset
Description
DL_061_AI_Contract_NativeKey
This DataStage job generates database record
identifiers for both Contract and NativeKey records
and identifies updates to the Contract database
tables.
 Inputs: ErrCon_Contract and
INPUT_CONTRACT_MASTER
 Outputs: Insert_NATIVEKEY, Insert_CONTRACT,
Update_CONTRACT,
DuplicateBeforeImage_Contract,
Contract_to_Component_Join and
Contract_INSTANCE_PK
 Assign Shared Container (Key): DLAINextKey
(NATIVE_KEY_ID), AINextKeyPrefix
(CONTRACT_ID), DLAINextKey
(CONTRACT_ID)
DL_061_AI_Alert_Contract
This DataStage job generates database record
identifiers for ALERT records and identifies updates to
the ALERT database tables.
 Inputs: CONTRACT_ErrCon_Alert,
Contract_INSTANCE_PK and
CONTRACT_AlertExisting
 Outputs: CONTRACT_Insert_ALERT,
CONTRACT_Update_ALERT and
DuplicateBeforeImage_Alert
 Assign Shared Container (Key): DLAINextKey
(ALERT_ID)
DL_061_AI_MiscValue_Contract
This DataStage job generates database record
identifiers for MiscValue records and identifies
updates to the MiscValue database tables.
 Inputs: CONTRACT_ErrCon_MiscValue,
Contract_INSTANCE_PK and
CONTRACT_MiscValueExisting
 Outputs: CONTRACT_Insert_MISCVALUE,
CONTRACT_Update_MISCVALUE and
DuplicateBeforeImage_MiscValue
 Assign Shared Container (Key): DLAINextKey
(MISCVALUE_ID)
Chapter 3. RDP MDM: Direct Load
59
DataStage asset
Description
DL_061_AI_Contract_Comp
This DataStage job generates database record
identifiers for Contract Component and Contract
Component Value records and identifies updates to
the Contract Component and Contract Component
Value database tables.
 Inputs: Contract_to_Component_Join,
ErrCon_ContractComponent,
ContractComponentExisting,
ErrCon_ContractCompVal and
ContractCompValExisting
 Outputs: ContrComp_to_ContrRole,
Insert_CONTRACTCOMPONENT,
Update_CONTRACTCOMPONENT,
DuplicateBeforeImage_ContractComp,
Insert_CONTRACTCOMPVAL,
Update_CONTRACTCOMPVAL and
DuplicateBeforeImage_ContractComponentValue
 Assign Shared Container (Key): DLAINextKey
(CONTR_COMPONENT_ID) and DLAINextKey
(CONTR_COMP_VAL_ID)
DL_061_AI_Contract_Role
This DataStage job generates database record
identifiers for ContractRole records and identifies
updates to the ContractRole database tables.
 Inputs: ContrComp_to_ContrRole,
ErrCon_ContractRole and
ContractComponentRoleExisting
 Outputs: Insert_CONTRACTROLE,
Update_CONTRACTROLE, ContrRole_RoleLoc
and DuplicateBeforeImage_ContractRole
 Assign Shared Container (Key): DLAINextKey
(CONTRACT_ROLE_ID)
DL_061_AI_Role_Location
This DataStage job generates database record
identifiers for Role Location records and identifies
updates to the Role Location database tables.
 Inputs: ContrRole_RoleLoc,
ErrCon_RoleLocation and RoleLocationExisting
 Outputs: Insert_ROLELOCATION,
Update_ROLELOCATION and
DuplicateBeforeImage_ContractRoleLocation
 Assign Shared Container (Key): DLAINextKey
(ROLE_LOCATION_ID)
60
Master Data Management: IBM InfoSphere Rapid Deployment Package
3.8 Data insert and update
This processing category consists of those MDM RDP Direct Load components
that are used to either update records or insert records into the MDM Server
database. These assets consist of the following structure and format:
 Reading in a data set
 Constructing a history record as determined by the value of the MDMIS
DS_LOAD_HISTORY parameter
 Inputting records to a database shared container.
Note the following information about the database shared contains:
– It is database-specific:
•
•
•
ODBC
DB2
ORACLE
– It connects to the MDM Server database.
– It properly formats the data record.
– It either inserts a new record, updates an existing record, or based on
parameter settings, inserts a related history record.
The MDMIS DS_LOAD_MODE parameter value data is used to invoke the
DataStage job sequences that controls the bulk loading of data through either the
use of SQL, or through bulk loading. The bulk loading of data requires specific
configuration activities and considerations that are not presented here. For more
information, see related articles by searching the IBM developerWorks® site:
http://www.ibm.com/developerworks/data/library/
Table 3-7 presents the DataStage jobs involved in inserting and updating records
in the MDM Server database. It does not include those assets that are used to
connect to the MDM Server database for each specific table or to bulk load data.
Because a large number of DataStage assets are involved, the table is organized
by the associated database table. The description of the DataStage job
sequences that invoke the jobs and the assets used to bulk load data are
described in Table 3-7.
Table 3-7 DataStage assets that insert or update by database table
Database table
DataStage asset
Address
IL_090_LD_Insert_Address
AddressGroup
IL_090_LD_Insert_AddressGroup
DL_090_LD_Update_AddressGroup
Chapter 3. RDP MDM: Direct Load
61
62
Database table
DataStage asset
AddressGroup History
DL_091_LD_Update_AddressGroup_History
Alert
DL_090_LD_Insert_Alert_Contract
DL_090_LD_Insert_Alert_Party
DL_090_LD_Update_Alert_Contact
DL_090_LD_Update_Alert_Contract
Alert History
DL_091_LD_Update_Alert_History
Contact
IL_090_LD_Insert_Contact
DL_090_LD_Update_Contact
Contact History
DL_091_LD_Update_Contact_History
ContactMethod
IL_090_LD_Insert_ContactMethod
DL_090_LD_Update_ContactMethod
ContactMethod History
DL_091_LD_Update_ContactMethod_History
ContactMethodGroup
IL_090_LD_Insert_ContactMethodGroup
DL_090_LD_Update_ContactMethodGroup
ContactMethodGroup
History
DL_091_LD_Update_ContactMethodGroup_History
ContactRel
IL_090_LD_Insert_ContactRel
DL_090_LD_Update_ContactRel
ContactRel History
DL_091_LD_Update_ContactRel_History
CONTEQUIV
IL_090_LD_Insert_ContEquiv
Contract
IL_090_LD_Insert_Contract
DL_090_LD_Update_Contract
Contract History
DL_091_LD_Update_Contract_History
ContractComponent
IL_090_LD_Insert_ContractComponent
DL_090_LD_Update_ContractComponent
ContractComponent
History
DL_091_LD_Update_ContractComponent_History
ContractComponent
Value
IL_090_LD_Insert_ContractCompVal
DL_090_LD_Update_ContractCompVal
ContractComponent
Value History
DL_091_LD_Update_ContractCompVal_History
ContractRole
IL_090_LD_Insert_ContractRole
DL_090_LD_Update_ContractRole
Master Data Management: IBM InfoSphere Rapid Deployment Package
Database table
DataStage asset
ContractRole History
DL_091_LD_Update_ContractRole_History
Identifier
IL_090_LD_Insert_Identifier
DL_090_LD_Update_Identifier
Identifier History
DL_091_LD_Update_Identifier_History
LOBREL
IL_090_LD_Insert_LobRel
DL_090_LD_Update_LOBRel
LOBREL History
DL_091_LD_Update_LOBRel_History
LocationGroup
IL_090_LD_Insert_LocationGroup
DL_090_LD_Update_LocationGroup
LocationGroup History
DL_091_LD_Update_LocationGroup_History
MiscValue
DL_090_LD_Insert_MiscValue_Contract
DL_090_LD_Insert_MiscValue_Party
DL_090_LD_Update_MiscValue_Contact
DL_090_LD_Update_MiscValue_Contract
MiscValue History
DL_091_LD_Update_MiscValue_History
NativeKey
IL_090_LD_Insert_NativeKey
Org
IL_090_LD_Insert_Org
DL_090_LD_Update_Org
Org History
DL_091_LD_Update_Org_History
OrgName
IL_090_LD_Insert_OrgName
DL_090_LD_Update_OrgName
OrgName History
DL_091_LD_Update_OrgName_History
Person
IL_090_LD_Insert_Person
DL_090_LD_Update_Person
Person History
DL_091_LD_Update_Person_History
PersonName
IL_090_LD_Insert_PersonName
DL_090_LD_Update_PersonName
PersonName History
DL_091_LD_Update_PersonName_History
PersonSearch
IL_090_LD_Insert_PersonSearch
DL_090_LD_Update_PersonSearch
PersonSearch History
DL_091_LD_Update_PersonSearch_History
Chapter 3. RDP MDM: Direct Load
63
Database table
DataStage asset
PhoneNumber
IL_090_LD_Insert_PhoneNumber
DL_090_LD_Update_PhoneNumber
PhoneNumber History
DL_091_LD_Update_PhoneNumber_History
PPrefEntity
IL_090_LD_Insert_PPrefEntity
DL_090_LD_Update_PPrefEntity
PPrefEntity History
DL_091_LD_Update_PPrefEntity_History
PrivPref
IL_090_LD_Insert_PrivPref
DL_090_LD_Update_PrivPref
PrivPref History
DL_091_LD_Update_PrivPref_History
RoleLocation
IL_090_LD_Insert_RoleLocation
DL_090_LD_Update_RoleLocation
RoleLocation History
DL_091_LD_Update_RoleLocation_History
Suspect
IL_090_LD_Insert_Suspect
The DataStage assets presented in Table 3-7 on page 61 are invoked by type of
data (Party or Contract) with updates preceding inserts. Therefore, for Party
data, we have the following job sequences:
 DL_090_LD__Update_Party_SQL, which invokes the update jobs specific to
Party database tables
 DL_090_LD__Update_Contract_SQL, which invokes update jobs that are
related to Contract data
The insert job sequences of DL_090_LD__Insert_Party_SQL and
DL_090_LD__Insert_Contract_SQL are a bit different. In these job sequences,
the value of the MDMIS LOAD_METHOD parameter influences whether data
loading is performed utilizing bulk loading or through the conventional approach
as represented by the DataStage jobs in Table 3-7 on page 61.
The bulk loading job sequences, DL_091__LD_Bulk_Party and
DL_091__LD_Bulk_Contract, use a DataStage container,
DL_091_LD_Bulk_Common, that allows for multiple instance to run at the same
time. The name of a table and related history table are passed into this shared
container and it performs the specific bulk loading process on those tables. For
the CONTEQUIV, NativeKey and Suspect tables, there are specific jobs
(DL_091_LD_Bulk_ContEquiv, DL_091_LD_Bulk_Suspect and
DL_091_LD_Bulk_NativeKey) to bulk load related data.
64
Master Data Management: IBM InfoSphere Rapid Deployment Package
4
Chapter 4.
RDP for MDM: Delta Load
This chapter provides an overview of a Delta Load solution using IBM InfoSphere
MDM Server (MDM) Rapid Deployment Package (RDP) Runtime Assets and
MDM Party Maintenance Services. A Delta Load in RDP is the process of
synchronizing changes in source system data with MDM Server. Because data is
processed by MDM services during loading, this solution provides the best level
of business data validation, ease of implementation and maintenance, and
highest MDM Server sustainability.
We also include detailed implementation, configuration, and installation
information about MDM RDP Runtime Assets and MDM Party Maintenance
Services.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
65
4.1 Overview
The RDP for MDM Delta Load solution assumes that the MDM Server has
already been installed and the data direct load has been completed using
Information Server DataStage and QualityStage jobs.
The two main components in this solution are as follows:
 MDM RDP Runtime Assets
Must be installed on top of MDM Party Maintenance Services.
 MDM Party Maintenance Services
Must be installed on top of MDM Server.
This solution preserves the source-to-SIF work that was done in the direct load.
The QualityStage sequencer job is executed to process SIF records in the order
required by MDM Server and generates sequenced SIF files.
The MDM BatchProcessor, or other batch framework, is used to read sequenced
SIF files and feed the SIF records into MDM Server. MDM Server invokes the SIF
Parser to transfer input SIF text messages into MDM business objects. MDM
Server then picks up configured MDM Party Maintenance Services as a business
proxy and runs it as a Java composite transaction. It resolves MDM business
object identities and eventually starts an MDM Server add or update transaction.
MDM Server invokes a Duplicated Suspect Candidate Search Rule extension in
MDM RDP Runtime Assets to emulate QualityStage blocking for duplicated
candidate selection. MDM Server sends a request to QualityStage runtime jobs
for standardization and party matching. The Delta Load solution provides a
functionality, which is same as the functionality in the Direct Load performed by
RDP for MDM Direct Load.
66
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 4-1 depicts how the components are grouped together and form the RDP
for MDM Delta Load solution.
MDM Server
MDM Party
Mainenance Services
SIF Parser
MDM Batch
Processor
Sequenced
SIF Files
SIF Sequencer
(DataStage Job)
Source Input File
MDM Business
Services
Information Server
DataStage
MDM Database
QS
DS Runtime Jobs
Figure 4-1 RDP for MDM Delta Load solution
4.2 MDM Party Maintenance Services
MDM Party Maintenance Services is created to provide a rapid solution to
implement MDM Server and load data for MDM Server.
MDM Party Maintenance Services support a subset of MDM Party Domain (see
details in 4.2.4, “MDM Party Maintenance Services Profile” on page 87). MDM
Party Maintenance Services can be used in both initial data load and delta data
load processes.
MDM Party Maintenance Services is packaged and distributed as part of the
InfoSphere MDM Server samples.
4.2.1 The instance resolution problem
InfoSphere MDM Server creates a unique internal identifier for each record or
business entity, and that serves as its internal key. This key is called the Business
Key. The Business Key is not typically intended to be published to other
applications outside of InfoSphere MDM Server, established source systems, or
Chapter 4. RDP for MDM: Delta Load
67
downstream consuming applications. However, it is available as part of the
service response message.
With InfoSphere MDM Server, you may configure the Business Key for each
business entity. This key serves as the unique identifier of the business entity in
external applications.
InfoSphere MDM Server services expect the internal identifier to be provided as
part of the update service request to ensure that services can identify the correct
business entity in the database. However, when data flows into InfoSphere MDM
Server directly from external applications, such as existing systems, the internal
key is not known and often the nature of the data change is also not known. This
issue, which is referred to as an Instance Resolution Problem, requires that the
following information be determined:
 What instance of the business entity is being worked with: party A; party B; or
others?
 Is data being added or updated?
 If you are trying to update, what instance do you want to update when there
are multiple names or addresses, multiple contact methods or identifiers,
multiple contract components, multiple party roles, and so on?
This problem is addressed by MDM Party Maintenance Services.
4.2.2 MDM Party Maintenance Services behavior
MDM Party Maintenance Services do not require the internal key as part of the
input. They also do not require the external system to specify if this entity must
be added or updated in InfoSphere MDM Server.
MDM Party Maintenance Services use the Business Key that is provided in the
load operation to locate the correct instance of the business entity in the
database. If an existing entity is found, it is updated by using the appropriate
transaction, such as updateParty. If no existing entity is found, a new entity is
created in InfoSphere MDM Server using the appropriate transaction, such as
addParty.
Party Maintenance Services support automatic expiry for deleted client records.
This feature can be enabled by setting the following value to true in the
WLCommon_extension.properties file:
syncSourceSystemEndedDataWithMDM
68
Master Data Management: IBM InfoSphere Rapid Deployment Package
When the feature is enabled, existing child object records that are not provided in
the input will be expired in the database. This functionality exists to
accommodate source systems without or with limited change data capture
capabilities.
Table 4-1 summarizes the supported objects in MDM Party Maintenance
Services.
Table 4-1 MDM Party Maintenance Services supported objects
Entity
Child objects
Party
PersonName, OrganizationName, PartyAddress,
PartyContactMethod, PartyPrivPref,
PartyIdentification, TCRMPartyValue,
PartyLobRelationship, PartyAlert
Contract
ContractAlert, ContractValue, ContractComponen
ContractComponent
ContractComponentValue, ContractPartyRole
ContractPartyRole
ContractParyRoleLocation
MDM Party Maintenance Services provide external behavior extensions to
disallow the creation of duplicate entities based on Business Keys. They also
disallow the updating of Business Keys. The new extensions are configured on
the transactions for the following entities:





PartyAddress
PartyContactMethod
ContractComponent
ContractRoleLocation
ContractComponentValue
The group validations for the Party Role validation function are expired when the
Party Maintenance Services sample is deployed.
Business Keys
MDM Party Maintenance Services use Business Keys to identify the correct
instance of the business entity in the database. MDM Party Maintenance
Services redefine the Business Keys that are provided by default as part of
InfoSphere MDM Server.
Chapter 4. RDP for MDM: Delta Load
69
Table 4-2 lists Business Keys in MDM Party Maintenance Services.
Table 4-2 MDM Party Maintenance Services Business Keys
70
Entity
Business Keys
PersonName
NameUsageType
OrganizationName
NameUsageType
PartyAddress/Address
AddressUsageType
PartyContactMethod/ContactMethod
ContactMethodUsageType
PartyRelationship
RelationshipFromValue,
RelationshipToValue, RelationshipType
PartyPrivPref
PrivPrefEntity, PrivPrefType
PartyIdentification
IdentificationType
Note: MDM Server already includes an
internal validation to disallow duplicates of
IdentificationType and
IdentificationNumber combinations.
AdminContEquiv
AdminPartyId, AdminSystemType
TCRMPartyValue
PartyValueType
ContractComponent
ContractId, ContractComponentType,
ProductType
ContractPartyRole
ContractComponentId, RoleType
ContractRoleLocation
LocationGroupId, ContractRoleId
Contract
AdminFldNmTp or AdminSystemType,
AdminContractId
Note: transactions search for active
contracts based on the Business Key from
either the TCRMAdminNativeKeyBObj
child object or the AdminContractId
element on TCRMContractBObj
ContractComponentValue
ContractComponentId, DomainValueType
Alert (for Contract and Party only)
EntityName, InstancePK, AlertType
PartyLobRelationship
RelatedLobType, LobRelationshipType
Master Data Management: IBM InfoSphere Rapid Deployment Package
Customizing Business Keys
MDM Party Maintenance Services use the Business Keys configured in the
metadata V_ELEMENTATTRIBUTE table, which can help to more easily redefine
the keys for a particular client implementation.
Customizing the Business Keys can use one of the following ways:
 Redefine Business Keys and create an SQL script. Execute the SQL script to
update Business Keys in V_ELEMENTATTRIBUTE table. Restart the server
to refresh MDM Server cache.
 Change the source code of the appropriate business proxy. For example,
change the MaintainPartyAlertBP.resolveIdentity() method to implement
custom business logic.
 Write an alternative implementation of the business proxy. For example, write
a new MaintainPartyAlertCustomBP class that overrides the resolveIdentity()
method and configure this class to be invoked for the maintainPartyAlert
transaction.
Request and response message
The data for the Business Key for a particular object can come from one or more
objects provided as part of the request message. For example, the Role Location
Business Key includes the Contract Role primary key, which can be determined
only by knowing the Contract Role type.
As a result, the maintainContractRoleLocation service includes the Contract
object hierarchy, which is Contract, Contract Component, Contract Party Role,
Person or Organization, Party Address or Party Contact Method, Contact
Equivalency and Role Location. The Party maintenance service itself ultimately
executes either an addContractRoleLocation transaction or an
updateContractRoleLocation transaction, but it requires some data from objects
in the hierarchy to resolve the instance of the role location.
Responses from the fine-grained Party Maintenance Services are constructed by
capturing the response from the core transaction after it is executed, and
replacing it in the response object hierarchy. For example, the response from an
addContractRoleLocation is inserted into a Contract response object for
maintainContractRoleLocation.
Only the Business Key data is included in the response for objects that contribute
Business Key data. For example, the maintainContractRoleLocation transaction
response only includes Business Key data for the entities Contract, Contract
Component, Contract Party Role, Person or Organization, Party Address or Party
Contact Method. It also contains the fully populated Contract Role Location
entity.
Chapter 4. RDP for MDM: Delta Load
71
If non-Business Key data is provided in the request, it is ignored and excluded
from the constructed responses.
If objects that are not required for the execution of the services are included in
the request, the service fails. For example, if a contract component value
business object is provided in a maintainRoleLocation service, the transaction
fails because none of the information from the contract component value
business object is required to resolve the identity of a role location instance.
Transaction message format
Party Maintenance Services support the following message formats:
 Standard <TCRMService> XML transaction format as provide by InfoSphere
MDM Server.
 Standard Interface File (SIF) format as defined in the RDP for MDM Direct
Load solution.
The SIF parser is provided as part of the MDM RDP Runtime Assets. The SIF
parser needs to be deployed on InfoSphere MDM Server before the SIF format
can be used with MDM Party Maintenance Services.
Party Maintenance Services do not support web services.
Implementation details
MDM Party Maintenance Services are implemented as Java composites that use
the existing InfoSphere MDM Server business components and services.
MDM Party Maintenance Services complement existing business services by
providing a delta-sensing capability.
Each of these services can be invoked individually to handle an individual
business object, or as part of other composite transactions. This way facilitates
the reuse of business logic between various Party Maintenance Services.
Each Party maintenance business proxy class implements the IMaintainService
interface and provides the resolveIdentity() method. The resolveIdentity() method
is responsible for resolving the identity of the business object managed by the
proxy. For example, MaintainPersonNameBP resolves the identity of the
PersonName business object.
72
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 4-2 shows a sample class diagram in MDM Party Maintenance Services.
IMaintainService
<<JavaClass>>
resolveIdentity(DWLCommon, DWLCommon,boolean)
DWLTxnBP
MaintainBaseBP
<<JavaClass>>
MaintainPartyBP
execute ( )
resolveIdentity ( )
fireTransaction ( )
<<JavaClass>>
MaintainPersonNameBP
execute ( )
resolveIdentity ( )
fireTransaction ( )
Figure 4-2 Party maintenance business proxies class diagram
For example, the MaintainPartyBP.resolveIdentity() method is responsible for
identity resolution of the Party entity. The MaintaintPartyBP class delegates
the job of resolving the identity of the PersonName entity to the
MaintainPersonNameBP.resolveIdentity() method.
The MaintainPersonNameBP.resolveIdentity() method searches for active
PartyName entities. It compares them with PartyNames entities in the input
object to identify the PartyName entities that already exist in the database.
The MaintainPersonNameBP.resolveIdentity() method takes the Party object as
an input. In this case, the Party object carries a list of PersonName entities. Party
object also contains the PartyId that was populated after the identity of the party
was resolved by the MaintainPartyBP.resolveIdentity() method.
Chapter 4. RDP for MDM: Delta Load
73
Figure 4-3 shows a maintainParty composite transaction sequence diagram in
MDM Party Maintenance Services.
:BatchProcessor
:MaintainPartyBP
:MaintainPersonNameBP
Resolves identity for
Party and delegates to
other BPs to do identity
resolution of child
objects
Resolves identity
for list of Person
Names
Issue add/updateParty
tx to persist Party with
children
Figure 4-3 Party Maintenance Business Proxies sequence diagram
4.2.3 MDM Party Maintenance Services Transaction List
This section describes the 17 composite transactions provided in MDM Party
Maintenance Services. Each transaction can handle only a predefined list of
business objects. If any additional objects are provided as part of the transaction
input, an error is returned. The error occurs because MDM Party Maintenance
Services validate and apply identity resolution logic only on supported entities; all
unsupported entities are rejected.
74
Master Data Management: IBM InfoSphere Rapid Deployment Package
MaintainParty
This transaction handles the Person entity or the Organization entity and some of
their child objects.
Note that no accommodation exists within maintainParty to translate a contact
equivalent into a party ID for CONTACT. provided_by_cont as follows:
 Input object
TCRMPersonBObj or TCRMOrganizationBObj, which contains one
mandatory TCRMAdminContEquivBObj object and an optional list of child
objects:
– TCRMOrganizationNameBObj or TCRMPersonNameBObj.
TCMROrganizationNameBObj is mandatory for TCRMOrganizationBObj
and TCRMPersonNameBObj is mandatory for TCRMPersonBObj
– TCRMPartyAddressBObj with one TCRMAddressBObj
– TCRMPartyContactMethodBObj with one TCRMContactMethodBObj
– TCRMPartyLobRelationshipBObj
– TCRMAlertBObj
– TCRMPartyPrivPrefBObj
– TCRMPartyValueBObj
– TCRMPartyIdentificationBObj
 Details
The maintainParty transaction searches for the active parties based on the
Business Key from the TCRMAdminContEquivBObj object. If an active party
is found, it is updated using the updateParty transaction. If no active party is
found, a new party is added using the addParty transaction.
This transaction delegates the task of resolving the identity of the party child
object to the following transactions:
–
–
–
–
–
–
–
–
MaintainOrganizationName or MaintainPersonName
MaintainPartyAddress
MaintainPartyContactMethod
MaintainPartyLobRelationship
MaintainPartyAlert
MaintainPartyPrivPref
MaintainPartyValue
MaintainPartyIdentification
If the auto-expiry for deleted client records is enabled, any existing party’s child
object records in the database which are not provided in the input will be expired
by setting their EndDate values to the server’s current time.
Chapter 4. RDP for MDM: Delta Load
75
MaintainOrganizationName
This transaction handles the OrganizationName entity as follows:
 Input object
The TCRMOrganizationBObj object, which contains one mandatory
TCRMAdminContEquivBObj object and one mandatory
TCRMOrganizationNameBObj object.
 Details
The maintainOrganizationName transaction searches for active parties based
on the Business Key from the TCRMAdminContEquivBObj object:
– If an active party is found, the maintainOrganizationName transaction
performs identity resolution for the OrganizationName entity using
Business Keys defined for OrganizationName.
– If an existing active OrganizationName is found, it is updated using the
updateOrganizationName transaction; if none is found, a new entity is
added using the addOrganizationName transaction.
MaintainPersonName
This transaction handles the PersonName entity, as follows:
 Input object
TCRMPersonBObj object, which contains one mandatory
TCRMAdminContEquivBObj object and one mandatory
TCRMPersonNameBObj object.
 Details
The maintainPersonName transaction searches for active parties based on
the Business Key from the TCRMAdminContEquivBObj object:
– If an active party is found, the maintainPersonName performs identity
resolution for the PersonName entity using the Business Keys defined for
PersonName.
– If an existing active PersonName is found, it is updated using the
updatePersonName transaction; if none is found, a new entity is added
using the addPersonName transaction.
76
Master Data Management: IBM InfoSphere Rapid Deployment Package
MaintainPartyAddress
This transaction handles the PartyAddress entity, as follows:
 Input object
TCRMPersonBObj or TCRMOrganizationBObj or TCRMPartyBObj object,
which contain one mandatory TCRMAdminContEquivBObj object and one
mandatory TCRMPartyAddressBObj object. TCRMPartyAddressBObj must
include one TCRMAddressBObj object.
 Details
The maintainPartyAddress transaction searches for active parties based on
the Business Key from the TCRMAdminContEquivBObj object:
– If an active party is found, the maintainPartyAddress transaction performs
identity resolution for the PartyAddress entity using the Business Keys
defined for PartyAddress.
– If an existing active PartyAddress is found, it is updated using the
updatePartyAddress transaction; if none is found, a new entity is added
using the addPartyAddress transaction.
MaintainPartyContactMethod
This transaction handles PartyContactMethod entity, as follows:
 Input object
TCRMPersonBObj or TCRMOrganizationBObj or TCRMPartyBObj object,
which contain one mandatory TCRMAdminContEquivBObj object and one
mandatory TCRMPartyContactMethodBObj objects.
TCRMPartyContactMethodBObj must include one TCRMContactMethodBObj
object.
 Details
The maintainPartyContactMethod transaction searches for active parties
based on the Business Key from the TCRMAdminContEquivBObj object:
– If an active party is found, it performs identity resolution for the
PartyContactMethod entity using the Business Keys defined for
PartyContactMethod.
– If an existing active PartyContactMethod is found, it is updated using the
updatePartyContactMethod transaction; if none is found, a new entity is
added using the addPartyContactMethod transaction.
Chapter 4. RDP for MDM: Delta Load
77
MaintainPartyLobRelationship
This transaction handles PartyLobRelationship entity, as follows:
 Input object
TCRMPersonBObj or TCRMOrganizationBObj object, which contain one
mandatory TCRMAdminContEquivBObj object and one mandatory
TCRMPartyLobRelationshipBObj object.
 Details
The maintainPartyLobRelationship transaction searches for active parties
based on the Business Key from the TCRMAdminContEquivBObj object:
– If an active party is found, the maintainPartyLobRelationship performs
identity resolution for the PartyLobRelationship entity using the Business
Keys defined for PartyLobRelationship.
– If an existing active PartyLobRelationship is found, it is updated using the
updatePartyLobRelationship transaction; if none is found, a new entity will
be added using the addPartyLobRelationship transaction.
MaintainPartyAlert
This transaction handles PartyAlert entity, as follows:
 Input object
TCRMPersonBObj or TCRMOrganizationBObj or TCRMPartyBObj object,
which contain one mandatory TCRMAdminContEquivBObj object and one
mandatory TCRMAlertBObj object.
 Details
The maintainPartyAlert transaction searches for active parties based on the
Business Key from the TCRMAdminContEquivBObj object:
– If an active party is found, the maintainPartyAlert transaction performs
identity resolution for the PartyAlert entity using the Business Keys defined
for PartyAlert.
– If an existing active PartyAlert entity is found, it is updated using the
updatePartyAlert transaction; if none is found, a new entity will be added
using the addPartyAlert transaction.
78
Master Data Management: IBM InfoSphere Rapid Deployment Package
MaintainPartyPrivPref
This transaction handles the PartyPrivacyPreference entity, as follows:
 Input object
TCRMPersonBObj or TCRMOrganizationBObj or TCRMPartyBObj object,
which contains one mandatory TCRMAdminContEquivBObj object and one
mandatory TCRMPartyPrivPrefBObj object.
 Details
The maintainPartyPrivPref transaction searches for active parties based on
the Business Key from the TCRMAdminContEquivBObj object:
– If an active party is found, the maintainPartyPrifPref transaction performs
identity resolution for the PartyPrivacyPreference entity using Business
Keys defined for PartyPrivacyPreference.
– If an existing active PartyPrivacyPreference entity is found, it is updated
using the updatePartyPrivacyPreference transaction; if none is found, a
new entity is added using addPartyPrivacyPreference transaction.
MaintainPartyValue
This transaction handles the PartyPrivacyPreference entity, as follows:
 Input object
TCRMPersonBObj or TCRMOrganizationBObj or TCRMPartyBObj object,
which contain one mandatory TCRMAdminContEquivBObj object and one
mandatory TCRMPartyValueBObj object.
 Details
The maintainPartyValue transaction searches for active parties based on the
Business Key from the TCRMAdminContEquivBObj object:
– If active party is found, the maintainPartyValue transaction performs
identity resolution for the PartyValue entity using the Business Keys
defined for PartyValue.
– If an existing active PartyValue is found, it is updated using the
updatePartyValue transaction; if none is found, a new entity is added using
the addPartyValue transaction.
Chapter 4. RDP for MDM: Delta Load
79
MaintainPartyIdentification
This transaction handles PartyIdentification entity, as follows:
Note: No accommodation exists within maintainPartyIdentification to translate
a contact equivalent into a party ID for IDENTIFIER.assigned_by.
 Input object
TCRMPersonBObj or TCRMOrganizationBObj or TCRMPartyBObj object,
which contain one mandatory TCRMAdminContEquivBObj object and one
mandatory TCRMPartyIdentificationBObj object.
 Details
The maintainPartyIdentification transaction searches for active parties based
on the Business Key from the TCRMAdminContEquivBObj object. If an active
party is found, the maintainPartyIdentification transaction performs identity
resolution for the PartyIdentification entity using the Business Keys defined for
PartyIdentification. If an existing active PartyIdentification is found, it is
updated using the updatePartyIdentification transaction; if none is found, a
new entity is added using the addPartyIdentification transaction.
Note: Although Party Maintenance Services redefine Business Keys for the
PartyIdentification entity to be IdentificationType, MDM Server already
includes an internal validation to disallow duplicates of IdentificationType and
IdentificationNumber combinations. Therefore, the Business Keys for
PartyIdentification consist of IdentificationType and IdentificationNumber
MaintainPartyRelationships
This transaction handles the PartyRelationship entity, as follows:
 Input object
TCRMPartyListBObj object, which contains multiple parties and each
represented by the TCRMPartyBObj, TCRMPersonBObj or
TCRMOrganizationBObj object. Each party must contain one mandatory
TCRMAdminContEquivBObj object.
 Details
The ObjectReferenceId elements must be used to properly identify the source
and target parties within the relationship. Each PartyRelationship object must
have its parent party’s ObjectReferenceId as either its source or target party.
Every PartyRelationship object must defines a relationship between a primary
party and one of the other parties in the PartyList. If only two parties are
provided, the first in the PartyList would be considered as the primary party.
80
Master Data Management: IBM InfoSphere Rapid Deployment Package
If the feature to automatically set the EndDate for deleted records is enabled,
it is possible to provide only one party without any PartyRelationship object in
the PartyList. The purpose of providing this input is to end all the existing
active relationships for that party. If multiple parties are provided, the primary
party’s relationships that are not in the input are ended.
The maintainPartyRelationship transaction searches for all parties based on
the Business Key from the TCRMAdminContEquivBObj objects. The
maintainPartyRelationships transaction searches for existing active
relationships for the primary party using the getAllPartyRelationships
transaction, performs identity resolution for the PartyRelationship entity based
on the PartyRelationship Business Keys, and uses the
updatePartyRelationship or addPartyRelationship transactions to update or
add relationship objects.
This transaction is not called as part of maintainParty because the
assumption is that party relationships will be loaded separately after all the
parties have been added to the system.
MaintainContractPlus
This transaction handles the Contract entity, including child entities
ContractValue, ContractComponent, and ContractAlert, as follows:
Note: No accommodation exists within maintainContractPlus to translate a
contract cross reference into contract Id for CONTRACT.repl_by_contract.
 Input object
TCRMContractBObj, which contains extension TCRMContractPlusBObjExt.
In addition, at least one of the following items is mandatory:
– TCRMAdminNativeKeyBObj child object
– AdminContractId and AdminSysTp elements on TCRMContractBObj
This transaction also has the following optional child objects:
– TCRMContractComponentBObj, along with the following optional child
objects:
•
•
•
•
TCRMContractComponentValueBObj
TCRMContractPartyRoleBObj
TCRMContractRoleLocationBObj
TCRMContractAlertBObj
– TCRMContractValueBObj, which is provided as an extended element of
TCRMContractPlusBObjExt
Chapter 4. RDP for MDM: Delta Load
81
 Details
In the response message, the TCRMContractValueBObj is included as part of
the TCRMContractBObj object, not as an extended element of the
TCRMContractPlusBObjExt.
The maintainContractPlus transaction searches for active contracts based on
the Business Key from either the TCRMAdminNativeKeyBObj child object or
the AdminContractId and AdminSysTp elements on TCRMContractBObj. If
an active contract is found, it is updated using the updateContract transaction;
if none is found, a new contract is added using the addContract transaction.
The maintainContractPlus transaction adds multiple
TCRMAdminNativeKeyBObj entities if multiple TCRMAdminNativeKeyBObj
objects are provided as part of the request. However, only the first
TCRMAdminNativeKeyBObj object is used as the Business Key to search for
an existing Contract entity.
This transaction delegates the task of child object identity resolution to the
following transactions:
– MaintainContractComponent
– MaintainContractAlert
The maintainContractPlus transaction performs identity resolution for
ContractValue objects and maintains them using either the
updateContractValue transaction or the addContractValue transaction.
If the auto-expiry for deleted client records is enabled, any existing
ContractAlert, ContractValue, ContractPartyRole and
ContractPartyRoleLocation records in the database which are not provided in
the input will be expired by setting their EndDate values to the server’s current
time.
MaintainContractComponent
This transaction handles the ContractComponent entity, including the child
entities ContractComponentValue and ContractPartyRole, as follows:
 Input object
TCRMContractBObj, which must contain the following items:
– At least one of the following items:
•
•
TCRMAdminNativeKeyBObj child object
AdminContractId and AdminSysTp elements on TCRMContractBObj
– One mandatory TCRMContractComponentBObj object, which can contain
either of the following optional child objects:
•
•
82
TCRMContractComponentValueBObj
TCRMContractPartyRoleBObj
Master Data Management: IBM InfoSphere Rapid Deployment Package
 Details
The maintainContractComponent transaction searches for active contracts
based on the Business Key from either the TCRMAdminNativeKeyBObj child
object or the AdminContractId and AdminSysTp elements on the
TCRMContractBObj:
– If an active contract is found, the maintainContractComponent transaction
performs identity resolution for the ContractComponent entity using the
Business Keys defined for ContractComponent.
– If an existing ContractComponent entity is found, it is updated using the
updateContractComponent transaction; if none is found, a new one is
added using the addContractComponent transaction.
This transaction delegates the task of resolving the identity of the
ContractComponent child object to the following transactions:
– MaintainContractComponentValue
– MaintainContractPartyRole
If auto-expiry for deleted client records is enabled, any existing
ContractComponentValue, ContractPartyRole, and
ContractPartyRoleLocation records in the database that are not provided in
the input will be expired by setting their EndDate values to the server’s current
time.
MaintainContractAlert
This transaction handles the ContractAlert entity, as follows:
 Input object
TCRMContractBObj, which contains the following items:
– At least one of the following items:
•
•
TCRMAdminNativeKeyBObj child object, or
AdminContractId and AdminSysTp elements on TCRMContractBObj
– One mandatory TCRMContractAlertBObj object
Chapter 4. RDP for MDM: Delta Load
83
 Details
The maintainContractAlert transaction searches for active contracts based on
the Business Key from either the TCRMAdminNativeKeyBObj child object or
the AdminContractId and AdminSysTp elements on TCRMContractBObj:
– If an active contract is found, the maintainContractAlert transaction
performs identity resolution for the ContractAlert entity using the Business
Keys defined for ContractAlert.
– If an existing active ContractAlert is found, it is updated using the
updateContractAlert transaction; if none is found, a new one is added
using the addContractAlert transaction.
MaintainContractComponentValue
This transaction handles the ContractComponentValue entity, as follows:
 Input object
TCRMContractBObj, which contains the following items:
– At least one of the following items:
•
•
TCRMAdminNativeKeyBObj child object, or
AdminContractId and AdminSysTp elements on TCRMContractBObj
– One mandatory TCRMContractComponentBObj object, which contains its
Business Keys and one mandatory TCRMContractComponentValueBObj
 Details
The maintainContractComponentValue transaction searches for active
contracts based on the Business Key from either the
TCRMAdminNativeKeyBObj child object or the AdminContractId and
AdminSysTp elements on TCRMContractBObj:
– If an active existing Contract is found, it retrieves the list of
ContractComponent objects for the existing contract and performs identity
resolution for the ContractComponent entity using the Business Keys
defined for ContractComponent.
– If an existing ContractComponent is found, the
maintainContractComponentValue transaction performs identity resolution
for the ContractComponentValue entity using the Business Keys defined
for the ContractComponentValue.
– If an existing active ContractComponentValue is found, it is updated using
the updateContractComponentValue transaction; if none is found, a new
one is added using the addContractComponentValue transaction.
84
Master Data Management: IBM InfoSphere Rapid Deployment Package
MaintainContractPartyRole
This transaction handles ContractPartyRole entity, including child entity
ContractRoleLocation, as follows:
 Input object
TCRMContractBObj, which contains the following items:
– At least one of the following items:
•
•
TCRMAdminNativeKeyBObj child object
AdminContractId and AdminSysTp elements on TCRMContractBObj
– One mandatory TCRMContractComponentBObj object, which contains its
Business Keys and which must contain one TCRMContractPartyRoleBObj
with one TCRMPersonBObj, or one TCRMOrganizationBObj object which
has one mandatory TCRMAdminContEquivBObj object to identify the
party.
 Details
The maintainContractPartyRole transaction searches for active contracts based
on the Business Key from either the TCRMAdminNativeKeyBObj child object or
the AdminContractId and AdminSysTp elements on TCRMContractBObj:
– If an active existing Contract is found, the maintainContractPartyRole
transaction retrieves the list of ContractComponent objects for the existing
contract and performs identity resolution for the ContractComponent entity
using the Business Keys defined for ContractComponent.
– If an existing ContractComponent is found, the maintainContractPartyRole
transaction searches for existing Party Roles based on the party’s
TCRMAdminContEquivBObj and it performs identity resolution for the
ContractPartyRole entity using the Business Keys defined for
ContractPartyRole.
– If an existing active ContractPartyRole is found, it is updated using the
updateContractPartyRole transaction; if none is found, a new one is added
using the addContractPartyRole transaction.
This transaction delegates the task of resolving the identity of the
ContractRoleLocation child object to the MaintainContractRoleLocation
transaction.
If the auto-expiry for deleted client records is enabled, any existing
ContractPartyRole records in the database which are not provided in the input
will be expired by setting their EndDate values to the server’s current time.
Chapter 4. RDP for MDM: Delta Load
85
MaintainContractRoleLocation
This transaction handles the ContractRoleLocation entity, as follows:
 Input object
TCRMContractBObj, which must contain the following items:
– One of the following items:
•
•
TCRMAdminNativeKeyBObj child object
AdminContractId and AdminSysTp elements on TCRMContractBObj
– One mandatory TCRMContractComponentBObj object, which contains
the Business Keys for ContractComponent
– One mandatory TCRMContractPartyRoleBObj object, which contains the
following items:
•
One Party object (TCRMPersonBObj or TCRMOrganizationBObj) with:
One mandatory TCRMAdminContEquivBObj object to identify the
party
One mandatory TCRMPartyAddressBObj or
TCRMPartyContactMethodBObj object. The Address and
ContractMethod must exist in the system before you execute this
transaction
•
One mandatory TCRMContractRoleLocationBObj object, which
contains the ObjectReferenceId to either the TCRMPartyAddressBObj
or the TCRMPartyContactMethodBObj object from Party
 Details
The maintainContractRoleLocation transaction searches for active contracts
based on the Business Key from either the TCRMAdminNativeKeyBObj child
object or the AdminContractId and AdminSysTp elements on the
TCRMContractBObj object.
– If an active existing Contract is found, the maintainContractRoleLocation
transaction retrieves the list of ContractComponent objects for the existing
contract and performs identity resolution for the ContractComponent entity
using the Business Keys defined for ContractComponent.
– If an existing ContractComponent is found, the
maintainContractRoleLocation transaction searches for the existing Party
Roles based on the party’s TCRMAdminContEquivBOb and performs
identity resolution for the ContractPartyRole entity using Business Keys
defined for ContractPartyRole.
– If an existing ContractPartyRole is found, the
maintainContractRoleLocation transaction validates the existence of the
Address or Contract Method objects and if the ContractRoleLocation entity
86
Master Data Management: IBM InfoSphere Rapid Deployment Package
exists, performs identity resolution for it using the Business Keys defined
for ContractRoleLocation. If an existing active ContractRoleLocation is
found, it is updated using the updateContractRoleLocation transaction; if
none is found, a new one is added using the addContractRoleLocation
transaction.
4.2.4 MDM Party Maintenance Services Profile
MDM Party Maintenance Services support a subset of MDM Party domain
business objects that are children of TCRMParty and TCRMContract.
The following child business objects are not supported:



















TCRMPartyAddressPrivPrefBObj
TCRMPartyAddressPrivPrefBObj
TCRMAddressValueBObj
TCRMAddressNoteBObj
TCRMPartyContactMethodPrivPrefBObj
TCRMPartyLocationPrivPrefBObj
TCRMFinancialProfileBObj
TCRMPartyBankAccountBObj
TCRMPartyChargeCardBObj
TCRMPartyPayrollDeductionBObj
TCRMIncomeSourceBObj
TCRMVehicleHoldingBObj
TCRMPropertyHoldingBObj
TCRMAlertBObj for TCRMContractPartyRoleBObj
TCRMContractPartyRoleSituationBObj
TCRMContractPartyRoleIdentifierBObj
TCRMContractPartyRoleRelationshipBObj
TCRMContractRelationshipBObj
TCRMContractRoleLocationPurposeBObj
MDM Party Maintenance Services profile uses an existing InfoSphere MDM
Server feature called Smart Inquiries to tune the InfoSphere MDM Server
database to avoid accessing unused tables for unsupported business objects.
Smart Inquiries provide the ability to reconfigure your server implementation to
turn off parts of the data model related to unused transactions and tables. When
these parts of the model are turned off, no database I/O inquiry is issued against
the unused tables thus improving processing efficiency when getParty and
getContract course grained transactions are invoked.
Run ELMDM_Smart_Inquiry.sql script to enable this feature. Any databases
tables that are included in this script will have their access turned off. For more
information about Smart Inquiries, see the IBM InfoSphere Master Data
Chapter 4. RDP for MDM: Delta Load
87
Management Server Developers Guide, Version 9.0, which is licensed material
that is available with the product.
Table 4-3 lists the operations and function area that has been turned off (not
performed).
Table 4-3 Disabled MDM Server operations
88
Operational actions
Function area not performed
getFinancialProfile
Financial Profile
getAllContractAdminSysKeys
Contract Admin System Key
getAllContractPartyRoleAlerts
Contract Party Role Alert
getAllContractPartyRoleAlerts
Contract Party Role Alert
getAllContractPartyRoleSituations
Contract Party Role Situation
getAllContrcatPartyRoleIdentifierbyContr
actRoleId
Contract Party Role Identifier
getAllContractRelationships
Contract Relationship
getAllIncomeSources
Income Source
getAllPartyBankAccounts
Bank Account
getAllPartyChargeCards
Charge Card
getAllPartyPayrolldeductions
Payroll Deduction
getAllPartyAddressPrivacyPreferences
Party Address Privacy Preference
getAllPartyContactMethodPrivacyPrefere
nces
Party Contact Method Privacy
Preference
getAllContractPartyRoleRelationships
Contract Party Role Relationship
getAllAddressValues
Address Values
getAllAddressNotes
Address Note
getAllPartyLocationPrivacyPreferences
Location Group Privacy Preference
getHolding
Holding
getAllContractRoleLocationPurposes
Contract Role Location Purposes
Master Data Management: IBM InfoSphere Rapid Deployment Package
4.2.5 MDM Party Maintenance Services installation
MDM Party Maintenance Services must be installed on top of MDM Server.
Installing the Party Maintenance Services modifies the configuration of
InfoSphere:
 The Business Keys for entities supported by Party Maintenance Services are
redefined.
 The MDM Party Maintenance Services behavior extensions are deployed to
prevent duplicate entities based on Business Keys.
Installation process overview
Installing the MDM Party Maintenance Services consists of the following tasks:
1. Expand the MDM901_Samples.tar.gz from the InfoSphere MDM Server
distribution to the server directory: <MDM_Sample_Home>.
2. To install Party Maintenance Services, run one of the following installation
shell scripts that is appropriate for your server environment:
– WebSphere and DB2 Environment
i. Edit:
<MDM_Sample_Home>/PartyMaintenanceServices/install/WebSphere/D
B2/setVariables.sh
ii. Run:
<MDM_Sample_Home>/PartyMaintenanceServices/install/WebSphere/D
B2/install_MDM_PartyMaintenanceServices.sh
– WebSphere and zOS Environment
i. Edit:
<MDM_Sample_Home>/PartyMaintenanceServices/install/WebSphere/z
OS/setVariables.sh
ii. Run:
<MDM_Sample_Home>/PartyMaintenanceServices/install/WebSphere/z
OS/install_MDM_PartyMaintenanceServices.sh
Chapter 4. RDP for MDM: Delta Load
89
– WebSphere and Oracle Environment
i. Edit:
<MDM_Sample_Home>/PartyMaintenanceServices/install/WebSphere/O
racle/setVariables.sh
ii. Run:
<MDM_Sample_Home>/PartyMaintenanceServices/install/WebSphere/O
racle/install_MDM_PartyMaintenanceServices.sh
– WebLogic and Oracle Environment
i. Edit:
<MDM_Sample_Home>/PartyMaintenanceServices/install/WebLogic/Or
acle/setVariables.sh
ii. Run:
<MDM_Sample_Home>/PartyMaintenanceServices/install/WebLogic/Or
acle/install_MDM_PartyMaintenanceServices.sh
– Cluster and WebSphere and DB2 Environment
i. Edit:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/Web
Sphere/DB2/setVariables.sh
ii. Run:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/Web
Sphere/DB2/install_MDM_PartyMaintenanceServices.sh
– Cluster and WebSphere and zOS Environment
i. Edit:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/Web
Sphere/zOS/setVariables.sh
ii. Run:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/Web
Sphere/zOS/install_MDM_PartyMaintenanceServices.sh
– Cluster and WebSphere and Oracle Environment
i. Edit:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/Web
Sphere/Oracle/setVariables.sh
ii. Run:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/Web
Sphere/Oracle/install_MDM_PartyMaintenanceServices.sh
90
Master Data Management: IBM InfoSphere Rapid Deployment Package
– Cluster and WebLogic and Oracle Environment
i. Edit:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/Web
Logic/Oracle/install_MDM_PartyMaintenanceServices.sh
ii. Run:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/Web
Logic/Oracle/setVariables.sh
Installation steps
Use the following steps to install MDM Party Maintenance Services:
1. Download the InfoSphere MDM Server Samples archive from InfoSphere
MDM Server Samples Packaging (for example, MDM901_Samples.tar.gz) to a
temp directory on the server, <MDM_Sample_Home>, and extract the
content using commands in Example 4-1.
Example 4-1 Extract MDM Samples
gzip –d MDMRDP901_Samples.tar.gz
tar –xvf MDMRDP901_Samples.tar
The TAR file is expanded into several directories. The
install_MDM_PartyMaintenanceServices.sh script is located in the following
subdirectory:
<MDM_Sample_Home>/PartyMaintenanceServices/install/
This script runs several scripts and uses resource files under the following
folders:
<MDM_Sample_Home>/PartyMaintenanceServices/Jars
<MDM_Sample_Home>/PartyMaintenanceServices/properties
<MDM_Sample_Home>/PartyMaintenanceServices/DB folders
The path to each of these resources is indicated relatively in the script.
2. Edit the setVariables.sh based on environment described in “Installation
process overview” on page 89 and set the variables with proper values,
including DB_NAME, DB_USER, and DB_PASSWORD.
Chapter 4. RDP for MDM: Delta Load
91
Example 4-2 shows sample values of setVariables.sh file.
Example 4-2 Sample values of setVariables.sh
export
export
export
export
export
export
export
export
export
export
export
export
export
export
JAVA_HOME=/usr/IBM/WebSphere/AppServer/java
WAS_HOME=/usr/IBM/WebSphere/AppServer
CELL_NAME=celebornCell01
NODE_NAME=Node01
APP_NAME=CAM_MDM900_10182009_2200_DB2_BE02
INSTALL_HOME=/usr/IBM/MDM/CAM_MDM900_10182009_2200_DB2_BE02
DB_NAME=MDM9QA2
DB_USER=celcam02
DB_PASSWORD=Schema90
APPLICATION_NAME='InfoSphere Master Data Management'
APPLICATION_VERSION=9.0.0
DEPLOY_NAME=CAM_MDM900_10182009_2200_DB2_BE02
ADMIN_USER=input-user
ADMIN_PASSWORD=input-password
These values are used to identify the location of server where the InfoSphere
MDM Server application was installed and the location of the MDM.ear file,
and other files.
3. To prevent a file permission error when installing the Party Maintenance
Services, in the subdirectory with shell scripts that exists under the
<MDM_Sample_Home>/PartyMaintenanceServices/install/ directory, execute
the command in Example 4-3.
Example 4-3 The chmod command
chmod -R 755 *.sh
4. Execute Example 4-4 shell script to install MDM Party Maintenance Services.
Example 4-4 Install MDMPartyMaintenanceServices shell script
./install_MDM_PartyMaintenanceServices.sh
5. Ignore the warning WARNING: Duplicate name in Manifest that appears on
the server command window while the script is running.
While the script is running, it prompts you to check that the application server
is running before deploying Party Maintenance Services InfoSphere MDM
Server configuration.
After you finish deploying Party Maintenance Services InfoSphere MDM
Server configuration, you can either restart the server automatically with the
script or skip restarting the server.
92
Master Data Management: IBM InfoSphere Rapid Deployment Package
This script modifies and updates the MDM.ear file, and the jar files where the
InfoSphere MDM Server instance is installed. The original EAR or JAR files
are copied and saved with a different name for backup. Each backup file can
be found in the folder where the original files were with the .beforeELM
extension. Examples are as follows:
MDM.ear.beforeELM
DWLCommonServicesEBJ.jar.beforeELM
The log files can be found in the logs folder under the install folder. After the
install script has run, check the log files in logs folder under the install
folder for any errors occurred during the installation.
6. If InfoSphere MDM Server is installed on a cluster with several nodes, you
must run the script install_MDM_PartyMaintenanceServices.sh in the
following subfolder:
<MDM_Sample_Home>/PartyMaintenanceServices/install/Cluster/
The script updates each InfoSphere MDM Server folder of the cluster. Before
running the script, modify the setVariables.sh script according to each
environment:
–
–
–
–
WAS_HOME
CELL_NAME
NODE_NAME
APP_NAME
Run the script as many times as needed to update the cluster InfoSphere
MDM Server.
If the script must be run again, replace the EAR file and JAR files with original
files to avoid errors that will occur while the script is running. The logs folder is
backed up with a current time stamp.
The install_MDM_PartyMaintenanceServices.sh script
The install_MDM_PartyMaintenanceServices.sh script performs the following
modifications to the InfoSphere MDM Server instance:
 Runs the following scripts to populate MDM Party Maintenance Services
configuration data into MDM Server database:
–
–
–
–
clearELMDM.sql
ELMDM_Business_Keys.sql
ELMDM_Misc_Inserts.sql
ELMDM_Txn_Names.sql
 Extracts META-INF/MANIFEST.MF file from DWLCommonmServicesEJB.jar and
edits Class-Path to include EntryLevelMDM.jar. Appends a string
EntryLevelMDM.jar to the end of Class-Path with a space ahead.
Chapter 4. RDP for MDM: Delta Load
93
 Extracts DWLCommon_extension.properties file from properties.jar. Copies
the content of DWLCommon_extension_ELMEM.properties and pastes it under
the following line in the DWLCommon_extension.properties:
BusinessProxy.Default= com.dwl.base.requestHandler.DWLTxnBP
 Extracts tcrm_extension.properties file from properties.jar. Copies the
content of tcrm_extension_ELMDM.properties and pastes it to the end of
tcrm_extension.properties.
 Modifies tcrmRequest_extension.xsd by deleting the lines indicated by
DELETE THE CODE BELOW AT DEPLOYMENT text in the files. Replaces
these files in DWLSchemas.jar with the modified ones.
 Backs up the following files:
–
–
–
–
MDM.ear
DWLCommonServicesEJB.jar
DWLSchema.jar
properties.jar
 Adds EntryLevelMDM.jar in the MDM.ear file and WebSphere runtime folder.
 Puts the modified JAR files back in the MDM.ear file and WebSphere runtime
folder.
 Restarts the server, if requested.
4.3 MDM RDP Runtime Assets
MDM RDP Runtime Assets contain following components:
 QualityStage (QS) SIF Sequencer Job
 SIF Parser
 RDP Suspect Duplicated Candidate Search Rule
 QS runtime standardization and matching jobs
 QS runtime client for Remote Method Invocation (RMI) or web services
access
 Java classes as adapter and converter to support QS runtime jobs
 Java wrapper classes and converter class to support disable MDM Server
Soundex phonetic generation and pick up Nysiis phonetic keys from QS
runtime jobs
MDM RDP Runtime Assets are installed on top of MDM Party Maintenance
Services and MDM Server.
94
Master Data Management: IBM InfoSphere Rapid Deployment Package
MDM RDP Runtime Assets play a role in RDP for MDM - Delta Load solution to
preserve the source-to-SIF work and ensure same data loading results on
suspect duplicated candidate searching, standardization, and party matching as
what were done in direct load.
SIF Sequencer Job processes source-to-SIF records in order, required by MDM
Server. SIF Parser parses sequenced SIF records and converts SIF text request
messages into MDM Server business objects. RDP Suspect Duplicated
Candidate Search Rule overwrites MDM Server default suspect candidate
search rule to emulate blockings used in DataStage Jobs during data direct load.
QS runtime standardization and matching jobs are invoked by MDM services
during the delta load. QS runtime client is deployed on MDM Server to allow
MDM Services to access IBM Information Server where QS jobs are installed
through RMI or the web services protocol.
4.3.1 SIF Parser
SIF Parser supports the source-to-SIF format that is used in RDP for MDM Direct
Load solution. SIF Parser converts data from an SIF input text message into a
persistent transaction Java object before sending the request to the MDM Party
Maintenance Services transaction. An appropriate transaction name is chosen,
based on a combination of the Record_Type and SubRecord_Type. For example,
for Record_Type=P and SubRecord_Type=I, the SIF Parser issues a
maintainPartyIdentification transaction, because the SIF record of this type of
transaction contains the data for the PartyIdentification entity.
SIF Parser is enabled by setting the entry, sif_compatibility_mode = on, in the
DWLCommon_extention.property file. MDM RDP Runtime Assets installation
script modifies it automatically.
Chapter 4. RDP for MDM: Delta Load
95
SIF records hierarchy
SIF Parser supports SIF records listed in Table 4-4.
Table 4-4 SIF Parser supported SIF records
Item
Rectype|Subtype
InfoSphere MDM Server Business Objects
1
P|P
Party/Person
2
P|O
Party/Organization
3
P|H
PersonName
4
P|G
OrganizationName
5
P|A
Address/LocationGroup/AddressGroup
6
P|C
ContactMethod/LocationGroup/ContactMethodGroup
7
P|I
Identification
8
P|B
LOBRelationship
9
P|R
PartyRelationship
10
P|M
PartyValue
11
P|S
PartyPrivacyPreference
12
P|T
PartyAlert
13
C|H
ContractPlus/Contract
14
C|C
ContractComponent
15
C|R
ContractPartyRole
16
C|L
ContractRoleLocation
17
C|V
ContractComponentValue
18
C|M
ContractPlus/ContractValue
19
C|A
ContractAlert
SIF Parser supports party SIF records hierarchy and contract SIF records
hierarchy.
96
Master Data Management: IBM InfoSphere Rapid Deployment Package
Party SIF hierarchy
Figure 4-4 shows party SIF records hierarchy.
P P/PO
P A P C P I
P B P M P S P T
P P
P H
P O
P G
P R
Note: P R does not have any parent or child SIF record.
Figure 4-4 Party SIF records hierarchy
Contract SIF Hierarchy
Figure 4-5 shows contract SIF records hierarchy.
CH
CM
CC
CR
CT
CV
CL
Figure 4-5 Contract SIF records hierarchy
Chapter 4. RDP for MDM: Delta Load
97
SIF metadata
MDM RDP Runtime Assets have 17 SIF metadata files in SIF.jar located in the
<RDP_Assets_Home>/MDMRDPRuntime/jar directory. SIF metadata files for each
supported SIF record are in the following list. SIF Parser converts SIF input
message to MDM business objects based on these SIF metadata specifications.
If any SIF format customization happens, SIF metadata must be updated
accordingly.

















Alert.sif
Contact.sif
Contract.sif
ContractComponent.sif
ContractComponentValue.sif
ContractPartyRole.sif
ContractRoleLocation.sif
ContractValue.sif
OrganizationName.sif
PartyAddress.sif
PartyContactMethod.sif
PartyIdentification.sif
PartyLOBRelationship.sif
PartyPrivPref.sif
PartyRelationship.sif
PartyValue.sif
PersonName.sif
SIF Business Keys
SIF Business Keys are used to determine SIF records hierarchy relationship. SIF
Business Keys are only necessary for MDM Server business object hierarchies
that have three or more levels, such as C|R, C|V, and C|L records.
The SIF Business Keys are defined in SIF.properties as shown in Example 4-5
on page 99.
98
Master Data Management: IBM InfoSphere Rapid Deployment Package
Example 4-5 SIF Business Key
#############################################################
#SIF Business Key
###############################################################
BusinessKey.C.C=ContractComponentType,ProductType
BusinessKey.C.V=ContractComponentType,ProductType,DomainValueType
BusinessKey.C.R=ContractComponentType,ProductType,RoleType
BusinessKey.C.L=ContractComponentType,ProductType,RoleType,AddressUsage
Type
####################################
# SIF SPEC for Business Key
####################################
ContractComponentType=CONTR_COMP_TP_CD
ProductType=PROD_TP_CD
RoleType=CONTR_ROLE_TP_CD
AddressUsageType=ADDR_USAGE_TP_CD
DomainValueType=DOMAIN_VALUE_TP_CD
If there are customized InfoSphere MDM Server Business Keys defined in the
V_ELEMENTATTRIBUTE table, the related SIF Business Key and SIF SPEC for
Business Key must be changed accordingly.
4.3.2 Data extension and SIF Parser configuration
Every MDM Server implementation and customization has the business
requirement to extend MDM Server default data model. SIF Parser supports
MDM Server data extension. In this section we provide information about SIF
metadata, SIF.properties, and SIF input format configuration that allows SIF
Parser to work with MDM Server data extension.
We assume you have the business requirement to extend the MDM Server
CONTRACTROLE table by adding two new columns per the following names:





Base table name: CONTRACTROLE
Base MDM business object name: TCRMContractPartyRoleBObj
Data extension table name: XCONTRACTROLE
Extension column names: xsold_indicator, xsold_store_number
Extension business object name: XContractPartyRoleBObjExt
Chapter 4. RDP for MDM: Delta Load
99
SIF metadata configuration
If CONTRACTROLE table is extended, the ContractPartyRole.sif metadata file
must be modified accordingly.
The three updates in the metadata file are as follows:
1. Define the extension business object to replace the base business object,
such as using XContractPartyRoleBObjExt to replace
TCRMContractPartyRoleBObj in Example 4-6.
2. Define data extension fields in metadata file. The primary key field of data
extension table should also be defined in metadata file. See the
ContractPartyRole Extension Fields section in Example 4-6. The extension
fields must be defined before the NULL_XXX fields.
3. Define a NULL field for each data extension and primary key fields at end of
the metadata file. See the ContractPartyRole Extension Null Fields section in
Example 4-6.
Example 4-6 ContractPartyRole SIF metadata with extension
#######################################################################
# SIF metadata: ContractPartyRole
# Version 1.0.0
## Defination:
# column 1: SIF_FIELD_NAME
# column 2: IS_NULLABLE
# column 3: BOBJ_FIELD_NAME (optional)
# Applies to data fields only, first character must be upper case
# column 4: BOBJ_CLASS (optional)
# Applies when the data field is not defined in the subType mapped BObj
# Fully qualified class name if present
######################################################################
RECTYPE
N
SUBTYPE
N
ADMIN_SYS_TP_CD
N
ADMIN_CONTRACT_ID
N
LOAD_TYPE
N
ADMIN_CLIENT_SYS_TP_CD N AdminSystemType
com.dwl.tcrm.coreParty.component.TCRMAdminContEquivBObj
ADMIN_CLIENT_ID
N AdminPartyId
com.dwl.tcrm.coreParty.component.TCRMAdminContEquivBObj
CONTR_COMP_TP_CD
Y
ContractComponentType
com.dwl.tcrm.financial.component.TCRMContractComponentBObj
PROD_TP_CD
N ProductType
com.dwl.tcrm.financial.component.TCRMContractComponentBObj
100
Master Data Management: IBM InfoSphere Rapid Deployment Package
CONTR_ROLE_TP_CD
N RoleType
com.ibm.cmdm.mdm.extension.component.XContractPartyRoleBObjExt
REGISTERED_NAME
Y RegisteredName
DISTRIB_PCT
Y
DistributionPercentage
IRREVOC_IND
Y IrrevokableIndicator
START_DT
Y StartDate
END_DT
Y EndDate
RECORDED_START_DT
Y RecordedStartDate
RECORDED_END_DT
Y RecordedEndDate
SHARE_DIST_TP_CD
Y ShareDistributionType
ARRANGEMENT_TP_CD
Y ArrangementType
ARRANGEMENT_DESC
Y ArrangementDescription
END_REASON_TP_CD
Y RoleEndReasonType
## ContractPartyRole Extension Fileds
xcontract_role_id
N XContractRoleId
xsold_indicator
Y XSoldIndicator
xsold_store_number
Y XSoldInStoreNumber
NULL_REGISTERED_NAME
NULL_DISTRIB_PCT
NULL_IRREVOC_IND
NULL_END_DT
NULL_RECORDED_START_DT
NULL_RECORDED_END_DT
NULL_SHARE_DIST_TP_CD
NULL_ARRANGEMENT_TP_CD
NULL_ARRANGEMENT_DESC
NULL_END_REASON_TP_CD
N
N
N
N
N
N
N
N
N
N
## ContractPartyRole Extension NULL fields
NULL_xcontract_role_id N
NULL_xsold_in_field_ind N
NULL_xsold_in_store_nbr N
Chapter 4. RDP for MDM: Delta Load
101
SIF properties file configuration
There may be several updates in the SIF.properties file for data extension. Note
the following steps:
1. Define SIF record/sub_record type to its top level business object hierarchy.
The top business object is one of the following objects:
–
–
–
–
–
–
TCRMPersonBObj
TCRMOrganizationBObj
TCRMPartyBObj
TCRMPartyListBObj
TCRMContractBObj
TCRMContractPlusBObjExt
If top business object is not extended, no configuration change is needed in
SIF.properties part 1.
2. Define SIF record/sub_record type to business object mapping. If business
object is extended because of corresponding data model table extended, SIF
record/sub_record type must be defined to its extended business object. In
Example 4-6 on page 100, SIF.properties file part 2 must be changed, as
shown in Example 4-7.
Example 4-7 SIF.properties updated for sub_type.C.R
sub_type.C.R=com.ibm.cmdm.mdm.extension.component.XContractPartyRole
BObjExt
3. Define a child-parent relationship navigator between business object classes.
Each non-root business object has one definition.TCRMContractBObj is the
root business object in MDM Server application party domain. Based on SIF
metadata in Example 4-6 on page 100, SIF.properties part 4 must have
changes as shown in Example 4-8.
Example 4-8 SIF.properties updated for parent child relationship navigator
navigator.com.dwl.tcrm.coreParty.component.TCRMPartyBObj=com.ibm.cmd
m.mdm.extension.component.XContractPartyRoleBObjExt
navigator.com.ibm.cmdm.mdm.extension.component.XContractPartyRoleBOb
jExt=com.dwl.tcrm.financial.component.TCRMContractComponentBObj
navigator.com.dwl.tcrm.financial.component.TCRMContractRoleLocationB
Obj=com.ibm.cmdm.mdm.extension.component.XContractPartyRoleBObjExt
4. Define SIF Business Key. If there are customization or data extension
changes MDM Server default Business Keys, then SIF Business Key and SIF
metadata for Business Key must be redefined in SIF.properties file’s SIF
Business Key and SIF SPEC for Business Key sections.
102
Master Data Management: IBM InfoSphere Rapid Deployment Package
SIF input format configuration
SIF input format must be created to match corresponding extended SIF
metadata. In Example 4-6 on page 100, a ContractPartyRole SIF input record
should have content in Example 4-9 specified. The Y is for xsold_indicator field,
and 101 is for xsold_store_number.
Example 4-9 SIF input record created based on SIF metadata in example 4-5
C|R|1|DSSA1011||1|DSSA1010|1|1|2|||N||||||||||Y|101|0|1|0|1|0|0|0|0|0|
0|0|0|0|
4.3.3 SIF sequencer
SIF format in direct load allows records of various types and subtypes to appear
in the same input load file. DataStage and QualityStage jobs can process these
records because the referential integrity on the MDM Server database is dropped
during direct load. Also, every line in SIF input file has one and only one SIF
record. No SIF records hierarchy relationship is present.
MDM Party Maintenance Services transactions invoke MDM Server core
transactions that check data referential integrity. Therefore, loading the child
record before the parent record that uses maintenance transactions is not
possible.
SIF Sequencer is built to sort and merge SIF records according to SIF records
hierarchy. Sequencer determines the related SIF records and appends them to
the end of their parent SIF record.
SIF Sequencer has three main objectives:
 Determine related SIF records and appends them to end of their parent SIF
record.
 Reduce the number of transactions to MDM Server. Let maintainParty or
maintainContractPlus coarse grain transaction to load data instead of
invoking several granular transactions.
 Improve performance.
Sequenced SIF records order
Sequencer reads source-to-SIF input SIF file, sorts and appends input SIF
records by source system cross references, and generates sequenced SIF
output file (or files), in the following order:
1. It appends all the child SIF records to the end of root SIF records (P|P, P|O,
and C|H) first.
Chapter 4. RDP for MDM: Delta Load
103
2. It appends all the grandchild SIF records to their parent second, and so on.
If a non-root SIF record does not have a matching parent SIF record, it may still
have child and grandchild SIF records. Sequencer appends the related
parent-child SIF records together.
Sequenced SIF file or files have the following layout:
 Party SIF records stay on top section of SIF file.
 Party child SIF records that have no matching cross references set after party
SIF records. The order between all party child SIF records does not matter.
 P|R SIF records sit after party child SIF records.
 Contract SIF records locate after all party child SIF records.Contract child and
grand child SIF records reside after contract SIF file.
Example 4-10 shows a SIF Sequencer generated SIF file.
Example 4-10 SIF Sequencer-generated SIF file
P|P…P|H…P|H…P|A…P|C…P|C…P|I…P|I…P|M…P|T
P|O…P|G…P|H…P|A…P|C…P|I…P|M…P|T
P|P…P|H…P|H…P|A…P|A…P|C…P|I…P|I…P|M…P|T
P|P…P|H
P|A…
P|C…
P|M…
P|I….
C|H…C|M…C|C…C|C…C|R…C|R…C|R…C|L…C|L…
C|H…C|C... C|T…C|V…C|R…C|V…C|L…C|L…
C|C…C|R…C|V…C|L
C|T…
C|V…
C|R…C|L
C|R…
C|L…
C|M…
SIF Sequencer can have up to 19 SIF input files. It depends on source system
and the tool used to extract the source data. SIF Sequencer may generate one
output SIF file with all SIF records in order.
104
Master Data Management: IBM InfoSphere Rapid Deployment Package
It is also flexible to generate three sequenced SIF output files:
 Party.SIF
The Party.SIF file contains all party and its child SIF records.
 Contract.SIF
The Contract.SIF file includes all contract and its child/grandchild SIF records,
plus C|M SIF records.
 PartyRelationship.SIF
The PartyRelationship.SIF file has all party relationship SIF records.
Run SIF Sequencer Job
See section 4.5.2, “Run SIF Sequencer Job” on page 125 for details about how
to run SIF Sequencer Job.
4.3.4 QualityStage runtime standardization and matching jobs
MDM RDP Runtime Assets contain 4 QualityStage runtime jobs for person
name, organization name, address, and phone number standardization. These
jobs allow delta load transactions going through MDM Server to use the same
standardization rules as in RDP for MDM Direct Load.
MDM RDP Runtime Assets also contain two QualityStage runtime jobs for
person and organization suspect duplicate matching. The matching algorithm
differs from the RDP for MDM Direct Load jobs. However, these runtime jobs
provide a similar level of functionality as the direct load jobs. MDM Server
Matching Critical Data Rules Console UI can be used to dynamically manage the
critical data match strings for person and organization. QualityStage runtime jobs
take into account the selections.
MDM RDP Runtime Asset also have an adapter and a converter to support the
QualityStage runtime jobs.
4.3.5 Search Suspect Candidates rule
The suspect duplicate candidate selection algorithm for RDP for MDM Direct
Load jobs differs from the algorithm provided with the default candidate selection
rule in InfoSphere MDM Server. To provide a similar level of functionality between
RDP for MDM Direct Load and RDP for MDM Delta Load, MDM RDP Runtime
Assets provide a search suspect candidates rule that overwrites MDM Server
default implementation and implements the same blockings algorithm as used in
the direct load.
Chapter 4. RDP for MDM: Delta Load
105
Blocking fields for each candidate selection pass are described in Table 4-5.
Table 4-5 Blockings in search suspect candidate rule
Pass
number
Blocking fields for
person match
Blocking fields for
organization match
Pass 1
Last Name Phonetics
Street Name Phonetics
Postal Code
Word 1 Phonetics
Street Name Phonetics
Postal Code
Pass 2
Last Name Phonetics
Box Number
Postal Code
Word 1 Phonetics
Box Number
Postal Code
Pass 3
Last Name Phonetics
Rural Route
Postal Code
Word 1 Phonetics
Rural Route
Postal Code
Pass 4
Last Name Phonetics
Street Name Phonetics
CityName Phonetics
Word 1 Phonetics
Street Name Phonetics
CityName Phonetics
Pass 5
Last Name Phonetics
Box Number
CityName Phonetics
Word 1 Phonetics
Box Number
CityName Phonetics
Pass 6
Last Name Phonetics
Rural Route
CityName Phonetics
Word 1 Phonetics
Rural Route
CityName Phonetics
Pass 7
Last Name Phonetics
National Identifier
Word 1 Phonetics
National Identifier
The search suspect candidates rule takes into account the matching critical data
fields selection that can be configured using MDM Server Matching Critical Data
Rules Console UI. If a critical field, such as street name, is not configured to
participate in matching, then any pass that includes this field (such as passes 1
and 4) are not used in candidate selection.
4.3.6 Disable phonetic keys generation in MDM Server
MDM Server has default Soundex phonetic key generation, but RDP for MDM
Direct Load uses Nysiis phonetic keys. To use the same phonetic keys in delta
load, MDM RDP Runtime Assets disables MDM Server Soundex phonetic key
generation, propagates Nysiis phonetic keys passed in from QualityStage
runtime jobs, forward them to MDM Server, and eventually persists to the
database.
106
Master Data Management: IBM InfoSphere Rapid Deployment Package
MDM RDP Runtime Assets provides dummy phonetic key generator wrapper
classes that overwrite MDM Server default Soundex phonetic key generator
classes to disable Soundex phonetic key generation in MDM, and a converter
class to pick up Nysiis phonetic keys for person name, organization name, and
address generated and passed in from QualityStage runtime jobs.
MDM RDP Runtime Assets installation script modifies MDM Server
tcrm_extension.properties file and CONFIGELEMENT table to replace
Soundex phonetic keys generator configuration in MDM Server and uses Nysiis
phonetic keys generated from QualityStage runtime jobs.
4.3.7 MDM RDP Runtime Assets installation
MDM Server and MDM Party Maintenance Services must be installed before
installing MDM RDP Runtime Assets.
SIF Sequencer Job must already be installed on IBM InfoSphere Information
Server server at RDP for MDM Direct Load using DataStage and QualityStage
RDP project time.
The QualityStage components that are required for MDM and QualityStage
integration are listed in Table 4-6. These components can be found in the
<RDP_Assets_Home>/MDMRDPRuntime/QualityStage directory.
Table 4-6 QualityStage runtime components
Component name
Description
ELMDMQS.dsx
DataStage/QualityStage job export. Contains
source code to be imported into your
environment through the
DataStage/QualityStage Designer Client.
ELMDMQS_ISDProject.xml
WebSphere Information Services Director
(ISD) project export. Contains service
definitions to be imported into your
environment through the InfoSphere
Information Server Console
You must install QualityStage runtime jobs in ELMDMQS.dsx and deploy services
for RMI interface using WebSphere ISD in ELMDMQS_ISDProject.xml file. You can
follow similar installation steps as in the “Installing DataStage and QualityStage
jobs” section in Chapter 54 of the InfoSphere Master Data Management Server
Version 9.0 Developers Guide (licensed material available with the product). The
Chapter 4. RDP for MDM: Delta Load
107
differences are installing ELMDMQS.dsx and ELMDMQS_ISDProject.xml instead of
MDMQS.dsx and MDMQS_ISDProject.xml this time.
When deploying ELMDMQS_Project.xml file, you must also edit operations, as
shown in Table 4-7. In the table, ISD is Information Service Director, and DS is
DataStage.
Table 4-7 MDM RDP ISD Operations
ISD operation name
DS job name
Inputs
accept
array
Input data type
Outputs
return
array
Output data type
elOrgMatch
RDP_MDMISD_Party_Suspe
ct_Reference_Match_Org
Yes
ELOrgMatchInput
Yes
ELOrgMatchOutput
elPersonMatch
RDP_MDMISD_Party_Suspe
ct_Reference_Match_Person
Yes
ELPersonMatchInput
Yes
ELPersonMatchOutput
standardizePersonName
RDP_MDMISD_Person_Stan
dardization
PersonInput
PersonOutput
standardizeAddress
RDP_MDMISD_Address_Sta
ndardization
AddressInput
AddressOutput
standardizeOrganization
RDP_MDMISD_Organization
_Standardization
OrganizationInput
OrganizationOutput
standardizePhone
RDP_MDMISD_Phone_Stan
dardization
PhoneInput
PhoneOutput
This installation process does not cover the QualityStage runtime jobs
installation. You can follow similar installation steps as in the “Installing
DataStage and QualityStage jobs” section in Chapter 54 of the InfoSphere
Master Data Management Server Version 9.0 Developers Guide (licensed
material available with the product).
MDM RDP Runtime Assets installation scripts are grouped by application server
and database type. To install the assets, use the following basic steps:
1. Choose the installation scripts under the server and database combination
that matches your server environment.
2. Edit the setVariables.sh script as appropriate.
3. Execute the install_RDP_Assets.sh script to install the InfoSphere MDM
Server Rapid Deployment Package assets.
108
Master Data Management: IBM InfoSphere Rapid Deployment Package
The install_RDP_Assets.sh script
The install_RDP_Assets.sh script performs the following modifications to the
InfoSphere MDM Server application:
 Executes the following SQL scripts to create and modify tables in the MDM
Server database:
– AlterTables.sql
– Create_IS_Tables.sql
– Create_Seq_Objects.sql
 Executes one of the following combinations of SQL scripts, depending on the
variables defined in setVariable.sh script:
– Altered_Compound_Triggers.sql and
Altered_Delete_Compound_Triggers.sql
– Altered_Simple_Triggers.sql and
Altered_Delete_Simple_Triggers.sql
 Executes the ELMDM_Configelement.sql script.
 Extracts the following JAR files into a temporary folder for modification:
– DWLCommonServicesEJB.jar
– properties.jar
 Extracts META-INF/MANIFEST.MF files from DWLCommonServicesEJB.jar and
edits the class path to include the following JAR files:
– MDMRDPRuntime.jar
– SIF.jar
– ELMDMQS_client.jar
 For WebLogic Application Server, extracts the MANIFEST.MF files from
PartyEJB.jar with ELMDMQSWS_client.jar added to the Class Path, and the
ejb-jar.xml file is updated with a reference to IBM InfoSphere Information
Server web services.
 Extracts the files tcrm_extension.properties and
DWLCommon_extension.properties from properties.jar, adds the contents of
tcrm_extension_ELMDM.properties and
DWLCommon_extension_ELMDM.properties to the two properties files, and then
reinserts the files into properties.jar file.
 Adds SIF.properties into properties.jar file.
 Adds the following files to the MDM.ear file:
–
–
–
–
MDMRDPRuntime.jar
SIF.jar
ELMDMQSWS_client.jar
ELMDMQS_client.jar
Chapter 4. RDP for MDM: Delta Load
109
 Adds the following modified JAR files to the MDM.ear file:
– DWLCommonServicesEJB.jar
– properties.jar
 Deploys the changes to the InfoSphere MDM Server application.
Installation steps
Perform the following steps to install MDM RDP Runtime Assets:
1. From the RDP FTP distribution site, download the InfoSphere MDM Server
RDP assets archive, MDM90_RDPRuntime.tar.gz, to a temporary directory on
the server.
2. Extract the TAR file using the following command:
gzip –d MDM900_RDPRuntime.tar.gz –xvf MDM900_RDPRuntime.tar
The TAR file is extracted into several directories.
3. Navigate to the directory for the server and database type that matches your
server environment. For example, if your server environment is IBM
WebSphere Application Server with IBM DB2, navigate to:
<RDP_Assets_Home>/MDMRDPRuntime/install/WebSphere/DB2/
4. Edit setVariables.sh to provide the variables with appropriate values for your
environment. These values are parameters that are used in installation
scripts, as shown in Example 4-11.
Example 4-11 Sample values for setVariables.sh
export JAVA_HOME=/usr/IBM/WebSphere/AppServer/java
export NODE_NAME=Node01
export WSADMIN_BIN=/usr/IBM/WebSphere/AppServer/bin
export SERVER_NAME=CAM_MDM900_12032009_1455_DB2_BE10
export APP_NAME=CAM_MDM900_12032009_1455_DB2_BE10
export INSTALL_HOME=/usr/IBM/MDM/CAM_MDM900_12032009_1455_DB2_BE10
export DB_NAME=MDM9QA2
export DB_USER=sarcam10
export DB_PASSWORD=Schema90
export TABLE_SPACE=TABLESPAC
export INDEX_SPACE=INDEXSPAC
export LONG_SPACE=LONGSPACE1
export TRIG=Compound
export DEL_TRIG=TRUE
export ADMIN_USER=cusadmin
export ADMIN_PASSWORD=cusadmin
export IIS_SRV_VERSION=81
MESSAGING_TYPE=WMQ
export ISP_URL='iiop:\/\/IISserver.ibm.com:2809'
110
Master Data Management: IBM InfoSphere Rapid Deployment Package
Note: For WebSphere Application Server, the ISP_URL supports IIOP
URLs only. For WebLogic Application Server, the ISP_URL only supports
the web services URL, as follows:
export ISP_URL='http:\/\/IISserver.ibm.com:9080'
5. To prevent a file permission error when installing the RDP runtime assets, run
the following command in the /MDMRDPRuntime/install folder:
chmod -R 755 *.sh
6. Run following script to install the RDP assets:
./install_RDP_Assets.sh
Note: Ignore the following warning, which appears in the console while the
script is running:
WARNING: Duplicate name in Manifest.
This script modifies the existing MDM.ear file and saves a backup copy of the
original EAR file. The backup version can be found in the same folder as the
modified file, and is renamed with the .beforeRDP file extension, for example:
MDM.ear.beforeRDP
Redeploying RDP assets
To redeploy MDM RDP Runtime Assets, use the following steps:
1. Restore the MDM.ear using the following commands:
rm /INSTALL_HOME/MDM.ear
mv /INSTALL_HOME/MDM.ear.beforeRDP /INSTALL_HOME/MDM.ear
2. Run the install_RDP_Assets.sh script.
3. Check the log files for errors.
You can ignore errors from database logs because of duplicated insert SQL
errors that happened.
4.3.8 MDM Matching Critical Data Rules Console user interface
MDM Matching Critical Data Rules Console user interface (UI) is not part of
MDM RDP Runtime Assets, but it is one of the MDM samples that is included
with the purchase of MDM Server. After it is installed on the server, it can be
used to manage party matching critical data fields dynamically, which means
Chapter 4. RDP for MDM: Delta Load
111
there is no need to restart the MDM Server to refresh cache after modified party
matching critical data fields.
MDM matching critical data fields can also be updated by directly updating the
MDM Server CONFIGELEMENT table content. However, if necessary, stop the
MDM Server to refresh server cache in this way.
MDM RDP Runtime Assets provides the following document, which describes
how to use the Matching Critical Data Rules Console to manage critical data:
fatchingCriticalDataRulesConsole.pdf
Installing Matching Critical Data Rules Console UI
Perform the following steps:
1. Locate the MDM901_Samples.tar.gz archive on your InfoSphere MDM Server
distribution media.
2. Open MDM901_Samples.tar.gz and extract the following installable EAR file:
UI/runtime/CustomerMatchingCriticalDataRules.ear
3. Open CustomerMatchingCriticalDataRules.ear and locate
propertiesUI.jar file, which contains three relevant properties files:
– mdmUIConfiguration.properties
– matchingCriticalDataRules.properties
– ClientAuthentication.properties
4. Open mdmUIConfiguration.properties and edit the entries shown in
Example 4-12.
Example 4-12 Edit the entries in mdmUIConfiguration.properties
# The iiop location for the server and port to connect to the MDM Server
jndi
#Ex: java.naming.provider.url=corbaloc:iiop:hasufel.torolab.ibm.com:9811
java.naming.provider.url=
# The fully qualified name of the User Group Implementaion.
# Currently the there are 2 values (user group implementation classes) for
this property.
# The 2 possible values can be:
# Ex:
# when deployed on a IBM WebSphere use
UserGroupImpl=com.ibm.mdm.ui.registry.WASUserGroupImpl
# when deployed on a BEA WebLogic use
UserGroupImpl=com.ibm.mdm.ui.registry.BEAUserGroupImpl
UserGroupImpl=
112
Master Data Management: IBM InfoSphere Rapid Deployment Package
5. Open matchingCriticalDataRules.properties and edit the entries shown in
Example 4-13.
Example 4-13 Edit the entries in matchingCriticalDataRules.properties
##############################################
# The host where the Config Manager is running
# Ex: cm.host=localhost
##############################################
cm.host=
################################################
# The port where the Config Manager is listening
# Ex: cm.port=9902
################################################
cm.port=
############################
# Inspect the values from MDM Server database:
# [TABLE].[FIELD]
# APPSOFTWARE.NAME
# APPSOFTWARE.VERSION
############################
#Ex:
#mdmServer.appName=InfoSphere Master Data Management
#mdmServer.appVersion=9.0.0
mdmServer.appName=
mdmServer.appVersion=
############################
# Inspect the values from MDM Server database:
# [TABLE].[FIELD]
# APPDEPLOYMENT.NAME
##
if there is no value in the database then
# leave the value of this property empty.
# Ex: mdmServer.deployName=
############################
#Ex:
#mdmServer.deployName=CAM_MDM900_12032009_1455_DB2_BE02
mdmServer.deployName=
############################
# [TABLE].[FIELD]
# APPINSTANCE.NAME
##
if there is no value in the database then
# leave the value of this property empty.
# Ex: mdmServer.instanceName=
############################
mdmServer.instanceName=
Chapter 4. RDP for MDM: Delta Load
113
6. Open ClientAuthentication.properties and edit the entries shown in
Example 4-14.
Example 4-14 Edit the entries in ClientAuthentication.properties
# ID and password used for MDM client applications
#Ex:
#client.id=mdmClientUser
#client.password= mdmClientPassword
client.id=
client.password=
# The type of application server used.
# Valid values are: WAS, WL
#Ex:
#applicationServerType=WAS
applicationServerType=
7. Add the modified files back into the CustomerMatchingCriticalDataRules.ear
file.
8. Use the application server’s Administrative Console to install the
CustomerMatchingCriticalDataRules.ear file.
No custom settings are required during the installation process.
9. After the installation is complete, start the Customer Matching Critical Data
Rules user interface.
10.To access the newly deployed user interface, use a web browser to navigate
to the URL, structured as follows:
http://<host>:<port>/CustomerMatchingCriticalDataRulesWeb/faces/inde
x.jsp
Replace <host> and <port> in the URL with the appropriate host name and
port number.
4.4 Performance tuning for MDM Delta Load using RDP
Overall performance of loading data into MDM database using MDM RDP
Runtime Assets and MDM party Maintenance Services depends on performance
of many layers. This section describes several performance tuning tips for MDM
BatchProcessor, QualityStage runtime jobs, MDM Server, WebSphere Server,
the Database Layer, and the InfoSphere Information Server layer.
114
Master Data Management: IBM InfoSphere Rapid Deployment Package
4.4.1 MDM BatchProcessor configuration
MDM BatchProcessor is used to load data. To configure the batch processor to
use SIF format input file and SIF parser, edit the following files:
 The batch_extension.properties file as shown in Example 4-15.
 The Batch.properties file as shown in Example 4-16.
Example 4-15 Sample update in batch_extension.properties
ParserAndExecConfiguration.Parser = SIF
Example 4-16 Sample update in Batch.properties
ServerConfiguration.provider_url =
corbaloc:iiop:<ServerName:portNumber>
ServerConfiguration.context_factory = <CTX_FACTORY>
#Sample values:
ServerConfiguration.provider_url=corbaloc:iiop:gandalf.torolab.ibm.com:
9825
ServerConfiguration.context_factory =
com.ibm.websphere.naming.WsnInitialContextFactory
Concurrency level
The batch processor can submit concurrent requests to MDM server. The level of
concurrency for the batch processor client can be controlled by changing the
number of submitters. If total number of logical CPUs available to the MDM
server (across all servers in case of a cluster) is N, and assuming MDM server is
serving only those requests that are coming from the batch processor, you may
choose the submitter number to be in a range from N to 2N. Based on internal
tests, for an MDM server running on an IBM pSeries® POWER5™ system with
eight physical processors (hence 16 logical processors), 24 submitters were
observed to be optimal.
The default number of submitters is 5. To set the number of submitters, edit the
following file:
<MDM_INSTALL_HOME>/BatchProcessor/properties/Batch.properties.
As an example, set the number to 24:
Submitter.number = 24
Note: The default number for reader and writer is 1, but you do not have to
change them.
Chapter 4. RDP for MDM: Delta Load
115
Suspend/Resume threshold
These values indicate the thresholds of percentage heap usage at which the
reader suspends or resumes reading. Default values for these thresholds are
20% and 30% respectively. The default values are good if your input requests are
in XML format. However, if you are using SIF input, you may increase these
thresholds to 30% and 40% respectively. These values are specified in the
Batch.properties file.
Suspend duration
When the reader does not see enough free memory, it remains suspended and
does not read more inputs. However, it calls garbage collection (GC) to ensure
memory is free of garbage. The duration for which the reader waits before calling
GC is controlled by suspend duration, which has a default value of 200
milliseconds. To reduce the GC overhead you may want set this to a higher value,
say 2000 milliseconds, in the Batch.properties file.
Handling successful responses
If you do not need successful responses, you can set the following property in
Batch.properties to false:
setMDMSuccessResponseToQueue=false
This setting will avoid responses of successful transactions being stored in
memory, thus reducing memory usage
JVM heap size
Set the heap size for JVM running the batch processor by editing the following
file:
<MDM_INSTALL_HOME>/BatchProcessor/bin/runbatch.sh
For most cases, 512 MB of heap is sufficient. Ensure that the java command in
this script called to run batchController actually uses this heap setting.
Logging thresholds
To reduce the overhead because of logging in batch processor, set the logging
threshold to Warning or Error level. You can do this by editing the following file:
<MDM_INSTALL_HOME>/BatchProcessor/Log4J.properties
In the file, set the logging threshold to WARN or ERROR, if it is not already:
log4j.appender.file.Threshold=ERROR.
116
Master Data Management: IBM InfoSphere Rapid Deployment Package
4.4.2 MDM Server configuration
The MDM Server configuration can be modified through properties files,
CONFIGELEMENT table, and Matching Critical Data Rules Console UI.
Logging thresholds
Set MDM Server logging threshold in Log4J.properties to WARN or ERROR to
reduce logging overhead. If WebSphere is the server platform, Log4J.properties
is included in the following JAR file:
<WebSphere_Home>/profiles/<NodeName>/installedApps/<CellName>/<Instance
Name>/properties.jar
An example of the setting is follows:
log4j.rootLogger=ERROR, file, stdout
Disable performance monitor
The performance monitor is disabled by default. However, if you enabled it for
some reason, (for example to debug a performance problem), ensure you turn it
OFF for normal operations to avoid the overhead of the performance monitor. To
disable performance monitoring, make following changes to the
CONFIGELEMENT table VALUE column:
 Set the following item to 0:
/IBM/DWLCommonServices/PerformanceTracking/level
 Set entries such as the following entries to false:
/IBM/DWLCommonServices/PerformanceTracking/%enabled
Disable TAIL
MDM Server Transaction Audit Information Log (TAIL) is turned off by default. If it
is not required to turn on TAIL, be sure TAIL is off.
To turn TAIL logging off, make the following change to CONFIGELEMENT table
value column:
set /IBM/DWLCommonServices/TAIL/enabled to false
Data standardization
MDM Runtime standardization ensures that names, addresses, and phone
numbers are stored in MDM Server, using the same format, which increase data
accuracy.
Chapter 4. RDP for MDM: Delta Load
117
If the input data is already standardized, turning off runtime data standardization
can avoid performance overhead.
Name Standardization
Name standardization can be switched on or off by setting the following item in
the CONFIGELEMENT table to false or true, respectively:
/IBM/Party/ExcludePartyNameStandardization/enabled
Address and PhoneNumber Standardization
Address Standardization can be turned on or off by setting the following indicator
to N or Y respectively, in the transaction input requests:
StandardFormatingIndicator
Also set the following items to true in the CONFIGELEMENT table to avoid
performing data standardization multiple times in MDM Server and improve
performance:
 /IBM/ThirdPartyAdapters/IIS/StandardizeAddress/StandardFormattingIndic
ator/enabled
 /IBM/ThirdPartyAdapters/IIS/StandardizePhoneNumber/StandardFormattingI
ndicator/enabled
Suspect Duplicate Processing (SDP)
If there are no duplicates in data, you can switch off SDP to avoid performance
overhead, because searching suspects in clean data is also time consuming
although no suspects will be found. You can switch off SDP by setting the value
to false for the following entries in CONFIGELEMENT table:
 /IBM/Party/SuspectProcessing/enabled
 /IBM/Party/SuspectProcessing/AddParty/returnSuspect
History triggers
If history triggers are enabled, the I/O requirement on DB server is almost
doubled. However, if enough I/O bandwidth is provided, the overhead of using
history triggers is less than 5%.
118
Master Data Management: IBM InfoSphere Rapid Deployment Package
4.4.3 WebSphere Application Server configuration
This section provides tips for WebSphere Application Server configuration.
Size of ORB thread pool
Ensure the ORB thread pool is large enough for the maximum expected
concurrency (maximum amount of concurrent RMI calls to MDM Server
transactions). You can set it to a value larger than the number of concurrent
users (or submitters in the batch processor). The default maximum size for the
ORB.thread.pool is 50 and is sufficiently large for most cases. However, if you
need to change it, you can do so using WebSphere Administration Console. Go
to Application servers  MDMServerName  Thread pools 
ORB.thread.pool.
EJB cache size
This value should be set to the maximum number of active enterprise bean
instances expected during a typical workload. For MDM server, set the
Enterprise JavaBeans (EJB) cache size to 4000. Do this by using the
WebSphere Administration Console. Go to Servers  Application servers 
[MDMServerName]  EJB Container Settings  EJB cache settings.
JDBC connection pool size
Ensure JDBC connection pool size is large enough to support the concurrency.
The default value of Maximum Connections is 20. You may leave it at the default
and use the Tivoli® Performance Viewer to determine whether the pool size must
be increased. If the number of concurrent waiters is greater than 0 (zero) and the
CPU usage is not close to 100%, you can increase the connection pool size.
To change this setting using WebSphere Administration Console, go to
Resources  JDBC  Data sources  DWLCustomer  Connection pool
properties.
Prepared statement cache size
Specifies the number of statements that can be cached per connection. The
WebSphere Application Server data source optimizes the processing of prepared
statements and callable statements by caching those statements that are not
being used in an active connection. For InfoSphere MDM Server set it to 300 and
monitor it using Tivoli Performance Viewer to see if needs to be increased.
To change this using WebSphere Administration Console, go to Resources 
JDBC  Data sources  DWLCustomer  WebSphere Application Server
data source properties.
Chapter 4. RDP for MDM: Delta Load
119
JVM settings
Change the JVM heap size and GC policy as follows:
1. From the WebSphere Administration Console, go to Servers  Application
servers  [MDMServerName]  Java and Process Management 
Process Definition  Java Virtual Machine.
2. Set the initial heap size as 512 MB and the maximum heap size as 1024 MB.
3. Specify -Xgcpolicy:gencon under Generic JVM arguments to use
generational concurrent (gencon) GC policy.
4. On the same page of WebSphere Administration Console, use the check box
to enable verbose GC logs. These logs are helpful, to understand heap
memory usage and garbage collection. The overhead caused by these logs is
small.
4.4.4 Database tuning
MDM Server performance is strongly dependent on the performance of the
underlying database layer. The following sections describe basic, but important,
areas on which to focus.
Disk configuration
To ensure no bottlenecks because of I/O, a large number of physical disks or
Storage Area Network (SAN) disks (typically configured as a RAID system) is
needed to make available to the DB server machine.
Use a set of dedicated disks for transaction logs and another set of dedicated
disks for table spaces. If possible, use different disk controllers for these two sets
of disks, because this gives the flexibility to configure the disk controllers
independently to favor various I/O patterns seen on these sets of disks. Ensure
read and write cache is enabled on the storage system.
Table spaces
Plan the table spaces to ensure that the I/O is balanced across all available
disks. If I/O is not balanced, the busiest disk becomes the bottleneck and overall
bandwidth will remain unused.
120
Master Data Management: IBM InfoSphere Rapid Deployment Package
DB2 statistics
While loading data into an empty MDM database, avoid running concurrent users
(multithreaded) in the beginning. Load a few records (say 10000) into the
database, using a single thread, and execute DB2 statistics (runstats) on your
critical data tables before running multiple concurrent users. Execute runstats
periodically after a large volume of data is loaded into the database since the last
execution of runstats.
SQL access plan
Because MDM supports extensions and customizations, you must ensure that
the database has correct indexes in place for all queries, including the
customized ones. You can analyze the top most time-consuming SQLs from the
database snapshot and ensure the access plans for these SQLs are efficient. For
any customizations, use parameterized SQLs to take advantage of prepared
statement caching and reduce number of compilation of SQL statements.
Buffer pools
You must monitor the database performance using DB2 snapshots or other tools
to measure buffer pool usage, SQL response times, and general I/O rates. The
buffer pool hit ratio is an indicator of how often the physical disks are accessed to
get the data. Try to use large buffer pools such that buffer pool hit ratio is more
than 80% for data, and more than 90% for indexes.
Chapter 4. RDP for MDM: Delta Load
121
4.4.5 Information Services Director job configuration
IBM InfoSphere Information Services Director (ISD) allows QualityStage jobs to
be deployed as web services or EJBs. Using IBM Information Server console you
can deploy and configure these jobs.
Log on to the IBM Information Server console, expand ELMDMQSService 
Operations. See Figure 4-6 for detail.
Figure 4-6 Configure QS runtime jobs
Select each operation and configure as described in the following sections.
122
Master Data Management: IBM InfoSphere Rapid Deployment Package
Default Settings tab
Select the Default Setting tab and configure the following items:
1. Load balancer
When there are multiple Application Service Backbone (ASB) Agents, the
load balancer is used to determine to which ASB Agent a request will be
routed. You may choose between round-robin and average response time.
For ELMDMQS jobs, response time is selected by default and you can retain
the same.
2. Max Queue wait
This item defines maximum time a request can wait on the queue. The default
is set to 1000 millisecond. You should increase it to higher value, such as
5000 milliseconds. See the highlighted item 1a in Figure 4-6 on page 122.
You can monitor SystemErr.log of MDM Server. If you notice a message,
such as the following one, increase this wait time:
Queue Wait of 1000 Exceeded
3. Max Queue requests
If a request cannot be served immediately, it waits in what is called an
operation queue. This number denotes the size of this queue. See highlighted
item 1b in Figure 4-6 on page 122. By default, queue size is set to 3 in
ELMDMQS jobs. Change this value according to the following formula:
Queue Size >= Maximum Level of concurrency / Minimum number of Job
instances
As an example, if the number of submitters is 30 and number of minimum job
instances is set to 2, set this queue size at least to 15. If you notice errors,
such as the following error in SystemErr.log of MDM Server, increase the
queue size:
Queue Limit of 3 Exceeded
Information provider tab
Select Information Provider tab, select Provider properties tab, and then review
or change the following information:
 Active job instances or JDBC connections
Select the Provider Properties tab. The active job instances or connectors are
the minimum and maximum number of jobs that will be active at a time. The
ASB Agent attempts to always keep the number of active instances between
these limits. By default, these limits are set to 1 and 5 respectively for all
ELMDMQS jobs. See highlighted item 2a in Figure 4-7 on page 124 as an
example.
Chapter 4. RDP for MDM: Delta Load
123
You can check the active number of job instances using IBM Information
Server Director Client. You can increase the maximum limit if you see that the
number of running instances is at the maximum limit and CPU usage on the
IBM Information Server is not near 100%.
 Activation threshold
The activation threshold is used to determine whether a new job instance
needs to be activated. This information is based on two parameters: Service
Requests and Delay. The default values for these parameters for ELMDMQS
jobs are 3 and 1000 respectively, as depicted in Figure 4-7.
When the number of service requests in the operation queue is more than
“Service Requests” and remain at a higher level, at least for a duration
indicated by “Delay,” a new job instance is created. The default values should
suffice for most cases.
Figure 4-7 Configure QS runtime jobs
124
Master Data Management: IBM InfoSphere Rapid Deployment Package
4.5 Run Delta Load for MDM using RDP
After MDM Server application, MDM Party Maintenance Services, and MDM
RDP Runtime Assets are installed, deployed, and configured, you are ready to
run Delta Load for MDM. IBM Information Server and DataStage and
QualityStage RDP project should have been up and running since the time of the
RDP for MDM Direct Load time.
Perform the following tasks before running Delta Load:
1. Create source SIF files.
2. Run SIF Sequencer to generate sequenced SIF files.
3. Run MDM Batchprocessor to load data to MDM.
4.5.1 Create source SIF files
The RDP for MDM Delta Load solution takes SIF files as input data files. IBM
Information Server is used to extract and transform client source data to create
the source SIF file.
See 2.3, “Standard Interface File (SIF)” on page 16 for details about creating the
SIF file.
4.5.2 Run SIF Sequencer Job
The SIF Sequencer Job is included in the DataStage RDP Package and
deployed with the RDP project at direct load phase. The SIF Sequencer Job can
be invoked in a similar way to run any direct load jobs, by using only
IL_000_AutoStart_EX.
Chapter 4. RDP for MDM: Delta Load
125
Figure 4-8 shows the SIF Sequencer Job in DataStage MDMIS R5.3 project.
Figure 4-8 SIF Sequencer in an MDMIS R5.3 project
SIF Sequencer can be run using either DataStage Director Client or the dsjob
command-line command.
The dsjob command
To use the dsjob command, the user environment must be correctly set up so
that required libraries and executables are in their proper paths. The InfoSphere
Information Server environment must already be set up at the RDP for MDM
Direct Load phase.
The following steps describe how the particular dsjob command would be
constructed to run the IL_000_AutoStart_EX SIF Sequencer job on InfoSphere
Information Server:
1. Go to <IIS_Install_Home>/DSEngine/bin location.
2. Set user as dsadm user and log in.
3. Put sequenced SIF input file (or files) on the server at the location defined in
the MDM Server CONFIGELEMENT table with the:
name=/IBM/ELMDM/IIS/Install/SIFInputFiles/path
126
Master Data Management: IBM InfoSphere Rapid Deployment Package
4. Run following command:
dsjob -run -mode NORMAL -wait -warn 0 -param MDM_CONNECTIONS=Default
RDP IL_000_AutoStart_EX
In the command, Default is the parameter value set name, RDP is the
DataStage Project Name, and IL_000_AutoStart_EX is the SIF Sequencer job
name.
5. Check the output file on server at the location defined in MDM Server
CONFIGELEMENT table with the:
name=/IBM/ELMDM/IIS/Install/ISDataSetHeaders/path
6. Check the error log file on server at the location defined in MDM Server
CONFIGELEMENT table with the:
name=/IBM/ELMDM/IIS/Install/ErrorFiles/path=/home/dsadm/Project/RDP/
ERROR
4.5.3 Run MDM BatchProcessor
MDM BatchProcessor or other batch framework can be used to read sequenced
SIF files and feed the SIF records into MDM Server. To use the MDM
BatchProcessor framework to load data, see 4.4.1, “MDM BatchProcessor
configuration” on page 115 for details.
To load single SIF input data file
In the <MDM_INSTALL_HOME>/BatchProcessor/bin/ directory, execute the following
command:
runbatch.sh <SIF_FILE_FULLNAME> <LOG_FILE_PATH>
<BATCH_EXTENSION_PROPERTYFILE_NAME>
An example is as follows:
runbatch.sh /opt/IBM/MDM/BatchProcessor/seed/person.sif
/opt/IBM/MDM/BatchProcessor/seed/logs batch_extension
To load multiple SIF input data files
In the <MDM_INSTALL_HOME>/BatchProcessor/properties/ directory, edit the
Batch.properties file by defining the SIF input data location, SIF input data file
names, and log file location. Next, execute runbatch.sh without any argument.
See Example 4-17 on page 128.
Chapter 4. RDP for MDM: Delta Load
127
Example 4-17 Batch.properties sample value for loading multiple SIF input files
SIF_INPUT_PATH=/usr/IBM/MDM/BAR_MDM850_12032008_1210_DB2_BE01/BatchProc
essor/
SIF_INPUT_FILE_NAMES=Party.sif,Contract.sif,PartyRelationship.sif
SIF_OUTPUT_PATH=/usr/IBM/MDM/BAR_MDM850_12032008_1210_DB2_BE01/BatchPro
cessor/logs
An example of the runbatch command is as follows:
runbatch.sh
4.5.4 Check Delta Load result and error messages
In <LOG_FILE_PATH> are three log files:
 batchLoadSuccess.out
 batchLoadFail.out
 batchLoadSuspect.out
Failed records in Delta Load can be found in the batchLoadFail.out file. This
log file records three types of information:
 Failed record index number in original SIF file
 MDM error message
 original SIF record
Example 4-18 shows a sample batchLoadFail.out file. It reports the first SIF
record (index 0) that failed with MDM response XML file and original SIF input
record at the end.
Example 4-18 The batchLoadFail.out file
0,<?xml version="1.0" encoding="UTF-8"?>
<TCRMService xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="tCRMResponse.xsd">
<ResponseControl>
<ResultCode>FATAL</ResultCode>
<ServiceTime>35344</ServiceTime>
<DWLControl>
<requesterLanguage>100</requesterLanguage>
<requesterLocale>en</requesterLocale>
<requestID>559123429080642132</requestID>
</DWLControl>
</ResponseControl>
<TxResponse>
<RequestType>processTx</RequestType>
<TxResult>
128
Master Data Management: IBM InfoSphere Rapid Deployment Package
<ResultCode>FATAL</ResultCode>
<DWLError>
<ComponentType>106</ComponentType>
<ErrorMessage>Parser DWLTransaction failed. The format of the
message is not correct or an application error occurred.</ErrorMessage>
<ErrorType>READERR</ErrorType>
<LanguageCode>100</LanguageCode>
<ReasonCode>4928</ReasonCode>
<Severity>0</Severity>
<Throwable>com.dwl.base.requestHandler.exception.RequestParserException: The
following is not correct: Type, SubType or their combination</Throwable>
</DWLError>
</TxResult>
</TxResponse>
</TCRMService>,P|P|100001|E243883150001||N||||||||100000||||||||100001|||1986-0
3-05 00:00:00||||||||||||||1969-05-30
00:00:00||||||100|100000|108|||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|P|A|100001|E243883150001|||
|||||||2008-12-23 16:46:50||||100001||100002|100000||108|702? TEST
COURT|||THORNHILL|L4J9K1||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|P|C|100001|E243883
150001|0|||||||||2008-12-23
16:46:53||||100001|100000||||||08152258043||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|P|H|100001|E243883150001||||1||TestFN||||LastNm||
|2008-12-23 16:46:51|||||100001|||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|
The successfully loaded SIF records can be found in the batchLoadSuccess.out
file. It records the message ID (index of the records in SIF input file) only by
default. If you want to see the MDM response for each SIF input request, set the
following entry in the Batch.properties file:
setMDMSuccessResponseToQueue=true
The batchLoadSuspect.out is empty for MDM Server version 9.0.1.
RDP for MDM Delta Load is the solution to address moderate volume data load.
It can be used for initial high-volume data load. Delta Load eliminates the need to
write custom composite transactions for every client implementation, therefore
reducing time and cost of implementation. Delta Load makes it possible to
change the Suspect Data Processing rule for party matching dynamically, and
without restarting MDM Server.
Chapter 4. RDP for MDM: Delta Load
129
130
Master Data Management: IBM InfoSphere Rapid Deployment Package
5
Chapter 5.
Financial services business
scenario
This chapter describes an approach to implement IBM InfoSphere MDM Server
using InfoSphere Rapid Deployment Package (RDP) on a linux platform. The
scenario takes a financial services business as an example to explain the
approach.
In this example, the initial load of the IBM InfoSphere MDM Server is performed
with RDP for MDM DataStage and QualityStage jobs and subsequent
operational loads are performed by using MDM Server RDP runtime assets.
This chapter includes the following topics:









Introduction
Business requirement
Environment configuration
An approach to implementation
Initial load
Suspect resolution
Hierarchies
MDM consumption application
Operational processing
© Copyright IBM Corp. 2009, 2011. All rights reserved.
131
5.1 Introduction
Fictional Bank Company T, hereafter referred to as FBankCoT, is a fictitious bank
that provides services such as savings, checking, and loans in the North
American continent. These services were either developed independently or
obtained through acquisitions. This resulted in the same customer information
potentially being represented inconsistently in each system, thereby leading to
increased costs (such as mailing) and poor customer service.
To overcome these problems, FBankCoT decided to implement a coexistence1
model of an MDM solution.
Some overlap of customer information is expected between the checking,
savings, and loans systems. However, a more likely situation is that a single
customer has an account in only one or two of the systems.
5.2 Business requirement
The objective is to consolidate master data of a customer in an MDM repository
to deliver improved customer service and reduce operational costs. The MDM
repository was also required to have defined regional hierarchies, which can
establish the association of customers to marketing organizations of the bank.
End-of-day data latency was considered acceptable, given the general
infrequency of changes to master data in the customer environment, which
translates to changes to master data in the operational systems being processed
at the end of every business day.
Note: Because the objective of this book is to focus on using the RDP for
MDM solution for building the MDM repository, we assume a simple data
model (single table) for each of the three (savings, checking, and loans)
systems.
1
132
With a coexistence model, key master data from one or more data sources is consolidated in the
MDM repository. Changes occurring in the data sources are applied to the MDM repository.
Synchronization is bidirectional: Existing systems provide new or updated data into MDM Server
through delta load and MDM Server feeds accurate master data back into existing systems. The
latency of the master data in the MDM repository varies by organization and frequency of
delta/operational load. Typically, some new applications obtain master data from the MDM
repository; legacy applications continue to access the master data from the existing data sources.
Master Data Management: IBM InfoSphere Rapid Deployment Package
5.3 Environment configuration
The configuration of the FBankCoT environment with the MDM repository is
shown in Figure 5-1.
Users and
Administrators
WebSphere Application Server (Domain),
XMETA and IADB
of IBM InfoSphere Information Server
(Linux Platform)
virgo.itsosj.sanjose.ibm.com
DataStage Engine
of IBM InfoSphere Information
Server (Linux Platform)
phoenix.itsosj.sanjose.ibm.com
WebSphere Application Server
of IBM InfoSphere MDM Server
(Linux Platform)
tarus.itsosj.sanjose.ibm.com
Existing Systems and MDM Repository
(Linux Platform)
orion.itsosj.sanjose.ibm.com
IADB
Checking
Savings
Loan
MDM
Repository
Figure 5-1 FBankCoT environment configuration
Figure 5-1 shows the following information:
 A Linux server orion.itsosj.sanjose.ibm.com has the following systems:
– FBankCoT core services of checking, savings, and loans on a DB2 for
LUW V9 database (FBANKCOT) that has one table (CHECKING,
SAVINGS, and LOAN) for each of the three systems
– MDM repository database (FBANKCOT) containing the MDM data
 Two Linux servers (phoenix.itsosj.sanjose.ibm.com and
virgo.itsosj.sanjose.ibm.com) that have InfoSphere Information Server 8.0.1
split as follows:
– WebSphere Application Server (Domain), XMETA and Information
Analyzer IADB on virgo.itsosj.sanjose.ibm.com
– DataStage Engine on phoenix.itsosj.sanjose.ibm.com
 A Linux server tarus.itsosj.sanjose.ibm.com which has the WebSphere
Application Server of the IBM MDM Server.
 Information Server version 8.0.1 used in this environment.
Chapter 5. Financial services business scenario
133
Important: Our objective is to showcase the RDP for MDM implementation on
a Linux platform. For convenience, we chose our data sources and target
MDM repository to be hosted on a single Linux platform, even though we
recognize that in a real production environment these systems are likely to be
hosted on an eclectic mix of operating systems, servers, and database
management systems. The configuration we use is meant only to showcase
the functionality of the RDP for MDM solution, and should in no way be seen
as delivering the scalability and performance requirements of your business
solution.
5.4 An approach to implementation
In a real production environment, you are likely perform the following tasks when
implementing a coexistence MDM solution using RDP for MDM:
1. Perform a Data Quality Assessment (DQA) of the data sources in your
environment to identify Master Data, assess data quality and determine the
system of record (SOR) for your Master Data. Information Analyzer,
InfoSphere Discovery and QualityStage would figure prominently in such an
effort.
Note: DQA is assumed to have occurred and only the results of this task is
presented here.
2. Review the MDM data model and potentially customize it to the specific
requirements of your organization.
In case of customization, RDP for MDM jobs, MDM RDP runtime assets and
MDM party maintenance services will need to be modified also.
Note: Customization of the MDM data model is not covered in this book.
3. Create the code mapping tables from source to SIF and update the MDM
code tables with domain values if appropriate.
4. Create a canonical form from the three data sources in our scenario.
Note: The canonical form is a concept we invented for this scenario and is
not defined in the RDP for MDM solution.
134
Master Data Management: IBM InfoSphere Rapid Deployment Package
5. Validate the RDP for MDM rule sets with the canonical form created in the
previous step and modify as needed. If the rule sets are modified, incorporate
them into the RDP for MDM jobs.
6. Create the mapping templates (from canonical form columns to the SIF
RT/STs).
7. Create the SIF using the mapping templates and the code mapping tables
created earlier.
8. Execute the RDP for MDM jobs with Standardization and Matching enabled in
the configuration parameter file. If errors occur, correct2 them and reprocess.
Note: In our case, we thoroughly cleaned the data prior to creating the SIF
so that no errors occurred.
However, to show the error messages generated by the RDP for MDM
jobs, we created other SIFs containing the most frequently occurring errors
and ran it through the RDP for MDM jobs. The purpose was to show the
correspondence between a particular error and the error messages
generated for it by the RDP for MDM jobs. This information is described in
Appendix D, “Error processing” on page 317.
9. Verify the successful loading of the MDM repository, using the MDM Server
Reporting facility.
10.Resolve any suspect parties that were not automatically collapsed in the load
jobs, but are suspected to be duplicate using the MDM Server Data
Stewardship UI.
11.Establish hierarchies that associate customers to marketing organizations
within the bank.
12.Integrate the real-time services of MDM Server in your master
data-consuming application. These are typically applications that you already
have in your environment such as sales and marketing, CRM, and operational
systems such as savings, checking, and loans.
2
You may either correct the errors in the SIF and reprocess the entire SIF again, or correct the data
in the data sources and re-create the SIF for processing by the RDP for MDM jobs. This way can be
time-consuming and is certainly the best approach when the number of errors is high, or when the
errors represent problems with creating the SIF from the source. This approach is also desirable
when the total data volume is relatively low, say less than 10 million rows. Correcting the data
directly in the SIF can be tedious and error-prone, and should be used only where the number of
errors is small. In general, a better way is to fix the data in the source system, or when creating the
SIF from the source. Correcting just the records in error and reprocessing them later in delta mode
is another option depending upon the number of and nature of the errors reported; a detailed
discussion of the considerations involved is beyond the scope of this book. However, this approach
is best if the data volume is high and the number of error records is low.
Chapter 5. Financial services business scenario
135
We wrote a sample application that provides a 360 degree view of a person
who fetches Master Data (such as address) from the MDM Server by making
a call to MDM web services and non-Master Data (such as balances) from the
corresponding source systems.
13.Perform operational processing with updates occurring in the source systems
after the initial loading.
In the next sections, we describe the approach we used to perform the following
tasks:





Initial loading
Suspect resolution
Hierarchies
MDM consumption application
Operational Processing
We also have instances of executions of the RDP for MDM jobs with SIFs
containing commonly encountered problems to see the correspondence between
a specific error condition in the SIF, and the corresponding error messages
generated by the RDP for MDM jobs. This information is described in
Appendix D, “Error processing” on page 317.
136
Master Data Management: IBM InfoSphere Rapid Deployment Package
5.5 Initial load
Figure 5-2 shows a high-level overview of the processing flow of the initial load of
the MDM repository.
Data source
Data Quality Assessment (DQA)
 Key data columns
 Domain values in key data columns
MDM Server
Key data + Domain values
Not covered in
this IBM Redbook
Yes
…..
Data source
Data source
Merge data from all the sources
into a canonical form
Create representative sample data
for standardization ruleset validation
and potential modification
Customize
MDM data model?
No
Verify adequacy of RDP rulesets
No
Modify ruleset
OK?
Yes
Use modified rulesets
in RDP jobs
Create mapping tables for each domain value
Map key columns to SIF records
Create SIF file
Populate MDM repository using the RDP jobs
Figure 5-2 Rapid MDM approach used in the scenario for the initial load
Briefly, a DQA is performed on the data sources to identify the Master Data
columns and the domain values in these Master Data columns for inclusion in the
MDM repository. The MDM data model’s Master Data columns and
corresponding domain values is reviewed against those of the three data
sources. Based on this review, the MDM Code Reference tables may need to be
updated with additional values, and source-to-SIF code mapping tables
generated between the source Master Data columns and corresponding MDM
Master Data columns.
Chapter 5. Financial services business scenario
137
Customization: If the MDM data model needs to be extended to support your
organization’s Master Data, then the MDM data model and behavior, and the
RDP for MDM jobs must be customized to address your Master Data
requirements. In our scenario, we did not include customization in the scope;
rather, we only discuss considerations that are involved when customizing the
MDM data model in Appendix C, “MDM customization considerations” on
page 309.
The Master Data from the data sources is loaded into a canonical form that
closely mirrors the format of the SIF records consumed by the RDP for MDM
jobs. During this process, you need to ensure that all MDM required columns (as
described in Appendix B, “Standard Interface File details” on page 295) have
valid data in them to avoid rejection by the RDP for MDM jobs. It is more efficient
to detect and fix these errors early in the cycle (potentially in the source system
itself) than much later after the RDP for MDM jobs have flagged it.
The purpose of creating a canonical form is to have a single format for validating
the efficacy of the RDP for MDM rule sets, and for simplifying the DataStage jobs
for creating the SIF, regardless of the number of data sources involved. Typically,
the data used for validating the efficacy of the RDP for MDM rule sets would be a
representative sample of all the data. If the RDP for MDM rule sets are modified
to address your organization’s data, then these modified rule sets must replace
the corresponding default ones in the RDP for MDM jobs.
Ambiguities: In creating this canonical form, an important step is to resolve
potential domain value semantic inconsistencies for a given column between
multiple data sources. For example Gender of ‘0’ means female in one of the
source systems, while the corresponding column Sex value of ‘0’ means male
in another source system. After resolving such ambiguities in the canonical
form, you should ensure that any user querying the canonical form data is
aware of the revised semantics so as not to misinterpret the information
retrieved.
The data in the canonical form is then loaded into the SIF using the
source-to-SIF column mapping templates you will have created, and the
source-to-SIF code mapping tables generated earlier.
Important: Before the RDP for MDM jobs can be run, you must drop all
referential integrity (RI) constraints and triggers defined in the MDM repository.
The script for dropping and creating triggers and constraints can be created by
querying metadata in the MDM database catalog tables.
138
Master Data Management: IBM InfoSphere Rapid Deployment Package
The RDP for MDM configuration parameters are set to perform standardization
and matching in the RDP for MDM jobs. The created SIF is processed by the
RDP for MDM jobs. After all errors have been resolved, the MDM repository
would have been loaded successfully. You should verify this by searching for
known records in the MDM repository using the MDM Server UI.
The referential constraints and triggers must be re-created before the MDM
repository can be considered operational and consumable by business
applications.
Important: The best approach is to use the Standardization and Matching
functionality of RDP for MDM jobs as much as possible, so as to rapidly deploy
your MDM implementation. However, no generalized Standardization and
Matching functionality might suit the particular requirements of your
organization’s data and could therefore require modification. For maximum
efficiency, the recommended approach allows you to validate the efficacy of
the RDP for MDM rule sets to your data and customize it if required.
The overall flow is covered in more detail for our particular scenario:







FBankCoT checking, savings, and loans systems
Data Quality Assessment (DQA)
Create canonical form from the data sources
Validate efficacy of the RDP for MDM rule sets and modify to suit
Create SIF
Execute RDP for MDM jobs
Verify successful load
5.5.1 FBankCoT checking, savings, and loans systems
The FBankCoT checking, savings, and loans systems are hosted on a DB2 for
Linux, UNIX, and Windows® (LUW) V9 database.
As mentioned earlier, because we were only interested in Master Data that
needed to be included in the MDM Server solution, we created one table for each
system containing all the required Master Data.
The DDL of the three tables is shown in Example 5-1 on page 140; the data
content in each of these tables is available in Appendix A, “Configuration
parameter file” on page 275. The Master Data columns in each table are
highlighted in bold in Example 5-1 on page 140. However, note that all the
columns are defined as being nullable with no Primary Key defined. In a real
production environment, you most likely would have a Primary Key defined for
each table
Chapter 5. Financial services business scenario
139
Note: Some overlap of customers and Master Data columns exists among the
data in the three systems. In a few cases, such as address, the address data
might all exist in one column in one system, but have the address data split
over multiple columns in another system.
Example 5-1 DDL of the Checking, Savings, and Loan table
CREATE TABLE DB2INST1.CHECKING (
BALANCE DECIMAL(10 , 0),
RATE DECIMAL(10 , 0),
OVERDRAF_RATE DECIMAL(10 , 0),
OVERDRAF_FEE INTEGER,
CHECKINGID INTEGER,
CUSTOMERID INTEGER,
NAME VARCHAR(255),
ADDRESS VARCHAR(255),
COUNTRY VARCHAR(255),
PHONE VARCHAR(255),
SSN VARCHAR(255),
DOB VARCHAR(10),
DOD VARCHAR(10),
GENDER VARCHAR(255),
WORK_STATUS VARCHAR(255),
PREF_LANGUAGE VARCHAR(255),
AGEVERIFICATIONDOCUMENT VARCHAR(255),
AGEVERIFICATIONNB VARCHAR(255),
NATIONALITY VARCHAR(255),
CUSTOMER_STATUS VARCHAR(255)
)
DATA CAPTURE NONE
IN USERSPACE1;
CREATE TABLE DB2INST1.SAVINGS (
SAVINGSID INTEGER,
SALUTATION VARCHAR(255),
NAME VARCHAR(255),
STREET VARCHAR(255),
CITY VARCHAR(255),
COUNTRY VARCHAR(255),
SSN VARCHAR(255),
DOB DATE,
PHONE VARCHAR(255),
CELLPHONE VARCHAR(255),
GENDER INTEGER,
140
Master Data Management: IBM InfoSphere Rapid Deployment Package
BALANCE DOUBLE,
RATE DOUBLE,
OVERDRAF_RATE DOUBLE,
CO_OWNER VARCHAR(10),
EFFECTIVE_CUSTOMERDATE DATE,
SOLICITATIONALLOW VARCHAR(255),
DRIVERLICENSEID VARCHAR(255),
CUSTOMER_PERFORMANCE VARCHAR(255)
)
DATA CAPTURE NONE
IN USERSPACE1;
CREATE TABLE DB2INST1.LOAN (
LOANID INTEGER,
CUSTOMERID INTEGER,
PASSPORTNB INTEGER,
TITLE VARCHAR(10),
FIRSTNAME VARCHAR(255),
LASTNAME VARCHAR(255),
INITIALS VARCHAR(255),
STREET VARCHAR(255),
CITY VARCHAR(255),
COUNTRY VARCHAR(255),
EMAIL VARCHAR(255),
GENDER VARCHAR(255),
DOB DATE,
DOD DATE,
PAYMENT_SCHEDULE INTEGER,
RATE DOUBLE,
INITIAL_VALUE INTEGER,
CREATION_DATE DATE,
LATE_FEE DOUBLE,
LATE_RATE DOUBLE,
BALANCE DOUBLE,
AUTOMAT_DEBT_IND VARCHAR(255),
GUARANTOR_ID VARCHAR(255),
MARRIED_STATUS VARCHAR(255),
CUSTOMER_STATUS VARCHAR(255)
)
DATA CAPTURE NONE
IN USERSPACE1;
Chapter 5. Financial services business scenario
141
5.5.2 Data quality assessment
Data quality assessment (DQA) is the process of exposing technical and
business data issues to plan the data integration effort most likely to succeed
within budget and time constraints.
 Technical quality issues based on target technical standards are generally
easy to discover and correct. Examples are as follows:
–
–
–
–
Different or inconsistent standards in structure, format, or values
Missing data, default values
Spelling errors, data in wrong fields
Buried information in free-form fields
 Business quality issues however are more subjective, and are associated with
business processes such as generating accurate reports, ensuring that data
driven processes are working correctly, and shipments are going out on time.
Because accuracy, timeliness, and correctness are subjective measures,
assessing the business quality of the data requires the involvement of the
business community.
Note: For enterprise-level initiatives, such as ERP implementations or system
consolidation, integration challenges at both the business and technical levels
generally revolve around the semantic reconciliation of master data objects
such as customer, product, and vendor.
Because the business is the ultimate recipient and user of the data resulting from
the integration effort, the success of a DQA is greatly dependent upon the ability
and commitment of the business community to participate in the process, and
more importantly, resolve semantic and business rule differences at the
functional level.
142
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-3 shows a high-level overview of the primary steps of a DQA process.
 Prepare the data for assessment.
Select the data sources to be investigated and analyzed.
 Conduct data discovery.
The data analyst and subject matter expert (SME) perform the investigation
and analyses by using tools such as IBM InfoSphere Information Analyzer
and IBM InfoSphere Discovery. This task involves checking metadata
integrity, structural integrity, entity integrity, relational integrity, and domain
integrity.
 Document data quality issues and decisions.
After all information about data quality is known, the appropriate data
alignment and cleansing decisions can be made and implemented.
Note: Typical DQA durations is between four to eight weeks. In short, focused
development efforts are kept tight, although assessment can be ongoing and
iterative. In longer (six or more months), development efforts can typically run
six and more weeks and are a key part of requirements definition.
IT Data Analyst
Staged
Source
Information Analyzer
Infosphere Discovery
Full Volume profiling and
Automated data analysis
All Information
& Reports
Data Alignment
Decisions
Meta Data/Domain Integrity
•Column Analysis
•Completeness
•Consistency
•Pattern Consistency
•Translation table creation
Structural Integrity
•Table Analysis
•Key Analysis
Entity Integrity
•Duplicate Analysis
•Targeted Data Accuracy
Relational Integrity
•Cross-Table Analysis
•Redundancy Analysis
Domain Integrity
•Business Rule Identification
and Validation
Figure 5-3 DQA approach: data assessment
Chapter 5. Financial services business scenario
143
The IBM InfoSphere Information Server product provides three tools for data
assessment:
 IBM InfoSphere Information Analyzer
With this product, you can quickly discover condition of data in large volumes
of data in a fraction of the time that could be handled manually. Through its
Column Analysis, Primary Key Analysis and Cross-Table Analysis functions,
IBM InfoSphere Information Analyzer enables systematic analysis and
reporting of results, thereby allowing the data analyst and subject matter
expert to focus on the real problem of data quality issues. It enables you to
apply professional quality control methods to manage the accuracy,
consistency, completeness, and integrity of information stored in databases.
By employing technology that integrates total quality management (TQM)
principles with data modeling and relational database concepts, IBM
InfoSphere Information Analyzer diagnoses data quality problems and
facilitates data cleanup efforts.
 IBM InfoSphere QualityStage
This product complements IBM InfoSphere Information Analyzer by
investigating free-form text fields such as names, addresses, and
descriptions. With IBM InfoSphere QualityStage, you can define rules for
standardizing free-form text domains which is essential for effective
probabilistic matching of potentially duplicate master data records. This level
of sophisticated data assessment is critical to understanding the total
cleansing effort required for a data integration project. IBM InfoSphere
QualityStage is covered in IBM WebSphere QualityStage Methodologies,
Standardization, and Matching, SG24-7546.
 IBM InfoSphere Discovery
This product accelerates information-centric project deployment and reduces
risk by creating a 360 degree view of data relationships across
heterogeneous sources. Using patented capabilities, InfoSphere Discovery
identifies and documents what data you have, where it is located and how it is
linked across systems by intelligently capturing relationships and determining
applied transformations and business rules.
144
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-4 summarizes the functions provided by IBM InfoSphere Information
Analyzer and IBM InfoSphere QualityStage.
Data
Source(s)
Metadata Integrity
Information Analyzer
Infosphere Discovery
Domain Integrity
Metadata Access &
Enrichment
Column Analysis &
Domain Assessment
Primary Key Analysis
Foreign Key Analysis
Cross-Domain Analysis
Automated Analysis
Structural Integrity Information Analyzer
Data Rule validation
Relational Integrity
Qualitystage
Duplicate Analysis
Entity Integrity
Ongoing Metrics
Qualitystage
Text pattern
Analysis
Information Analyzer
Metrics & Reporting
Information Analyzer
Baseline Analysis
Figure 5-4 Data assessment tools functionality
A discussion of all the steps and the benefits of Data Quality Assessment (DQA)
is beyond the scope of this book. For details about these steps, see IBM
WebSphere Information Analyzer and Data Quality Assessment, SG24-7508.
In this scenario, we focus on determining the domain values in the columns in the
source systems that need to be mapped to the corresponding columns in the
MDM repository. The determination of the domain values in the source systems
might necessitate adding3 new domain values to the MDM repository to
accommodate values that exist in the source systems. For example, if the source
system rates a customer into five categories (1 - 5) and the MDM repository only
allows four categories, you must add another category to the MDM repository
code reference table for customer rating. Also, because the SIF must be loaded
with domain values expected by the MDM repository, the process creating the
SIF must map the values in the source systems to the values in the MDM
repository. Mapping tables are required for each code reference table in the MDM
repository. For example, gender might be stored as 0 (female) and 1 (male) in the
source systems, while the MDM repository expects M (male) and F (female). This
requires a mapping table for gender that maps 0 to F, and 1 to M.
3
As part of the implementation preparation, the MDM code tables must be populated with
appropriate values. The MDM implementation process, the steps to determine what these values
should be, and how they are loaded is not within the scope of this book.
Chapter 5. Financial services business scenario
145
The Column Analysis Frequency Distribution Data report of IBM InfoSphere
Information Analyzer is used to determine the valid domain values4 in the various
code reference columns in the source systems. Figure 5-5 through Figure 5-7 on
page 147 show the report for the GENDER column in the Checking, Savings and
Loan table respectively. One value in Figure 5-5 for the Checking table shows an
invalid value X, which is assumed to be corrected in the source system. While the
Checking and Loan tables have M and F as the domain values, the Savings table
has 0 (female) and 1 (male) as domain values.
FBANKCOT
Figure 5-5 Frequency Distribution Data report for GENDER column in Checking table
4
146
We assume that DQA process on the organization’s Master Data has identified invalid domain
values in the source table columns that correspond to columns in the MDM, and that these invalid
values have been corrected in the source systems before mapping tables are created.
Master Data Management: IBM InfoSphere Rapid Deployment Package
FBANKCOT
Figure 5-6 Frequency Distribution Data report for GENDER column in Savings table
FBANKCOT
Figure 5-7 Frequency Distribution Data report for GENDER column in Loan table
Chapter 5. Financial services business scenario
147
Table 5-1 shows the columns that need to be mapped between the sources and
the target MDM repository. This list was arrived at after an analysis of the code
reference tables in the MDM repository and those in the source systems.
Table 5-1 Code table mapping between the sources and the MDM repository
Common
columns
Source systems
Column & source & domain values
Country
COUNTRY in CHECKING
US
(null)
COUNTRY in SAVINGS
US
(null)
COUNTRY in LOAN
US
(null)
CUSTOMER_PERFORMANCE in
SAVINGS
LOW
MID
HIGH
(null)
CUSTOMER_STATUS in LOAN
GOLD
SILVER
BRONZE
(null)
Customer status
CUSTOMER_STATUS in CHECKING
Gender
MDM Server
Domain values & column
185
and other country codes
COUNTRY_TP_CD in
CDCOUNTRYTP
1
2
3
CLIENT_IMP_TP_CD in
CDCLIENTIMPTP
A
B
C
D
(null)
1
2
3
4
CLIENT_ST_TP_CD in
CDCLIENTSTTP
GENDER in CHECKING
M
F
(null)
M
F
Not validated in MDM
GENDER in SAVINGS
1
0
(null)
GENDER in LOAN
M
F
(null)
Marital status
MARRIED_STATUS in LOAN
Married
Single
Divorced
(null)
1
2
3
MARITAL_ST_TP_CD in
CDMARITALSTTP
Nationality
NATIONALITY in CHECKING
US
(null)
185
and other country codes
COUNTRY_TP_CD in
CDCOUNTRYTP
Preferred language
PREF_LANGUAGE in CHECKING
English
(null)
100
and other language codes
LANG_TP_CD in
CDLANGTP
Salutation
SALUTATION in SAVINGS
Mr.
Mrs.
(spaces)
(null)
14
15
and other salutation codes
PREFIX_NAME_TP_CD
in CDPREFIXNAMETP
TITLE in LOAN
Mr.
Mrs.
(spaces)
(null)
Customer performance
148
Master Data Management: IBM InfoSphere Rapid Deployment Package
Important: If the MDM repository is populated from the same column (such
as gender) in multiple data sources, then overlapping values which have
different semantic meanings is possible. For example, in one system, the
value ‘0’ might represent a female; the value ‘0’ in another system might
represent a male. When creating the canonical form, semantic conflicts must
be resolved before populating the column. This situation did not exist in our
scenario.
Figure 5-8 on page 151 shows the mapping between the Master Data columns in
the source systems to the corresponding columns in the canonical form table
shown in Example 5-2 on page 150.
In the canonical form, note the following information:
 The CUSTOMERID columns gets mapped to the ADMIN_CLIENT_ID column
in the SIF which becomes part of the SSK in the MDM data repository. It is for
all practical purposes the primary key for access in the source system.
Important: There is no CUSTOMERID equivalent column in the Savings
system. We therefore artificially generated a value that concatenated the
SAVINGSID column (an implicit primary key) with an additional character
and populated the CUSTOMERID column with it. When coding the MDM
consumption application, we extract the SAVINGSID component from the
CUSTOMERID column when it needs to retrieve non-Master Data from the
Savings system as shown in the JSP application code in Example 5-6 on
page 269.
 Two columns (SRCSYSTEMID and ZIPCODE) do not have corresponding
columns in the source. The SRCSYSTEMID column is generated based on
the source system columns being mapped (1 for Checking, 2 for Savings, and
3 for Loan); the ZIPCODE is embedded in other columns in the source
systems and therefore not explicitly mapped.
Chapter 5. Financial services business scenario
149
Example 5-2 DDL of the canonical form table
CREATE TABLE STAGING.CANONICAL_TBL (
SRCSYSTEMID INTEGER NOT NULL,
CUSTOMERID VARCHAR(255) NOT NULL,
ACCOUNTID VARCHAR(255) NOT NULL,
WORKSTATUS VARCHAR(255),
CELLNB VARCHAR(255),
PHONENB VARCHAR(255),
EMAIL VARCHAR(255),
PASSPORTNB VARCHAR(255),
DRIVERLICNB VARCHAR(255),
SSN VARCHAR(255),
FIRSTNAME VARCHAR(255),
LASTNAME VARCHAR(255),
INITIALS VARCHAR(255),
STREETADDRESS VARCHAR(255),
CITY VARCHAR(255),
COUNTRY VARCHAR(255),
ZIPCODE VARCHAR(255),
DOD DATE,
DOB DATE,
MARITALSTATUS VARCHAR(255),
GENDER CHAR(1),
NATIONALITY VARCHAR(255),
CUSTOMERSTATUS VARCHAR(255),
CUSTOMERPERF VARCHAR(255),
STARTDATE DATE,
SOLICITATIONALLOW VARCHAR(255),
AGEVERIFICATIONDOC VARCHAR(255),
SALUTATION VARCHAR(255),
PREF_LANGUAGE VARCHAR(255),
FREEFORMNAME VARCHAR(255),
FREEFORMADDRESS VARCHAR(255)
)
DATA CAPTURE NONE
IN STAGINGSPACE;
150
Master Data Management: IBM InfoSphere Rapid Deployment Package
Canonical form
Source systems
C
H
E
C
K
I
N
G
S
A
V
I
N
G
S
L
O
A
N
S
SRCSYSTEMID
CUSTOMERID
ACCOUNTID
WORKSTATUS
CELLNB
PHONENB
EMAIL
PASSPORTNB
DRIVERLICNB
SSN
FIRSTNAME
LASTNAME
INITIALS
STREETADDRESS
CITY
COUNTRY
ZIPCODE
DOD
DOB
MARITALSTATUS
GENDER
NATIONALITY
CUSTOMERSTATUS
CUSTOMERPERF
STARTDATE
SOLICITATIONALLOW
AGEVERIFICATIONDOC
SALUTATION
PREF_LANGUAGE
FREEFORMNAME
FREEFORMADDRESS
WORK_STATUS
PREF_LANGUAGE
CHECKINGID
GENDER
PHONE
CUSTOMERID
NATIONALITY
CUSTOMER_STATUS
DOD
AGEVERIFICATIONDOCUMENT
ADDRESS
SSN
COUNTRY
DOB
AGEVERIFICATIONNB
NAME
CITY
EFFECTIVE_CUSTOMERDATE
SAVINGSID
GENDER
PHONE
STREET
DRIVERLICENSEID
CELLPHONE
COUNTRY
SSN
DOB
SALUTATION
NAME
SOLICITATIONALLOW
CUSTOMER_PERFORMANCE
TITLE
LASTNAME
CITY
EMAIL
PASSPORTNB
GENDER
FIRSTNAME
CUSTOMERID
STREET
CUSTOMER_STATUS
DOD
INITIALS
COUNTRY
MARRIED_STATUS
DOB
LOANID
SRCSYSTEMID is assigned a value of 1 (checking), 2 (savings) or 3 (loans) depending upon the source
ZIPCODE has no assignment from any of the input sources
Figure 5-8 Mapping from source(s) to canonical form
5.5.3 Create canonical form from the data sources
This section describes the mapping of data from the three source systems to a
single canonical form table.
The primary steps involved are as follows:
1. Define the sources to canonical form table target mapping.
2. Populate the canonical form table.
Chapter 5. Financial services business scenario
151
Define the sources to canonical form table target mapping
We used the IBM InfoSphere Information Server FastTrack component, Version
8.0.1 to perform the mapping and generate the DataStage jobs5.
FastTrack provides Data Architects and Business Analysts with a drag-and-drop
user interface to InfoSphere Information Server which allows them to define
source-to-target mapping specifications and to define and track additional
requirements for data transformations. From these mapping specifications,
DataStage jobs6 and job templates are generated. The DataStage developer can
test and verify the generated job followed by modification of the job according to
the DataStage development best practices. The DataStage developer can then
execute it to move source data to the target.
Note: We assume that the DQA has taken place previously, and therefore the
required ODBC data sources (see Figure 5-9 which includes the definition of
the FBANKCOT and IADB data sources) have been defined for both the
sources and target systems. All the data sources that were imported using
InfoSphere Information Server console is also available to FastTrack users.
The metadata acquired from these data sources is used to identify the target
columns and tables in FastTrack, and to configure ODBC connectivity in the
generated DataStage jobs.
5
6
152
Template jobs for more complex requirements
In our scenario, the mappings are relatively simple. Therefore, the mapping specifications are used
to generate runnable DataStage jobs.
Master Data Management: IBM InfoSphere Rapid Deployment Package
[FBANKCOT]
QEWSD=39715
Driver=/opt/IBM/InformationServer/Server/branded_odbc/lib/VMdb222.so
Description=FICTIONAL BANKING COMPANY T DATA SOURCE
AddStringToCreateTable=
AlternateID=
Database=FBANKCOT
DynamicSections=100
GrantAuthid=PUBLIC
GrantExecute=1
IpAddress=9.43.86.101
IsolationLevel=CURSOR_STABILITY
LogonID=db2inst1
Password=itso13sj
PackageOwner=
TcpPort=60000
WithHold=1
[iadb]
QEWSD=39715
Driver=/opt/IBM/InformationServer/Server/branded_odbc/lib/VMdb222.so
Description=IADB connection
AddStringToCreateTable=
AlternateID=
Database=iadb
DynamicSections=100
GrantAuthid=PUBLIC
GrantExecute=1
IpAddress=9.43.86.104
IsolationLevel=CURSOR_STABILITY
LogonID=iauser
Password=itso13sj
PackageOwner=
TcpPort=50000
WithHold=1
Figure 5-9 ODBC data sources on odbc.ini file
Figure 5-10 on page 154 through Figure 5-22 on page 167 show windows for
creating a specification that maps the SAVINGS source columns to the
corresponding CANONICAL target columns, using FastTrack, and the generation
and configuration of the DataStage job for that specification.
Note: The mapping is repeated for the CHECKING and LOAN sources also,
but that is not repeated here.
Chapter 5. Financial services business scenario
153
The steps are as follows:
1. Figure 5-10 shows the FastTrack login to the appropriate server (virgo) with
the user ID isadmin; we assume that this user has the required permissions
to access InfoSphere Information Server.
Figure 5-10 Define the sources to canonical form table target mapping (1 of 13)
154
Master Data Management: IBM InfoSphere Rapid Deployment Package
2. FastTrack source-to-target mapping specifications are contained in projects.
We opened a previously created project, named SourceToSif_Canonical, for
our mapping specification as shown in Figure 5-11.
Figure 5-11 Define the sources to canonical form table target mapping (2 of 13)
Chapter 5. Financial services business scenario
155
3. Figure 5-12 shows a list of source-to-target mapping specifications defined in
this project. Click New Mapping in the Tasks list to create a new mapping
specification.
Figure 5-12 Define the sources to canonical form table target mapping (3 of 13)
156
Master Data Management: IBM InfoSphere Rapid Deployment Package
4. Provide details of the new mapping specification in the Mapping Editor as
shown in Figure 5-13, such as Name (SAVINGS_TO_CANONICAL). Click
Column Mappings in the Basic section of the tab list to map the columns.
Figure 5-13 Define the sources to canonical form table target mapping (4 of 13)
Chapter 5. Financial services business scenario
157
5. As shown in Figure 5-14, open the Database metadata tab and expand the
metadata tree under the target host (ORION.ITSOSJ.SANJOSE.IBM.COM),
and navigate to the target canonical table (STAGING.CANONICAL_TBL in
the database FBANKCOT). Drag this target table to the mapping canvas in
the Target Columns field. This move causes this area to be populated with the
columns from the STAGING.CANONICAL_TBL as shown.
FBANKCOT
Figure 5-14 Define the sources to canonical form table target mapping (5 of 13)
158
Master Data Management: IBM InfoSphere Rapid Deployment Package
6. In Figure 5-15, open the Database metadata tab and expand the metadata
tree and navigate to the required source table STAGING.SAVINGS table in
the FBANKCOT database in the source host
ORION.ITSOSJ.SANJOSE.IBM.COM.
Select the required source columns such as CITY and drag it on to the
mapping canvas in the Source Columns field which corresponds to the target
column (CITY) as shown.
FBANKCOT
ORION.ITSOSJ.SANJOSE.IBM.COM.
FBANKCOT.STA
ORION.ITSOSJ.SANJOSE.IBM.COM.
FBANKCOT.STA
Figure 5-15 Define the sources to canonical form table target mapping (6 of 13)
Chapter 5. Financial services business scenario
159
7. Repeat the process for the remaining columns. However, in the case of the
target columns CUSTOMERID and SRCSYSTEMID, we must define a
transformation function as follows:
– As mentioned previously, in general, the CUSTOMERID columns are
mapped to the ADMIN_CLIENT_ID column in the SIF, which becomes part
of the SSK in the MDM data repository. It is for all practical purposes the
primary key for access in the source system.
However, because there is no CUSTOMERID equivalent column in the
Savings system, we artificially generated a value that concatenated the
SAVINGSID column (an implicit primary key) with an additional character
(1) and populated the CUSTOMERID column. This is achieved by placing
a value of SAVINGSID: 1, under the Transformation Function field for the
target CUSTOMERID column, as shown in Figure 5-16 on page 161.
This action is remembered in the MDM consumption application when it
must retrieve non-Master Data from the Savings system as shown in the
JSP application code in Example 5-6 on page 269.
160
Master Data Management: IBM InfoSphere Rapid Deployment Package
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT.
STAGING
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT.
STAGING
Figure 5-16 Define the sources to canonical form table target mapping (7 of 13)
– In the case of the target column SRCSYSTEMID, we need to have a
constant value (2) placed in it to indicate that SAVINGS is the source
system. This is achieved by placing a value of 2 under the Transformation
Function field for the target SRCSYSTEMID column as shown in
Figure 5-17 on page 162.
It also shows the Transformation Function field with a value of SetNull() for
some of the columns such as CUSTOMERSTATUS and STARTDATE.
Note: The mapping specification is complete when all the columns have
been mapped correctly.
Chapter 5. Financial services business scenario
161
Click Save and then Close.
FBANKCOT
ORION.ITSOSJ.SANJOSE.IBM.COM.
FBANKCOT.STAGING
Figure 5-17 Define the sources to canonical form table target mapping (8 of 13)
162
Master Data Management: IBM InfoSphere Rapid Deployment Package
8. Select the newly created mapping specification SAVINGS_TO_CANONICAL
and click Generate Job in the Tasks list as shown in Figure 5-18.
Figure 5-18 Define the sources to canonical form table target mapping (9 of 13)
Chapter 5. Financial services business scenario
163
9. For the Composition Type, select No Composition, as shown in Figure 5-19,
because we are only generating a job from a single mapping specification.
Click Next.
Figure 5-19 Define the sources to canonical form table target mapping (10 of 13)
164
Master Data Management: IBM InfoSphere Rapid Deployment Package
10.Select the project (SourceToSIF) and folder (SourceToCanonical) and click
Finish. This step saves the generated job (with the Name of New Job
SAVINGS_TO_CANONICAL) in the selected project and folder, as shown in
Figure 5-20. Click Next.
Figure 5-20 Define the sources to canonical form table target mapping (11 of 13)
Chapter 5. Financial services business scenario
165
11.Connection details for the generated job need to be defined in the Job
parameters as shown in Figure 5-21. Job parameters are used to pass
database connection data such as user name and password. As a general
guideline, these database connections generally consist of the source
database, the target database, and the lookup data source (if lookups are
implemented). Navigate to the data sources and enter the appropriate job
parameter names for each data source. Supply the required user name and a
password parameter and click Finish to generate the DataStage job.
FBANKCOT
Figure 5-21 Define the sources to canonical form table target mapping (12 of 13)
The generated job that is shown in Figure 5-22 on page 167 is the job
corresponding to CHECKING_TO_CANONICAL instead of
SAVINGS_TO_CANONICAL. This was an error on our part while capturing the
screens.
166
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-22 Populate the canonical form table (13 of 13)
Populate the canonical form table
After the DataStage jobs have been generated (and modified if necessary to
meet additional requirements), the jobs need to be compiled and executed. The
DataStage jobs can either be run one by one from Director, or a job sequence
can be built that controls the jobs.
Notes:
 As mentioned earlier, we mistakenly only captured the figure of the
CHECKING_TO_CANONICAL job instead of SAVINGS_TO_CANONICAL.
The execution of the CHECKING_TO_CANONICAL job is described in
Figure 5-23 on page 168 through Figure 5-25 on page 171.
 Figure 5-22 on page 167 shows the generated job
CHECKING_TO_CANONICAL in Designer. Any changes to the job design,
specifically, the Transformer Stage of the generated job could be of
interest, because this stage implements the derivations and source to
target mappings. We did not make any modifications.
Chapter 5. Financial services business scenario
167
Figure 5-23 through Figure 5-25 on page 171 show the windows of the execution
of the generated job, and after all the sources were processed, the partial
contents of the canonical form table is shown in Example 5-3 on page 169. The
steps are as follows:
1. Open the Director Client.
2. Navigate to the newly generated job CHECKING_TO_CANONICAL job
folders to the newly generated job as shown in Figure 5-23.
Figure 5-23 Populate the canonical form
3. Specify the appropriate parameters in the Job Run Options setting the
appropriate values for the job parameters, and click Run as shown in
Figure 5-24.
Figure 5-24 Populate the canonical form table 3/4
4. Review the job’s execution in Director for job logs.
168
Master Data Management: IBM InfoSphere Rapid Deployment Package
5. The data from all the sources must be loaded into the CANONICAL_TBL. The
partial contents of this table is shown in Example 5-3.
Example 5-3 Partial contents of CANONICAL_TBL
3,"8000037","30000004",,,,"[email protected]","608813863",,,"Denise ","Farrel","DF","1735 Saratoga Ave","San
Jose","US",,,19860902,"Divorced","F",,,"GOLD ",,,,"Mr.",,,
3,"8000885","30000035",,,,"[email protected]","743118324",,,"Kelly ","Hopkins","KH","1482 Rhode Island St","San
Francisco","US",,,20010317,"Married","F",,,"BRONZE ",,,,"Mrs.",,,
3,"8000637","30000029",,,,"[email protected]","111695965",,,"Burr ","Preston","BP","3726 Broderick St","San
Francisco","US",,,19890725,"Divorced","M",,,"BRONZE ",,,,,,,
3,"8000002","30000016",,,,"[email protected]","921319004",,,"Allan ","Jensen","AJ","PO Box 7424","San
Francisco","US",,,19570329,"Married","M",,,"BRONZE ",,,,"Mrs.",,,
................
..........
2,"200000161","20000016",,"(415) 923-1998","(408) 269-0922",,,"S99887766","111-345-2312",,,,"2584 Junction Ave","San
Jose","US",,,19971102,,"0",,,"HIGH ",,"Y",," ",,"A Fanelli",
2,"200000071","20000007",,"(408) 919-1500","(999) 999-9999",,,"S12312311",,,,,"1603 Bel Air Ave","San Jose","US",,,19890823,,"0",,,"MID
",,"Y",," ",,"Anna Fanelli",
2,"200000001","20000000",,"(408) 236-2527","(999) 999-9999",,,"S45118674","232-22-4444",,,,"6177 Purple Sage Ct","San
Jose","US",,,19370802,,"1",,,"MID ",,"Y",," ",,"Bruce H Anderson",
2,"200000121","20000012",,"(415) 561-8511","(408) 782-7100",,,"S22334455","112-99-1212",,,,"321 Curie Drivee","San
Jose","US",,,19750902,,"1",,,"MID ",,"Y",,"Mrs.",,"Torben Andersom",
2,"200000171","20000017",,"(415) 673-4598","(408) 919-1500",,,"E123456789","456-34-4563",,,,"5528 Muir Dr","San
Jose","US",,,19900314,,"1",,,"MID ",,"N",,"Mrs.",,"A Carter",
..............
..........
1,"70006245","10000022","Empl",,"(415) 296-9450",,,"xxxxxxxx","133-34-2345",,,,,,"US",,,19770918,,"M","US","B
",,,,"Drivers License",,"English","Andrew I Jensen","44 Montgomery St,Ste 3705,San Francisco,94104"
1,"70004432","10000006","Empl",,"(408) 850-6400",,,"S67856745","",,,,,,"US",,,19760802,,"M","US","C
",,,,"Drivers License",,"English","Gayle Fagan","2315 N 1st St,,San Jose,95119"
1,"70002305","10000023","Empl",,"(415) 282-0219",,,"S12312311","234-45-3434",,,,,,"US",,,19451022,,"F","US","B
",,,,"Drivers License",,"English","Anette A Jensen","77 Grand View Ave,Apt 202,San Francisco,94114"
1,"70002268","10000004","Empl",,"(800) 817-8232",,"134785432",,"xxx-xx-xxxx",,,,,,"US",,,19980803,,"M","US","B
",,,,"Passport",,"English","Anders Olsson","2050 North First Street,,San Jose,95119"
1,"70006863","10000020","Empl",,"(415) 683-0763",,,"xxxxxxxx","123-45-6789",,,,,,"US",,20081001,19670502,,"M","US","B
",,,,"Drivers License",,"English","Aaron Jensen","1363 14th Ave,,San Francisco,94122"
1,"70007096","10000027","Empl",,"(415) 677-9723",,"111345674",,"",,,,,,"US",,,19960411,,"M","US","A
",,,,"Passport",,"English","Allan Preston","720 Market St,Ste 900,San Francisco,94102"
1,"70005799","10000008","Empl",,"(800) 553-6387",,,"S98765432","345-34-2378",,,,,,"US",,,19370901,,"F","US","C
",,,,"Drivers License",,"English","Arcangelo Fanelli","170 W Tasman Dr,,San Jose,95119"
1,"70003060","10000024","Empl",,"(415) 586-7966",,,"S34565422","123-22-2222",,,,,,"US",,,19660825,,"X","US","B
",,,,"Drivers License",,"English","Anton T & Larue Jensen","258 Lisbon St,,San Francisco,94112"
1,"70005333","10000001","Empl",,"(408) 226-2327",,,"S13494673","453-42-1234",,,,,,"US",,,19450312,,"F","US","B
",,,,"Drivers License",,"English","Christina Anderson","6181 Camino Verde Dr,,San Jose,95119"
1,"70007859","10000002","Empl",,"(408) 782-7100",,,"S33433434","543-23-9999",,,,,,"US",,,19671201,,"F","US","B
",,,,"Drivers License",,"English","Alexandra Anderson","321 Curie Drivee,,San Jose,95119"
................
................
5.5.4 Validate and modify efficacy of the RDP MDM rule sets
As mentioned previously, the purpose of creating a canonical form was to have a
single format for validating the efficacy of the RDP for MDM rule sets, and for
simplifying the DataStage jobs for creating the SIF, regardless of the number of
data sources involved. The data used for validating the efficacy of the RDP for
MDM rule sets should be a representative sample of all the data.
Note: In our test environment, the volume of data was quite small and we
therefore chose to use all of it as input to this process.
Chapter 5. Financial services business scenario
169
If the RDP for MDM rule sets are modified to address your organization’s data,
then these modified rule sets must replace the corresponding default ones in the
RDP for MDM jobs.
Important: We adopted this approach after our bad experience with directly
executing the RDP for MDM jobs on the canonical form data (without this
validation step) in which a majority of the rows got rejected by the RDP for
MDM jobs because the critical CITY field in the SIF record was empty
(because of standardization errors). We then introduced this approach of
validating the canonical form data with the RDP for MDM rule sets, with the
intention of modifying the default RDP for MDM rule sets to address potential
problems with the CITY field in particular.
Finally, because of time constraints, we chose to override the USPREP rule
set (which is not in the RDP for MDM rule sets) only to ensure that the CITY
name in the input was passed on to the SIF record appropriately. We did not
make any changes to improve the quality of the standardization such as
modifying the classifications and other overrides to fix problems such as
misspellings of “Drivee” and “Avedue.”
Be sure that you perform the necessary overrides to correct such problems
also to ensure quality data is loaded into your MDM data repository. Hence our
description of the process of exporting the RDP for MDM rule sets and
importing them back after changes to them.
In this section, we describe the following tasks:
 Importing “out-of-the-box” (OOTB) RDP for MDM rule sets into a DataStage
project
 Validating RDP for MDM rule set in the standardization job
 Overriding Input Patterns & rerun the standardization job & export modified
rule set
 Importing modified RDP for MDM rule sets into RDP for MDM jobs
Note: As mentioned previously, the following information is not meant to be a
tutorial about the use of QualityStage because that is beyond the scope of this
book. See IBM WebSphere QualityStage Methodologies, Standardization,
and Matching, SG24-7546 for information about using QualityStage. In this
book, we include several relevant figures to facilitate a better understanding of
the process adopted and guidelines proposed.
170
Master Data Management: IBM InfoSphere Rapid Deployment Package
Import OOTB RDP for MDM rule sets into a DataStage project
We created a project named MDMTESTRULE, in which we created a
standardization job to analyze all the data created in canonical form (described in
5.5.3, “Create canonical form from the data sources” on page 151) for efficacy of
the OOTB RDP for MDM rule sets.
Figure 5-25 through Figure 5-43 on page 187 show several windows in the
import process, as follows:
1. Launch the WebSphere DataStage and QualityStage Designer, and from the
task bar, click Import  DataStage Components, as shown in Figure 5-25.
Figure 5-25 Import OOTB RDP for MDM rule sets into a standardization job (1 of 7)
2. In the DataStage Repository Import window, specify the RDP for MDM jobs
DSX file, select the Import selected radio button to select the components we
want to import, and click OK, as shown in Figure 5-26.
Figure 5-26 Import OOTB RDP for MDM rule sets into a standardization job (2 of 7)
Chapter 5. Financial services business scenario
171
Figure 5-27 through Figure 5-29 on page 174 show the available
components.
Figure 5-27 Import OOTB RDP for MDM rule sets into a standardization job (3 of 7)
172
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-28 Import OOTB RDP for MDM rule sets into a standardization job (4 of 7)
Chapter 5. Financial services business scenario
173
Figure 5-29 Import OOTB RDP for MDM rule sets into a standardization job (5 of 7)
174
Master Data Management: IBM InfoSphere Rapid Deployment Package
3. Because we are only interested in the components related to name and
address standardization, we select only them (four shared containers and all
the rule sets) and click OK. The progress of the import of the selected
components is shown in Figure 5-30.
Figure 5-30 Import OOTB RDP for MDM rule sets into a standardization job (6 of 7)
Note: As mentioned previously, we actually only made changes to the
USPREP rule set, which is not in the RDP for MDM rule sets.
4. At the completion of the import, review the imported components in the
ValidationStanContainers in the navigation pane in Figure 5-31 on page 176.
Chapter 5. Financial services business scenario
175
Figure 5-31 Import OOTB RDP for MDM rule sets into a standardization job (7 of 7)
176
Master Data Management: IBM InfoSphere Rapid Deployment Package
You can now proceed to validate the efficacy of the OOTB RDP for MDM rule
sets in the standardization job.
Validate RDP for MDM rule sets on the standardization job
We created a copy7 (named ORGUSPREP) of the USPREP OOTB RDP for
MDM rule set and validated it with the representative sample of the canonical
form data; in our case, because our volume of data was quite small, we used all
the data in our sources as input to the validation effort.
Important: We validated the address field only because our focus was on
ensuring that the citya name information could be parsed from the SIF
FREEFORMADDRESS field to populate the CITY_NAME column which is
required by the MDM data repository. Other critical fieldsb such as
LAST_NAME and POSTAL_CODE could have been standardized, but we
chose not to include them here.
Our objective here was to ensure that most, if not all, input data was loaded by
RDP for MDM into the MDM data repository. Toward this goal, we focused on
ensuring that the critical columns (CITY in this case) had the necessary
information. This led us to work on the USPREPc rule set. As mentioned
earlier, because of time constraints, we did not expend additional effort to
modify the other rule sets to enhance the quality of the standardization
performed by the OOTB RDP for MDM rule sets. However, given our
recommendation to work with all the OOTB RDP for MDM rule sets, we
demonstrate here the process of validating the efficacy of the OOTB RDP for
MDM rule sets and modifying them if necessary for subsequent replacement
of the original OOTB rule sets.
a. The CITY name field must be populated during load by RDP for MDM for inserting
into the MDM data repository
b. For a list of all critical fields, see the “Required for Insert” and “Required for
Update” columns in the MDM_RDP_SIF mapping template
c. As mentioned earlier, the USPREP rule set is not supplied with the RDP for MDM
jobs. It is a part of the standard QualityStage rule set.
7
A copy was created as a backup, and also to be able to run standardization by using the original
OOTB RDP for MDM rule sets. If modifications are required, they have to be performed on
USPREP, which would then be imported into the RDP for MDM jobs for replacing the original OOTB
USPREP rule set.
Chapter 5. Financial services business scenario
177
Figure 5-32 on page 180 through Figure 5-44 on page 188 show several
windows that describe the validation process, as follows:
1. Launch the WebSphere DataStage and QualityStage Designer and display
the VSSTANAddress shared container on the Designer canvas.
VSSTANAddress is the shared container that contains the USPREP stage to
be validated. This stage processes address data from one or more source
columns and moves it into appropriate domain columns. Because we were
only interested in the address fields, our focus was on reviewing the street
address in the AddressDomain_USPREP column and the city name, state
and Zip code in the AreaDomain_USPREP column.
2. The USPREP stage was inspected to see which rule sets were used and how
they were used. We did not modify it. We reviewed them to generate a
corresponding standardization job (J02_ORGUSPREP_STAN) to test the
OOTB RDP for MDM rule sets. The current names of the source columns are
ADDR_LINE_ONE, ADDR_LINE_TWO, and ADDR_LINE_THREE in the SIF
files. The literal ZQADDRZQ8 is included here.
3. A copy of these rule sets was created in a ORGUSPREP folder.
4. We created our standardization job (J02_ORGUSPREP_STAN) with the
ORGUSPREP rule set. The ORGUSPREP stage identifies the two address
columns in the canonical form data (FREEFORMADDRESS and
STREETADDRESS)9 with literal ZQADDRZQ as shown in Figure 5-40 on
page 185 and Figure 5-41 on page 186 (which has two literals ZQADDRZQ).
The reason for including the two literals ZQADDRZQ is that our inspection of
the use of USPREP by RDP for MDM (as shown in Figure 5-32 on page 180
and Figure 5-33 on page 180) shows the following information:
ZQADDRZQ
ADDR_LINE_ONE
ZQADDRZQ
ADDR_LINE_TWO
ZQADDRZQ
ADDR_LINE_THREE
8
9
178
This literal specifies that after field overrides and field modifications are applied, it checks for
common Address patterns. If not found, it checks for Name and Area patterns. If not found, the field
is defaulted to Address. See IBM WebSphere QualityStage Methodologies, Standardization, and
Matching, SG24-7546 for details about such literals.
FREEFORMADDRESS is really the column we wanted to target. But we knew that either
FREEFORMADDRESS or STREETADDRESS contained the data we needed. Therefore, this
setup ensures that it generates one address to be standardized for each row.
Master Data Management: IBM InfoSphere Rapid Deployment Package
We do not supply the values for ADDR_LINE_TWO and
ADDR_LINE_THREE in our canonical form data, and instead have the
following information:
ZQADDRZQ
FREEFORMADDRESS
STREETADDRESS
ZQADDRZQ
ZQADDRZQ
The last two literals are essential to make our patterns match the patterns that
are generated in RDP for MDM.
5. Figure 5-42 on page 186 shows execution of the J02_ORGUSPREP_STAN
job.
6. Figure 5-43 on page 187 shows the data in the STREETADDRESS and
FREEFORMADDRESS columns in the canonical form data. The
FREEFORMADDRESS address shows the city name and Zip code data in it;
STREETADDRESS contains only street address, and the CITY column has
city information corresponding to the data in the STREETADDRESS column.
7. Figure 5-44 on page 188 shows the AreaDomain_ORGUSPREP column
contents after the processing by ORGUSPREP rule set. It shows failure in
moving city name to this column from the FREEFORMA in the input. The
InputPattern_ORGUSPREP column shows the input pattern for the
addresses that were not processed correctly by ORGUSPREP.
Important: Because there is no city information, these records would fail
validation by the RDP for MDM jobs and cause them to be rejected. As
mentioned previously, our earlier experience without the pre-validation
phase had resulted in a rejection of a majority of the rows by the RDP for
MDM jobs. Our analysis of the errors had indicated that the missing critical
CITY field was the cause of the rejections. Our pre-validation phase
confirmed the missing values of this critical field. Examples of other critical
fields that can cause rows to be rejected include LAST_NAME and
POSTAL_CODE which did not appear in our processing.
Because we did not want to have these records rejected, we proceeded to
override the input patterns in order for proper processing of city names to occur
as described in section “Overriding patterns” on page 188.
Chapter 5. Financial services business scenario
179
Figure 5-32 Validate RDP for MDM rule set on the standardization job (1 of 13)
Figure 5-33 Validate RDP for MDM rule set on the standardization job (2 of 13)
180
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-34 Validate RDP for MDM rule set on the standardization job (3 of 13)
Figure 5-35 Validate RDP for MDM rule set on the standardization job (4 of 13)
Chapter 5. Financial services business scenario
181
Figure 5-36 Validate RDP for MDM rule set on the standardization job (5 of 13)
182
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-37 Validate RDP for MDM rule set on the standardization job (6 of 13)
Chapter 5. Financial services business scenario
183
Figure 5-38 Validate RDP for MDM rule set on the standardization job (7 of 13)
184
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-39 Validate RDP for MDM rule set on the standardization job (8 of 13)
Figure 5-40 Validate RDP for MDM rule set on the standardization job (9 of 13)
Chapter 5. Financial services business scenario
185
Figure 5-41 Validate RDP for MDM rule set on the standardization job (10 of 13)
Figure 5-42 Validate RDP for MDM rule set on the standardization job (11 of 13)
186
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-43 Validate RDP for MDM rule set on the standardization job (12 of 13)
Chapter 5. Financial services business scenario
187
Figure 5-44 Validate efficacy of RDP for MDM rule sets (13 of 13)
Overriding patterns
We override the input patterns that were not handled in Figure 5-44 in the
USPREP10 rule set and re-run the standardization to ensure the address
columns were processed correctly, and then export the modified USPREP rule
set to a DSX file.
10
188
The USPREP is modified, rather than ORGUSPREP, because that is the rule set in the RDP for
MDM jobs which would need to be replaced.
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-45 on page 190 through Figure 5-55 on page 197 show windows that
describe the input pattern override, rerun of the standardization job with the
modified USPREP rule set, and the modified USPREP export process as follows:
1. The Rule Management window in Figure 5-45 on page 190 shows the various
parts in a rule set, including Overrides. Click Overrides in the Rules
Management window to add, copy, edit, or delete overrides to rules sets.
2. We perform input pattern overrides as shown in Figure 5-46 on page 191 and
Figure 5-47 on page 192.
With input pattern override, you can specify token overrides that are based on
the input pattern. The input pattern overrides take precedence over the
pattern-action file. Input pattern overrides are specified for the entire input
pattern.
Note: The override codes A (for ADDRESS) circled in the Enter Input
Pattern text field corresponds to the literal ZQADDRZQ explained
previously. The boxed values correspond to the characters overridden with
A in the Override Code column of the Current Pattern List.
3. The modified rule sets are then provisioned11 as shown in Figure 5-48 on
page 193.
4. A copy of the J02_ORGUSPREP_STAN job is created as
J12_USPREP_STAN using a stage named USPREP as shown in Figure 5-49
on page 193.
The USPREP stage is modified to refer to the canonical form data columns
STREETADDRESS and FREEFORMADDRESS as shown in Figure 5-50 on
page 194.
5. Figure 5-51 on page 194 shows the execution of this job.
6. Figure 5-52 on page 195 shows the results of processing by the modified
USPREP rule set which shows the AreaDomain_USPREP column now
populated with the city name for the relevant rows. The results indicate
successful input pattern overrides.
7. Figure 5-53 on page 196 through Figure 5-55 on page 197 show the
successful export of the modified USPREP rule set as a DSX file
(USPREPCHANGED.dsx).
We then proceed to import the modified USPREP rule set into the RDP for MDM
jobs as described in “Import modified RDP for MDM rule sets into RDP for MDM
project” on page 198.
11
You must provision new, copied, or customized rule sets in the Designer client before you can
compile and run a job that uses them.
Chapter 5. Financial services business scenario
189
Figure 5-45 Override input pattern and rerun the standardization job (1 of 11)
190
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-46 Override input pattern and rerun the standardization job (2 of 11)
Chapter 5. Financial services business scenario
191
Figure 5-47 Override input pattern and rerun the standardization job (3 of 11)
192
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-48 Override input pattern and rerun the standardization job (4 of 11)
Figure 5-49 Override input pattern and rerun the standardization job (5 of 11)
Chapter 5. Financial services business scenario
193
Figure 5-50 Override input pattern and rerun the standardization job (6 of 11)
Figure 5-51 Override input pattern and rerun the standardization job (7 of 11)
194
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-52 Override input pattern and rerun the standardization job (8 of 11)
Chapter 5. Financial services business scenario
195
Figure 5-53 Override input pattern and rerun the standardization job (9 of 11)
196
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-54 Override input pattern and rerun the standardization job (10 of 11)
Figure 5-55 Override input pattern and rerun the standardization job (11 of 11)
Chapter 5. Financial services business scenario
197
Import modified RDP for MDM rule sets into RDP for MDM
project
Figure 5-56 on page 199 through Figure 5-60 on page 201 describe several
windows involved in importing the modified RDP for MDM rule sets (in
“Overriding patterns” on page 188) into the RDP for MDM project using
WebSphere DataStage Designer.
The RDP_rule set_Change project contains all the RDP for MDM jobs into which
the modified rule sets are imported. The windows and steps are as follows:
1. Figure 5-56 on page 199 shows the contents of the RDP_rule set_Change
project that lists all the RDP for MDM jobs in it.
2. Figure 5-57 on page 200 through Figure 5-59 on page 200 shows the import
of all the components of the modified rule set (USPREPCHANGED.dsx) file into
this project.
3. Finally, the modified rule set is provisioned as shown in Figure 5-60 on
page 201.
Note: After provisioning, the job that is using the modified rule sets must be
recompiled.
With the creation of the SIF (that proceeded in parallel and described in 5.5.5,
“Create SIF” on page 202), we proceed to execute the RDP for MDM jobs as
described in 5.5.6, “Execute RDP for MDM jobs” on page 221.
198
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-56 Import modified RDP for MDM rule sets into RDP for MDM jobs (1 of 5)
Chapter 5. Financial services business scenario
199
Figure 5-57 Import modified RDP for MDM rule sets into RDP for MDM jobs (2 of 5)
Figure 5-58 Import modified RDP for MDM rule sets into RDP for MDM jobs (3 of 5)
Figure 5-59 Import modified RDP for MDM rule sets into RDP for MDM jobs (4 of 5)
200
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-60 Import modified RDP for MDM rule sets into RDP for MDM jobs (5 of 5)
Chapter 5. Financial services business scenario
201
5.5.5 Create SIF
As part of the custom work to create the SIF, you will typically create a mapping
document/spreadsheet that specifies the mapping of source-to-SIF target
columns, and any transformation of values especially where code tables are
concerned. For code table transformations, a cross reference table is
recommended (instead of hard coding the transformations) that contains the
mapping of source code values to target code values. A custom DataStage job,
or other program or utility, should read the source data (canonical form data
created in 5.5.3, “Create canonical form from the data sources” on page 151 in
our case) and perform a lookup of the cross reference tables that are created
manually (or generated by Information Analyzer) to pick up the corresponding
MDM code values for loading to the SIF.
In this section, we describe the following information:
 Creation of a reference table using Information Analyzer.
 Generation of the SIF from the data stored in the canonical form data table
created in section “Create canonical form from the data sources” on
page 151.
Creation of a reference table using Information Analyzer
As mentioned earlier, reference tables for mapping may be created manually or
using Information Analyzer.
Because we had used Information Analyzer in the DQA, we briefly describe the
process of creating one reference table to serve as a lookup for mapping values
in the canonical form table to code values stored in the MDM code tables.
These values must be retrieved to correctly populate the SIF.
Figure 5-61 on page 204 through Figure 5-66 on page 209 describe the windows
for creating a single reference table. It involves determining the code values in
the appropriate MDM code table and then creating the reference table in
Information Analyzer using these values as follows:
1. Figure 5-61 on page 204 shows the navigation pane in the MDM Server UI.
Select Administration Console  Navigation tree  Code Tables. In the
content pane, select a code table of interest (CdAdminSysTp) from the
drop-down list, and click GO.
2. Figure 5-62 on page 205 shows the list of valid values in this table. We added
code values for the Checking (1000000), Savings (100001), and Loan
(1000002) systems through this GUI.
202
Master Data Management: IBM InfoSphere Rapid Deployment Package
Note: Repeat this process for all the code tables of interest in the MDM data
repository for which reference tables need to be created.
3. After the code values have been verified, repeat the following steps for all
columns in the canonical form table that are required to be translated to the
target MDM code table values:
a. Select the Column Analysis12 tab and View Analysis Summary for the
CANONICAL_TBL table in Information Analyzer to view all the columns in
this table, as shown in Figure 5-63 on page 206. Select the column
(SRCSYSTEMID in this case) requiring a code value lookup and click
View Details.
b. Select the Frequency Distribution tab and key in the transformation of
values in the Transformation Value column as shown in Figure 5-64 on
page 207. It shows the Data Value of 3 (Loan system) in the
SRCSYSTEMID column to be transformed to 1000002; Data Value of 1
(Checking system) in the SRCSYSTEMID column to be transformed to
1000000, and Data Value of 2 (Savings system) in the SRCSYSTEMID
column to be transformed to 1000001.
c. Select Reference Tables  New Reference Table to create a reference
table with these mappings, as shown in Figure 5-65 on page 208.
d. Provide the Name (CTS_LKP_SRCID) of the reference table and set the
radio button Mapping (All Values), as shown in Figure 5-66 on page 209.
Click Save.
Note: Repeat this process for all the columns in the canonical form table
that have code values.
12
We assume that Column Analysis has been performed on the canonical form table, and that we
know the columns in the canonical form table that must have their values transformed to those in
the corresponding code table in the MDM data repository.
Chapter 5. Financial services business scenario
203
Figure 5-61 Creation of a reference table using Information Analyzer (1 of 6)
204
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-62 Creation of a reference table using Information Analyzer (2 of 6)
Chapter 5. Financial services business scenario
205
Figure 5-63 Creation of a reference table using Information Analyzer (3 of 6)
206
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-64 Creation of a reference table using Information Analyzer (4 of 6)
Chapter 5. Financial services business scenario
207
Figure 5-65 Create SIF (5 of 6)
208
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-66 Create SIF (6 of 6)
Chapter 5. Financial services business scenario
209
Table 5-2 summarizes the code table value mappings between the source and
the target MDM data repository for our scenario.
Table 5-2 Code table mapping between the canonical form columns and the MDM
Source systems
column, source, and domain values
MDM Server
domain values and column
COUNTRY
US
(null)
185
and other country codes
COUNTRY_TP_CD in
CDCOUNTRYTP
CUSTOMERPERF
LOW
MID
HIGH
GOLD
SILVER
BRONZE
(null)
1
2
3
CLIENT_IMP_TP_CD in
CDCLIENTIMPTP
CUSTOMERSTATUS
A
B
C
D
(null)
1
2
3
4
CLIENT_ST_TP_CD in
CDCLIENTSTTP
GENDER
M
F
0
1
(null)
M
F
Not validated in MDM
MARITALSTATUS
Married
Single
Divorced
(null)
1
2
3
MARITAL_ST_TP_CD in
CDMARITALSTTP
NATIONALITY
US
(null)
185
and other country codes
COUNTRY_TP_CD in
CDCOUNTRYTP
PREF_LANGUAGE
English
(null)
100
and other language codes
LANG_TP_CD in
CDLANGTP
SALUTATION
Mr.
Mrs.
(spaces)
(null)
14
15
and other salutation codes
PREFIX_NAME_TP_CD
in CDPREFIXNAMETP
SIF generation from canonical form
We used FastTrack Version 8.0.1 to define the mapping between the columns in
the canonical form data table and the SIF, and generated a DataStage job to load
the SIF tables. A subsequent DataStage job extracted the data from the SIF
tables and created the SIF file for processing by the RDP for MDM jobs.
Because the FastTrack process was similar to the one described in the creation
of the canonical form data table, it is not repeated here.
In the following process, we describe only those steps that differ from the ones
performed earlier. Specifically, the creation of the SIF tables (Figure 5-67 on
page 211 shows the job jpSchemaPrep which reads the DDL for the 22 SIF13
210
Master Data Management: IBM InfoSphere Rapid Deployment Package
tables and creates the tables in the FBANKCOT database), performing a lookup
of the reference tables created earlier, and the execution of the job creating the
SIF file from the SIF tables.
FBANKCOT_CONN_PS parameters
FBANKCOT
Figure 5-67 Create SIF tables
13
These tables map one-to-one with the RT/ST combinations described in Appendix B, “Standard
Interface File details” on page 295. The DDL for these tables can be downloaded from the IBM
Redbooks website ftp://www.redbooks.ibm.com/redbooks/SG247704/
Chapter 5. Financial services business scenario
211
Table 5-3 shows the mapping between the columns in the canonical form data
table to the corresponding SIF columns.
Table 5-3 Canonical form to SIF mapping
Canonical form columns
RT/ST
SIF column
CANONICAL_TBL.FREEFORMADDRESS,CANONICAL_TBL.STREETADDRESS
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.CITY
CTS_LKP_COUNTRY.TRANSFORMVALUE
CANONICAL_TBL.ZIPCODE
PA
ADDRESS.ADDR_LINE_ONE
ADDRESS.ADMIN_CLIENT_ID
ADDRESS.ADMIN_SYS_TP_CD
ADDRESS.CITY_NAME
ADDRESS.COUNTRY_TP_CD
ADDRESS.POSTAL_CODE
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCSYSTEM.TRANSFORMVALUE
CTS_LKP_AGEVERDOC.TRANSFORMVALUE
CANONICAL_TBL.DOB
CTS_LKP_NATIONALITY.TRANSFORMVALUE
CTS_LKP_CUSTPERF.TRANSFORMVALUE
CTS_LKP_CUSTSTATUS.TRANSFORMVALUE
CANONICAL_TBL.DOD
CTS_LKP_GENDER.TRANSFORMVALUE
CTS_LKP_MARITALST.TRANSFORMVALUE
CTS_LKP_PREFLANG.TRANSFORMVALUE
PP
CONTACT.ADMIN_CLIENT_ID
CONTACT.ADMIN_SYS_TP_CD
CONTACT.AGE_VER_DOC_TP_CD
CONTACT.BIRTH_DT
CONTACT.CITIZENSHIP_TP_CD
CONTACT.CLIENT_IMP_TP_CD
CONTACT.CLIENT_ST_TP_CD
CONTACT.DECEASED_DT
CONTACT.GENDER_TP_CODE
CONTACT.MARITAL_ST_TP_CD
CONTACT.PREF_LANG_TP_CD
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.CELLNB (when it is NOT NULL)
PC
CONTACTMETHOD.ADMIN_CLIENT_ID
CONTACTMETHOD.ADMIN_SYS_TP_CD
CONTACTMETHOD.REF_NUM
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.EMAIL (when it is NOT NULL)
PC
CONTACTMETHOD.ADMIN_CLIENT_ID
CONTACTMETHOD.ADMIN_SYS_TP_CD
CONTACTMETHOD.REF_NUM
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.PHONENB (when it is NOT NULL)
PC
CONTACTMETHOD.ADMIN_CLIENT_ID
CONTACTMETHOD.ADMIN_SYS_TP_CD
CONTACTMETHOD.REF_NUM
CANONICAL_TBL.ACCOUNTID
CTS_LKP_SRCID.TRANSFORMVALUE
CH
CONTRACT.ADMIN_CONTRACT_ID
CONTRACT.ADMIN_SYS_TP_CD
CANONICAL_TBL.ACCOUNTID
CTS_LKP_SRCID.TRANSFORMVALUE
CTS_LKP_PRODTP.TRANSFORMVALUE
CC
CONTRACTCOMPONENT.ADMIN_CONTRACT_ID
CONTRACTCOMPONENT.ADMIN_SYS_TP_CD
CONTRACTCOMPONENT.PROD_TP_CD
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.ACCOUNTID
CTS_LKP_SRCID.TRANSFORMVALUE
CTS_LKP_PRODTP.TRANSFORMVALUE
CR
CONTRACTROLE.ADMIN_CLIENT_ID
CONTRACTROLE.ADMIN_CLIENT_SYS_TP_CD
CONTRACTROLE.ADMIN_CONTRACT_ID
CONTRACTROLE.ADMIN_SYS_TP_CD
CONTRACTROLE.PROD_TP_CD
REGIONS_STAGING.REGION_ID
REGIONS_STAGING.REGION_DESCRIPTION
HN
HIERARCHY_NODE.ADMIN_CLIENT_ID
HIERARCHY_NODE.DESCRIPTION
CANONICAL_TBL.CUSTOMERID when SRCSYSTEMID = '2'
HN
HIERARCHY_NODE.ADMIN_CLIENT_ID
REGIONS_STAGING.REGION_ID
REGIONS_STAGING.PARENT_REGION_ID
(when PARENT_REGION_ID IS NOT NULL)
HN
HIERARCHY_REL.ADMIN_CLIENT_ID_CHILD
HIERARCHY_REL.ADMIN_CLIENT_ID_PARENT
Lookup Client Id Parent.REGION_ID
CANONICAL_TBL.CUSTOMERID
HN
HIERARCHY_REL.ADMIN_CLIENT_ID_PARENT
HIERARCHY_REL.ADMIN_CLIENT_ID_CHILD
REGIONS_STAGING.REGION_DESCRIPTION when PARENT_REGION_ID IS
NULL
HN
HIERARCHY.DESCRIPTION
REGIONS_STAGING.REGION_ID
REGIONS_STAGING.REGION_DESCRIPTION
(when PARENT_REGION_ID is null)
HN
HIERARCHY_UP.ADMIN_CLIENT_ID
HIERARCHY_UP.DESCRIPTION
212
Master Data Management: IBM InfoSphere Rapid Deployment Package
Canonical form columns
RT/ST
SIF column
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.DRIVERLICNB (when it is NOT NULL)
PI
IDENTIFIER.ADMIN_CLIENT_ID
IDENTIFIER.ADMIN_SYS_TP_CD
IDENTIFIER.REF_NUM
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.PASSPORTNB (when it is NOT NULL)
PI
IDENTIFIER.ADMIN_CLIENT_ID
IDENTIFIER.ADMIN_SYS_TP_CD
IDENTIFIER.REF_NUM
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.SSN (when it is NOT NULL)
PI
IDENTIFIER.ADMIN_CLIENT_ID
IDENTIFIER.ADMIN_SYS_TP_CD
IDENTIFIER.REF_NUM
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.FREEFORMNAME
CANONICAL_TBL.FIRSTNAME,CANONICAL_TBL.FREEFORMNAME
CANONICAL_TBL.FREEFORMNAME,CANONICAL_TBL.LASTNAME
CTS_LKP_SALUTATION.TRANSFORMVALUE
PH
PERSONNAME.ADMIN_CLIENT_ID
PERSONNAME.ADMIN_SYS_TP_CD
PERSONNAME.FREE_FORM_NAME
PERSONNAME.GIVEN_NAME_ONE
PERSONNAME.LAST_NAME
PERSONNAME.PREFIX_NAME_TP_CD
CANONICAL_TBL.CUSTOMERID
CTS_LKP_SRCID.TRANSFORMVALUE
CANONICAL_TBL.ACCOUNTID
CTS_LKP_SRCID.TRANSFORMVALUE
CTS_LKP_PRODTP.TRANSFORMVALUE
CL
ROLELOCATION.ADMIN_CLIENT_ID
ROLELOCATION.ADMIN_SYS_TP_CD
ROLELOCATION.ADMIN_CONTRACT_ID
ROLELOCATION.ADMIN_CLIENT_SYS_TP_CD
ROLELOCATION.PROD_TP_CD
Figure 5-68 on page 214 through Figure 5-73 on page 219 describe the windows
that perform the lookup of the reference tables for transforming the code table
values.
Figure 5-74 on page 220 shows the execution of the job jpGenerateOutputSIF
that extracts the contents of the SIF tables and generates the SIF file with the
pipe ( | ) delimiters between the columns.
Chapter 5. Financial services business scenario
213
The steps shown by the figures are as follows:
1. Figure 5-68 shows the FastTrack Mapping Editor for a specific mapping
specification (CANONICAL_ROLELOCATION_SIF). Click Lookup
Definitions in the task bar on the left for the list of lookup definitions for the
list of columns that must be translated into code values.
FBANKCOT
Figure 5-68 Create SIF (1 of 7)
214
Master Data Management: IBM InfoSphere Rapid Deployment Package
2. Figure 5-69 shows the names of the lookup definitions (such as
CTS_LKP_SRCID), the corresponding lookup table (CTS_LKP_SRCID) and
the source table (CANONICAL_TBL). Click New Lookup Definition to add
another lookup definition.
FBANKCOT
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT
.STAG
Figure 5-69 Create SIF (2 of 7)
Chapter 5. Financial services business scenario
215
3. Provide the name (CTS_LKP_SRC) of the lookup definition and click OK, as
shown in Figure 5-70.
FBANKCOT
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT
.STAG
Figure 5-70 Create SIF (3 of 7)
4. After the new lookup definition has been created and saved, the lookup table
needs to be defined.
Note: We assume that the reference tables created using Information
Analyzer (or manually) have been imported into the InfoSphere Information
Server metadata repository.
216
Master Data Management: IBM InfoSphere Rapid Deployment Package
In Figure 5-71, the reference table IAUSER.CTS_LKP_SRCID from the
database IADB on host VIRGO.ITSOSJ.SANJOSE.IBM.COM is to be used as
the lookup table. Drag the table from the database metadata tab to the
Lookup Table field.
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT
.STAG
Figure 5-71 Create SIF (4 of 7)
Chapter 5. Financial services business scenario
217
5. Figure 5-72 shows the next step which defines the source table for the lookup
definition. In our case this is the canonical form table
STAGING.CANONICAL_TBL in the database FBANKCOT on the host
ORION.ITSOSJ.SANJOSE.IBM. Drag the table from the database metadata
tab to the Source Table field.
FBANKCOT
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT
.STAG
Figure 5-72 Create SIF (5 of 7)
218
Master Data Management: IBM InfoSphere Rapid Deployment Package
6. Figure 5-73 shows the next step which is to define a join key for the lookup
definition. Make note of the key columns in both the lookup and the source
tables to be joined, and then click Add Key to open the Add Join Entry
window, as shown in Figure 5-73. Select the key columns in the Lookup Table
and Source Table from the drop-down list.
After completing all definition of the mapping specification, save it, and
generate and verify the DataStage job14. This process is similar to the
process described in “Define the sources to canonical form table target
mapping” on page 152, and is not repeated here. When this generated job is
run, it loads the SIF tables from the canonical form data table, and is not
shown here.
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT.STAGING.CANONICAL_TBL.PREF
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT.STAGING.CANONICAL_TBL.SALUT
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT.STAGING.CANONICAL_TBL.SOLIC
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT.STAGING.CANONICAL_TBL.SRCSY
ORION.ITSOSJ.SANJOSE.IBM.COM.FBANKCOT.STAGING.CANONICAL_TBL.SSN
J.SANJOSE.IBM.COM.FBANKCOT.
STAG
Figure 5-73 Create SIF (6 of 7)
14
The job name of a FastTack generated DataStage job generally defaults to the mapping name.
Chapter 5. Financial services business scenario
219
7. After all the SIF tables have been loaded, the DataStage job
jpGenerateOutputSIF (Figure 5-74) is run to create the SIF file from the SIF
tables. The DSX file of the jpGenerateOutputSIF DataStage job can be
downloaded from the IBM Redbooks website at:
ftp://www.redbooks.ibm.com/redbooks/SG247704/
FBANKCOT_CONN_PS parameters
FBANKCOT
Figure 5-74 Create SIF (7 of 7)
220
Master Data Management: IBM InfoSphere Rapid Deployment Package
Example 5-4 shows the partial contents of the SIF file generated by this process,
corresponding to the canonical form data.
Example 5-4 Partial contents of SIF file
C|R|1000002|30000029|A|1000002|8000637||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000022|A|1000002|8000712||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000024|A|1000002|8000917||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000031|A|1000002|8000467||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000010|A|1000002|8000263||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000028|A|1000001|200000281||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000030|A|1000001|200000301||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000001|20000005|A|1000001|200000051||7|1||||||||||||0|0|0|0|0|0|0|0|0|0|
C|R|1000002|30000023|A|1000002|8000259||10|1||||||||||||0|0|0|0|0|0|0|0|0|0|
.....................
P|I|1000001|200000161|A|3||S99887766|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000291|A|3||S347936486|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000251|A|3||D12456745|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70004432|A|3||S67856745|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70006363|A|3||S12314517|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70004262|A|3||S76193782|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000001|A|3||S45118674|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000091|A|3||S12314517|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000000|70007287|A|3||S12312399|||||||||||0|0|0|0|0|0|0|0|0|0|
P|I|1000001|200000131|A|3||S12312456|||||||||||0|0|0|0|0|0|0|0|0|0|
......................
P|P|1000002|8000885|A|N|||||||3||||||||||||||||||||1|||||F|2001-03-17 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000362|A|N|||||||2||||||||||||||||||||1|||||F|1989-08-23 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000002|8000263|A|N|||||||2||||||||||||||||||||1|||||F|1986-06-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000301|A|N|||||||3|||||||||||||||||||||||||F|1987-07-06 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70005799|A|N|||100|||||3|||||||||||||||||||||185|||F|1937-09-01 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000000|70002070|A|N|||100|||||4|||||||||||||||||||||185|||M|1923-04-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|P|1000001|200000051|A|N|||||||1|||||||||||||||||||||||||M|2011-08-02 00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
,,,,,,,,,,,,,,,,
P|O|1000001|4|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|O|1000001|8|A|N||||||||||||||||||||||7|||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000035|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
C|C|1000002|30000028|A|10|2||||||||||||0|0|0|0|0|0|0|0|0|0|
5.5.6 Execute RDP for MDM jobs
After the SIF was created, we proceeded to drop the triggers and referential
constraints on the MDM data repository, configured the parameters for the RDP
for MDM jobs, and then launched the IL_000_INITIAL_LOAD job.
Drop triggers and RI constraints
All the triggers and referential integrity (RI) constraints in the MDM data
repository were dropped as required:
The following command was used to drop the triggers and deactivate referential
integrity constraints:
db2 -tsvf Filename
In the command, Filename contains the SQL to perform this operation.
Chapter 5. Financial services business scenario
221
Notes:
 Scripts for dropping triggers and RI are in the RDP assets TAR file. The
relative path of the script is as follows:
<MDMRDPRuntimeAssets_Install_Home>/DB/<DB_TYPE>/
The Drop_Scripts.zip file provides scripts for dropping all triggers and
referential integrity constraints.
 Scripts for resuming triggers and RI constraints are in the MDM installation
folder:
<MDM_Install_Home>/database/MDM/<DB_TYPE>/Standard/ddl/
After completion of initial loading, these scripts resume triggers and
constraints.
Provide configuration parameters
Table 5-4 on page 223 shows our choices for the MUST MODIFY parameters.
The main parameters of interest that were enabled or modified are as follows:
 Enabled standardization and matching in the RDP for MDM jobs:
–
–
–
–
–
QS_PERFORM_ORG_MATCH set to 1
QS_PERFORM_PERSON_MATCH set to 1
QS_STAN_ADDRESS set to 1
QS_STAN_ORG_NAME set to 1
QS_STAN_PERSON_NAME set to 1
 Modified the national ID from the default as follows:
– QS_MATCH_ORG_NATID set to I215 (Corporate Tax Identification)
– QS_MATCH_PERSON_NATID set to I116 (Social Security Number)
 Modified the database connection details, file system directories, and other
parameters for our particular environment.
Note: We make these changes in the configuration parameter file because we
used the DL_000_DELTA_LOAD job rather than the
DL_000_AutoStart_PS_DELTA_LOAD job.
15
The default was I8, which is the passport number. Having it as a national identification for an
organization does not make sense. We therefore chose I2.
16
The default was c2, which is the business phone number. Having it as a national identification for a
person does not make sense. We therefore chose I1.
222
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table 5-4 RDP configuration parameters MUST MODIFY list
Frequency
Category
Sub
category
Parameter
Scenario value
One time
SETUP
Connection
DB_CONNECT_STRING
MDM_DB
DB_INSTANCE
db2inst1
DB_PASSWORD
encrypted value
DB_SCHEMA
DB2INST1.a
DB_USERID
db2inst1
$APT_DB2INSTANCE_HOME
/home/dsadm/remote_db2config
$APT_IMPORT_PATTERN_USES_FILESET_MOUNTED
True
$APT_STRING_PADCHAR
(blank)
DS_PARALLEL_APT_CONFIG_FILE
/opt/IBM/InformationServer/Server/Configuratio
ns/MDM_Default.apt
DS_SEQUENTIAL_APT_CONFIG_FILE
/opt/IBM/InformationServer/Server/Configuratio
ns/MDM_1X1.apt
MDM_DEPLOYMENT_NAME
WebSphere Customer Center
DS_LANGUAGE_TYPE_CODE
100
DS_SUPPORT_FILE_DIR
/data/RDP/FREQ/
DS PARAMETER
Miscellaneous
File location
Runtime
ADVANCED
Runtime
DS PARAMETER
FS_DATA_SET_HEADER_DIR
/data/RDP/DATA/
FS_ERROR_DIR
/data/RDP/ERROR/
FS_LOG_DIR
/data/RDP/LOG/
FS_PARAM_SET_DIR
./ParameterSets/
FS_REJECT_DIR
/data/RDP/REJECT/
FS_SK_FILE_DIR
/data/RDP/SK/
FS_TMP_DIR
/data/RDP/TMP/
BATCH_ID (auto assigned)
canonical_HIER
DS_PROCESSING_DATE (auto assigned)
2008-05-30 09:36:10
FS_HIERARCHY_SIF_FILE_PATTERN
/data/RDP/SIF_IN/canonical_1/*.hsif
FS_SIF_FILE_PATTERN
/data/RDP/SIF_IN/canonical_1/*.sif
$APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL
True
$APT_IMPORT_PATTERN_USES_FILESET
True
$APT_IMPORT_REJECT_STRING_FIELD_OVERRUNS
True
$APT_SORT_INSERTION_OPTIMIZATION
True
Chapter 5. Financial services business scenario
223
Frequency
Category
Sub
category
Parameter
Scenario value
Recurring
SETUP
QualityStage
QS_MATCH_ORG_NATIDb
I2
QS_MATCH_PERSON_NATID
c
I1
d
1
QS_PERFORM_ORG_MATCH
QS_PERFORM_PERSON_MATCHe
1
QS_STAN_ADDRESSf
1
g
1
QS_STAN_ORG_NAME
QS_STAN_PERSON_NAME
h
1
a. Period is required
b. Default is the passport number; it should be I2, which is the Corporate Tax Identification
c. The default setting of C2 equates to Business Phone Number, which is not a reasonable national ID document. We therefore changed it to I1, which
is SSN.
d. We chose to perform Org match.
e. We chose to perform Person match.
f. We chose to perform standardization on address.
g. We chose to perform standardization on OrgName.
h. We chose to perform standardization on PersonName.
224
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table 5-5 shows our choices for the CONSIDER MODIFYING parameters
according to the guidelines in Table A-6 on page 290.
Table 5-5 RDP configuration parameters in the CONSIDER MODIFYING list
Frequency
Category
Sub
category
Parameter
Recommendation
One time
SETUP
Miscellaneous
DS_SOURCE_DATE_FORMATa
%yyyy-%mm-%dd %hh:%nn:%ss.6
DS_USE_NATIVE_KEY
1
SK_MID_ADDRESS_ID_NEXT_VAL
1
ADVANCED
One time
ADVANCED
SURROGATE
SURROGATE
SK_MID_ALERT_ID_NEXT_VAL
1
SK_MID_CONT_EQUIV_ID_NEXT_VAL
1
SK_MID_CONT_ID_NEXT_VAL
1
SK_MID_CONT_REL_ID_NEXT_VAL
1
SK_MID_CONTACT_METHOD_ID_NEXT_VAL
1
SK_MID_CONTR_COMP_VAL_ID_NEXT_VAL
1
SK_MID_CONTR_COMPONENT_ID_NEXT_VAL
1
SK_MID_CONTRACT_ID_NEXT_VAL
1
SK_MID_CONTRACT_ROLE_ID_NEXT_VAL
1
SK_MID_HIER_ULT_PAR_ID_NEXT_VAL
1
SK_MID_HIERARCHY_ID_NEXT_VAL
1
SK_MID_HIERARCHY_NODE_ID_NEXT_VAL
1
SK_MID_HIERARCHY_REL_ID_NEXT_VAL
1
SK_MID_IDENTIFIER_ID_NEXT_VAL
1
SK_MID_LOB_REL_ID_NEXT_VAL
1
SK_MID_LOCATION_GROUP_ID_NEXT_VAL
1
SK_MID_MISCVALUE_ID_NEXT_VAL
1
SK_MID_NATIVE_KEY_ID_NEXT_VAL
1
SK_MID_ORG_NAME_ID_NEXT_VAL
1
SK_MID_PERSON_NAME_ID_NEXT_VAL
1
SK_MID_PERSON_SEARCH_ID_NEXT_VAL
1
SK_MID_PPREF_ID_NEXT_VAL
1
SK_MID_ROLE_LOCATION_ID_NEXT_VAL
1
SK_MID_SUSPECT_ID_NEXT_VAL
1
SK_PREFIX_CONT_ID_NEXT_VAL
1
SK_PREFIX_CONTRACT_ID_NEXT_VAL
1
SK_PREFIX_HIERARCHY_ID_NEXT_VAL
1
Chapter 5. Financial services business scenario
225
Frequency
Category
Sub
category
Parameter
Recommendation
Recurring
SETUP
QualityStage
QS_ALLOW_LOB_MATCH
0
QS_EXCLUDE_FIELDS_FROM_MATCH_ORGANIZATION
(blank)
QS_EXCLUDE_FIELDS_FROM_MATCH_PERSON
(blank)
QS_MATCH_ORG_1b
I2
QS_MATCH_ORG_2
(blank)
QS_MATCH_ORG_3
(blank)
Recurring
Error Handling
DROP
Notification
Abort
handling
QS_MATCH_ORG_4
(blank)
QS_MATCH_PERSON_1
C1
QS_MATCH_PERSON_2
C3
QS_MATCH_PERSON_3
C5
QS_MATCH_PERSON_4
C7
QS_PHONETIC_CODING_TYPE_ADDRESS
QSNYSIIS
QS_PHONETIC_CODING_TYPE_ORGANIZATION
QSNYSIIS
QS_PHONETIC_CODING_TYPE_PERSON
QSNYSIIS
QS_REJECT_ADDRESS_IF_NOT_STANDARDIZED
0
QS_REJECT_ORG_NAME_IF_NOT_STANDARDIZED
0
QS_REJECT_PERSON_NAME_IF_NOT_STANDARDIZED
0
DS_DETECTED_DUPLICATES_ACTION
E
DS_PARTY_DROP_SEVERITY_LEVELc
0
DS_EMAIL_ERROR_CHECK_DISTRIBUTION
[email protected]
DS_EMAIL_ERROR_CHECK_REPORT
1
DS_DROP_MAX_ITERATIONS
10
DS_FAILED_COLUMNIZATION_ACTIONd
C
DS_FAILED_RECORDIZATION_ACTIONe
C
DS_SIF_ERROR_THRESHOLD
120
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD
50
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT
12
a. We chose to adopt the time stamp format for finer granularity information.
b. We did not have organizations in our input data. However, if we had, the default value of C1 (which is SSN) is not apropriate.
c. Defines the severity level below which parties are dropped; we chose the least sensitive setting.
d. Defines what you want to do with a parsing failure; we chose C(ontinue).
e. Defines what you want to do with a parsing failure; we chose C(ontinue).
226
Master Data Management: IBM InfoSphere Rapid Deployment Package
Launch RDP for MDM jobs
We launched DL_000_DELTA_LOAD using Director as shown in Figure 5-75.
Figure 5-75 Launch RDP for MDM jobs (1 of 2)
The Job Run Options are shown in Figure 5-76 on page 228.
The successful completion of the job was checked in Director Client.
Note: Changes have been made in RDP job names from the initial release.
Therefore, you see references of new job names if you used the initial version
of RDP jobs.
We proceed to verify the successful load of the MDM data repository as
described in 5.5.7, “Verify successful load” on page 228.
Chapter 5. Financial services business scenario
227
Figure 5-76 Launch RDP for MDM jobs (2 of 2)
5.5.7 Verify successful load
After loading the MDM data repository using RDP for MDM jobs, you must verify
that the load was successful as follows:
1. Re-establish referential integrity constraints and triggers.
2. Use the MDM Server UI to query select rows.
Re-establish referential integrity constraints and triggers
The referential integrity constraints and triggers in the MDM data repository that
were dropped prior to the launch of the RDP for MDM jobs must be
re-established before proceeding with the query of selected rows.
Note: If the RDP for MDM jobs have successfully validated all the rows, then
no errors are highlighted by this process. If RI constraints are found to be
violated, then an error (SQLSTATE 23512) is raised and the table is put into a
check pending state. You then have to resolve these errors before proceeding
further.
Resume dropped triggers and referential integrity constraints. Use the following
command to re-create the triggers and referential integrity constraints that were
dropped previously:
db2 [email protected] -svf Filename
228
Master Data Management: IBM InfoSphere Rapid Deployment Package
In the command, Filename contains the SQL to create all the triggers and
referential integrity constraints.
Note: The scripts for resume triggers and referential integrity constraints will
be placed in the MDM installation folder:
<MDM_Install_Home>/database/MDM/<DB_TYPE>/Standard/ddl/
After completion of initial loading these scripts resume triggers and
constraints.
Use the MDM Server UI to query select rows
After the RI constraints and triggers have been successfully re-instated, you can
use the MDM Server UI to query select rows to verify that they were successfully
loaded.
Figure 5-77 on page 230 through Figure 5-81 on page 234 show the search and
successful retrieval of information relating to a customer whose given name is
Torben:
1. Use the web address for the MDM Server UI, and navigate to Party
Maintenance Console  Navigation  Search  Party Search in the
Navigation pane, as shown in Figure 5-77 on page 230. Do the following
steps:
a. In the Family Name field, type a question mark (?), which is a wildcard
character.
b. In the Given Name 1 field, type Torben.
c. Click SUBMIT.
Chapter 5. Financial services business scenario
229
Figure 5-77 Verify successful load (1 of 5)
230
Master Data Management: IBM InfoSphere Rapid Deployment Package
2. Figure 5-78 shows the results of the search and identifies only a single
customer with a Family Name of Andersom as satisfying the conditions of the
search. Select the customer by clicking the Party Id link as shown.
Figure 5-78 Verify successful load (2 of 5)
3. Figure 5-79 on page 232 shows the Master Data for this customer such as
Party Info, Addresses, Identifiers, and Contact Method. Click the icon
corresponding to Next in the Identifiers portion, as shown, to view details of
additional identifiers associated with Torben Andersom.
Note: The information associated with Torben Andersom has been
merged from separate recordsa in the input because the RDP for MDM
jobs was able to automatically match (“A1”) the Torben Andersom records
in the CHECKING, SAVINGS, and LOAN systems.
a. Passport Number is from the LOAN system; Social Security number is from
the CHECKING, SAVINGS systems.
Chapter 5. Financial services business scenario
231
Figure 5-79 Verify successful load (3 of 5)
232
Master Data Management: IBM InfoSphere Rapid Deployment Package
4. Figure 5-80 shows details of Torben Andersom’s driver’s license. Click the
Next icon again to view information about additional identifiers.
Note: Review Master Data information of other important customers also,
and after the retrieved information is deemed to be accurate, you can
conclude that the load by the RDP for MDM jobs is successful.
Figure 5-80 Verify successful load (4 of 5)
Chapter 5. Financial services business scenario
233
5. Figure 5-81 shows passport details of Torben Andersom.
Figure 5-81 Verify successful load (5 of 5)
234
Master Data Management: IBM InfoSphere Rapid Deployment Package
5.6 Suspect resolution
If matching is enabled in the RDP for MDM job in the configuration parameters
(QS_PERFORM_PERSON_MATCH and QS_PERFORM_ORG_MATCH both
set to 1 as shown in Figure 5-82 on page 236), then the RDP for MDM jobs are
likely to conclusively identify certain parties as being duplicates of each other
and will consolidate the information from these multiple parties into a single
record in the MDM data repository. However, when the match score falls below
the A1 cutoff value but above the B cutoff value (as shown in Figure 5-82 on
page 236), it cannot conclusively make the determination that certain parties are
duplicates, and therefore marks such potential duplicates as “suspects” for
manual review.
The process associated with manually reviewing these suspects and resolving
potential duplicates is called suspect resolution. The MDM Server UI provides
the capability to find the identified suspects and resolve (and mark) them as
duplicates or not, as described in Figure 5-82 on page 236 through Figure 5-90
on page 244. It involves searching for suspects, reviewing their details, and
collapsing them into a single record and choosing the column values to store in
the collapsed record.
Note: Using the real-time services of MDM Server will ensure that they are not
raised as possible duplicates again.
Chapter 5. Financial services business scenario
235
The steps are as follows:
1. Use the web address for the MDM Server UI and navigate to Data
Stewardship Console  Navigation  Party Suspect Processing 
Suspect Search in the Navigation pane as shown in Figure 5-82. Click
SEARCH to view up to 100 potential suspects in both persons and
organizations.
Figure 5-82 Suspect resolution (1 of 9)
236
Master Data Management: IBM InfoSphere Rapid Deployment Package
2. Figure 5-83 shows the results of the search and identifies a list of persons (we
did not have organizations in our data), which have suspects that are
associated with them. We chose to resolve potential duplicates associated
with the person named JASTINDERK with Party Id 1120000000101288 by
selecting the person and clicking the icon corresponding to Open Suspect List
as shown.
Figure 5-83 Suspect resolution (2 of 9)
Chapter 5. Financial services business scenario
237
3. Figure 5-84 shows one other person with Party Id 4070000000000788 with
the same name and a Match Reason code of A217. Because we believe this
to be a duplicate, we select this suspect and click COLLAPSE CANDIDATES
LIST to view the Collapsed Candidates List window, as shown in Figure 5-85
on page 239 which shows the single candidate (Party Id 4070000000000788)
that has the Suspect Status of Parties are Duplicates. Click PREVIEW
COLLAPSE to review what the collapsed party information should be.
Figure 5-84 Suspect resolution (3 of 9)
17
238
A2 indicates that there is a reasonable certainty that the two records represent the same party.
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-85 Suspect resolution (4 of 9)
Chapter 5. Financial services business scenario
239
4. Figure 5-86 through Figure 5-89 on page 243 show you the source data and
the suspect data side-by-side and allows you to decide what the collapsed
party columns values should be. For example, in Figure 5-86 you can use the
drop-down list for Marital Status to select Single. See Figure 5-87 on
page 241.
Figure 5-86 Suspect resolution (5 of 9)
240
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-87 Suspect resolution (6 of 9)
Chapter 5. Financial services business scenario
241
The Solicitation Indicator can be set to any value from the drop down list as
shown in Figure 5-88 and Figure 5-89 on page 243, even though neither the
source nor the suspect have this information. Click SUBMIT when the
collapsed party information has been updated to your satisfaction. The UI
sends business service request to MDM Server to collapse suspects.
Figure 5-88 Suspect resolution (7 of 9)
242
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-89 Suspect resolution (8 of 9)
Chapter 5. Financial services business scenario
243
5. Figure 5-90 shows the collapsed party information, which now has a new
Party Id 869122548964091809.
Figure 5-90 Suspect resolution (9 of 9)
244
Master Data Management: IBM InfoSphere Rapid Deployment Package
Note: Each time parties are collapsed, the MDM Server invokes suspect
processing for a newly created party to identify any new potential suspects. If
you have modified the Standardization and Matching QualityStage rules in the
RDP for MDM initial load, the same rules must be deployed in MDM Server
run time to ensure identical business logic between runtime services requests
and load. For more information about integrating runtime QualityStage rules
with MDM Server see InfoSphere MDM Server Developer Guide.
You should repeat this process for all the persons that have suspects associated
with them.
You may now integrate the real-time services of MDM Server into your existing
applications as described earlier.
In our scenario, we wrote a new application to consume master data in MDM
Server as described in 5.8, “MDM consumption application” on page 266.
5.7 Hierarchies
Hierarchies provide a view of relationships between parties/contacts. The
deletion of a party/contact, or the collapse of multiple parties/contacts into a
single party/contact can have an impact on existing hierarchy structures.
5.7.1 Hierarchy overview
MDM supports a directed acyclic graph hierarchy. You can define multiple
hierarchies (Primary Key is the HIERARCHY_ID column in the HIERARCHY
table). A hierarchy includes hierarchy nodes (Primary Key is the
HIERARCHY_NODE_ID column in the HIERARCHYNODE table), and each
hierarchy node18 has a column INSTANCE_PK column whose value matches the
Party Id of the corresponding party/contact19. Each HIERARCHY_NODE_ID
value corresponds to a single hierarchy as identified by the HIERARCHY_ID
column in this table and which is a foreign key to the HIERARCHY table. A
party/contact may have zero or many hierarchy nodes associated with it. A
hierarchy node may also be associated with multiple hierarchies.
18
Each node must reference a valid Hierarchy using the Hierarchy Name (such as Legal, Marketing
and Finance) and TypeCode (1,2, and 3).
19
The RDP for MDM data model only supports a party/contact hierarchy even though MDM Server
supports product hierarchies also.
Chapter 5. Financial services business scenario
245
Figure 5-91 illustrates the concept, as follows:
 Three hierarchies: National, Western Region, and Eastern Region
 Six parties: Austin, Bill, Charles, David, Estelle, and Frank
 Twelve hierarchy nodes in all: six hierarchy nodes (corresponding to Austin,
Bill, Charles, David, Estelle and Frank) associated with the National hierarchy,
three hierarchy nodes (corresponding to Austin, Bill, and Charles) associated
with the Western Region hierarchy, and three hierarchy nodes (corresponding
to David, Estelle, and Frank) associated with the Eastern Region hierarchy.
Note: Each party has two corresponding hierarchy nodes associated with
it.
 All six parties belong to the National hierarchy and are in a hierarchy of
reporting where Estelle and Frank report to David, who in turn reports to
Charles. Bill and Charles report to Austin who is at the top of the hierarchy
and is defined as the hierarchy ultimate parent.
 Three parties (Austin, Bill and Charles) are associated with the Western
Region hierarchy. Austin is again at the top of the hierarchy here and is
defined as the hierarchy ultimate parent.
 Three parties (David, Estelle and Frank) are associated with the Eastern
Region hierarchy. David is at the top of this hierarchy and is defined as the
hierarchy ultimate parent.
Note: Business rules have been defined to ensure that a cyclic graph does not
occur.
Hierarchy
National
Hierarchy
Western Region
Hierarchy Node (HUP)
Austin
Hierarchy Node Hierarchy Ultimate Parent (HUP)
Austin
Hierarchy Node
Bill
Hierarchy Node
Charles
Hierarchy Node
Bill
Hierarchy Node
Charles
Hierarchy Node
David
Hierarchy Node
Estelle
Hierarchy Node
Frank
Figure 5-91 Hierarchy scenario example
246
Master Data Management: IBM InfoSphere Rapid Deployment Package
Hierarchy
Eastern Region
Hierarchy Node
David
Hierarchy Node
Estelle
(HUP)
Hierarchy Node
Frank
Hierarchy data is processed as a separate feed after all other party/contact data
has been validated, matched, keys assigned and the data loaded into the MDM
data repository. The input hierarchy data is validated against the hierarchy data
and party/contact information already in the MDM data repository.
The Hierarchy RT/ST (Table B-20 on page 307 through Table B-23 on page 308)
data is processed in the same manner as the non-hierarchy RT/ST party/contact
and contract data.
5.7.2 Hierarchy scenario
After successfully loading the MDM data repository for the FBankCoT scenario,
and performing suspect resolution, we defined a single hierarchy of parties
(persons and organizations), which showed the relationship of customers to the
bank’s marketing organizations.
Note: No organizations are in our FBANKCOT data. For the purposes of
creating a hierarchy, we loaded organization records in to the MDM data
repository. The loading of these organization records is not described here.
Figure 5-92 on page 248 shows the MARKETING hierarchy, the various party
(person20 and organization21) hierarchy nodes, and the hierarchy node
relationships defined for the FBankCoT scenario. We have combined persons
and organizations in the same hierarchy. We did not have persons in some
organizations in our scenario, an unlikely situation in the production environment.
20
21
Oval shape
Rectangle shape
Chapter 5. Financial services business scenario
247
US – Wide Marketing
State - California
Local Marketing – San
Jose
Torben Andersom
Carol Hansson
Anna Fanelli
Bruce H Anderson
Anders Olsson
Arcangelo Fanelli
State - Massachusetts
Local Marketing – San Francisco Local Marketing – Eugene
Aaron Jensen
Local Marketing – Salem
Renee Jackson
State - Washington
Local Marketing – Seattle
Jackie Jackson
Anton T & Larue Jensen
Allan Jensen
Brandon Jensen
Andrew I Jensen
Steven C Preston
Yesica Anderson
A Carter
Christina Anderson
Alex Skov
Denise Farrel
Kurt Madi
Barry Rosen
Figure 5-92 FBankCoT hierarchy scenario
In this section, we describe the following information:
 The SIF hierarchy RT/ST records (that we created manually) to define the
hierarchy, hierarchy nodes, hierarchy relationships, and hierarchy ultimate
parent.
 RDP for MDM jobs Director output
 Verify successful hierarchy creation using MDM Server UI
248
Master Data Management: IBM InfoSphere Rapid Deployment Package
SIF hierarchy RT/ST records
The hierarchy RT/ST records (HH, HN, HR, and HU) in the SIF corresponding to
the hierarchy shown in Figure 5-92 on page 248 is shown in Example 5-5.
The sequence of the SIF records is immaterial because they are sorted into the
required sequence by the RDP for MDM jobs.
Example 5-5 SIF Hierarchy RT/ST records
H|R|A|MARKETING|2|1000001|3|1000001|7||||0|0|
H|R|A|MARKETING|2|1000001|2|1000001|9||||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000031|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|7|1000001|200000281|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000051|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|6|1000001|200000201|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000141|Leaf Node Link|||0|0|
H|N|A|MARKETING|2|1000001|7|CONTACT|City|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|5|CONTACT|City|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|200000141|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000171|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000251|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000181|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000121|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000211|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|R|A|MARKETING|2|1000001|1|1000001|2||||0|0|
H|R|A|MARKETING|2|1000001|1|1000001|4||||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000161|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|6|1000001|200000251|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000181|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000121|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|6|1000001|200000211|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000081|Leaf Node Link|||0|0|
H|N|A|MARKETING|2|1000001|2|CONTACT|State|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|4|CONTACT|State|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|9|CONTACT|City|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|200000081|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000221|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000261|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000301|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000131|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|H|A|MARKETING|2|United States of America|||0|0|
H|U|A|MARKETING|2|1000001|1|United States Of America|||0|0|
H|R|A|MARKETING|2|1000001|1|1000001|3||||0|0|
H|R|A|MARKETING|2|1000001|2|1000001|5||||0|0|
H|R|A|MARKETING|2|1000001|8|1000001|200000291|Leaf Node Link|||0|0|
Chapter 5. Financial services business scenario
249
H|R|A|MARKETING|2|1000001|5|1000001|200000071|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000041|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|6|1000001|200000241|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000011|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|6|1000001|200000221|Leaf Node Link|||0|0|
H|N|A|MARKETING|2|1000001|6|CONTACT|City|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|1|CONTACT|United States Of America|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|200000011|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000091|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000031|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000281|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000051|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000201|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|R|A|MARKETING|2|1000001|2|1000001|6||||0|0|
H|R|A|MARKETING|2|1000001|4|1000001|8||||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000171|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|6|1000001|200000261|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|9|1000001|200000301|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000131|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000001|Leaf Node Link|||0|0|
H|R|A|MARKETING|2|1000001|5|1000001|200000091|Leaf Node Link|||0|0|
H|N|A|MARKETING|2|1000001|3|CONTACT|State|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|8|CONTACT|City|||1||0|0|0|0|
H|N|A|MARKETING|2|1000001|200000001|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000161|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000291|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000071|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000041|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
H|N|A|MARKETING|2|1000001|200000241|CONTACT|CUSTOMER|||1|LOCAL|0|0|0|0|
250
Master Data Management: IBM InfoSphere Rapid Deployment Package
RDP for MDM jobs director output
The DL_200_Hierarchy job processes the SIF containing the hierarchy RT/STs:
1. Figure 5-93 shows the launch of the DL_200_Hierarchy job from the Director
and the selection of the canonical_1 file as the configuration parameter file for
this run; the contents of the canonical_1 file is not shown here.
2. Director Client shows the successful execution of this job or any errors and
warnings associated with this job run.
Figure 5-93 RDP for MDM jobs Director output
Chapter 5. Financial services business scenario
251
Verify successful hierarchy creation
We used the MDM Server UI to verify that the hierarchies were successfully
created, shown in Figure 5-94 through Figure 5-108 on page 265.
1. Expand the navigation pane in the MDM Server UI and click Search
Hierarchy By Party, as shown in Figure 5-94.
Figure 5-94 Hierarchy view using MDM Server UI (1 of 15)
252
Master Data Management: IBM InfoSphere Rapid Deployment Package
2. In the Party Search pane (Figure 5-95), enter the following information:
– In the Family Name field, type Andersom.
– In the Given Name 1 field, type Torben.
Click Submit to view the results of the search.
Figure 5-95 Hierarchy view using MDM Server UI (2 of 15)
Chapter 5. Financial services business scenario
253
3. Figure 5-96 shows one qualifying person in the Party Search Results pane.
Click the link 1180000000001888 under Party Id to view the associated
hierarchy information.
Figure 5-96 Hierarchy view using MDM Server UI (3 of 15)
254
Master Data Management: IBM InfoSphere Rapid Deployment Package
4. Figure 5-97 shows Torben Andersom belonging to the MARKETING
hierarchy. To view more information about the MARKETING hierarchy, click
the MARKETING link under Hierarchy Name.
Figure 5-97 Hierarchy view using MDM Server UI (4 of 15)
5. Figure 5-98 on page 256 and Figure 5-99 on page 257 show the FULL VIEW
of the MARKETING hierarchy that shows the ancestors and the children in
this hierarchy.
Chapter 5. Financial services business scenario
255
Figure 5-98 Hierarchy view using MDM Server UI (5 of 15)
256
Master Data Management: IBM InfoSphere Rapid Deployment Package
To view more details of a specific node in the hierarchy such as Marketing California, click the corresponding link as shown in Figure 5-99.
Figure 5-99 Hierarchy view using MDM Server UI (6 of 15)
Chapter 5. Financial services business scenario
257
6. Figure 5-100 shows details of the specific node Marketing - California. Click
RETURN to go back to the previous window.
Figure 5-100 Hierarchy view using MDM Server UI (7 of 15)
258
Master Data Management: IBM InfoSphere Rapid Deployment Package
7. Click US-Wide Marketing(UP) in Figure 5-101 to view details of this node
(root or hierarchy ultimate parent).
Figure 5-101 Hierarchy view using MDM Server UI (8 of 15)
Chapter 5. Financial services business scenario
259
8. Figure 5-102 shows the details identifying it as the root, Ultimate Parent Yes.
Figure 5-102 Hierarchy view using MDM Server UI (9 of 15)
260
Master Data Management: IBM InfoSphere Rapid Deployment Package
9. You may view the same information (FULL VIEW, ANCESTORS VIEW,
DESCENDENTS VIEW) by searching on an organization with Party Id
2030000000200388 as shown in Figure 5-103 through Figure 5-108 on
page 265.
Figure 5-103 Hierarchy view using MDM Server UI (10 of 15)
Chapter 5. Financial services business scenario
261
Figure 5-104 Hierarchy view using MDM Server UI (11 of 15)
Figure 5-105 Hierarchy view using MDM Server UI (12 of 15)
262
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-106 Hierarchy view using MDM Server UI (13 of 15)
Chapter 5. Financial services business scenario
263
Figure 5-107 Hierarchy view using MDM Server UI (14 of 15)
264
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-108 Hierarchy view using MDM Server UI (15 of 15)
Chapter 5. Financial services business scenario
265
5.8 MDM consumption application
The business requirement for the MDM solution implementation was one of
coexistence, where Master Data was maintained in the MDM repository that was
synchronized with the changes occurring in the source system (or systems) at
each end-of-day.
As mentioned previously, you typically integrate the real-time services of MDM
Server into your existing applications to access the master data therein.
However, our scenario involved writing a new simple MDM consumption
application that obtains a 360-degree view of a customer, where Master Data is
obtained from the MDM Server through web service call and non-Master Data is
retrieved from the corresponding source systems checking, savings, and loan.
Our application involved a GUI interface that provided for search on first name
and last name in the MDM repository, returning the address (Master Data from
the MDM repository) and balance (non-Master Data) from the appropriate source
systems checking, savings, and loan.
The MDM consumption application22 was developed as a JSP and J2EE
application and performs the following functions:
It uses a JSP to provide the GUI, shown in Figure 5-109 on page 267, which
allows you to search on first name and last name:
1. It uses the PartyServiceProxy() web service to obtain the party ID and
address information for persons matching the search criteria23. The code
invoking the web service is highlighted in Example 5-6 on page 269.
Note: In our sample application, we did not provide for wildcard searches
and assumed that the results of the search would either be zero or one row
from the MDM repository with the associated party ID.
2. This party ID is then used to retrieve the address information from the MDM
repository, and the corresponding source system keys (SSK) for the checking,
saving and loan systems as highlighted in Example 5-6 on page 269.
3. The SSKs are then used to connect to the DB2 for LUW source system (or
systems) to retrieve the balance (non-key) data (as highlighted in
Example 5-6 on page 269) and present back to the user as shown in
Figure 5-110 on page 268.
22
23
266
Download from the IBM Redbooks website ftp://www.redbooks.ibm.com/redbooks/SG247704
In our case, we assumed only zero or one row to be returned matching the criteria, which is highly
unlikely in a real-world environment.
Master Data Management: IBM InfoSphere Rapid Deployment Package
In this case, the customer Renee Jackson only has a savings account, and no
checking or loan accounts.
4. Figure 5-111 on page 268 and Figure 5-112 on page 269 show the customer
Torben Andersom having accounts in Checking, Savings, and Loan systems.
Note: The code shown in Example 5-6 on page 269 is only meant to show the
web service calls and subsequent access to the source systems. It has no
error handling capabilities, which is essential in a real-world application.
The CUSTOMERID value in the ADMIN_CLIENT_ID field of the CONTEQUIV
table in MDM is used to access the balance information for the Checking and
Loan systems.
In the case of the Savings system, the CUSTOMERID value in the
ADMIN_CLIENT_ID was generated from the SAVINGSID by adding another
character to it. The MDM consumption application strips this additional
character to get the SAVINGSID key which is then used to retrieve the
balance.
Figure 5-109 MDM consumption application (1 of 4)
Chapter 5. Financial services business scenario
267
Figure 5-110 MDM consumption application (2 of 4)
Figure 5-111 MDM consumption application (3 of 4)
268
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure 5-112 MDM consumption application (4 of 4)
The code invoking the web service is highlighted in Example 5-6.
Example 5-6 FBankCoT 360 view test.jsp
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<%@page language="java" contentType="text/html; charset=ISO-8859-1"
pageEncoding="ISO-8859-1"%>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.common.intf.schema.Control" %>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.intf.schema.PersonSearchResultsResponse" %>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.port.PartyServiceProxy" %>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.schema.PersonSearch" %>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.schema.PersonSearchResult"%>
<%@page import="com.ibm.www.xmlns.prod.websphere.wcc.party.intf.schema.PartyAdminSysKeyResponse"%>
<%@page import="java.sql.*"%>
<html>
<head>
<title>test</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<meta name="GENERATOR" content="Rational Application Developer">
</head>
<body>
<%
if(request.getParameter("query") != null){
// retrieve the first and last name as search parameters
String lastname = request.getParameter("lastname");
String firstname = request.getParameter("firstname");
// Retrieve the party
try{
PartyServiceProxy myPSP = new PartyServiceProxy();
PersonSearch myPSrch = new PersonSearch();
myPSrch.setLastName(lastname);
myPSrch.setGivenNameOne(firstname);
Control myControl = new Control();
myControl.setRequesterName("wasadmin");
myControl.setRequestId(12312);
Chapter 5. Financial services business scenario
269
PersonSearchResultsResponse myPSRR = myPSP.searchPerson(myControl, myPSrch);
%>
<table border="1" width="70%">
<tr><td colspan="2"><p>Customer Details:</td></tr>
<%
PersonSearchResult myPSrchR = myPSRR.getSearchResult(0);
%>
<tr><td colspan="2"><p><%=myPSrchR.getGivenNameOne()%> <%=myPSrchR.getLastName()%>
<%
PersonSearch myPSResult = myPSrchR.getMatchedFields();
long partyId = myPSResult.getPartyId().longValue(); %>
<p style="line-height: normal"><%=myPSResult.getAddrLineOne()%><br>
<%=myPSResult.getCityName() %></p></td></tr>
<%// get the balance for checking from DB
PartyAdminSysKeyResponse myPASKR = null;
String strCustId = null;
boolean proceedFlag = false;
try{
myPASKR = myPSP.getPartyAdminSysKeyByPartyId(myControl,"1000000",partyId);
strCustId = myPASKR.getAdminSysKey().getAdminSysPartyId();
proceedFlag = true;
} catch (Exception e){
proceedFlag = false;
}
// define the vars so that we can re-use them.
javax.sql.DataSource ds = null;
java.sql.Connection con = null;
String query = null;
PreparedStatement stmt = null;
int custid = 0;
ResultSet rs = null;
javax.naming.InitialContext ctx = new javax.naming.InitialContext();
ds = (javax.sql.DataSource) ctx.lookup("jdbc/test");
con = ds.getConnection("db2inst1","itso13sj");
if(proceedFlag){
query = "select balance from db2inst1.checking " + " where customerid =?";
stmt = con.prepareStatement(query);
custid = Integer.parseInt(strCustId);
stmt.setInt(1, custid);
rs = stmt.executeQuery();
if(rs.next()){ %>
<tr><td width="40%">Balance for Checking Account:</td><td
align="right"><%=rs.getBigDecimal("balance")%></td></tr>
<% }
stmt.close();
} else { %>
<tr><td colspan="2"><p> The customer does not have a checking account.</p></td></tr>
<% }
// now get the balance for the savings account
try{
myPASKR = myPSP.getPartyAdminSysKeyByPartyId(myControl,"1000001",partyId);
strCustId = myPASKR.getAdminSysKey().getAdminSysPartyId();
270
Master Data Management: IBM InfoSphere Rapid Deployment Package
strCustId = strCustId.substring(0,strCustId.length()-1);
proceedFlag = true;
} catch (Exception e){
proceedFlag = false;
}
if(proceedFlag){
query = "select balance from db2inst1.savings where savingsid =?";
custid = Integer.parseInt(strCustId);
stmt = con.prepareStatement(query);
stmt.setInt(1, custid);
rs = stmt.executeQuery();
if(rs.next()){ %>
<tr><td width="40%">Balance for Savings Account:</td><td
align="right"><%=rs.getBigDecimal("balance")%></td></tr>
<% }
stmt.close();
} else { %>
<tr><td colspan="2"><p> The customer does not have a savings account.</p></td></tr>
<% }
// now get the balance for the Loan account
try{
myPASKR = myPSP.getPartyAdminSysKeyByPartyId(myControl,"1000002",partyId);
strCustId = myPASKR.getAdminSysKey().getAdminSysPartyId();
proceedFlag = true;
} catch (Exception e){
proceedFlag = false;
}
if(proceedFlag){
query = "select balance from db2inst1.loan where customerid =?";
custid = Integer.parseInt(strCustId);
stmt = con.prepareStatement(query);
stmt.setInt(1, custid);
rs = stmt.executeQuery();
if(rs.next()){ %>
<tr><td width="40%">Balance for Loan:</td><td
align="right"><%=rs.getBigDecimal("balance")%></td></tr>
<% }
stmt.close();
} else { %>
<tr><td colspan="2"><p> The customer does not have a Loan.</p></td></tr>
<% } %>
</table>
<%
} catch(Exception e) {
e.printStackTrace();
Chapter 5. Financial services business scenario
271
}
} else {
%>
<p>Please enter the search criteria:
<form action="test.jsp" method="get">
<p>First Name: <input type="text" name="firstname"/>
<p>Last Name: <input type="text" name="lastname"/>
<input type="hidden" name="query"/>
<p><input type="submit" value="Submit"/>
</form>
<%
}
%>
</body>
</html>
5.9 Operational processing
After the initial loading is successful, FBankCoT has a need to update the MDM
repository with updated master data at periodic intervals. For this purpose, an
operational load is scheduled at specific time intervals during a week. Although
there are other options of doing operational load processing, such as using RDP
MDM jobs, for this example we have used MDM RDP assets. They use MDM
runtime maintenance services for processing the operational load. This section
explains the steps required to implement operational load for FBANKCOT using
the MDM RDP asset.
The steps for processing operational load are as follows:
1. Extract delta data from the source system. Because we have three data
sources, we receive three operational data files. This extraction can also be
done using the ETL tool, depending on the customer situation. However, for
this example, we assume that no direct connectivity is available for ETL, and
FBankCoT has scheduled a batch process on the source system to receive
data files for daily operational processing. These files could have the updated
records or new records coming from the source systems.
2. Use DataStage to convert the received files from the source system to SIF,
which can be done in a similar way as was done for the initial load.
3. After converting the data to one SIF, convert the SIF file into a Sequence SIF
file as the batch processor utility that we will use requires the data to be in
specific sequence. A DataStage job named IL_000_Autostart_EX in
272
Master Data Management: IBM InfoSphere Rapid Deployment Package
DataStage and QualityStage RDP assets allows us to convert SIF file to a
Sequence SIF format, as shown in Figure 5-113.
4. See 2.3, “Standard Interface File (SIF)” on page 16 to get details of the
sequence file format.
Figure 5-113 Job IL_000_Autostart_EX
5. Open the batch_extension.properties file, which is located in the following
folder:
/opt/IBM/MDM/CAM_MDM902_08192010_2159_DB2_BE01/BatchProcessor/properties
Modify the file by setting the following parameter:
ParseAndExecConfiguration.Parser = TCRMService
6. Open the Batch.properties file, which is located in the following folder:
/opt/IBM/MDM/CAM_MDM902_08192010_2159_DB2_BE01/BatchProcessor/properties
Modify the file by setting the following two parameters:
–
ServerConfiguration.provider_url=corbaloc:iiop:<ServerName:portNumber>
For example:
ServerConfiguration.provider_url=
corbaloc:iiop:gandalf.torolab.ibm.com:9825
– ServerConfiguration.context_factory = <CTX_FACTORY>
For example:
ServerConfiguration.context_factory=com.ibm.websphere.naming.WsnI
nitialContextFactory
Chapter 5. Financial services business scenario
273
7. We have the sequence SIF file ready and also we have set the required
parameters in the properties file. We can now call the batch processor.
– For Linux, the batch processor utility (runbatch.sh) is in the following path:
/opt/IBM/MDM/CAM_MDM902_08192010_2159_DB2_BE01/BatchProcessor/bin/
– Run the following command:
runbatch.sh inputFile outputPath
batch_extensionPropertiesFileName
For example:
runbatch.sh /opt/IBM/MDM/Regression/BatchFramework/seed/delta.sif
/opt/IBM/MDM/Regression/BatchFramework/seed/logs batch_extension
8. Check the log files for successful execution of delta load in the folder specified
as the output path.
9. Go to the following directory:
/opt/IBM/MDM/Regression/BatchFramework/seed/logs
The directory contains three log files:
– batchLoadSuccess.out
– batchLoadFail.out
– batchLoadSuspect.out
10.Review the batchLoadFail.out file for any failed record loading to MDM
Server. It contains error messages and error reason codes for debugging
purposes.
11.After you determine that no errors are listed in the output log files, go to the
data stewardship console to verify whether the delta load changes are
reflected into MDM repository.
See 5.5.7, “Verify successful load” on page 228 for the process of checking
data in the data stewardship UI.
274
Master Data Management: IBM InfoSphere Rapid Deployment Package
A
Appendix A.
Configuration parameter file
A number of parameters are provided to control the execution of the RDP for
MDM jobs. These are all listed in Table A-1 on page 278 through Table A-4 on
page 284 with a brief description and their default value.
In this book, for ease of understanding, we classified the various parameters into
broad categories and sub-categories based on their function. We then also
identified the parameters in these categories that must be modified (Table A-5 on
page 288) before the RDP for MDM jobs can be executed, and those that you
should consider modifying (Table A-6 on page 290) before the RDP for MDM
jobs can be executed.
Note: For a detailed description of these parameters, see the IBM WebSphere
DataStage and QualityStage Version 8 Parallel Job Developer Guide,
SC18-9891.
The parameters working in conjunction with the CONFIGELEMENT table from
MDM as described in 2.2, “MDMIS Parameter Set” on page 8 list the names of
the CONFIGELEMENT records used to derive them in the following tables.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
275
Categories and sub-categories
The broad categories and sub-categories that are defined are as follows:
 SETUP category (Table A-1 on page 278) identifies parameters that are
associated with setting up the environment for the RDP for MDM jobs to run,
such as the database instance (DB_INSTANCE) to connect to, the location of
the various libraries ($APT_DB2INSTANCE_HOME), and the error file
directories (FS_ERROR_DIR).
We defined the following sub-categories:
– Connection sub-category includes parameters to access the MDM
repository database and includes DB_CONNECT_STRING,
DB_INSTANCE, and $APT_DB2INSTANCE_HOME.
– DS PARAMETER sub-category includes parameters that identify the
DataStage code libraries and configuration files and includes
DS_PARALLEL_APT_CONFIG_FILE.
– Miscellaneous sub-category includes parameters such as the date format
in the SIF file (DS_SOURCE_DATE_FORMAT) and the language code
(DS_LANGUAGE_TYPE_CODE).
– File location sub-category includes parameters that identify the path of the
various files such as error (FS_ERROR_DIR), log (FS_LOG_DIR), and
parameter sets (FS_PARAM_SET_DIR).
– QualityStage sub-category includes parameters that specify whether
standardization and matching should be performed, such as
QS_STAN_ADDRESS, QS_STAN_PERSON_NAME,
QS_STAN_ORG_NAME, QS_PERFORM_MATCH, and
QS_PERFORM_ORG_MATCH. If standardization and matching is
requested, then parameters in support of these functions can be
customized, such as QS_A1_MATCH_CUTOFF_PERSON,
QS_A2_MATCH_CUTOFF_PERSON, and
QS_MATCH_PERSON_NATID.
 ERROR HANDLING category (Table A-2 on page 282) identifies parameters
that define the action to be taken when errors are detected by the RDP for
MDM jobs, such as the action to be taken when duplicates are detected in the
SIF (DS_DETECTED_DUPLICATES_ACTION), the number of errors
threshold beyond which a job should be aborted
(DS_DROP_MAX_ITERATIONS), and the severity level above which parties
should be dropped (DS_PARTY_DROP_SEVERITY_LEVEL).
276
Master Data Management: IBM InfoSphere Rapid Deployment Package
We defined the following sub-categories:
– DROP sub-category includes parameters that define when records should
be dropped given an error condition, such as
DS_DETECTED_DUPLCATES_ACTION and
DS_PARTY_DROP_SEVERITY_LEVEL.
– Notification sub-category includes parameters that identify the persons to
be notified when errors occur, such as
DS_EMAIL_ERROR_CHECK_DISTRIBUTION and
DS_EMAIL_ERROR_CHECK_REPORT.
– Abort handling sub-category includes parameters that specify the error
conditions and thresholds that should cause the job to abort, such as
DS_DROP_MAX_ITERATIONS,
DS_FAILED_COLUMNIZATION_ACTION, and
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD.
– Error Consolidation sub-category includes parameters that specify
whether a party or contract should be dropped when a specific error
occurs, such as DROP_ON_PRVBY_ERR and DROP_ON_REPLB_ERR.
 RUNTIME category (Table A-3 on page 284) identifies parameters that must
be provided at run time to uniquely identify that execution and include the SIF
files to be processed (FS_SIF_FILE_PATTERN and
FS_HIERARCHY_SIF_FILE_PATTERN), the processing date
(DS_PROCESSING_DATE) and batch ID (BATCH_ID). There are no
sub-categories.
 ADVANCED category (Table A-4 on page 284) identifies parameters that
afford greater control over the performance and functionality of the RDP for
MDM jobs such as the type of history records created
(LOAD_HISTORY_FLAG), the columns used to calculate the checksum used
in address matching (DS_MD5_CRITICAL_ADDRESS_COLUMNS), and the
next value to be used in surrogate key generation
(SK_MID_CONT_ID_NEXT_VAL) and the file holding it
(SK_MID_CONT_ID_SIF).
We defined the following sub-categories:
– DEBUG sub-category includes parameters for debugging, such as
DS_LAND_FILE_FLAG.
– DS PARAMETER sub-category includes parameters that control the
DataStage environment variables, such as
$APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL and
$APT_NO_SORT_INSERTION.
– HISTORY sub-category includes the parameter LOAD_HISTORY_FLAG
that controls history record creation.
Appendix A. Configuration parameter file
277
– SETUP sub-category includes the
DS_MD5_CRITICAL_ADDRESS_COLUMNS parameter that specifies the
columns used to calculate the checksum used in address matching.
– SURROGATE sub-category parameters specifies the next surrogate key
value (SK_MID_CONT_ID_NEXT_VAL) to be used and the file
(SK_MID_CONT_ID_SIF) it is to be taken from for the various identifiers.
Table A-1 RDP configuration parameters by the SETUP category
Sub category
Parameter
Default
Description
Name in CONFIGELEMENT
Connection
DS PARAMETER
Miscellaneous
DB_CONNECT_STRING
(blank)
Values to control the database access. Will
vary between environments.Is set in the
MDM_CONNECTION parmset
DB_USERID
(blank)
Values to control the database access. Will
vary between environments.Is set in the
MDM_CONNECTION parmset
DB_PASSWORD
(blank)
Values to control the database access. Will
vary between environments.Is set in the
MDM_CONNECTION parmset
DB_SCHEMA
(blank)
Values to control the database access. Will
vary between environments.Is set in the
MDM_CONNECTION parmset
DB_CLIENT_INSTANCE
(blank)
Values to control the database access. Will
vary between environments.Is set in the
MDM_CONNECTION parmset
DB_SERVER_INSTANCE
(blank)
Values to control the database access. Will
vary between environments.Is set in the
MDM_CONNECTION parmset
DB_ALIAS
(blank)
Values to control the database access. Will
vary between environments.Is set in the
MDM_CONNECTION parmset
$APT_DB2INSTANCE_HOME
/home/dsadm/remote_db2config
(blank)
DS_STRING_PADCHAR
0x0
Pad character to be used in the Load jobs
DS_PARALLEL_APT_CONFIG_FILE
/opt/IBM/InformationServer/Server/Configuratio
ns/MDM_Default.apt
DataStage Configuration file for parallel jobs
DS_SEQUENTIAL_APT_CONFIG_FILE
/opt/IBM/InformationServer/Server/Configuratio
ns/MDM_1X1.apt
DataStage Configuration file for sequential
jobs
DS_SOURCE_DATE_FORMAT
%yyyy-%mm-%dd
Timestamp format in the SIF Files
MDM_DEPLOYMENT_NAME
WebSphere Customer Center
MDM deployment name required by the jobs
reading and writing to the CONFIGELEMENT
table. Must match the deployed MDM
application name in order for it to update the
correct values.
DS_LANGUAGE_TYPE_CODE
100
MDM Language ID - 100 (default). 100 =
English
DS_USE_NATIVE_KEY
1
Use NativeKey for Contract_ID resolution - 1
(true) / 0 (false).
/IBM/ELMDM/IIS/Contract/useNativeKey/enabled
278
Master Data Management: IBM InfoSphere Rapid Deployment Package
Sub category
Parameter
Default
Description
File location
DS_SUPPORT_FILE_DIR
/mdmisdata03/Projects/MDMISINT3/FREQ/
Directory where required files are installed.
For instance FREQUENCY files used by QS
Match. (at present this seems to be the only
files stored there)
FS_DATA_SET_HEADER_DIR
/mdmisdata03/Projects/MDMISINT3/DATA/
Data set headers directory. The place where
.ds files descriptors are stored. Actually ds
data are stored in the database.
Name in CONFIGELEMENT
/IBM/ELMDM/IIS/Install/ISDataSetHeaders/path
FS_ERROR_DIR
/mdmisdata03/Projects/MDMISINT3/ERROR/
Error files directory
/IBM/ELMDM/IIS/Install/ErrorFiles/path
FS_LOG_DIR
/mdmisdata03/data/MDMIS/LOG/
Log files directory
FS_PARAM_SET_DIR
./ParameterSets/
Parameter Set directory
FS_REJECT_DIR
/mdmisdata03/Projects/MDMISINT3/REJECT/
Reject files directory
/IBM/ELMDM/IIS/Install/RejectFiles/path
FS_SK_FILE_DIR
/mdmisdata03/Projects/MDMISINT3/SK/
Surrogate key files directory.
/IBM/ELMDM/IIS/Install/SKFiles/path
QualityStage
FS_TMP_DIR
/mdmisdata03/data/MDMIS/TMP/
Temporary files directory
QS_A1_MATCH_CUTOFF_ORGANIZATION
205
Specify Org A1a Minimum Match Score - 205
QS_A1_MATCH_CUTOFF_PERSON
205
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/MatchScores/a1
Specify Person A1 Minimum Match Score 205
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/MatchScores/a1
QS_A2_MATCH_CUTOFF_ORGANIZATION
175
Specify Org A2b Minimum Match Score - 175
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/MatchScores/a2
QS_A2_MATCH_CUTOFF_PERSON
175
Specify Person A2 Minimum Match Score 175 (default).
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/MatchScores/a2
QS_ALLOW_LOB_MATCH
0
Allow match across Line of Business - 1 (true)
/ 0 (false)
/IBM/Party/SuspectProcessing/PersistDuplicateParties/enabled
QS_B_MATCH_CUTOFF_ORGANIZATION
150
QS_B_MATCH_CUTOFF_PERSON
150
Specify Org B Minimum Match Score - 150
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/MatchScores/b
Specify Person B Minimum Match Score 150
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/MatchScores/b
Appendix A. Configuration parameter file
279
Sub category
Parameter
QualityStage
QS_EXCLUDE_FIELDS_FROM_MATCH_ORGANIZATIO
N
Default
Description
Name in CONFIGELEMENT
(blank)
Select Critical Data Fields for Organization
Match.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgAddress/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgCity/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgCountry/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgState/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgCountry/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgPostCode/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgEstablishedDate/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString1/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString2/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString3/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString4/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgNationalID/enabled
QS_EXCLUDE_FIELDS_FROM_MATCH_PERSON
(blank)
Select Critical Data Fields for Individual
Match.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonAddress/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonBirthDate/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonCity/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonCountry/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonGender/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString1/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString2/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString3/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString4/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonNationalID/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonPostCode/enabled
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonState/enabled
QS_MATCH_ORG_1
I1
Specify Variable Match String type and TpCd
for organization: these are values I1 through
I12 which correspond to entries in the
CDIDTP table in the ID_TP_CD column. For
example, I1 corresponds to the Social
Security Number as shown in Figure A-1 on
page 292. The MDM UI allows you to add up
to 4 user specified columns as shown in
Figure A-1 on page 292 to include in the
match beside the 8 already pre-specified
ones. The actual values here are MDM codes
denoting the available match identifiers.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString1/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString1/TpCd
QS_MATCH_ORG_2
I2
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString2/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString2/TpCd
QS_MATCH_ORG_3
(blank)
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString3/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString3/TpCd
QS_MATCH_ORG_4
I3
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString4/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgMatchString4/TpCd
QS_MATCH_ORG_NATID
I8
Specify Variable Match NationalId for
organization: I8 (default) corresponds to the
passport number. The MDM UI allows you to
specify the document used for national ID
(drivers license, passport number, SSN etc.)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgNationalID/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Org/OrgNationalID/TpCd
280
Master Data Management: IBM InfoSphere Rapid Deployment Package
Sub category
Parameter
Default
QualityStage
QS_MATCH_PERSON_1
C1
Description
Name in CONFIGELEMENT
Specify Variable Match String type and TpCd
for person, C1 through C8: these correspond
to entries in the CDCONTMETHTP table in
the CONT_METH_TP_CD column as shown
in Figure A-2 on page 293. The MDM UI
allows you to add up to 4 user specified
columns to include in the match beside the
already pre-specified ones. The actual values
here are MDM codes denoting the available
match contact methods.
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString1/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString1/TpCd
QS_MATCH_PERSON_2
C3
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString2/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString2/TpCd
QS_MATCH_PERSON_3
C5
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString3/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString3/TpCd
QS_MATCH_PERSON_4
C7
(blank)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString4/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonMatchString4/TpCd
QS_MATCH_PERSON_NATID
C2
Specify Variable Match NationalId for person
- C2 (default). The MDM UI allows you to
specify the document used for national ID
(drivers license, passport number, SSN, and
so on.)
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonNationalID/type
/IBM/ELMDM/IIS/CriticalDataFieldsMatch/Person/PersonNationalID/TpCd
QS_PERFORM_ORG_MATCH
0
Perform Organization Match - 1 (true) / 0
(false)
/IBM/Party/SuspectProcessing/enabled
QS_PERFORM_PERSON_MATCH
0
QS_PHONETIC_CODING_TYPE_ADDRESS
QSNYSIIS
Perform Person Match -1 (true) / 0 (false)
/IBM/Party/SuspectProcessing/enabled
May be QSSOUNDEX or custom
/IBM/ELMDM/IIS/StandardizeAddress/PhoneticCodingType
QS_PHONETIC_CODING_TYPE_ORGANIZATION
QSNYSIIS
QS_PHONETIC_CODING_TYPE_PERSON
QSNYSIIS
May be QSSOUNDEX or custom
/IBM/ELMDM/IIS/StandardizeOrganizationName/PhoneticCodingType
May be QSSOUNDEX or custom
/IBM/ELMDM/IIS/StandardizePersonName/PhoneticCodingType
Appendix A. Configuration parameter file
281
Sub category
Parameter
Default
QualityStage
QS_REJECT_ADDRESS_IF_NOT_STANDARDIZED
0
QS_REJECT_ORG_NAME_IF_NOT_STANDARDIZED
0
Description
Name in CONFIGELEMENT
Reject record if standardization fails - 1 (true)
/ 0 (false)
/IBM/ELMDM/IIS/StandardizeAddress/RejectOnFail
Reject record if standardization fails -1 (true)
/ 0 (false)
If standardization leaves unhandled data
AND STREET_NAME, BOX_ID, DEL_ID are
all null
/IBM/ELMDM/IIS/StandardizeOrganizationName/RejectOnFail
QS_REJECT_PERSON_NAME_IF_NOT_STANDARDIZE
D
0
Reject record if standardization fails - 1 (true)
/ 0 (false)
/IBM/ELMDM/IIS/StandardizePersonName/RejectOnFail
QS_STAN_ADDRESS
0
If not equal to
com.ibm.mdm.thirdparty.integration.iis8.adap
ter.InfoServerStandarizerAdapter, will bypass
standardization for address.
/IBM/Party/Standardizer/Address/className
QS_STAN_ORG_NAME
0
If not equal to
com.ibm.mdm.thirdparty.integration.iis8.adap
ter.InfoServerStandarizerAdapter, will bypass
standardization for name.
/IBM/Party/Standardizer/Name/className
QS_STAN_PERSON_NAME
0
If not equal to
com.ibm.mdm.thirdparty.integration.iis8.adap
ter.InfoServerStandarizerAdapter, will bypass
/IBM/Party/Standardizer/Name/className
a. A1 is described in 3.6, “Matching” on page 47
b. A2 is described in 3.6, “Matching” on page 47
Table A-2 RDP configuration parameters by the ERROR HANDLING category
Sub category
Parameter
Default
Description
Name in CONFIGELEMENT
DROP
DS_DETECTED_DUPLICATES_ACTION
E
Action to take if duplicates (same key only) records are
detected in the SIF file. The duplicate records will be
removed from input. E: Error all duplicates / K: Keep first,
error others.
/IBM/ELMDM/IIS/Errors/detectedDuplicated/action
DS_PARTY_DROP_SEVERITY_LEVEL
4
Party will be dropped if there are errors with severity <=
DS_PARTY_DROP_SERVERITY_LEVEL. Severity level
ranges from 0 (worst) to 10 (least)
/IBM/ELMDM/IIS/Errors/partyDropSeverity/level
282
Master Data Management: IBM InfoSphere Rapid Deployment Package
Sub category
Parameter
Notification
DS_EMAIL_ERROR_CHECK_DISTRIBUTION
Default
Description
Name in CONFIGELEMENT
Abort
handling
Space-separated list of email address to receive error count
report of SIF errors by file (abort or not is controlled by three
parameters: DS_SIF_ERROR_THRESHOLD,
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD,
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT)
DS_EMAIL_ERROR_CHECK_REPORT
1
Flag to indicate whether the error report (of SIF file error
count) should be emailed at all. (abort or not is controlled by
three parameters: DS_SIF_ERROR_THRESHOLD,
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD,
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT)
DS_DROP_MAX_ITERATIONS
10
Number of times the job Contract_Iterative_Drop or
Party_Iterative_Drop will run (at max). The job will abort
when this level is reached. If it fails their are problems with
the data. Either fix the data or increase this value and restart
the job.
DS_FAILED_COLUMNIZATION_ACTION
F
Action if ANY row fails columnization - F: Fail (Default) / C:
Continue. Fail == abort the job
/IBM/ELMDM/IIS/Errors/failedColunmization/action
DS_FAILED_RECORDIZATION_ACTION
F
Action if ANY row fails recordization. Warning! Setting to C
may break the row counter - F: Fail (Default) / C: Continue.
Faill == abort the job
/IBM/ELMDM/IIS/Errors/failedRecordization/action
DS_SIF_ERROR_THRESHOLD
101
Percentage of ALL SIF records with errors that cause the job
stream to abort (any value above 100 will skip this check)
/IBM/ELMDM/IIS/Errors/failedIfInputRowsInError/percentage
Error
Consolidation
(MDM_EC
parameter
set)
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD
101
Percentage of an individual SIF File's Records with errors
that will cause the job stream to abort (Any value over 100
will skip this check)
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT
101
Number of Individual SIF Files, whose Error Threshold has
been exceeded, that are required for an abort.
DROP_ON_ASSIGNEDBY_ERR
1
Identifier assigned by party was dropped. 0. do not drop the
party, but drop the identifier record. 1. drop the party.
ReasonCode100385 severity <=party drop
DROP_ON_FROM_ERR
1
Contact Rel from party error action. 0 - do not drop the
party, but drop the contact rel record. 1 - drop the party.
ReasonCode100383 severity <=party drop
DROP_ON_PRVBY_ERR
1
Provided by party ID RI Validation 0 - Party will not be
dropped, 1 - drop parties when the provided by party was
dropped. ReasonCode100381 severity <=party drop
severity level
DROP_ON_REPLBY_ERR
1
Reply by contract ID RI Validation 0 - Contract will not be
dropped. 1 - drop contracts when the REPLBY contract was
dropped ReasonCode 100388 severity <=party drop
severity level
DUPLICATES_ERR_MSG_TP_CD
12
Err_mssage_tp_cd associated with Duplicate primary key.
Used in conjunction with
MDMIS.DS_DETECTED_DUPLICATES_ACTION
INCLUDE_LIST_ERR_MSG_TP_CD
|0|110127|110184|110125|110126|162
6|110208|110209|110385|
ErrorCodeIncludeList PipeDelimited List of ErrorCodes
used for error thresholding in job EC_Error_Check
|0|110127|110184|110125|110126|1626|110208|110209|1
10385|
RESET_ON_ASSIGNEDBY_ERR
0
Reset the assigned by to null, if the assigned by party was
dropped. This parameter is only useful when
DROP_ON_ASSIGNEDBY_ERR=0. Error Reason code
100391 severity <=party drop severity level
Appendix A. Configuration parameter file
283
Table A-3 RDP configuration parameters by the RUNTIME category
Sub category
Parameter
Default
Description
Name in CONFIGELEMENT
Runtime
FS_HIERARCHY_SIF_FILE_PATTERN
/mdmisdata03/Projects/MDMISINT3/SIF
_IN/sanitycheck/*.hsif
Hierarchy SIF files pattern. Includes full path and file
mask. All files meeting this pattern are read by RDP jobs.
/IBM/ELMDM/IIS/Install/HierarchySIFInputFiles/path
FS_SIF_FILE_PATTERN
/mdmisdata03/Projects/MDMISINT3/SIF
_IN/sanitycheck/*.sif
SIF files pattern. Includes full path and file mask. All files
meeting this pattern are read by the RDP jobs
/IBM/ELMDM/IIS/Install/SIFInputFiles/path
BATCH_ID (auto assigned)
1
Batch ID generated at run time if IL_000_AutoStart_EX is
used. If IL_000_INITIAL_LOAD is used batch ID is
assigned through the parameterset.
Will be appended to every output filename generated
during the job run
DS_PROCESSING_DATE (auto assigned)
1900-01-01 00:00:00
Generated at run time. Can be used to fix the processing
date if you are restarting the load at a later date.
Table A-4 RDP configuration parameters by the ADVANCED category
Sub category
Parameter
Default
Description
Name in CONFIGELEMENT
DEBUG
DS_LAND_FILE_FLAG
0
When set to 1 only Insert Update data sets are
created and the load jobs are not run.
0 implies database will be updated.
DS PARAMETER
$APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL
True
(blank)
$APT_IMPORT_PATTERN_USES_FILESET
True
(blank)
$APT_IMPORT_PATTERN_USES_FILESET_MOUNTED
True
(blank)
$APT_IMPORT_REJECT_STRING_FIELD_OVERRUNS
True
(blank)
$APT_NO_PART_INSERTION
True
(blank)
$APT_NO_SORT_INSERTION
True
(blank)
$APT_SORT_INSERTION_OPTIMIZATION
True
(blank)
$APT_OLD_BOUNDED_LENGTH
True
(blank)
HISTORY
LOAD_HISTORY_FLAG
C
History flag to set history records creation
type - C: Compound /S: Simple / N: None.
SETUP
DS_MD5_CRITICAL_ADDRESS_COLUMNS
ADDR_LINE_ONE,ADDR_LINE_TWO,ADDR_L
INE_THREE,CITY_NAME,POSTAL_CODE,PR
OV_STATE_TP_CD,COUNTRY_TP_CD,RESI
DENCE_NUM
/IBM/ELMDM/IIS/Install/History/type
/IBM/ELMDM/IIS/MD5CriticalAddressColumns/value
284
Master Data Management: IBM InfoSphere Rapid Deployment Package
The columns used to calculate the MD5
checksum used in address "matching"
Sub category
Parameter
Default
SURROGATE
SK_LOAD_SUFFIX
88
Description
Name in CONFIGELEMENT
Constant value that is appended to each
surrogate key. This avoids possible key
collisions with MDM Service generated IDs 88
/IBM/ELMDM/IIS/Key/LoadSuffix/value
SK_MASK
PPPMMMMMMMMMMMSS
Format of surrogate keys. Example
PPPMMMMMMMMMMMSS P=set size of
Cyclical Sequence,M=set size of
midSequence. S=set size of load suffix PPPMMMMMMMMMMMSS
/IBM/ELMDM/IIS/Key/Mask/value
SK_MID_ADDRESS_ID_NEXT_VAL
1
Surrogate key value for ADDRESS_ID
SK_MID_ADDRESS_ID_SF
skMid_ADDRESS_ID.sf
The file that holds the previous surrogate key
SK_MID_ALERT_ID_NEXT_VAL
1
Surrogate key value for ALERT_ID
/IBM/ELMDM/IIS/StartAtKeys/AddressID/part2
/IBM/ELMDM/IIS/StartAtKeys/AlertID/part2
SK_MID_ALERT_ID_SF
skMid_ALERT_ID.sf
The file that holds the previous surrogate key
SK_MID_CONT_EQUIV_ID_NEXT_VAL
1
Surrogate key value for CONT_EQUIV_ID
/IBM/ELMDM/IIS/StartAtKeys/ContEquivID/part2
SK_MID_CONT_EQUIV_ID_SF
skMid_Contacts_CONTEQUIV_ID.sf
The file that holds the previous surrogate key
SK_MID_CONT_ID_NEXT_VAL
1
Surrogate key value for CONT_ID
/IBM/ELMDM/IIS/StartAtKeys/ContactID/part2
SK_MID_CONT_ID_SF
skMid_Contacts_CONT_ID.sf
The file that holds the previous surrogate key
SK_MID_CONT_REL_ID_NEXT_VAL
1
Surrogate key value for CONT_REL_ID
/IBM/ELMDM/IIS/StartAtKeys/ContactRelID/part2
SK_MID_CONT_REL_ID_SF
skMid_ContactRel_CONT_REL_ID.sf
The file that holds the previous surrogate key
SK_MID_CONTACT_METHOD_ID_NEXT_VAL
1
Surrogate key value for
CONTACT_METHOD_ID
/IBM/ELMDM/IIS/StartAtKeys/ContactMethodID/part2
SK_MID_CONTACT_METHOD_ID_SF
skMid_CONTACT_METHOD_ID.sf
The file that holds the previous surrogate key
SK_MID_CONTR_COMP_VAL_ID_NEXT_VAL
1
Surrogate key value for
CONTR_COMP_VAL_ID
/IBM/ELMDM/IIS/StartAtKeys/ContractComponentValID/part2
SK_MID_CONTR_COMP_VAL_ID_SF
skMid_CONTR_COMP_VAL_ID.sf
The file that holds the previous surrogate key
SK_MID_CONTR_COMPONENT_ID_NEXT_VAL
1
Surrogate key value for
CONTR_COMPONENT_ID
/IBM/ELMDM/IIS/StartAtKeys/ContractComponentID/part2
SK_MID_CONTR_COMPONENT_ID_SF
skMid_CONTR_COMPONENT_ID.sf
The file that holds the previous surrogate key
SK_MID_CONTRACT_ID_NEXT_VAL
1
Surrogate key value for CONTRACT_ID
/IBM/ELMDM/IIS/StartAtKeys/ContractID/part2
Appendix A. Configuration parameter file
285
Sub category
Parameter
Default
Description
SURROGATE
SK_MID_CONTRACT_ID_SF
skMid_CONTRACT_ID.sf
The file that holds the previous surrogate key
SK_MID_CONTRACT_ROLE_ID_NEXT_VAL
1
Surrogate key value for
Name in CONFIGELEMENT
CONTRACT_ROLE_ID
/IBM/ELMDM/IIS/StartAtKeys/ContractRoleID/part2
SK_MID_CONTRACT_ROLE_ID_SF
skMid_CONTRACT_ROLE_ID.sf
The file that holds the previous surrogate key
SK_MID_HIER_ULT_PAR_ID_NEXT_VAL
1
Surrogate key value for HIER_ULT_PAR_ID
/IBM/ELMDM/IIS/StartAtKeys/HierarchyUltimateParentID/part2
SK_MID_HIER_ULT_PAR_ID_SF
skMid_HIER_ULT_PAR_ID.sf
The file that holds the previous surrogate key
SK_MID_HIERARCHY_ID_NEXT_VAL
1
Surrogate key value for HIERARCHY_ID
/IBM/ELMDM/IIS/StartAtKeys/HierarchyID/part2
SK_MID_HIERARCHY_ID_SF
skMid_HIERARCHY_ID.sf
The file that holds the previous surrogate key
SK_MID_HIERARCHY_NODE_ID_NEXT_VAL
1
Surrogate key value for
HIERARCHY_NODE_ID
/IBM/ELMDM/IIS/StartAtKeys/HierarchyNodeID/part2
SK_MID_HIERARCHY_NODE_ID_SF
skMid_HIERARCHY_NODE_ID.sf
The file that holds the previous surrogate key
SK_MID_HIERARCHY_REL_ID_NEXT_VAL
1
Surrogate key value for
HIERARCHY_REL_ID
/IBM/ELMDM/IIS/StartAtKeys/HierarchyRelID/part2
SK_MID_HIERARCHY_REL_ID_SF
skMid_HIERARCHY_REL_ID.sf
The file that holds the previous surrogate key
SK_MID_IDENTIFIER_ID_NEXT_VAL
1
Surrogate key value for IDENTIFIER_ID
/IBM/ELMDM/IIS/StartAtKeys/IdentifierID/part2
SK_MID_IDENTIFIER_ID_SF
skMid_Identifier_IDENTIFIER_ID.sf
The file that holds the previous surrogate key
SK_MID_LOB_REL_ID_NEXT_VAL
1
Surrogate key value for LOB_REL_ID
/IBM/ELMDM/IIS/StartAtKeys/LOBRelID/part2
SK_MID_LOB_REL_ID_SF
skMid_LOB_REL_ID.sf
The file that holds the previous surrogate key
SK_MID_LOCATION_GROUP_ID_NEXT_VAL
1
Surrogate key value for
LOCATION_GROUP_ID
/IBM/ELMDM/IIS/StartAtKeys/LocationGroup/part2
SK_MID_LOCATION_GROUP_ID_SF
skMid_LOCATION_GROUP_ID.sf
The file that holds the previous surrogate key
SK_MID_MISCVALUE_ID_NEXT_VAL
1
Surrogate key value for MISCVALUE_ID
/IBM/ELMDM/IIS/StartAtKeys/MiscValueID/part2
SK_MID_MISCVALUE_ID_SF
286
skMid_MISCVALUE_ID.sf
Master Data Management: IBM InfoSphere Rapid Deployment Package
The file that holds the previous surrogate key
Sub category
Parameter
Default
Description
SURROGATE
SK_MID_NATIVE_KEY_ID_NEXT_VAL
1
SK_MID_NATIVE_KEY_ID_SF
skMid_NativeKey_NATIVE_KEY_ID.sf
The file that holds the previous surrogate key
SK_MID_ORG_NAME_ID_NEXT_VAL
1
Surrogate key value for ORG_NAME_ID
Name in CONFIGELEMENT
Surrogate key value for NATIVE_KEY_ID
/IBM/ELMDM/IIS/StartAtKeys/NativeKeyID/part2
/IBM/ELMDM/IIS/StartAtKeys/OrgName/part1
SK_MID_ORG_NAME_ID_SF
skMid_OrgName_ORG_NAME_ID.sf
The file that holds the previous surrogate key
SK_MID_PERSON_NAME_ID_NEXT_VAL
1
Surrogate key value for PERSON_NAME_ID
/IBM/ELMDM/IIS/StartAtKeys/PersonName/part2
SK_MID_PERSON_NAME_ID_SF
skMid_PersonName_PERSON_NAME_ID.sf
The file that holds the previous surrogate key
SK_MID_PERSON_SEARCH_ID_NEXT_VAL
1
Surrogate key value for
PERSON_SEARCH_ID
/IBM/ELMDM/IIS/StartAtKeys/PersonSearch/part2
SK_MID_PERSON_SEARCH_ID_SF
skMid_PersonName_PERSON_SEARCH_ID.sf
The file that holds the previous surrogate key
SK_MID_PPREF_ID_NEXT_VAL
1
Surrogate key value for PPREF_ID
/IBM/ELMDM/IIS/StartAtKeys/PPrefID/part2
SK_MID_PPREF_ID_SF
skMid_PrivPref_PPREF_ID.sf
The file that holds the previous surrogate key
SK_MID_ROLE_LOCATION_ID_NEXT_VAL
1
Surrogate key value for
ROLE_LOCATION_ID
/IBM/ELMDM/IIS/StartAtKeys/RoleLocationID/part2
SK_MID_ROLE_LOCATION_ID_SF
skMid_ROLE_LOCATION_ID.sf
The file that holds the previous surrogate key
SK_MID_SUSPECT_ID_NEXT_VAL
1
Surrogate key value for SUSPECT_ID
/IBM/ELMDM/IIS/StartAtKeys/SuspectID/part2
SK_MID_SUSPECT_ID_SF
skMid_Contacts_SUSPECT_ID.sf
SK_PREFIX_CONT_ID_NEXT_VAL
1
The file that holds the previous surrogate key
Surrogate key value for CONT_ID
/IBM/ELMDM/IIS/StartAtKeys/ContactID/part1
SK_PREFIX_CONT_ID_SF
skPrefix_Contacts_CONT_ID.sf
SK_PREFIX_CONTRACT_ID_NEXT_VAL
1
The file that holds the previous surrogate key
Surrogate key value for CONTRACT_ID
/IBM/ELMDM/IIS/StartAtKeys/ContractID/part1
SK_PREFIX_CONTRACT_ID_SF
skPrefix_Contracts_CONTRACT_ID.sf
The file that holds the previous surrogate key
SK_PREFIX_HIERARCHY_ID_NEXT_VAL
1
Surrogate key value for HIERARCHY_ID
/IBM/ELMDM/IIS/StartAtKeys/HierarchyID/part1
SK_PREFIX_HIERARCHY_ID_SF
skPrefix_HIERARCHY_ID.sf
The file that holds the previous surrogate key
Appendix A. Configuration parameter file
287
MUST MODIFY parameters
Table A-5 lists all the parameters that, in our opinion, is a good practice.
Database connection details, file names and directories, and key DataStage
parameters all require to be provided before RDP for MDM jobs can be launched.
It is also a good practice to enable standardization and matching; the default is to
disable these jobs. Guidelines are provided where appropriate.
Table A-5 RDP configuration parameters MUST MODIFY list
Frequency
Category
Sub
category
Parameter
Recommendation to set parameter to
One time
SETUP
Connection
DB_CONNECT_STRING
(blank)
DB_INSTANCE
(blank)
DB_PASSWORD
(blank)
DB_SCHEMA
(blank)
DB_USERID
(blank)
$APT_DB2INSTANCE_HOME
/home/dsadm/remote_db2config
DS PARAMETER
Miscellaneous
File location
Runtime
288
Runtime
$APT_IMPORT_PATTERN_USES_FILESET_MOUNTED
TRUE
DS_STRING_PADCHAR
0x0
DS_PARALLEL_APT_CONFIG_FILE
/opt/IBM/InformationServer/Server/Configuratio
ns/MDM_Default.apt
DS_SEQUENTIAL_APT_CONFIG_FILE
/opt/IBM/InformationServer/Server/Configuratio
ns/MDM_1X1.apt
MDM_DEPLOYMENT_NAME
WebSphere Customer Centera
DS_LANGUAGE_TYPE_CODE
100
DS_SUPPORT_FILE_DIR
/mdmisdata03/data/MDMIS/PARAMETERS/
FS_DATA_SET_HEADER_DIR
/mdmisdata03/Projects/MDMISINT3/DATA/
FS_ERROR_DIR
/mdmisdata03/Projects/MDMISINT3/ERROR/
FS_LOG_DIR
/mdmisdata03/data/MDMIS/LOG/
FS_PARAM_SET_DIR
./ParameterSets/
FS_REJECT_DIR
/mdmisdata03/Projects/MDMISINT3/REJECT/
FS_SK_FILE_DIR
/mdmisdata03/Projects/MDMISINT3/SK/
FS_TMP_DIR
/mdmisdata03/data/MDMIS/TMP/
BATCH_ID (auto assigned)
1
DS_PROCESSING_DATE (auto assigned)
1/1/1900
FS_HIERARCHY_SIF_FILE_PATTERN
/mdmisdata03/Projects/MDMISINT3/SIF_IN/san
itycheck/*.hsif
FS_SIF_FILE_PATTERN
/mdmisdata03/Projects/MDMISINT3/SIF_IN/san
itycheck/*.sif
Master Data Management: IBM InfoSphere Rapid Deployment Package
Frequency
Category
Sub
category
Parameter
Recommendation to set parameter to
One time
ADVANCED
DS PARAMETER
$APT_IMPEXP_ALLOW_ZERO_LENGTH_FIXED_NULL
true
$APT_IMPORT_PATTERN_USES_FILESET
true
$APT_IMPORT_REJECT_STRING_FIELD_OVERRUNS
true
$APT_SORT_INSERTION_OPTIMIZATION
true
QS_MATCH_ORG_NATID
I2
QS_MATCH_PERSON_NATID
I1
Recurring
SETUP
QualityStage
QS_PERFORM_ORG_MATCH
1
QS_PERFORM_PERSON_MATCH
1
QS_STAN_ADDRESS
1
QS_STAN_ORG_NAME
1
QS_STAN_PERSON_NAME
1
a. This name must match the name used when deploying the MDM application.
Appendix A. Configuration parameter file
289
CONSIDER MODIFYING parameters
Table A-6 lists all the parameters that in our opinion, you should consider
modifying. Guidelines are provided where appropriate.
Table A-6 RDP configuration parameters in the CONSIDER MODIFYING list
Frequency
Category
Sub
category
Parameter
Recommendation to set parameter to
One time
SETUP
Miscellaneous
DS_SOURCE_DATE_FORMAT
%yyyy-%mm-%nn %hh:%nn%ss.6
DS_USE_NATIVE_KEY
1
ADVANCED
One time
290
ADVANCED
SURROGATE
SURROGATE
SK_MID_ADDRESS_ID_NEXT_VAL
1
SK_MID_ALERT_ID_NEXT_VAL
1
SK_MID_CONT_EQUIV_ID_NEXT_VAL
1
SK_MID_CONT_ID_NEXT_VAL
1
SK_MID_CONT_REL_ID_NEXT_VAL
1
SK_MID_CONTACT_METHOD_ID_NEXT_VAL
1
SK_MID_CONTR_COMP_VAL_ID_NEXT_VAL
1
SK_MID_CONTR_COMPONENT_ID_NEXT_VAL
1
SK_MID_CONTRACT_ID_NEXT_VAL
1
SK_MID_CONTRACT_ROLE_ID_NEXT_VAL
1
SK_MID_HIER_ULT_PAR_ID_NEXT_VAL
1
SK_MID_HIERARCHY_ID_NEXT_VAL
1
SK_MID_HIERARCHY_NODE_ID_NEXT_VAL
1
SK_MID_HIERARCHY_REL_ID_NEXT_VAL
1
SK_MID_IDENTIFIER_ID_NEXT_VAL
1
SK_MID_LOB_REL_ID_NEXT_VAL
1
SK_MID_LOCATION_GROUP_ID_NEXT_VAL
1
SK_MID_MISCVALUE_ID_NEXT_VAL
1
SK_MID_NATIVE_KEY_ID_NEXT_VAL
1
SK_MID_ORG_NAME_ID_NEXT_VAL
1
SK_MID_PERSON_NAME_ID_NEXT_VAL
1
SK_MID_PERSON_SEARCH_ID_NEXT_VAL
1
SK_MID_PPREF_ID_NEXT_VAL
1
SK_MID_ROLE_LOCATION_ID_NEXT_VAL
1
SK_MID_SUSPECT_ID_NEXT_VAL
1
SK_PREFIX_CONT_ID_NEXT_VAL
1
SK_PREFIX_CONTRACT_ID_NEXT_VAL
1
SK_PREFIX_HIERARCHY_ID_NEXT_VAL
1
Master Data Management: IBM InfoSphere Rapid Deployment Package
Frequency
Category
Sub
category
Parameter
Recommendation to set parameter to
Recurring
Recurring
SETUP
QualityStage
QS_ALLOW_LOB_MATCH
0
QS_EXCLUDE_FIELDS_FROM_MATCH_ORGANIZATIO
N
(blank)
QS_EXCLUDE_FIELDS_FROM_MATCH_PERSON
(blank)
QS_MATCH_ORG_1
(blank)
QS_MATCH_ORG_2
(blank)
QS_MATCH_ORG_3
(blank)
QS_MATCH_ORG_4
(blank)
QS_MATCH_PERSON_1
C1
QS_MATCH_PERSON_2
C3
QS_MATCH_PERSON_3
C5
QS_MATCH_PERSON_4
C7
QS_PHONETIC_CODING_TYPE_ADDRESS
QSNYSIIS
Error Handling
DROP
Notification
Abort
handling
QS_PHONETIC_CODING_TYPE_ORGANIZATION
QSNYSIIS
QS_PHONETIC_CODING_TYPE_PERSON
QSNYSIIS
QS_REJECT_ADDRESS_IF_NOT_STANDARDIZED
0
QS_REJECT_ORG_NAME_IF_NOT_STANDARDIZED
0
QS_REJECT_PERSON_NAME_IF_NOT_STANDARDIZE
D
0
DS_DETECTED_DUPLICATES_ACTION
E
DS_PARTY_DROP_SEVERITY_LEVEL
4
DS_EMAIL_ERROR_CHECK_DISTRIBUTION
DS_EMAIL_ERROR_CHECK_REPORT
1
DS_DROP_MAX_ITERATIONS
10
DS_FAILED_COLUMNIZATION_ACTION
F
DS_FAILED_RECORDIZATION_ACTION
F
DS_SIF_ERROR_THRESHOLD
101
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD
101
DS_SIF_INDIVIDUAL_ERROR_THRESHOLD_KOUNT
101
Appendix A. Configuration parameter file
291
The match columns for organization (QS_MATCH_ORG_*) and person
(QS_MATCH_PERSON_*) in Table A-1 on page 278 allow you to specify match
fields to be used:
 Allowable values I1 through I12 correspond to entries in the ID_TP_CD
column in the CDIDTP table in the ID_TP_CD as shown in Figure A-1. The
actual values here are MDM codes denoting the available match identifiers.
For example, I1 corresponds to the Social Security Number.
 Allowable values C1 through C8 correspond to entries in the
CONT_METH_TP_CD column in the CDCONTMETHTP table as shown in
Figure A-2 on page 293. The actual values here are MDM codes denoting the
available match contact methods. For example, C1 corresponds to the Home
Telephone number.
Figure A-1 CDIDTP table contents: corresponds to the I’n’ columns
292
Master Data Management: IBM InfoSphere Rapid Deployment Package
Figure A-2 CDCONTMETHTP table contents: corresponds to the C’n’ columns
Note: The match columns for organization (QS_MATCH_ORG_*) and person
(QS_MATCH_PERSON_*) in Table A-1 on page 278 are also derived from
the CONFIGELEMENT table in MDM when using the Matching Critical Data
Rules UI as shown in 2.5, “Configuration screens in the MDM Server UI” on
page 19.
Appendix A. Configuration parameter file
293
294
Master Data Management: IBM InfoSphere Rapid Deployment Package
B
Appendix B.
Standard Interface File
details
This appendix provides an overview of the Record Type/Sub Type (RT/ST)
mapping of the Standard Interface File (SIF).
The SIF has 23 RT/ST combinations (Table B-1 on page 297 through Table B-23
on page 308) to populate party and contact information in the MDM data
repository, each with specific fields that almost mirror corresponding columns
and tables in the MDM data repository model. It includes RT/ST combinations to
define hierarchies (HH, HN, HR and HU).
Map the key data columns in your source systems to corresponding columns in
the appropriate SIF RT/ST records before they can be loaded into the MDM
repository using RDP for MDM jobs.
Note: SIF supports both inserts to and updates of records in the MDM
repository, but not delete operations. In this book, we cover both inserts (for
initial load) and updates to perform delta processing.
To map the columns in your source systems to the SIF, the data types of each
column in the RT/ST must be known; this information is defined in the RT/ST
templates that is provided as part of the RDP for MDM solution. Table B-1 on
© Copyright IBM Corp. 2009, 2011. All rights reserved.
295
page 297 through Table B-23 on page 308 do not contain the data type
information.
When a value in the column of an RT/ST record can be null (as indicated by the
letter N in the “Can be empty?” column in Table B-1 on page 297), you can define
the action to be taken on the value in the corresponding column of the MDM data
repository when NULL is supplied as a value in that RT/ST column as described
next.
Set the null indicator for that column in the RT/ST to a 1 or 0. The Mapping Rule
specifies the action to be taken on the value in the corresponding column of the
MDM data repository. The null indicator columns (names beginning with
“NULL_”) and their corresponding Mapping Rule are shown in Table B-1 on
page 297 through Table B-23 on page 308. For example, in the RT/ST, the
NULL_PREF_LANG_TP_CD in Table B-1 on page 297 column corresponds to
the PREF_LANG_TP_CD column (which can be empty) and the Mapping rule for
it specifies that the following action be taken:
If 1 then set to null, if 0 and column is empty use prior value, if 0
and column is not empty overwrite prior value
This action specifies the following information:
 Setting a 1 in the NULL_PREF_LANG_TP_CD column in the PP RT/ST SIF
record indicates that you want the value in the corresponding column in the
MDM repository to be set to NULL.
 Setting to 0 with a null in the PREF_LANG_TP_CD column indicates that the
corresponding column in the MDM data repository should retain its prior
value. This applies to the case of an update operation.
 Setting to 0 with a non-null value in the PREF_LANG_TP_CD column
indicates that the corresponding column in the MDM data repository should
be over-written with the value in the PREF_LANG_TP_CD column. This
setting is applicable in an update operation.
Note: If the PREF_LANG_TP_CD column has a value, the null indicator
setting does not apply.
Table B-1 on page 297 through Table B-23 on page 308 provide a high-level
overview of the individual columns and mapping rules for each of the 23 RT/ST
combinations.
296
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table B-1 Contact information RT/ST is PP
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
FORCE_MATCH
CONTEQUIV_DESCRIPTION
ACCE_COMP_TP_CD
PREF_LANG_TP_CD
CONTACT_NAME
SOLICIT_IND
CONFIDENTIAL_IND
CLIENT_IMP_TP_CD
CLIENT_ST_TP_CD
CLIENT_POTEN_TP_CD
RPTING_FREQ_TP_CD
LAST_STATEMENT_DT
ALERT_IND
PRVBY_ADMIN_SYS_TP_CD
PRVBY_ADMIN_CLIENT_ID
DO_NOT_DELETE_IND
SOURCE_IDENT_TP_CD
LAST_USED_DT
LAST_VERIFIED_DT
SINCE_DT
LEFT_DT
ACCESS_TOKEN_VALUE
ORG_TP_CD
INDUSTRY_TP_CD
ESTABLISHED_DT
BUY_SELL_AGR_TP_CD
PROFIT_IND
MARITAL_ST_TP_CD
BIRTHPLACE_TP_CD
CITIZENSHIP_TP_CD
HIGHEST_EDU_TP_CD
AGE_VER_DOC_TP_CD
GENDER_TP_CODE
BIRTH_DT
DECEASED_DT
CHILDREN_CT
DISAB_START_DT
DISAB_END_DT
USER_IND
N
N
N
N
Y
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
"P"
"P" or "O" (Cannot be updated)
NULL_DESCRIPTION
NULL_ACCE_COMP_TP_CD
NULL_PREF_LANG_TP_CD
NULL_CONTACT_NAME
NULL_SOLICIT_IND
NULL_CONFIDENTIAL_IND
NULL_CLIENT_IMP_TP_CD
NULL_CLIENT_ST_TP_CD
NULL_CLIENT_POTEN_TP_CD
NULL_RPTING_FREQ_TP_CD
NULL_LAST_STATEMENT_DT
NULL_ALERT_IND
NULL_PROVIDED_BY_CONT
NULL_DO_NOT_DELETE_IND
NULL_SOURCE_IDENT_TP_CD
NULL_LAST_USED_DT
NULL_LAST_VERIFIED_DT
NULL_SINCE_DT
NULL_LEFT_DT
NULL_ACCESS_TOKEN_VALUE
NULL_INDUSTRY_TP_CD
NULL_ESTABLISHED_DT
NULL_BUY_SELL_AGR_TP_CD
NULL_PROFIT_IND
NULL_MARITAL_ST_TP_CD
NULL_BIRTHPLACE_TP_CD
NULL_CITIZENSHIP_TP_CD
NULL_HIGHEST_EDU_TP_CD
NULL_AGE_VER_DOC_TP_CD
NULL_GENDER_TP_CODE
NULL_BIRTH_DT
NULL_DECEASED_DT
NULL_CHILDREN_CT
NULL_DISAB_START_DT
NULL_DISAB_END_DT
NULL_USER_IND
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
CDADMINSYSTP
“U” update, “A” add, “empty” either add or update as applicable
"Y" or "N"
CDACCETOCOMPTP
CDLANGTP
CDCLIENTIMPTP
CDCLIENTSTTP
CDCLIENTPOTENTP
CDRPTINGFREQTP
CDADMINSYSTP
CDSOURCEIDENTTP
MUST BE EMPTY if SUBTYPE = "P", REQUIRED FOR SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "P"
MUST BE EMPTY if SUBTYPE = "P"
MUST BE EMPTY if SUBTYPE = "P"
MUST BE EMPTY if SUBTYPE = "P"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
MUST BE EMPTY if SUBTYPE = "O"
CDORGTP
CDINDUSTRYTP
CDBUYSELLAGREETP
CDMARITALSTTP
CDCOUNTRYTP
CDCOUNTRYTP
CDHIGHESTEDUTP
CDAGEVERDOCTP
not validated
Appendix B. Standard Interface File details
297
Table B-2 OrgName information RT/ST is PG
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
ORG_NAME_TP_CD
ORG_NAME
S_ORG_NAME
START_DT
END_DT
LAST_USED_DT
LAST_VERIFIED_DT
SOURCE_IDENT_TP_CD
P_ORG_NAME
N
N
N
N
Y
N
N
Y
Y
Y
Y
Y
Y
Y
"P"
"G"
NULL_S_ORG_NAME
NULL_END_DT
NULL_LAST_USED_DT
NULL_LAST_VERIFIED_DT
NULL_SOURCE_IDENT_TP_CD
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDORGNAMETP
Use Processing Date if not supplied.
CDSOURCEIDENTTP
Table B-3 Person Name / Person Search information RT/ST is PH
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
PREFIX_NAME_TP_CD
PREFIX_DESC
NAME_USAGE_TP_CD
FREE_FORM_NAME
GIVEN_NAME_ONE
GIVEN_NAME_TWO
GIVEN_NAME_THREE
GIVEN_NAME_FOUR
LAST_NAME
GENERATION_TP_CD
SUFFIX_DESC
START_DT
END_DT
USE_STANDARD_IND
LAST_USED_DT
LAST_VERIFIED_DT
SOURCE_IDENT_TP_CD
P_LAST_NAME
P_GIVEN_NAME_ONE
P_GIVEN_NAME_TWO
P_GIVEN_NAME_THREE
P_GIVEN_NAME_FOUR
GIVEN_NAME_ONE_SEARCH
GIVEN_NAME_TWO_SEARCH
GIVEN_NAME_THREE_SEARCH
GIVEN_NAME_FOUR_SEARCH
LAST_NAME_SEARCH
N
N
N
N
Y
Y
Y
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
"P"
"H"
NULL_PREFIX_NAME_TP_CD
NULL_PREFIX_DESC
NULL_GIVEN_NAME_ONE
NULL_GIVEN_NAME_TWO
NULL_GIVEN_NAME_THREE
NULL_GIVEN_NAME_FOUR
NULL_GENERATION_TP_CD
NULL_SUFFIX_DESC
NULL_END_DT
NULL_USE_STANDARD_IND
NULL_LAST_USED_DT
NULL_LAST_VERIFIED_DT
NULL_SOURCE_IDENT_TP_CD
N
N
N
N
N
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
298
Validate
to table
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDPREFIXNAMETP
CDNAMEUSAGETP
Must be supplied if LAST_NAME is empty. Must be empty if GIVEN_NAME or LAST_NAME present.
Must be empty if FREE_FORM_NAME supplied
CDGENERATIONTP
Use Processing Date if not supplied.
CDSOURCEIDENTTP
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table B-4 External Match RT/ST is PE
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
DESCRIPTION
LINKTO_ADMIN_SYS_TP_CD
LINKTO_ADMIN_CLIENT_ID
N
N
N
N
Y
Y
N
N
"P"
"E’
Validate
to table
CDADMINSYSTP
“U” update, “A” add, “empty” either add or update as applicable
Not required.
Table B-5 Location_Group_Address_Group Address RT/ST is PA
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
UNDEL_REASON_TP_CD
MEMBER_IND
PREFERRED_IND
SOLICIT_IND
EFFECT_START_MMDD
EFFECT_END_MMDD
EFFECT_START_TM
EFFECT_END_TM
START_DT
END_DT
LAST_USED_DT
LAST_VERIFIED_DT
SOURCE_IDENT_TP_CD
CARE_OF_DESC
ADDR_USAGE_TP_CD
COUNTRY_TP_CD
RESIDENCE_TP_CD
PROV_STATE_TP_CD
ADDR_LINE_ONE
ADDR_LINE_TWO
ADDR_LINE_THREE
CITY_NAME
POSTAL_CODE
ADDR_STANDARD_IND
OVERRIDE_IND
RESIDENCE_NUM
COUNTY_CODE
LATITUDE_DEGREES
LONGITUDE_DEGREES
POSTAL_BARCODE
P_CITY
BUILDING_NAME
STREET_NUMBER
STREET_NAME
P_STREET_NAME
STREET_SUFFIX
PRE_DIRECTIONAL
POST_DIRECTIONAL
BOX_DESIGNATOR
BOX_ID
STN_INFO
STN_ID
REGION
DEL_DESIGNATOR
DEL_ID
DEL_INFO
N
N
N
N
Y
Y
Y
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
"P"
"A"
Validate
to table
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDUNDELREASONT
P
Use Processing Date if not supplied.
CDSOURCEIDENTTP
CDADDRUSAGETP
CDCOUNTRYTP
CDRESIDENCETP
CDPROVSTATETP
Appendix B. Standard Interface File details
299
Column name
Can be
empty?
Mapping rule
Validate
to table
NULL_UNDEL_REASON_TP_CD
NULL_MEMBER_IND
NULL_PREFERRED_IND
NULL_SOLICIT_IND
NULL_EFFECT_START_MMDD
NULL_EFFECT_END_MMDD
NULL_EFFECT_START_TM
NULL_EFFECT_END_TM
NULL_END_DT
NULL_LAST_USED_DT
NULL_LAST_VERIFIED_DT
NULL_SOURCE_IDENT_TP_CD
NULL_CARE_OF_DESC
NULL_CONTRY_TP_CD
NULL_RESIDENCE_TP_CD
NULL_PROV_STATE_TP_CD
NULL_ADDR_LINE_TWO
NULL_ADDR_LINE_THREE
NULL_POSTAL_CODE
NULL_ADDR_STANDARD_IND
NULL_OVERRIDE_IND
NULL_RESIDENCE_NUM
NULL_COUNTY_CODE
NULL_LATITUDE_DEGREES
NULL_LONGITUDE_DEGREES
NULL_POSTAL_BARCODE
NULL_BUILDING_NAME
NULL_STREET_NUMBER
NULL_STREET_NAME
NULL_STREET_SUFFIX
NULL_PRE_DIRECTIONAL
NULL_POST_DIRECTIONAL
NULL_BOX_DESIGNATOR
NULL_BOX_ID
NULL_STN_INFO
NULL_STN_ID
NULL_REGION
NULL_DEL_DESIGNATOR
NULL_DEL_ID
NULL_DEL_INFO
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-6 LocationGroup_ContactMethodGroup_ContactMethod RT/ST is PC
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
UNDEL_REASON_TP_CD
MEMBER_IND
PREFERRED_IND
SOLICIT_IND
EFFECT_START_MMDD
EFFECT_END_MMDD
EFFECT_START_TM
EFFECT_END_TM
START_DT
END_DT
LAST_USED_DT
LAST_VERIFIED_DT
SOURCE_IDENT_TP_CD
CONT_METH_TP_CD
METHOD_ST_TP_CD
ATTACH_ALLOW_IND
TEXT_ONLY_IND
MESSAGE_SIZE
COMMENT_DESC
REF_NUM
CONT_METH_STD_IND
COUNTRY_CODE
AREA_CODE
EXCHANGE
PH_NUMBER
EXTENSION
N
N
N
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
N
Y
Y
Y
Y
Y
N
Y
Y
Y
Y
Y
Y
"P"
"C"
300
Validate
to table
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDUNDELREASONTP
Use Processing Date if not supplied.
Master Data Management: IBM InfoSphere Rapid Deployment Package
CDSOURCEIDENTTP
CDCONTMETHTP
CDMETHODSTATUSTP
S
Column name
Can be
empty?
Mapping rule
Validate
to table
NULL_UNDEL_REASON_TP_CD
NULL_MEMBER_IND
NULL_PREFERRED_IND
NULL_SOLICIT_IND
NULL_EFFECT_START_MMDD
NULL_EFFECT_END_MMDD
NULL_EFFECT_START_TM
NULL_EFFECT_END_TM
NULL_END_DT
NULL_LAST_USED_DT
NULL_LAST_VERIFIED_DT
NULL_SOURCE_IDENT_TP_CD
NULL_METHOD_ST_TP_CD
NULL_ATTACH_ALLOW_IND
NULL_TEXT_ONLY_IND
NULL_MESSAGE_SIZE
NULL_COMMENT_DESC
NULL_CONT_METH_STD_IND
NULL_COUNTRY_CODE
NULL_AREA_CODE
NULL_EXCHANGE
NULL_PH_NUMBER
NULL_EXTENSION
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-7 Identifier RT/ST is PI
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
ID_TP_CD
ID_STATUS_TP_CD
REF_NUM
START_DT
END_DT
EXPIRY_DT
ASSIGNEDBY_ADMIN_SYS_TP_CD
ASSIGNEDBY_ADMIN_CLIENT_ID
IDENTIFIER_DESC
ISSUE_LOCATION
LAST_USED_DT
LAST_VERIFIED_DT
SOURCE_IDENT_TP_CD
N
N
N
N
Y
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
"P"
"I"
NULL_ID_STATUS_TP_CD
NULL_REF_NUM
N
N
NULL_END_DT
NULL_EXPIRY_DT
NULL_ASSIGNED_BY
NULL_IDENTIFIER_DESC
NULL_ISSUE_LOCATION
NULL_LAST_USED_DT
NULL_LAST_VERIFIED_DT
NULL_SOURCE_IDENT_TP_CD
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
ref_num can only be null for 1 identifier status type. If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not
empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
#########################################################################################################
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDIDTP
CDIDSTATUSTP
Use Processing Date if not supplied.
CDADMINSYSTP
CDSOURCEIDENTTP
Table B-8 LobRel RT/ST is PB
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
ENTITY_NAME
LOB_TP_CD
LOB_REL_TP_CD
START_DT
END_DT
N
N
N
N
Y
N
N
N
Y
Y
"P"
"B"
NULL_END_DT
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Not required.
“U” update, “A” add, “empty” either add or update as applicable
"CONTACT"
CDLOBTP
CDLOBRELTP
Use Processing Date if not supplied.
Appendix B. Standard Interface File details
301
Table B-9 ContactRel RT/ST is PR
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD_TO
ADMIN_CLIENT_ID_TO
ADMIN_SYS_TP_CD_FROM
ADMIN_CLIENT_ID_FROM
LOAD_TYPE
REL_TP_CD
REL_DESC
START_DT
END_DT
REL_ASSIGN_TP_CD
END_REASON_TP_CD
N
N
N
N
N
N
Y
N
Y
Y
Y
Y
Y
"P"
"R"
NULL_REL_DESC
NULL_END_DT
NULL_REL_ASSIGN_TP_CD
NULL_END_REASON_TP_CD
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Not required.
Not required.
TO and FROM SSKs cannot be the same.
“U” update, “A” add, “empty” either add or update as applicable
CDRELTP
Use Processing Date if not supplied.
CDRELASSIGNTP
CDENDREASONTP
Table B-10 Contract RT/ST is CH
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CONTRACT_ID
LOAD_TYPE
CONTR_LANG_TP_CD
CURRENCY_TP_CD
FREQ_MODE_TP_CD
BILL_TP_CD
PREMIUM_AMT
NEXT_BILL_DT
CURR_CASH_VAL_AMT
LINE_OF_BUSINESS
BRAND_NAME
SERVICE_ORG_NAME
BUS_ORGUNIT_ID
SERVICE_PROV_ID
REPLBY_ADMIN_SYS_TP_CD
REPLBY_ADMIN_CONTRACT_ID
ISSUE_LOCATION
PREMAMT_CUR_TP
CASHVAL_CUR_TP
ACCESS_TOKEN_VALUE
MANAGED_ACCOUNT_IND
AGREEMENT_NAME
AGREEMENT_NICKNAME
SIGNED_DT
EXECUTED_DT
END_DT
ACCOUNT_LAST_TRANSACTION_DT
TERMINATION_DT
TERMINATION_REASON_TP_CD
AGREEMENT_DESCRIPTION
AGREEMENT_ST_TP_CD
AGREEMENT_TP_CD
SERVICE_LEVEL_TP_CD
LAST_VERIFIED_DT
LAST_REVIEWED_DT
PRODUCT_ID
CLUSTER_KEY
N
N
N
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
"C"
"H"
302
Validate
to table
CDADMINSYSTP
“U” update, “A” add, “empty” either add or update as applicable
CDLANGTP
CDCURRENCYTP
CDFREQMODETP
CDBILLTP
Required if Reply by contract ID present.
CDADMINSYSTP
CDCURRENCYTP
CDCURRENCYTP
IMPORTANT: Leave null unless advised by MDM Server expert.
CDTERMINATIONREASONTP
CDAGREEMENTSTTP
CDAGREEMENTTP
CDSERVICELEVELTP
NOT USED
Master Data Management: IBM InfoSphere Rapid Deployment Package
Column name
Can be
empty?
Mapping rule
Validate
to table
NULL_CONTR_LANG_TP_CD
NULL_CURRENCY_TP_CD
NULL_FREQ_MODE_TP_CD
NULL_BILL_TP_CD
NULL_PREMIUM_AMT
NULL_NEXT_BILL_DT
NULL_CURR_CASH_VAL_AMT
NULL_LINE_OF_BUSINESS
NULL_BRAND_NAME
NULL_SERVICE_ORG_NAME
NULL_BUS_ORGUNIT_ID
NULL_SERVICE_PROV_ID
NULL_REPL_BY_CONTRACT
NULL_ISSUE_LOCATION
NULL_PREMAMT_CUR_TP
NULL_CASHVAL_CUR_TP
NULL_ACCESS_TOKEN_VALUE
NULL_MANAGED_ACCOUNT_IND
NULL_AGREEMENT_NAME
NULL_AGREEMENT_NICKNAME
NULL_SIGNED_DT
NULL_EXECUTED_DT
NULL_END_DT
NULL_REPLACES_CONTRACT
NULL_ACCOUNT_LAST_TRANSACTION_DT
NULL_TERMINATION_DT
NULL_TERMINATION_REASON_TP_CD
NULL_AGREEMENT_DESCRIPTION
NULL_AGREEMENT_ST_TP_CD
NULL_AGREEMENT_TP_CD
NULL_SERVICE_LEVEL_TP_CD
NULL_LAST_VERIFIED_DT
NULL_LAST_REVIEWED_DT
NULL_PRODUCT_ID
NULL_CLUSTER_KEY
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-11 Contract RT/ST is CK
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_FLD_NM_TP_CD
ADMIN_CONTRACT_ID
LINKTO_ADMIN_FLD_NM_TP_CD
LINKTO_ADMIN_CONTRACT_ID
CONTRACT_COMP_IND
N
N
N
N
N
N
Y
"C"
"K"
Validate
to table
CDADMINFLDNMTP
CDADMINFLDNMTP
ANY VALUE INPUT WILL BE OVERRIDDEN TO "N"
Table B-12 Contract Component RT/ST is CC
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CONTRACT_ID
LOAD_TYPE
PROD_TP_CD
CONTRACT_ST_TP_CD
CURR_CASH_VAL_AMT
PREMIUM_AMT
ISSUE_DT
VIATICAL_IND
BASE_IND
CONTR_COMP_TP_CD
SERV_ARRANGE_TP_CD
EXPIRY_DT
PREMAMT_CUR_TP
CASHVAL_CUR_TP
CLUSTER_KEY
N
N
N
N
Y
N
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
"C"
"C"
Validate
to table
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDPRODTP
CDCONTRACTSTTP
CDCONTRCOMPTP
CDARRANGEMENTTP
CDCURRENCYTP
CDCURRENCYTP
Appendix B. Standard Interface File details
303
Column name
Can be
empty?
Mapping rule
Validate
to table
NULL_CURR_CASH_VAL_AMT
NULL_PREMIUM_AMT
NULL_ISSUE_DT
NULL_VIATICAL_IND
NULL_BASE_IND
NULL_SERV_ARRANGE_TP_CD
NULL_EXPIRY_DT
NULL_PREMAMT_CUR_TP
NULL_CASHVAL_CUR_TP
NULL_CLUSTER_KEY
N
N
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-13 Contract Role RT/ST is CR
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CONTRACT_ID
LOAD_TYPE
ADMIN_CLIENT_SYS_TP_CD
ADMIN_CLIENT_ID
CONTR_COMP_TP_CD
PROD_TP_CD
CONTR_ROLE_TP_CD
REGISTERED_NAME
DISTRIB_PCT
IRREVOC_IND
START_DT
END_DT
RECORDED_START_DT
RECORDED_END_DT
SHARE_DIST_TP_CD
ARRANGEMENT_TP_CD
ARRANGEMENT_DESC
END_REASON_TP_CD
N
N
N
N
Y
N
N
Y
N
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
"C"
"R"
NULL_REGISTERED_NAME
NULL_DISTRIB_PCT
NULL_IRREVOC_IND
NULL_END_DT
NULL_RECORDED_START_DT
NULL_RECORDED_END_DT
NULL_SHARE_DIST_TP_CD
NULL_ARRANGEMENT_TP_CD
NULL_ARRANGEMENT_DESC
NULL_END_REASON_TP_CD
N
N
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Not required.
“U” update, “A” add, “empty” either add or update as applicable
Not required.
CDCONTRCOMPTP
CDPRODTP
CDCONTRACTROLETP
Use Processing Date if not supplied.
CDSHAREDISTTP
CDARRANGEMENTTP
CDENDREASONTP
Table B-14 Role Location RT/ST is CR
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CONTRACT_ID
LOAD_TYPE
ADMIN_CLIENT_SYS_TP_CD
ADMIN_CLIENT_ID
CONTR_COMP_TP_CD
PROD_TP_CD
CONTR_ROLE_TP_CD
ADDR_USAGE_TP_CD
START_DT
END_DT
UNDEL_REASON_TP_CD
N
N
N
N
Y
N
N
Y
N
N
N
Y
"C"
"L"
NULL_END_DT
NULL_UNDEL_REASON_TP_CD
N
N
304
Validate
to table
Not required.
“U” update, “A” add, “empty” either add or update as applicable
Not required.
CDCONTRCOMPTP
CDPRODTP
CDCONTRACTROLETP
CDADDRUSAGETP
Use Processing Date if not supplied.
CDUNDELREASONTP
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table B-15 Role Location RT/ST is CL
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CONTRACT_ID
LOAD_TYPE
ADMIN_CLIENT_SYS_TP_CD
ADMIN_CLIENT_ID
CONTR_COMP_TP_CD
PROD_TP_CD
CONTR_ROLE_TP_CD
ADDR_USAGE_TP_CD
START_DT
END_DT
UNDEL_REASON_TP_CD
N
N
N
N
Y
N
N
Y
N
N
N
Y
"C"
"L"
NULL_END_DT
NULL_UNDEL_REASON_TP_CD
N
N
Validate
to table
Not required.
“U” update, “A” add, “empty” either add or update as applicable
Not required.
CDCONTRCOMPTP
CDPRODTP
CDCONTRACTROLETP
CDADDRUSAGETP
Use Processing Date if not supplied.
CDUNDELREASONTP
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-16 ContractCompVal RT/ST is CV
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CONTRACT_ID
LOAD_TYPE
CONTR_COMP_TP_CD
PROD_TP_CD
DOMAIN_VALUE_TP_CD
VALUE_STRING
START_DT
END_DT
N
N
N
N
Y
Y
N
N
N
Y
Y
"C"
"V"
NULL_END_DT
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDCONTRCOMPTP
CDPRODTP
CDDOMAINVALUETP
Use Processing Date if not supplied.
Table B-17 MiscValue RT/ST is CM or PM
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_OR_CONTRACT_ID
LOAD_TYPE
MISCVALUE_TP_CD
VALUE_STRING
PRIORITY_TP_CD
SOURCE_IDENT_TP_CD
DESCRIPTION
START_DT
END_DT
VALUEATTR_TP_CD_0
ATTR0_VALUE
VALUEATTR_TP_CD_1
ATTR1_VALUE
VALUEATTR_TP_CD_2
ATTR2_VALUE
VALUEATTR_TP_CD_3
ATTR3_VALUE
VALUEATTR_TP_CD_4
ATTR4_VALUE
VALUEATTR_TP_CD_5
ATTR5_VALUE
VALUEATTR_TP_CD_6
ATTR6_VALUE
VALUEATTR_TP_CD_7
ATTR7_VALUE
VALUEATTR_TP_CD_8
ATTR8_VALUE
VALUEATTR_TP_CD_9
ATTR9_VALUE
N
N
N
N
Y
N
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
"C" or "P"
"M"
Validate
to table
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDMISCVALUETP
CDPRIORITYTP
CDSOURCEIDENTTP
Use Processing Date if not supplied.
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
CDMISCVALUEATTRTP
Appendix B. Standard Interface File details
305
Column name
Can be
empty?
Mapping rule
Validate
to table
NULL_VALUE_STRING
NULL_PRIORITY_TP_CD
NULL_SOURCE_IDENT_TP_CD
NULL_DESCRIPTION
NULL_END_DT
NULL_VALUEATTR_TP_CD_0
NULL_ATTR0_VALUE
NULL_VALUEATTR_TP_CD_1
NULL_ATTR1_VALUE
NULL_VALUEATTR_TP_CD_2
NULL_ATTR2_VALUE
NULL_VALUEATTR_TP_CD_3
NULL_ATTR3_VALUE
NULL_VALUEATTR_TP_CD_4
NULL_ATTR4_VALUE
NULL_VALUEATTR_TP_CD_5
NULL_ATTR5_VALUE
NULL_VALUEATTR_TP_CD_6
NULL_ATTR6_VALUE
NULL_VALUEATTR_TP_CD_7
NULL_ATTR7_VALUE
NULL_VALUEATTR_TP_CD_8
NULL_ATTR8_VALUE
NULL_VALUEATTR_TP_CD_9
NULL_ATTR9_VALUE
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Table B-18 PPrefEntity_PrivPref RT/ST is PS
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
LOAD_TYPE
PPREF_REASON_TP_CD
SOURCE_IDENT_TP_CD
VALUE_STRING
START_DT
END_DT
PPREF_TP_CD
PPREF_ACT_OPT_ID
N
N
N
N
Y
N
N
Y
Y
Y
N
Y
"P"
"S"
NULL_VALUE_STRING
NULL_END_DT
NULL_PPREF_ACT_OPT_ID
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDPPREFREASONTP
CDSOURCEIDENTTP
Use Processing Date if not supplied.
CDPPREFTP
PPREFACTIONOPT
Table B-19 Alert RT/ST is CT or PT
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
ADMIN_SYS_TP_CD
ADMIN_CONTRACT_OR_CLIENT_ID
LOAD_TYPE
REMOVED_BY_USER
CREATED_BY_USER
ALERT_TP_CD
ALERT_SEV_TP_CD
START_DT
END_DT
DESCRIPTION
N
N
N
N
Y
Y
Y
N
Y
Y
Y
Y
"C" or "P"
"T"
NULL_REMOVED_BY_USER
NULL_CREATED_BY_USER
NULL_ALERT_SEV_TP_CD
NULL_END_DT
NULL_DESCRIPTION
N
N
N
N
N
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
If 1 then set to null, if 0 and column is empty use prior value, if 0 and column is not empty overwrite prior value
306
Validate
to table
Not required.
“U” update, “A” add, “empty” either add or update as applicable
CDALTERTP
CDALERTSEVTP
Use Processing Date if not supplied.
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table B-20 Hierarchy RT/ST is HH
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
LOAD_TYPE
NAME
HIERARCHY_TP_CD
DESCRIPTION
START_DT
END_DT
N
N
Y
N
N
"H"
"H"
“U” update, “A” add, “empty” either add or update as applicable
NULL_DESCRIPTION
NULL_END_DT
N
N
if 1 set null, if 0 use prior
if 1 set null, if 0 use prior
CDHIERARCHYTP
Table B-21 Hierarchy Node RT/ST is HN
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
LOAD_TYPE
NAME
HIERARCHY_TP_CD
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
ENTITY_NAME
DESCRIPTION
START_DT
END_DT
NODEDESIG_TP_CD
LOCALEDESCRIPTION
N
N
Y
N
N
N
N
"H"
"H"
“U” update, “A” add, “empty” either add or update as applicable
NULL_DESCRIPTION
NULL_END_DT
NULL_NODEDESIG_TP_CD
NULL_LOCALEDESCRIPTION
N
N
N
N
if 1 set null, if 0 use prior
if 1 set null, if 0 use prior
if 1 set null, if 0 use prior
if 1 set null, if 0 use prior
CDHIERARCHYTP
Table B-22 Hierarchy Rel RT/ST is HR
Column name
Can be
empty?
Mapping rule
Validate
to table
RECTYPE
SUBTYPE
LOAD_TYPE
NAME
HIERARCHY_TP_CD
ADMIN_SYS_TP_CD_PARENT
ADMIN_CLIENT_ID_PARENT
ADMIN_SYS_TP_CD_CHILD
ADMIN_CLIENT_ID_CHILD
DESCRIPTION
START_DT
END_DT
N
N
Y
N
N
N
N
N
N
"H"
"R"
“U” update, “A” add, “empty” either add or update as applicable
NULL_DESCRIPTION
NULL_END_DT
N
N
if 1 set null, if 0 use prior
if 1 set null, if 0 use prior
CDHIERARCHYTP
CDADMINSYSTP
CDADMINSYSTP
Appendix B. Standard Interface File details
307
Table B-23 Hierarchy Ultimate Parent RT/ST is HU
Column name
Can be
empty?
Mapping rule
RECTYPE
SUBTYPE
LOAD_TYPE
NAME
HIERARCHY_TP_CD
ADMIN_SYS_TP_CD
ADMIN_CLIENT_ID
DESCRIPTION
START_DT
END_DT
N
N
Y
N
N
N
N
"H"
"U"
“U” update, “A” add, “empty” either add or update as applicable
NULL_DESCRIPTION
NULL_END_DT
N
N
if 1 set null, if 0 use prior
if 1 set null, if 0 use prior
308
Validate
to table
CDHIERARCHYTP
CDADMINSYSTP
Master Data Management: IBM InfoSphere Rapid Deployment Package
C
Appendix C.
MDM customization
considerations
This appendix describes the extensions that are supported by Master Data
Management (MDM) Server and the impact of such extensions on the Rapid
Deployment Package (RDP) for MDM jobs.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
309
C.1 Introduction
Because MDM Server source code is not accessible to clients, there are a
number of extension and configuration mechanisms available to adapt the
product to your environment. The Extension Framework1 is one of these
mechanisms. It is tightly integrated with the kernel of the product.
The primary types of extensions are as follows:
 Data extensions and additions, which allow you to add new data elements
and create new business entities with a set of business services to maintain
them.
 Behavior extensions, which allow you to plug in new business rules or
functionality.
Note: MDM Server also comes with MDM Server Workbench, a development
tool to help with the creation of these data and behavior extensions. This
workbench is in the form of a plug-in to IBM Rational® Software Architect.
You may also create new transactions or services using the MDM Server
application framework. You can build transactions by constructing new controller
and business components, and using the existing Request Framework and
Common Components.
This appendix briefly describes the following extension information:
 Data extensions and additions
 Behavior extensions
 Impact of extensions on RDP for MDM
A brief overview of several considerations involved in extending RDP for MDM is
covered here as follows:




1
310
Extending RDP for MDM
Runtime Column Propagation
Adding new elements (columns)
Modifying existing elements (columns)
MDM Server also uses its own extension framework to plug in certain modules, such as Rules of
Visibility, to keep it loosely coupled and easily configurable to turn on or off.
Master Data Management: IBM InfoSphere Rapid Deployment Package
C.2 Data extensions and additions
MDM Server provides a mechanism for extending the data model. You can add
new attributes to existing tables and add new tables. Extended data elements
can be persisted and retrieved as part of existing MDM Server transactions
without the need to modify MDM Server code. MDM Server has the following
responsibilities when dealing with extended data:
 Parsing extended data as part of an XML service request and creating
extended business objects
 Invoking validation routines on the extended business objects
 Populating the extended data elements as part of the MDM Server meta-data
so that features such as external validation rules can be used
 Invoking methods on the extended business object when required to persist
or retrieve the extended data elements
 Constructing XML data as part of the service completion
C.3 Behavior extensions
MDM Server provides a mechanism for extending the behavior of the product in
an event-based way. The Pre/Post Transaction and Pre/Post Action points within
the product can be extended to provide additional functionality.
A transaction equates to a published service, or Controller Component
operation. An action equates to an operation on a business logic component.
There may be other predefined points that can be extended. They are
documented as part of the service specification. You can write extensions to
MDM Server behavior as Java code or in a rules engine language. Extensions
are organized into Extension Sets, which are similar to the rule sets within a rules
engine. Examples include generic prospective client rules or line of
business-specific rules like life insurance client rules. The Extension Controller is
the gateway from the core application to behavior extensions and is invoked at
extension points listed above. It is provided with the following information:




Data about extension point that invoked it
The transaction’s object hierarchy
The action’s object hierarchy, in the case of an action extension point
The transaction header that was provided in the original MDM Server request
The Extension Controller uses the parameters to determine if any Extension Sets
must be further evaluated. Relevant Extension Sets are then interrogated and
qualified extensions, either Java or rules sets, are invoked.
Appendix C. MDM customization considerations
311
C.4 Impact of data/behavior extensions on RDP for MDM
The process of extending the MDM data model to support your organization's
specific master data requirements is beyond the scope of this IBM Redbooks
publication. However, this Appendix provides considerations for extending the
MDM Server, and the corresponding impact on the RDP for MDM assets.
Because RDP for MDM loads directly into the MDM target tables, creating new
MDM Server services or behavior extensions will have no impact on RDP for
MDM. However, with extensions to the MDM data model, changes must be made
to both the MDM Server and the corresponding RDP for MDM assets.
MDM Server provides a code generation tool to allow clients to change existing
column attributes, add new columns to existing tables, and add new tables to
satisfy business requirements. The code generation tool also generates the web
services integration code for these data extensions.
312
Master Data Management: IBM InfoSphere Rapid Deployment Package
Corresponding changes to the RDP for MDM assets depend on the type of data
model change, as summarized in Table C-1.
Table C-1 MDM extensions impact on RDP for MDM
Type of MDM
extension
Nature of MDM
extension
Impact on RDP for MDM
Data extensions
and additions
Either:
 Add a new element to
an existing SIF record
when that element
does not participate in
some transformation
or aggregation
 Modify an existing
element's data type
and
precision/scale/length
when that element
does not participate in
some transformation
or aggregation.
Change to the corresponding ImportSIF shared
container (names starting with ILIS…) will propagate
through to the target.
Either:
 Add a new element to
an existing SIF record
when that element
participates in some
transformation or
aggregation
 Modify an existing
element's data type
and
precision/scale/length
when that element
participates in some
transformation or
aggregation.
Change to the corresponding ImportSIF shared
container (names starting with ILIS…) will propagate
through to the target.
Add a new table (new SIF
record)
Impact is beyond the scope of a typical RDP for MDM
implementation.
For BulkLoad, no further changes are required.
For Insert (Upsert), change to the corresponding DB
shared container (names starting with ILDBIN…) is also
required.
For BulkLoad, no further changes are required.
For Insert (Upsert), change to the corresponding DB
shared container (names starting with ILDBIN…) is also
required.
Search for dependent objects where the element is used,
and examine transformation or validation logic to see if
changes are necessary.
Behavior extensions
No impact
New transaction or service
No impact
Appendix C. MDM customization considerations
313
C.5 Extending RDP for MDM
In general, RDP for MDM jobs have been built using modular design techniques
and reusable shared containers. This allows changes to be made to source,
target, edit, and validation logic without changing the actual jobs themselves.
In this way, if the extensions are confined to existing shared containers,
upgrading core RDP jobs without losing client-specific customizations is
possible.
Shared containers are provided for the following jobs:







ImportSIF format
Database select (code tables), and target (upsert and bulk load methods)
Edit points for pre-validation
Validation and Standardization
Match Processing
Error Conditions
ID Assignment
C.6 Runtime column propagation (RCP)
RCP is a feature of IBM InfoSphere Information Server that allows job designs to
accommodate additional columns beyond those defined by the DataStage or
QualityStage job developer.
Using RCP judiciously facilitates re-usable job designs based on input metadata,
rather than using a large number of jobs with hard-coded table definitions to
perform the same tasks. Furthermore, RCP facilitates re-use through parallel
shared containers.
By using RCP, only the columns explicitly referenced within the shared container
logic must be defined, the remaining columns pass through at run time, as long
as each stage in the shared container has RCP enabled on its stage Output
properties.
Before a DataStage developer can use RCP, it must be enabled at the project
level through the administrator client. RCP is then enabled or disabled on the
Output tab of each stage. When RCP is enabled, columns not explicitly defined
will be passed across the stage from input to output.
314
Master Data Management: IBM InfoSphere Rapid Deployment Package
C.7 Adding new elements (columns)
Most RDP for MDM job designs enable RCP across their stages. In the simplest
case, this way allows additional columns to be defined on input by modifying the
corresponding ImportSIF shared container. If these new columns are not needed
in additional derivations or validations, these additional columns will flow from the
source SIF to the target database table. For BulkLoad, no additional changes are
necessary. For Insert(Upsert), the table definition within the corresponding DB
shared container must also be updated with the new column.
For some RDP for MDM jobs, stages, and shared containers, RCP is explicitly
disabled. In most cases, RCP is disabled for QualityStage match, because this is
a standard practice (only matching key columns are output). However, for other
objects and jobs where RCP for MDM is disabled, they should be reviewed to
ensure the additional columns are passed down when necessary.
Table C-2 summarizes the jobs and containers that might require review.
Note: This is an incomplete table, and may change with new releases of RDP
assets.
If a newly-added element must also be validated, the corresponding EditPoint or
Validation shared container should be changed instead of changing an existing
base RDP for MDM job.
Table C-2 RDP for MDM objects with RCP disabled
Category
Job/Container name
MDMIS R4
IL_000_PS_Stage_ErrReasonTbl
MDMIS R4
IL_010_IS_Import_SIF
MDMIS R4
IL_020_VS_Address
MDMIS R4/
Shared Containers/EditPointContainers
EPCVSAddress (Container)
MDMIS R4/
Shared Containers/ValidationStanContainers
VSVALAddress (Container)
MDMIS R4/
Shared Containers/ImportSIFContainers
(names start with ILIS…)
MDMIS R4/Shared Containers/DBContainers
(names start with "ILDBIN…")
Appendix C. MDM customization considerations
315
C.8 Modifying existing elements (columns)
By using RCP, changes to existing column attributes such as length, precision,
and in some cases even data type can also flow from source to target. Similar to
adding new columns, changes must be made in the source ImportSIF shared
container, and (if using Insert) target DB shared container.
A possibility is that an existing column might also be used in a transformation or
aggregation that must be reviewed and updated. Although the Advanced Find
feature might be useful in locating some objects, identifying and changing all
dependent transformations by exporting the entire RDP for MDM project to a
DSX file is easiest.
DSX files contain a clear-text representation of all DataStage and QualityStage
objects that can be easily searched using a text editor or command-line tool. Edit
a copy of the .DSX file, and update all transformations and stages where an
existing column is used. This updated copy can be re-imported into the RDP for
MDM project. This method is particularly effective for simple derivations.
Changes to some elements may require more advanced knowledge of
DataStage and QualityStage. For example, if a column is used in a QualityStage
standardization or match, those specifications will need to be updated. These
advanced or more extensive changes may require the guidance of an IBM
Professional Services consultant experienced with RDP for MDM.
316
Master Data Management: IBM InfoSphere Rapid Deployment Package
D
Appendix D.
Error processing
This appendix describes the most commonly encountered data-related problems
in the Standard Interface File (SIF) and how they are highlighted in the Rapid
Deployment Package (RDP) for Master Data Management (MDM) error log.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
317
D.1 Introduction
Figure D-1 shows that errors processing the SIF might be identified during the
various phases (such as Import SIF, Validation & Standardization, and Error
Consolidation & Referential Integrity) of RDP for MDM processing. The errors
are consolidated into the consolidated error log.
Figure D-1 Main components of RDP processing and error logs generated
The general format of the error messages in the error logs is shown in Table D-1
on page 319. Go to the download site for a document about error codes:
http://www.redbooks.ibm.com/redpieces/abstracts/sg247704.html
318
Master Data Management: IBM InfoSphere Rapid Deployment Package
Table D-1 Error message format
Field#
SIF column names
Description
1
RECTYPE
2
SUBTYPE
These two fields identify the record type, such as PA
(address), PI (identifier), CC (contract component)
3
ADMIN_SYS_TP_CD
4
ADMIN_CLIENT_ID_OR_
CONTRACT_ID
5
CONT_ID
This is the surrogate key (SK) generated for each row
6
SIF_FILE_NAME
7
SIF_ROW_NUMBER
This is the physical location of the error row among the
input data files
8
ERR_CODE
This is the error type, such as invalid state/province type
code. This field is an integer.
9
ERR_REASON_CODE
This is a specific instance of an error code, such as invalid
state/province type code in the Address Validation job.
This field is also an integer.
10
ERR_MSG
This text message corresponds to the error code
(ERR_CODE): the literal string "invalid state/province type
code"
11
ERR_SEVERITY_LEVEL
The severity of the error.
12
BATCH_ID
You should set this to a unique ID which gets assigned to
each run of the jobs. It is used in all of the filenames for all
files created by the RDP for MDM jobs.
13
INTERNAL_ID
This is a surrogate key that we apply inside the RDP for
MDM jobs for use there, it is dropped before loading the
database (where the CONT_ID is used instead).
14
ERR_TS
This is a time stamp corresponding to when the error was
detected.
15
ERR_STAGE_NAME
This is the name of the stage that detected the error and
produced the error row.
16
ERR_JOB_NAME
This is the name of the job in which the
ERR_STAGE_NAME stage resides.
These two fields are the SSK (Source System Key)
Appendix D. Error processing
319
Important: The error log does not identify the offsets in the record of the fields
in error. Also, the sequence of error messages does not reflect the actual time
sequence of occurrence of the error. The parallel framework with the default
automated partitioning used in the RDP for MDM jobs causes the sequence of
these errors to be non-deterministic. This means that reruns of the same job
are likely to show the error messages generated in a different order each time.
In this appendix, we created a number of SIF files containing the most commonly
encountered errors to identify the corresponding error messages generated by
the RDP for MDM jobs. The contents of the consolidated error log is shown here.
Note: We chose to document each error in isolation to review the
corresponding errors generated in the logs. In practice, a combination of these
errors is likely to occur. Carefully review the error logs to determine and rectify
the errant rows in the input SIF.
The most commonly encountered errors are as follows:






320
Pipe (|) character in the data
Validation error with the code table
RT/ST/ADMIN_SYS_TP_CD error
End of record missing
Start date after end date error
Date format error
Master Data Management: IBM InfoSphere Rapid Deployment Package
D.2 Pipe character (|) in the data
The pipe character (|) is the field delimiter in the SIF, and the presence of it in the
data will cause the SIF parser to fail with an error message.
We introduced a pipe character in the address field of row 28 of the SIF
(highlighted in Example D-1. Only the partial contents of the SIF is shown here)
to view the error messages generated by the SIF parser.
Example D-2 on page 324 shows the contents of the error log for this error:
 The first record highlights row 28 in the SIF file SIF_Out.pipe with the SSK of
(1000000,70005817) that the SIF parser is unable to parse. Error message
shows Unable to parse record at RT/ST Level, and the error severity level
is 0. The name of the stage (tx_RTST_ci_Rejects) and job name
(IL_010_Parse_Columnization) is also provided.
This row is rejected.
 The subsequent records show the rows in the SIF that are also rejected
because they are associated with the previous row. The following message is
generated to identify the corresponding row numbers (496, 579, 126, 698,
and 17) in the input SIF file:
Record dropped by association. Fatal errors were detected on related
party records.
Note: Currently, the pipe character cannot be substituted as the field delimiter,
nor is an escape character provided.
Example: D-1 Pipe character in the data error: partial contents of SIF
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
Appendix D. Error processing
321
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
322
Master Data Management: IBM InfoSphere Rapid Deployment Package
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|A|1000002|8000640|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||6177 Purple
Sage Ct|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000469|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||5528 Muir
Dr|||San
Jose|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000111|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||631 Ofarrell
St|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000001|200000201|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||1363 14th
Ave|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000000|70005333|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||6181 Camino
Verde Dr,,San
Jose,95119||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000000|70005817|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||PO Box
7424||San
Francisco,94120||||||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
P|A|1000002|8000340|A|||||||||2008-10-26 19:11:54.000000||||||1|185|||44 Montgomery
St|||San
Francisco|||||||||||||||||||||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0
|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
............
Appendix D. Error processing
323
Example: D-2 Pipe character in the data error log
P|A|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|28|110362|100401|U
nable to parse record at RT/ST Level|0|canonical_errPipe|6696|2008-05-30
09:36:10|tx_RTST_ci_Rejects|IL_010_Parse_Columnization
P|C|1000000||70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|496|110387|100387
|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errPipe|6696|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|H|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|579|110387|100387|
Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errPipe|6696|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|I|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|126|110387|100387|
Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errPipe|6696|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|I|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|698|110387|100387|
Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errPipe|6696|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|P|1000000|70005817|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.pipe|17|110387|100387|R
ecord dropped by association. Fatal errors were detected on related party
records.|0|canonical_errPipe|6696|2008-05-30
9:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
324
Master Data Management: IBM InfoSphere Rapid Deployment Package
D.3 Validation error with the code table
We introduced an invalid code (-13) in the CLIENT_POTEN_TP_CD field of row
2 of the SIF (highlighted in Example D-3; only the partial contents of the SIF is
shown here) to view the error messages generated.
Example D-4 on page 327 shows the contents of the error log for this error:
 The first record highlights row 2 in the SIF file SIF_Out.CodeError with the
SSK of (1000002,8000090) that is in error. Error message shows “The
following is not correct: ClientPotentialType”, and the error severity
level is 0. The name of the stage
(020_Contact.CheckCodeAndContentValidationErrors) and job name
(IL_020_VS_Contact) is also provided.
 The second row has the error message “Record In Error Dropped” for the
same row (2) in the SIF. It also It shows name of the stage in which this occurs
as being 020Contact.DropErrorRows, and the job name being
IL_020_VS_Contact.
 The subsequent records show the rows (689, 780, and 41) in the SIF that are
also rejected because they are associated with row 2 that was dropped. The
messages “Invalid PersonName Records: No Matching Contact Record”
(row 689)” and “Record dropped by association. Fatal errors were
detected on related party records.” (rows 780 and 41) are generated.
Example: D-3 Validation error with the code table error: partial contents of SIF
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||-13||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
Appendix D. Error processing
325
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
326
Master Data Management: IBM InfoSphere Rapid Deployment Package
P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
............
Example: D-4 Validation error with the code table error log output
P|P|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|2|1624|100054|
The following is not correct: ClientPotentialType|0|canonical_errCode|8611|2008-11-01
09:24:13|020_Contact.CheckCodeAndContentValidationErrors|IL_020_VS_Contact
P|P|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|2|110184|10006
6|Record In Error Dropped|0|canonical_errCode|8611|2008-11-01
09:24:13|020Contact.DropErrorRows|IL_020_VS_Contact
P|H|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|689|110126|100
246|Invalid PersonName Records: No Matching Contact
Record|0|canonical_errCode|8611|2008-11-01 09:25:10|030_CONTACT_RIV.Party Join
Proc|IL_030_RI_Contact_Person_Org
P|I|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|780|110387|100
387|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errCode|8611|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|A|1000002|8000090|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.CodeError|41|110387|1003
87|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_errCode|8611|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
Appendix D. Error processing
327
D.4 RT/ST/ADMIN_SYS_TP_CD error
We introduced an invalid code RT/ST/ADMIN_SYS_TP_CD (PP1) in row 3 of the
SIF (highlighted in Example D-5; only the partial contents of the SIF is shown
here) to view the error messages generated. Example D-6 on page 330 shows
the contents of the error log for this error:
 The third record highlights row 3 in the SIF file SIF_Out.errRTST with the SSK
of (1,200000071) that is in error. Error message shows “Invalid Contact
Record: No Match found in PersonName nor OrganizationName”, and the
error severity level is 0. The name of the stage
(030_CONTACT_RIV.Process_Contact_Join) and job name
(IL_030_RI_Contact_Person_Org) is also provided.
 The first two records and the ones following the third record are errors
resulting from the invalid RT/ST/ADMIN_SYS_TP_CD. Note the various rows
(42, 102, 115, 474, and 755), error messages, and stage and job in which
these errors were detected.
Example: D-5 RT/ST/ADMIN_SYS_TP_CD error: partial contents of SIF
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
328
Master Data Management: IBM InfoSphere Rapid Deployment Package
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
..................
Appendix D. Error processing
329
Example: D-6 RT/ST/ADMIN_SYS_TP_CD error log output
P|A|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|42|110107|1000
35|Invalid Internal ID|0|canonical_errRTST|0|2008-11-01
10:00:12|020_Address.CheckCodeAndContentValidationErrors|IL_020_VS_Address
P|A|1000001|200000071|-1|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|42|110184|100
024|Record In Error Dropped|0|canonical_errRTST|0|2008-11-01
10:00:11||IL_020_VS_Address
P|P|1|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|3|110105|100244|Inva
lid Contact Record: No Match found in PersonName nor
OrganizationName|0|canonical_errRTST|8625|2008-11-01
10:00:01|030_CONTACT_RIV.Process_Contact_Join|IL_030_RI_Contact_Person_Org
P|C|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|102|100995|100
995|Invalid Internal Id|0|canonical_errRTST|0|2008-11-01
09:58:44|020_ContactMethod.Type_Code_Chkup|IL_020_VS_ContactMethod
P|C|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|474|100995|100
995|Invalid Internal Id|0|canonical_errRTST|0|2008-11-01
09:58:44|020_ContactMethod.Type_Code_Chkup|IL_020_VS_ContactMethod
P|C|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|102|110184|100
077|Record In Error Dropped.|0|canonical_errRTST|0|2008-11-01
09:58:43|020_ContactMethod.Final_Recs_Process|IL_020_VS_ContactMethod
P|C|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|474|110184|100
077|Record In Error Dropped.|0|canonical_errRTST|0|2008-11-01
09:58:43|020_ContactMethod.Final_Recs_Process|IL_020_VS_ContactMethod
P|I|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|115|110107|100
161|Invalid Internal ID|0|canonical_errRTST|0|2008-11-01
10:00:30|020_Identifier.CheckCodeAndContentValidationErrors|IL_020_VS_Identifier
P|I|1000001|200000071|-1|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|115|110184|10
0154|Record in Error Dropped|0|canonical_errRTST|0|2008-11-01
10:00:30|020_Identifier.DropErrorRows|IL_020_VS_Identifier
P|H|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|755|110107|100
218|Invalid Internal Id|0|canonical_errRTST|0|2008-11-01
09:58:59|020_PersonName.Validation_Chk|IL_020_VS_PersonName
P|H|1000001|200000071|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.errRTST|755|110184|100
219|Record In Error Dropped|0|canonical_errRTST|0|2008-11-01
09:58:58|020_PersonName.Final_Rec_Process|IL_020_VS_PersonName
330
Master Data Management: IBM InfoSphere Rapid Deployment Package
D.5 End of record missing error
We assumed that there was a problem with the code generating the SIF file
which resulted in the end of record (DOS Record Terminator Character) being
dropped at the end of a row. We simulated this error by concatenating 2 SIF
records into a single row as shown in row 2 in Example D-7.
The two SSKs (1,955742003) and (1,955742002) appear to be columns in the
same row in the SIF.
When this error occurs, any additional columns detected after the final expected
column (as defined by the metadata) are ignored and a warning message
[“Import consumed only 74bytes of the record's 164 bytes (no further
warnings will be generated from this partition)”] is written to the Director
log as shown in Figure D-2.
Note: The main point here is to carefully review the Director log output for
such warnings because they do not appear in the RDP for MDM error logs.
The count of bytes (74 in our example) begins after the SSK because that is
where the columns begin; the count includes the column delimiter pipe “|”
character.
Example: D-7 End of record missing error: partial contents of SIF
P|H|1|955742001||||1||Alley|Mary|||Barton|||||||||||||||||||1|1|0|0|1|1|0|1|1|0|0|0|0
|
P|H|1|955742003||||1||Georgina|Elly|||Colborn|||||||||||||||||||1|1|0|0|1|1|0|1|1|0|0
|0|0|P|H|1|955742002||||1||Cheryl|Lynn|||Ainsworth|||||||||||||||||||1|1|0|0|1|1|0|1|
1|0|0|0|0|
P|H|1|955742004||||1||Margaret|F|||Conway|||||||||||||||||||1|1|0|0|1|1|0|1|1|0|0|0|0
|
Figure D-2 End of record missing error: partial contents of Director log output
Appendix D. Error processing
331
D.6 Start date after end date error
We introduced an invalid end date DISAB_END_DT that preceded the start date
DISAB_START_DT (date bounds error) in row 8 of the SIF (highlighted in
Example D-8; only the partial contents of the SIF is shown here) to view the error
messages generated.
Example D-9 on page 334 shows the contents of the error log for this error:
 The second record highlights row 8 in the SIF file
SIF_Out.endBeforeStartDate with the SSK of (1000002,8000212) that is in
error. Error message shows “EndDate must be after StartDate”, and the
error severity level is 0. The name of the stage
(020_Contact.CheckCodeAndContentValidationErrors) and job name
(IL_020_VS_Contact) is also provided.
 The first record also highlights the fact that row 8 is dropped with the error
message “Record In Error Dropped”.
 The subsequent records are errors resulting from the invalid date bounds.
Note the various rows (157, 254, 353, and 675), error messages, and stage
and job in which these errors were detected.
Example: D-8 Start date after end date error: partial contents of SIF
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||1984-05-07
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000037|A|N|||||||1||||||||||||||||||||3|||||F|1986-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000297|A|N|||||||1||||||||||||||||||||2|||||F|1995-10-30
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
332
Master Data Management: IBM InfoSphere Rapid Deployment Package
P|P|1000002|8000212|A|N|||||||1||||||||||||||||||||3|||||F|1997-11-02
00:00:00.000000|||1999-08-23 00:00:00.000000|1989-08-23
00:00:00.000000||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|
P|P|1000001|200000291|A|N|||||||1|||||||||||||||||||||||||F|1977-03-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004432|A|N|||100|||||3|||||||||||||||||||||185|||M|1976-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70004182|A|N|||100|||||2|||||||||||||||||||||185|||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000640|A|N|||||||2||||||||||||||||||||3|||||F|1975-07-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000469|A|N|||||||2||||||||||||||||||||1|||||M|1990-03-14
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000111|A|N|||||||1||||||||||||||||||||1|||||M|1984-06-21
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000201|A|N|||||||1|||||||||||||||||||||||||M|1967-05-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005333|A|N|||100|||||2|||||||||||||||||||||185|||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70005817|A|N|||100|||||4|||||||||||||||||||||185|||M|1957-03-29
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000232|A|N|||||||1||||||||||||||||||||1||||||1966-08-25
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000259|A|N|||||||1||||||||||||||||||||2|||||F|1991-02-04
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000011|A|N|||||||3|||||||||||||||||||||||||F|1945-03-12
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000221|A|N|||||||2|||||||||||||||||||||||||M|1977-09-18
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
Appendix D. Error processing
333
P|P|1000000|70003022|A|N|||100|||||2|||||||||||||||||||||185|||M|1989-12-11
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
...........
Example: D-9 Start date after end date error log output
P|P|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|8|110
184|100066|Record In Error Dropped|0|canonical_endBeforeStartDate|9107|2008-11-04
04:44:17|020Contact.DropErrorRows|IL_020_VS_Contact
P|P|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|8|102
|100056|EndDate must be after
StartDate|0|canonical_endBeforeStartDate|9107|2008-11-04
04:44:17|020_Contact.CheckCodeAndContentValidationErrors|IL_020_VS_Contact
P|H|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|675|1
10126|100246|Invalid PersonName Records: No Matching Contact
Record|0|canonical_endBeforeStartDate|9107|2008-11-04 04:45:44|030_CONTACT_RIV.Party
Join Proc|IL_030_RI_Contact_Person_Org
P|I|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|254|1
10387|100387|Record dropped by association. Fatal errors were detected on related
party records.|0|canonical_endBeforeStartDate|9107|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|C|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|353|1
10387|100387|Record dropped by association. Fatal errors were detected on related
party records.|0|canonical_endBeforeStartDate|9107|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|A|1000002|8000212|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.endBeforeStartDate|157|1
10387|100387|Record dropped by association. Fatal errors were detected on related
party records.|0|canonical_endBeforeStartDate|9107|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
334
Master Data Management: IBM InfoSphere Rapid Deployment Package
D.7 Date format error
We introduced an invalid date format BIRTH_DT (dd-mm-yy) in row 1 of the SIF
(highlighted in Example D-10; only the partial contents of the SIF is shown here)
to view the error messages generated.
Example D-11 shows the contents of the error log for this error:
 The first record highlights row 1 in the SIF file SIF_Out.dateFormatError with
the SSK of (1000002,8000719) that is in error. Error message shows “Unable
to parse record at RT/ST Level”, and the error severity level is 0. The name
of the stage (tx_RTST_ci_Rejects) and job name
(IL_010_Parse_Columnization) is also provided.
 The subsequent records are errors resulting from the invalid date format.
Note the various rows (227, 422, 513, and 859), error messages, and stage
and job in which these errors were detected.
Example: D-10 Date format error: partial contents of SIF
P|P|1000002|8000719|A|N|||||||1||||||||||||||||||||3||||||07-05-1984
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000002|8000090|A|N|||||||2||||||||||||||||||||3|||||M|1975-09-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000071|A|N|||||||2|||||||||||||||||||||||||F|1989-08-23
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000001|200000041|A|N|||||||1|||||||||||||||||||||||||M|1998-08-03
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
P|P|1000000|70008172|A|N|||100|||||2|||||||||||||||||||||185|||M|1937-08-02
00:00:00.000000||||||0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|
0|0|0|0|
...........
Appendix D. Error processing
335
Example: D-11 Date format error log output
P|P|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|1|110362
|100022|Unable to parse record at RT/ST
Level|0|canonical_dateFormatError|5925|2008-05-30
09:36:10|tx_RTST_ci_Rejects|IL_010_Parse_Columnization
P|H|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|859|1101
26|100246|Invalid PersonName Records: No Matching Contact
Record|0|canonical_dateFormatError|5925|2008-10-28 15:13:22|030_CONTACT_RIV.Party
Join Proc|IL_030_RI_Contact_Person_Org
P|I|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|513|1103
87|100387|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_dateFormatError|5925|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|A|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|422|1103
87|100387|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_dateFormatError|5925|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
P|C|1000002|8000719|0|/data/RDP/SIF_IN/canonical_err/SIF_Out.dateFormatError|227|1103
87|100387|Record dropped by association. Fatal errors were detected on related party
records.|0|canonical_dateFormatError|5925|2008-05-30
09:36:10|Split_Kept_Dropped|IL_040_EC_Party_Last_Drop
336
Master Data Management: IBM InfoSphere Rapid Deployment Package
E
Appendix E.
Additional material
This book refers to additional material that can be downloaded from the Internet
as described.
Locating the web material
The web material associated with this book is available in softcopy on the
Internet from the IBM Redbooks web server. Point your web browser at:
ftp://www.redbooks.ibm.com/redbooks/SG247704
Alternatively, you can go to the IBM Redbooks website at:
ibm.com/redbooks
Select the Additional materials and open the directory that corresponds with
the IBM Redbooks form number, SG247704.
© Copyright IBM Corp. 2009, 2011. All rights reserved.
337
Using the web material
The additional web material that accompanies this book includes the following
file:
File name
SG247704Code.zip
Description
Compressed code and data used in the scenario
System requirements for downloading the web material
We used the following system configuration:
Hard disk space:
Operating System:
500 MB minimum
Windows
How to use the web material
Create a subdirectory (folder) on your workstation, and extract the contents of the
web material ZIP file into this folder.
338
Master Data Management: IBM InfoSphere Rapid Deployment Package
Master Data Management: IBM InfoSphere Rapid Deployment Package
(0.5” spine)
0.475”<->0.875”
250 <-> 459 pages
Back cover
®
Master Data Management
IBM InfoSphere
Rapid Deployment Package
Implementing faster
to see the benefits
faster
Seeing benefits with
a financial services
scenario
Getting control of
your data
environment
IBM InfoSphere Rapid Deployment Package (RDP) for Master
Data Management (MDM) is a services offering that
combines the pre-integration of IBM InfoSphere software
with a prescriptive MDM implementation approach to
significantly reduce the cost of MDM implementations, and
reduce the overall risk. The RDP MDM delivers a fully
integrated solution that provides, to your enterprise, a single
view of the customer. It also provides a seamless upgrade
path to IBM InfoSphere MDM Server, to give you a wide and
robust range of MDM functionality.
This IBM Redbooks publication is aimed at IT architects,
Information Management specialists, and Information
Integration specialists responsible for implementing an IBM
InfoSphere Master Data Management solution on a Red Hat
Enterprise Linux 4.0 platform.
A simple financial services MDM scenario describes the RDP
for MDM offering. The scenario shows how RDP can deliver a
return on investment in a short time frame by using a phased
approach. MDM solutions can provide significant benefits to
an enterprise. Realizing those benefits and return on
investment requires implementation of an MDM solution and
a change in how the organization does business. For this
reason, how your MDM solution is implemented is often as
important as the solution itself.
®
INTERNATIONAL
TECHNICAL
SUPPORT
ORGANIZATION
BUILDING TECHNICAL
INFORMATION BASED ON
PRACTICAL EXPERIENCE
IBM Redbooks are developed by
the IBM International Technical
Support Organization. Experts
from IBM, Customers and
Partners from around the world
create timely technical
information based on realistic
scenarios. Specific
recommendations are provided
to help you implement IT
solutions more effectively in
your environment.
For more information:
ibm.com/redbooks
SG24-7704-01
ISBN 0738435422

Similar documents

×

Report this document