THE DATA INTEGRATION FORUM
Conference Sessions
(NOTE: Some sessions have "Pre-Reading Materials."  Follow the links to check them out!)


Monday, November 4, 2002
5:00 pm - 6:00 pm
Night School 

Concordance: Managing Mismatched Data from Multiple Sources

Denise  Draper
Chief Software Architect
Nimble Technology 

When integrating data from multiple sources, one of the primary hurdles to overcome is how to match the data that refer to the same entity across different sources, when there is no 'natural key.'   A classic example is two systems that house customer data that have to be matched on customer name or address, but names and addresses are subtly different.

This problem is solved with various kinds of 'merge/purge' techniques for creating warehouses or cleaning source data, but the problem is different when doing virtual data integration, when the underlying source data remains 'dirty' but cannot be changed.  We call this the 'concordance' problem.

In this talk, we will describe the issues involved in the concordance problem, and describe a solution based on creating an independent concordance database, which tracks the relationships between records in multiple sources.  The issues and steps involved in designing and constructing a concordance database will be described, as well as how the concordance database is then used.


Monday, November 4, 2002
5:00 pm - 6:00 pm
SIG

Database Product Futures: IBM

Berni  Schiefer
Distinguished Engineer
IBM Corporation

What's next for DB2?  IBM is focusing on product features that simplify and automate database management, such as in DB2 Version 8 which will incorporate new self-managing and data-integration features.  Come and hear what's in store for IBM's enterprise data management offerings.


Tuesday, November 5, 2002
5:00 pm - 6:00 pm
Night School

Introduction to RosettaNet

Robert  Oberwetter
Application Development Manager
Tokyo Electron America 

RosettaNet combines the disciplines of XML, Data Integration and Modeling into an integrated business process.  All three areas are critical to, and used by, RosettaNet.  This presentation describes:

Attendees will learn that there is more to application integration than just the integration activities.  Each integration has a business process which must be defined and modeled.


Tuesday, November 5, 2002
5:00 pm - 6:00 pm
SIG

Database Product Futures: Microsoft

Sam  Batterman
Senior Technical Specialist
Microsoft 

Microsoft's offerings continue to make inroads into most areas of the database market. During the briefing you'll learn what's next for SQL Server, its various enhancements and other major developments in Microsoft's enterprise data management product line. 


Tuesday, November 5, 2002
5:00 pm - 6:00 pm
SIG

ARUG Meeting

Tom  Bilcze
Roadway Express
   

The Advantage Repository Users Group will meet to conduct business and introduce interested attendees to the activities of the users group. 


Wednesday, November 6, 2002
7:15 am - 8:15 am
SIG

Sharing Live Application Data Across the Internet: A New Concept in Data Storage

Harry Ellis
British Army

This presentation will outline a number of principles and associated technology that should enable automatic rule-based distribution of live application data.

Pre-Reading Materials


Wednesday, November 6, 2002
10:30 am - 11:30 am
Conference Session

Using Web Services for Integration Within and Outside the Enterprise

Leo  Kraunelis
Director
OASIS/XML.org
 

Web Services are reducing the cost of integration.  This presentation explains what web services are, and demonstrates their application through case studies and ROI examples. It also offers insight on market trends and tools.


Wednesday, November 6, 2002
10:30 am - 11:30 am
Conference Session 

The Many Become One ... Integrating Disparate Data into an Enterprise Data Warehouse

Alan  Chow
SVP, R&D
Teradata, a division of NCR

At its onset, data warehousing promised businesses a better understanding of their customers' businesses as a basis for better decision-making.  Fifteen years later, some organizations have achieved that goal.  They know their customers better, adapt to change faster, and their more accurate predictions pay off in business terms.  However, other organizations have poured money into data warehousing efforts, but haven't realized potential returns.  What's the difference?  All too often analysis is hindered by islands of data scattered across their organization. 

Businesses house data throughout the organization in unconnected and incompatible data marts, creating multiple versions of the truth.  Relatively speaking, data marts appear to be a cheap way, especially to business units, to have control over specific information.  However, current research highlights the disadvantages of those data marts in terms of the cost of ongoing support and maintenance.

In this presentation, Alan Chow, SVP, R&D, Teradata, a division of NCR, explores how to integrate data from disparate transactional ERP systems into an enterprise active data warehouse - giving users a single view of their business and avoiding the use of costly data marts and ODS.  Alan includes a discussion of consolidation strategies, from higher-level architectures to specific technical solutions, that increase the speed and reduce the cost of data integration between disparate systems. 


Wednesday, November 6, 2002
10:30 am - 11:30 am
Conference_Session

Be The Master Of Your Domain

Doug  Stacey
Team Leader, Metadata Infrastructure Support
Allstate Insurance Company

Renee Zea
Data Analyst
Allstate Insurance Company
 

Domain Management is at the core of Allstate Insurance Company's data integration strategy.  Through the management of Business Domains, Allstate's Enterprise Data Management team has achieved consistency in business definitions, documented and integrated the multiple sets of values and codes used throughout the enterprise, and provided the links between physical schemas and logical data models.  By building a Domain Management set of tools, Allstate has created a solution for researching, managing, and standardizing both encoded and non-encoded data.  This presentation will discuss the tools and techniques we used for building the environment and how the application community is now leveraging that information for the integration of systems.


Wednesday, November 6, 2002
11:40 am - 12:40 pm
Panel

Database Futures

Alan  Chow
SVP, R&D
Teradata, a division of NCR 

William  Ruh
Senior Vice President of Professional Service
Software AG, Inc

Sam  Batterman
Senior Technical Specialist
Microsoft

This panel session brings together three distinctly different, but strongly held, views of the future of database technology.  Software AG sees that native XML databases will explode in popularity, and soon.  Indeed, they are betting the company on this view.  Teradata, always the proponent of “big” solutions, sees enterprise warehousing as the model for the future, at the expense of smaller, distributed data marts?  Intuitively, consolidation seems to make sense, but does it work as well in practice?  Then there’s Microsoft, perhaps the only organization big enough to cover all its bases.  Where does it see the marketplace heading in the near, medium and long term?


Wednesday, November 6, 2002
11:40 am - 12:40 pm
Conference Session

Are We Headed Towards Massively Distributed Integration?

Michael  Hoskins
President
Data Junction Corporation 

As integration is now requisite both inside and outside the enterprise, new kinds of problems are created demanding new solutions. In today’s environment of Distributed Application Integration (DAI), each new application or integration point (inside or outside the enterprise) spawns a new set of integration issues, highlighting the dynamic, exponential (as opposed to the traditional linear) nature of today’s integration challenges. Subsequently, new problems are not readily addressed by the traditional standby solutions. Custom code, EAI tools and XML-only B2Bi solutions all fail to address these new challenges and concerns that crop up when attempting to "connect everyone to everything." The modus operandi of each of these solutions is to resolve one integration issue at a time, relying on a "problem du jour" framework that does not adequately confront integration at a widespread, systemic level. 

To solve today’s massively distributed application integration projects, solutions must be massively distributed as well.  Basic patterns in biology teach us what type of architecture effectively solves massively distributed problems -- not only must the solution itself be massively distributed, it must also be highly intelligent and dynamic, changing and developing as the challenges themselves evolve. Consequently, it is through emergent integration systems, working at the firewall of each business in an integration chain, that disparate data can be mediated (semantically and syntactically) into the enterprise’s own unique systems.  The presentation will also broach the topic of how does Web services fit into the picture.


Wednesday, November 6, 2002
1:45 pm - 2:45 pm
Panel

The Semantic Web

Brett  Champlin
Process Center of Expertise
Allstate Insurance Company 
 

William  Ruh
Senior Vice President of Professional Service
Software AG, Inc

Dave  McComb
President
Semantic Arts

The Semantic Web is a much anticipated (and yet often misunderstood) concept.  Fundamentally, the Semantic Web is a vision of the future in which documents and data contain descriptive metadata which allows them to be easily understood by computers.  Like so many nascent ideas in technology, the potential payoffs are huge, but the implementation questions remain unanswered.  Nonetheless, it deserves a much closer look.

In this session we’ll evaluate the business implications of the Semantic Web, and how much time and effort your organization should devote to further research.

Pre-Reading Materials


Wednesday, November 6, 2002
1:45 pm - 2:45 pm
Conference Session

ETL vs. EAI: Comparing Data Integration Approaches

Faisal  Shah
Chief Technology Officer

Knightsbridge Solutions 

EAI follows ETL as the latest category of data integration tools.  Many organizations are tempted to address all of their integration needs through just one category of tool.  At first, this seems like the most cost-effective and efficient way to address the integration issue. 

Unfortunately, the long-term costs of trying to solve ETL issues with EAI tools (and vice versa) can far outweigh the upfront costs. The two categories treat latency, unit of work granularity, meta data integration, third-party product integration, and other product dimensions differently.

An organization needs to address ETL and EAI holistically and at the same time understand that there are still significant differences between the tools and ways to approach integration projects.  EAI and ETL tools continue to grow closer together, but there are still significant advantages to using each for its original purpose, and knowing how to leverage these will allow an integration project to deliver the right information at the right time and at the right cost.


Wednesday, November 6, 2002
1:45 pm - 2:45 pm
Conference Session

Data Refactoring: Enabling Iterative and Incremental Database Development

Scott W. Ambler
President and Senior Consultant
Ronin International

“Traditional” development practices, practices that are still followed by many data professionals today,  are nearly serial in nature and “driven” by one or more forms of entity/data model that were baselined early in the software lifecycle.  Times have changed.  Dramatically.  Modern software development methodologies, including both rigorous processes such as the Rational Unified Process (RUP) and agile processes such as eXtreme Programming (XP), are based on the premise that software should be developed in an iterative and incremental manner.  Furthermore these processes are often driven by new types of artifacts, use cases and user stories respectively, and not data-oriented artifacts.  Application developers are adopting new ways to work, why can’t data professionals?   

Data refactoring is a technique that enables data professionals to work in an iterative and incremental manner, just like the application developers they support.  Like source code refactoring, data refactoring is based on the idea that you can evolve your data schema over time by applying small changes that improve its design without destroying its original invariants.  This presentation explores the issues surrounding data refactoring, although it is quite simple in green field environments it becomes quite complex in the highly-coupled reality of legacy databases, and overviews the techniques and philosophies that data professionals need to adopt to support modern development projects.  Data refactoring is an enabling technique of the Agile Data method.  

Pre-Reading Materials


Wednesday, November 6, 2002
1:45 pm - 2:45 pm
Conference Session 

Enterprise Data Integration: Development of an Enterprise Data Model

Noreen  Kendle
Enterprise Architect
Delta Technology - Delta Air Lines

This presentation is focused on the "How" to develop an Enterprise Data Model.  It describes the approach developed and used at Delta Air Lines for the creation of an Enterprise Data Model.  The Delta Air Lines Enterprise Data Model is now being used to create the Operational or Enterprise Data Stores, integrating operational data across the airline business.  It describes a 7 step practical methodology for developing an Enterprise Data Model that incorporates a "top Down" and "Bottom up" approach.   It incorporates an enterprise view needed for integration to support an ODS and/or DW, as well as the current state (work already accomplished – existing models) for practicality and quicker development.  The presentation focuses on How to build the enterprise data model using this methodology.


Wednesday, November 6, 2002
3:15 pm - 4:15 pm
Conference Session 

XML Tools: XML Views

Bradley  Wright
Vice President, Product Development
MetaMatrix, Inc

Mark Milodragovich
Senior Information Engineer
Nimble Technology, Inc.

Integration is clearly one of the core benefits of XML deployment, and can take various forms.  In this session we examine two aspects of XML data integration.

The first, sometimes called XML Views (or virtual XML documents) dynamically mediates and integrates data from heterogenous data sources.  The speaker will present a high level survey of vendor claims/announcements of XML Views to help attendees sort through confusing terminology. 

In the second part we will discuss the integration of legacy systems into XML standard schemas.  This presentation will show how the OMG Meta Object Facility (MOF) extends UML modeling to apply to modeling diverse information sources, including XML schemas to create Platform Independent Models (PIMs).  The speaker will show how the schema can be represented as a virtual model and then mapped to Non-XML physical sources, as well as XML sources.  He will then show how the virtual models are applied to the integration of the diverse information sources.


Wednesday, November 6, 2002
3:15 pm - 4:15 pm
Conference Session

Roadmap to Federated Data Architecture

Ho-Chun  Ho
President
HoTech Corp

The goal of architectural planning is to enable organizations to optimize revenue and increase shareholder value by establishing the supporting strategy, standard process, culture, technology and best practices. Over the years organizations have been building silo systems and isolated data islands, oftentimes forced by realistic reasons. It is largely overlooked that inadequate design of the organization of data architecture contributes to this disparity.  This presentation will discuss typical models of data architecture organizations in the U.S., the pros and cons of each type of organization, the concept of federation governance and local autonomy, and the roadmap to establish data architecture in a federated manner based on real-life experience.


Wednesday, November 6, 2002
3:15 pm - 4:15 pm
Conference Session

Information Quality through Semantic Models

Joshua  Fox
Software Architect
Unicorn Solutions Ltd. 

Understanding data source semantics and their reference to a unified business model is central to ensuring total information quality.

This presentation will show data managers how to apply a central conceptual model to provide semantics to data schemas. It will answer the two critical questions - where is the data? and what does it mean? Combining such a model with a formal development process ensures information quality that transcends the limits of a single system, transformation, or data warehouse. 

Data integrators today analyze the business concepts behind their data, and design transformation logic to unify metadata. These procedures must be repeated individually for each data source and transformation, with the resulting integrations providing low quality output that is often impossible to maintain. Participants will learn how to apply these techniques when moving data into a data warehouse with an ETL tool, when integrating databases from recently merged organizations, or when cleansing legacy databases with badly-structured data.

 The presentation will demonstrate how analysts can understand their numerous data sources without re-analyzing each schema’s semantics and structure. When the rich semantic model helps implement business information quality coherently across the enterprise, disjointed data is transformed into meaningful information. 

This talk is targeted at data managers, data modelers, information quality specialists, and data stewards.  The topic is also relevant to EAI specialists who develop transformations for EAI message brokers.  The presentation will also appeal to conference attendees who are interested in new ideas in the fields of ontology and the Semantic Web.   

Pre-Reading Materials


Thursday, November 7, 2002
8:30 am - 9:30 am
Conference Session

Engaging Data Administration in the Enterprise

Tom  Bilcze
Senior Group Coordinator
Roadway Express
 

Is your corporate Data Administration group in danger of falling like a house of cards? Why have companies abandoned sound principles of data design and administration? Often the bottom line is that Data Architects that offered the promise of building a sound data infrastructure ended up littering the road to systems development with walls and obst ructions.

In this session you will see how to build a collaborative environment by partnering with applications developers and end-user business staffs. You will discover some value-added techniques that will draw you in and make you a key player in business projects. You will see how your data modeling toolkit, analysis techniques and your company's Intranet can help you make this technique a reality.  

Attendees will also learn:


Thursday, November 7, 2002
8:30 am - 9:30 am
Conference Session

New Approaches to Customer Data Integration

A) Reference-Based Customer Data Integration: What it is and Why it’s Better

Chandos  Quill
Vice President, Strategic Marketing
Experian

Integrating customer data is, by nature, a reference process. Knowing whether data is accurate or not requires a picture of reality to which data cleansers and integrators can compare records. Any other process is a mathematical guessing game that tends to over-or-under merge customer records. If companies aren’t careful, they can accidentally eliminate customer relationships and perpetuate data inaccuracies.

This presentation details new reference-based data integration methods that achieve dramatically better results. These methods go beyond mere matching formulas to compare customer data to historical customer reference repositories. Case studies will be presented that demonstrate how reference-based matching has helped companies increase the accuracy and number of matching customer records, eliminate ever-matching, reduce processing times and costs, and keep data integrated over time.

Pre-Reading Materials


B) Data Synchronization – A New Approach to Enterprise Customer Data Integration

Jeff  Canter
Vice President of Operations
Innovative Systems, Inc.

Data integration projects are complex and challenging. Customer data integration projects are even more complex and challenging because they usually support multiple business units, each with different requirements for defining "customer."  The departments’ competing definitions and different business objectives often undermine the success of the traditional customer data integration project.  

Data Synchronization provides a new approach to customer data integration, an approach that accommodates competing business objectives, and still provides an integrated, enterprise customer view. 

This session will present a new vision for enterprise customer data integration, and real-world applications of this valuable approach.  In this session, Jeff Canter will identify and explain the critical success factors for creating a sharable, enterprise customer profile that can easily be segmented into "purpose-driven" views to support the different requirements of departments and applications across the enterprise.  

Critical to the success of this new approach to customer data integration is data quality.  Canter will outline how organizations can ensure that these different "purpose-driven" views are consistent and accurate. 


Thursday, November 7, 2002
8:30 am - 9:30 am
Conference Session

Good (Data) to Great (Data) - Part 1

Robert  Seiner
Publisher, The Data Administration Newsletter (TDAN.com)
& Principal, KIK Consulting

In 2001, Jim Collins wrote a best selling business book titled “Good to Great: Why Some Companies Make the Leap … and Some Don’t”.  In this book, Mr. Collins wrote about how the successful companies develop detailed business plans and build strict disciplines to go from “good” (or even mediocre or poor) companies to “great” companies.

This presentation by Robert S. Seiner (and ensuing brief discussions) will highlight a number of instances where companies did (or did not) create fundamental data management plans and implement disciplined enterprise data management efforts and what we as practitioners can learn from these efforts. 

The presentation will focus on specific business needs for an enterprise data management approach, the pragmatic disciplines that will be most effective most quickly, and the data focused technologies that can be looked upon as the accelerator to enterprise data management success.   

Pre-Reading Materials


Thursday, November 7, 2002
9:45 am - 10:45 am
Conference Session

EAI Aftermath - What Next?

Sheila  Jeffrey
Vice President
Wachovia

This presentation will discuss possible consequences of EAI (Enterprise Application Integration) strategies. The goal is to highlight less visible aspects of current EAI approaches for attendees, and outline an alternative end state.

EAI technology and business drivers will be briefly reviewed to explain the current challenges. The relationships between business organizations, processes and data will be presented as a context for examining the potential legacies of today’s EAI implementations. The evolution of data to information to knowledge (to wisdom) will be outlined as a driver for increasing solution complexity and growth in data volumes. Application architectures intended to address these expanding expectations will be reviewed - distributed solutions to complement legacy applications, ERP as EAI, data warehouses, and current middleware approaches.

Practical considerations of EAI implementations will be assessed in this context -- what is the real probable business model for Web services, security considerations, retaining control/ownership of ‘your’ data, data quality issues, and metadata management concerns. The need for a simplified, re-engineered, rationalized, distributed application portfolio will be presented as the conclusion.

The take-away from this session will be an awareness of pitfalls to avoid, design techniques to employ, and criteria for assessing EAI solutions for sustained benefit.


Thursday, November 7, 2002
9:45 am - 10:45 am
Conference Session 

Achieving Semantic Interoperability in Transactional Environments

Chito  Jovellanos
President & CEO
forward look, inc.

This presentation will examine and critique large-scale real-world applications that address the semantic interoperability problem.  Using the Financial Securities industry as a reference point, attendees will understand the evolution of interoperability problems between trading systems, the resolution strategies to date, and the tactical approaches needed to achieve semantic interoperability.  You will gain an understanding of the challenges presented by real-world objectives and constraints.  The speaker will also explore a new approach to the semantic interoperability problem called "semantic signaling", and will debunk the notion that XML is a pre-requisite for semantic interoperability.

The approaches taken within the securities industry are indicative of the problems that will need to be dealt with in other industries that are attempting to participate in eCommerce (eg, governments, manufacturing etc). Every transaction has counter-parties, a need for usable product information, a reference data framework such as calendar information, currency regime, taxes, fees and so on - all of which are presented to industry partners using the semantics  inherent in the enterprise's internal systems.

The presentation will be of interest to both practitioners and applied researchers who are currently engaged in large-scale Enterprise Applicaton Integration (EAI) projects. This presentation offers a practical hands-on assessment of commercial solutions and new techniques for addressing the deep semantic issues in EAI.

Attendees should have a fundamental understanding of metadata management, basic statistics, ontology development, XML, distributed systems, and middleware.

The Semantic Interoperability Problem

Case Studies from the Securities Industry (including application demonstrations and walkthroughs)


Thursday, November 7, 2002
9:45 am - 10:45 am
Conference Session

Good (Data) to Great (Data) - Part 2

Robert  Seiner
Publisher,
The Data Administration Newsletter (TDAN.com)
Founder & Principal, KIK Consulting

Hal Davis
Project Manager
Mellor Financial Services

William Lewis
Senior Technology Specialist
Cambridge Technology Partners

In 2001, Jim Collins wrote a best selling business book titled “Good to Great: Why Some Companies Make the Leap … and Some Don’t”.  In this book, Mr. Collins wrote about how the successful companies develop detailed business plans and build strict disciplines to go from “good” (or even mediocre or poor) companies to “great” companies.

This presentation by Robert S. Seiner (and ensuing brief discussions) will highlight a handful of case studies where companies did (or did not) create fundamental data management plans and implement disciplined enterprise data management efforts and what we as practitioners can learn from these efforts.

The presentation will focus on specific business needs for an enterprise data management approach, the pragmatic disciplines that will be most effective most quickly, and the data focused technologies that can be looked upon as the accelerator to enterprise data management success.


Thursday, November 7, 2002
11:00 am - 12:00 pm
Conference Session 

Real-Time Integration and Analytics

Seth  Grimes
Principal Consultant
Alta Plana Corporation

John  Ko
Product Marketing Manager
DataMirror
 

Ron Agresta
Products Engineer
DataFlux

In this session we look at the inexorable push for instantaneous information.  Whatever your preferred terminology -- “Real-time”, “Zero latency”, “Information on Demand”, “Active Warehousing” – you need to be working towards shorter and shorter timeframes for getting data into and out of analytical systems. 

How is real time integration accomplished in an XML world?  Companies must be able to capture selected events such as purchase orders or invoicing from any application database and send them in industry standard XML formats across the enterprise and beyond. Is the “streaming” of XML documents to application servers, B2B exchanges or other XML-driven applications the answer?  

Pre-Reading Materials


 Return to EDF Home Page