THE MODELING FORUM
Conference Sessions
(NOTE: Some sessions have
"Pre-Reading Materials." Follow the links to check them out!)
Monday,
November 4, 2002
5:00 pm - 6:00 p
SIG
Stewardship - The Road Taken
Loretta
Smith
Information Architect
T. Rowe Price Group
Although frequently high-quality software development in articles, books, and conference presentations identify Stewardship as a critical success factor —we rarely get clear directions on how to get there. As in any journey, there are many routes that you can take, each having different obstacles and signposts. In this presentation we will show the audience the travel journal from T. Rowe Price’s trip and describe our experiences. Finally, the team will facilitate a Q&A session, encouraging the audience to share their own Stewardship experiences.
Baseline definitions of primary terms
Roadmap identifying Stewardship Milestones
Outline Stewardship Policy
Opportunity establish contacts with peers in Stewardship Issue arena
Complete documentation of recommendations and suggestions from the facilitated session
Tuesday,
November 5, 2002
5:00 pm - 6:00 pm
SIG
CMM for Data Management
Peter
Aiken
Founding Director
Institute for Data Research
The Capability Maturity Model (CMM) was originally developed at the Software Engineering Institute (SEI) at Carnegie Mellon University, in Pittsburgh, as a tool for the assessment, measurement and comparison of software and systems development practices. Recently, a number of independent initiatives have started to think about ways to incorporate data and metadata management practices into the existing CMM structure. Some have even suggested a specific Data Management CMM (DMCMM). The SIG will discuss a number of these efforts and see if the groundswell of interest in a DMCMM is likely to bear fruit.
Tuesday, November 5, 2002
5:00 pm - 6:00 pm
SIG
The Data Vault: The Next Evolution in Data Modeling
Daniel
Linstedt
Chief Technology Officer
Core Integration Partners
The Data Vault is a patent-pending technique which some industry experts have predicted may start a revolution as the next big thing in data modeling for enterprise warehousing. This SIG session, led by the creator of the Data Vault, will explain what this new concept is, what its architecture and components are, its applications, and the advantages of the Data Vault over existing techniques.
Wednesday, November 6, 2002
10:30 am - 11:30 am
Conference Session
Data Model Patterns, Generalizations and XML
Roland
Berg
Principal Consultant
ThinkSpark
Patterns and generalizations have long been touted as the solution to managing unstable data environments. The use of metadata-centered database designs creates a great deal of flexibility in the structures and allows the database to become evolutionary in content while maintaining structural consistency.
XML and related technology is considered to be the solution to the problem of exchanging dynamic data across diverse systems.
It is only natural that we should explore the implications of applying XML-based data interchange to patterned/generalized databases and vice-versa. This presentation is a discussion of the impact that each technological approach has on the other and an exploration of the alternatives available for representing a patterned/generalized database as an XML structure.
Specific issues addressed will include the interaction of the object-like nature of the highly generalized database with the hierarchical structure of XML and options for transporting the highly metadata-driven information via XML.
Attendees will learn:
How XML and model patterns/generalizations interact
Options for using XML to transport these structures
Strengths and weaknesses of the various options
Wednesday,
November 6, 2002
10:30 am - 11:30 am
Conference Session
Unified Modeling
Robert A. Maksimchuk
dotNET Senior Evangelist
Rational Software Corporation
One of the major reasons for software project failing to meet project milestones is ineffective communication. The use of the UML to model business processes, systems requirements, software applications and database design allow you to improve the communication within your software project. Come see how the UML can be used to streamline communication across your whole development team.
In this session you will learn how the different UML diagrams can be used throughout the entire development lifecycle.
Wednesday,
November 6, 2002
11:40 am - 12:40 pm
Conference Session
Analytical Modeling Manifesto
Tom
Haughey
Chief Technology Officer
Pepsi Bottling Group
This presentation will re-examine the concept of data design for analytical systems such as the data warehouse. It will take a close look at dimensional modeling and define its proper role and context. It will position ER modeling, dimensional modeling (and other forms) into a general framework. Dimensional modeling is usually presented as the end-all and be-all of data warehousing. Is dimensional modeling one of the great con jobs in data management history? In fact, dimensional modeling has strengths and weaknesses. In some ways it has become outmoded. In other ways, it has been around for decades (and will continue to be). There are three ways to improve performance: use better hardware, use better software and optimize the data. The primary justification for dimensional modeling is to improve performance by compromising the data to compensate for the inefficiency of technology. It uses the third method above. A secondary purpose is to provide a consistent base for analysis. Dimensional modeling comes with a price and with restrictions. There are times and places where dimensional modeling is appropriate and will work, and other times and places where it is inappropriate and will actually interfere with the goals of a warehouse.
To make matters worse, the data warehouse industry suffers from a host of double-entendres that make it difficult to communicate meaningfully. It is not uncommon for two “gurus” to disagree about something without realizing that they are not talking about the same thing. Because of this it is actually necessary to start over and define some terms. This presentation will do just that: it will reexamine these concepts and redefine them; it will establish a framework for integration; and it will address a number of specific analytical modeling issues or situations, such as the following:
The main characteristics of analytical models
How to distinguish logical from physical models
The importance of using principles (not patterns) to do design
How to do database optimization
Logical vs physical models
ER model vs. dimensional model
Data model optimization
Different fundamental grains of facts
Seamless extensibility of a database
Changing dimensions
Assignment of keys, including surrogate keys aggregates
Prodigal data
Ragged hierarchies
Dimensions with multiple values or roles
Representing what did and did not happen
Conforming dimensions
Unexpected data
Time variant models
Dealing with changes in the model
Wednesday,
November 6, 2002
11:40 am - 12:40 pm
Conference Session
Experiences of a Data Modeler on a RUP Project
Christine
Mandracchia
Manager - Data Administration
American ReInsurance
RUP (Rational Unified Process) is a system development methodology that provides a disciplined approach for object-oriented software engineering projects. The designated roles and artifacts within RUP have impacted the traditional involvement of this logical data modeler on application development projects. In this presentation, Christine will cover her pre-conceived notions about data modeling for a RUP project, in the areas of roles, the object class model, other related deliverables, and the iterative process. She will then describe her actual experiences on the RUP project in these same areas, and her current perspective on when to develop an entity-relationship data model and an object class model.
Wednesday,
November 6, 2002
1:45 pm - 2:45 pm
Conference Session
Data Refactoring: Enabling Iterative and Incremental Database Development
Scott W. Ambler
President and Senior Consultant
Ronin International
“Traditional” development practices, practices that are still followed by many data professionals today, are nearly serial in nature and “driven” by one or more forms of entity/data model that were baselined early in the software lifecycle. Times have changed. Dramatically. Modern software development methodologies, including both rigorous processes such as the Rational Unified Process (RUP) and agile processes such as eXtreme Programming (XP), are based on the premise that software should be developed in an iterative and incremental manner. Furthermore these processes are often driven by new types of artifacts, use cases and user stories respectively, and not data-oriented artifacts. Application developers are adopting new ways to work, why can’t data professionals?
Data refactoring is a technique that enables data professionals to work in an iterative and incremental manner, just like the application developers they support. Like source code refactoring, data refactoring is based on the idea that you can evolve your data schema over time by applying small changes that improve its design without destroying its original invariants. This presentation explores the issues surrounding data refactoring, although it is quite simple in green field environments it becomes quite complex in the highly-coupled reality of legacy databases, and overviews the techniques and philosophies that data professionals need to adopt to support modern development projects. Data refactoring is an enabling technique of the Agile Data method.
Wednesday,
November 6, 2002
1:45 pm - 2:45 pm
Conference Session
Enterprise Data Integration: Development of an Enterprise Data Model
Noreen
Kendle
Enterprise Architect
Delta Technology - Delta Air Lines
This presentation is focused on the "How" to develop an Enterprise Data Model. It describes the approach developed and used at Delta Air Lines for the creation of an Enterprise Data Model. The Delta Air Lines Enterprise Data Model is now being used to create the Operational or Enterprise Data Stores, integrating operational data across the airline business. It describes a 7 step practical methodology for developing an Enterprise Data Model that incorporates a "top Down" and "Bottom up" approach. It incorporates an enterprise view needed for integration to support an ODS and/or DW, as well as the current state (work already accomplished – existing models) for practicality and quicker development. The presentation focuses on How to build the enterprise data model using this methodology.
Definition of what is an Enterprise Data Model
Explanation of the methodology used, an approach of top down and bottom up
A description of HOW to create an Enterprise Subject Area Model with real world examples
A description of the development of the Enterprise Conceptual Models with real world examples
A description of data rationalization in the bottom up /top down integration resulting in the Enterprise Data Model
An explanation of the supportive documentation and continuation of the iterative process
Wednesday, November 6, 2002
3:15 pm - 4:15 pm
Conference Session
Information Quality through Semantic Models
Joshua
Fox
Software Architect
Unicorn Solutions Ltd.
Understanding data source semantics and their reference to a unified business model is central to ensuring total information quality.
This presentation will show data managers how to apply a central conceptual model to provide semantics to data schemas. It will answer the two critical questions - where is the data? and what does it mean? Combining such a model with a formal development process ensures information quality that transcends the limits of a single system, transformation, or data warehouse.
Data integrators today analyze the business concepts behind their data, and design transformation logic to unify metadata. These procedures must be repeated individually for each data source and transformation, with the resulting integrations providing low quality output that is often impossible to maintain. Participants will learn how to apply these techniques when moving data into a data warehouse with an ETL tool, when integrating databases from recently merged organizations, or when cleansing legacy databases with badly-structured data.
The presentation will demonstrate how analysts can understand their numerous data sources without re-analyzing each schema’s semantics and structure. When the rich semantic model helps implement business information quality coherently across the enterprise, disjointed data is transformed into meaningful information.
This talk is targeted at data managers, data modelers, information quality specialists, and data stewards. The topic is also relevant to EAI specialists who develop transformations for EAI message brokers. The presentation will also appeal to conference attendees who are interested in new ideas in the fields of ontology and the Semantic Web.
Wednesday,
November 6, 2002
3:15 pm - 4:15 pm
Panel
Current Controversies in Data Modeling
Brett
Champlin
Process Center of Expertise
Allstate Insurance Company
Graeme
Simsion
Senior Fellow
Melbourne University
Robert A. Maksimchuk
dotNET Senior Evangelist
Rational Software
Scott W. Ambler
President and Senior Consultant
Ronin International
David C. Hay
President
Essential Strategies
What are the key issues that data modelers are grappling with today? What topics are igniting flames on the discussion boards? In the “real” world, when people sit down to address new projects or new requirements, what are the things that generate arguments between colleagues? Where are the conflicts between modelers and other stakeholders? And what new business and technology trends are changing the face of data modeling right now?
UML vs E/R vs. whatever else
“Agile” modeling
Buy vs. build
Enterprise vs. project
Renaissance vs. Disillusionment: What is data modeling’s place in the world?
Thursday, November 7, 2002
7:15 am - 8:15 am
SIG
ERwin Users "Get Together"
Marcie
Barkin Goodwin
President/CEO
Axis Software Designs
Thursday,
November 7, 2002
8:30 am - 9:30 am
Conference Session
Surviving and Thriving using Data Modeling Standards & Procedures
Marcie
Barkin Goodwin
President/CEO
Axis Software Designs
"No
problem is so big or so complicated that it can’t be run away
from."
--Linus, in Peanuts
Linus was probably talking about Standards & Procedures in the Data modeling environment. Standards and procedures are those seemingly nasty things that everyone knows they should have, but don’t want to admit to. Or they do have but don’t use. They cause universal grimaces and moans when someone is faced with the writing, implementing and enforcing of these vitally important (though unpopular) bastions of development.
There is, however, such an enormous advantage to using standards & procedures that the issue is not whether they add value, but how an organization can most efficiently and effectively realize their return on investment.
This presentation will provide the whys and how-tos of establishing an effective and maintainable data modeling development infrastructure. Fortified with useful handouts as well as ‘in the trenches’ anecdotes, there will be something of interest for the new, the experienced and the sometimes beleaguered data modeler, business analyst, project manager, business user, and IT Upper Manager.
Best Practices - Suggestions For A Successful Modeling Effort
The Big Picture
An Enterprise Wide View
Methods, Standards & Procedures
Model Reviews
Logical Model Validation
Physical Model Validation
Model Guidelines
Quality Assurance
Logical & Physical Models
Keep The Link (Baby)
Thursday,
November 7, 2002
9:45 am - 10:45 am
Conference Session
The Process Potential of Temporal Data Structures
Henry
Feinman
Principal
HJF Information Solutions
Difficulties encountered in developing structures that incorporate time has stunted the growth of data centric approach to modelling business process.
Temporal enablement has a foothold in data warehousing. Methods and techniques used here can be extended to operational data models, freeing these structures to define process and state transition.
Some Definitions
Process
Activity
Event
Workflow
State
State Transition
Brief primer in temporally enabled models
A state transition data model
A state transition reference data model (the rules)
A process model
A process reference data model (the rules)
Attendees will learn:
To temporally enable data models
To extend data modelling's definitional capbilitiy to business process
Why is it important?
Business process is the method by which the organization attempts to manage state transitions for the benefit of itself and its customers. A definition of required and desired state transitions, or business rules, can be created in procedural code, or within the data structure, though it is usually done in procedural code. Defining business process in procedural code locks process change to IT systems change.
There are many advantages to moving state transition definition to data - ease of modification, flexibility, agility, but the difficulties encountered in developing structures that incorporate time have prevented widespread exploitation of the data centric approach.
Thursday,
November 7, 2002
11:00 am - 12:00 pm
Conference Session
A Common Model for Classification Hierarchies
William
Lewis
Senior Technology Specialist
Cambridge Technology Partners
Among the most frequently occuring challenges faced by data modelers are classification hierarchies, or taxonomies, used for grouping and analyzing common entities business entities such as Customers, Products, Accounts or Organization Units. Widespread technologies implementing such classification hierarchies include OLAP dimensions, data mining clusters, knowledge taxonomies and LDAP directories. With the growing emphasis on both corporate portals and information security, these requirements have taken on increasing urgency. Do common, reusable and innovative patterns exist for modeling these requirements?
This presentation begins with examples of common business cases including Customers, Products, Accounts or Organization Units. Conventional patterns for modeling these requirements are then addressed, along with examples of techniques for "flattening" hierarchical structures. Then a highly abstract, yet widely applicable and implementable model for addressing classifications and hierarchies across multiple application domains will be presented and described. Capabilities of this model to support flexible and dynamic hierarchical classifications of detailed, summary and historical time-series data, by incorporating features of object, multi-dimensional and entity-relationship models, will be explained in detail. The presentation will conclude with examples of actual physical implementations of such a model.
With the growing emphasis on both corporate portals and information security, requirements for flexible modeling of classification hierarchies have taken on increasing urgency.
In this session, attendees will learn:
Widely-used classification hierarchy patterns for Customers, Products, Accounts and Organization Units
Examples of techniques for “flattening” hierarchical structures
An innovative, reusable modeling pattern for classification hierarchies, applicable across multiple application domains and technologies
Examples of actual physical implementations of such a model
Thursday,
November 7, 2002
11:00 am - 12:00 pm
Conference Session
Bringing Data Modeling/Data Administration into an Organization: What's Worked...and Not
Abbie
Allen
Systems Technical Consultant
Farmland Insurance
Introducing Data Modeling/Data Administration into an organization can be very challenging and frustrating, yet very rewarding and exciting at the same time…when it works! Amazingly what works for one company may not work as well for another.
This presentation will examine some of the trials and tribulations experienced when introducing Data Administration into various companies in order to provide some examples of what worked well, what didn’t work so well and some thoughts on why.
This presentation will not give you a magic checklist to take back and implement, but rather different ideas to incorporate into your particular situation i.e., new ‘pitches’ for you to try…
For each company we’ll take a look at:
the environment
the structure of the Data Administration group
the techniques used to educate both the modeler and the knowledge worker
the deliverables introduced
the results