Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Subject Area Models 1

Status
Not open for further replies.

yaffle

MIS
Sep 15, 1999
127
US
I'd like to pick up a point made initially under the Star Schema and Technical Keys thread (kalyan Jul 30, 1999) about the Subject Area Models mentioned by SASMAN.<br>
&lt;quote&gt;<br>
The Subject Area Model is not specifically an ERD in the traditional sence of the word. It is a highly <br>
abstracted model representing the subject areas in an enterprise and the relationship between those <br>
subjects. As far as 2 weeks to do the model is concerned, most of the Architect's time will probably be <br>
spent arranging or waiting for meetings with organisational stake-holders rather than modelling. When it<br>
comes down to the model, a corporation probably has around 20 subject areas and to model these in a subject model is quite possible in 2 weeks.<br>
&lt;/quote&gt;<br>
I can see how such a modelwould be usefulwhen designing a DW initially, in that even if only some of the subject areas identified are to be covered by the DW and associated datamarts at 'go-live', it helps to know of others in order to estimate future requirements that may require extra DW capacity. If future subject areas are already identified then it may also be decided to hold a few extra data values that would allow linkage when they are developed (e.g. a key for a dimension to be built later). The ERDs for individual subject areas would presumably be developed as and when they are (individually) approved for DW inclusion.<br>
<br>
My question would be: if the SAM is not in entity relational format, what should it look like ?<br>
Do you see a SAM as simply a series of named boxes, with perhaps a few key data elements inside, or as something a bit more sophiticated ?
 
Perhaps you didn't read my rejoinder to SASMAN. The ERD that is the subject area model to which I refer is of the variety sometimes referred to as a logical model or business model by those who are trained as DBAs, instead of the off the top of the head database design usually dubbed an ERD by the DBA crowd. The logical model to which I refer is usually thrown away by the DBAs after database design is complete in a typical application development project. If that is your background, it is easy to understand how you would see little value in such a model.<br>
<br>
If, in your organization, there is no need for the statistics generated by one data mart to be consistent with statistics from another, then there is really little need for modeling of any kind. Star joins are easy to understand and easy to design. All you really need is Informatica and a really enthusiastic entry level application developer. You will be cranking out data marts like sausages in no time.<br>
<br>
However, If you are in a situation, like most of us out here in the World of Private Enterprise, where consistency, far from being the bugbear of small minds, is the source of vast power, then the subject area model takes on a great deal of importance. Its most important contribution is that it helps integrate new subject matter into the warehouse over time without unnecessary change to other structures, such as main warehouse tables, star join data marts, and other specialized reporting databases that are dependant on the design of the warehouse.<br>
<br>
The subject area model is a high level model. I agree with you that, in most cases, two weeks is ambitious unless the subject area is a very small one. That is not to say that a suibject area model needs to take the same amount of time as a full scale ERD. Most data warehouses begin as a means of satisfying a limited number of business cases. Because of this, most of the design effort goes into solving the immediate business problem. A subject area model acts as a balance against this tendency, but will still tend to be biased by the business problems that are the immediate concern of the warehouse project initially, unless it is painted with a broad brush at a high level. It acts as a template for additional construction.
 
Yes, WHArchitect, I did read your rejoinder to SASMAN,and the rest of the original thread - it contains some very useful discusssion of general DW principles. My question related primarily to the Subject Area Models as described by SASMAN that, as you pointed out in your 30-Jul post in that thread, are not the same as the high-level ERDs that you discussed.<br>
To attempt to answer my own question: the format of a SAM should be such as to best show to a mainly non-technical audience (e.g. executives, potential users, new joiners), what the principal areas of the enterprise are, and what sort of information is (or is desired to be) available for each. It would also be useful to identify for each area, the systems within the enterprise that hold such information. Such a model could then be used to present the scope of a DW project, i.e. &quot;this is the whole enterprise, we're looking at these areas, which will provide this information, and once we've done that we'd like to look at this other area to provide this&quot;.<br>
I agree that the next level down, moving from tens of subject areas to perhaps scores or hundreds of high-level logical entities in a ERD will take considerably more effort, but that this effort is worth it in the longer term. Yes, this is distinct again from the 'near-physical' low-level ERD most commonly produced, which may run to the hundreds of entities per subject area.<br>
To restate my question: if creating a SAM of the business overview type, would you use the same tools and methods as to produce ERDs (of whatever level), and what would be the best format for presenting the information in such a model ?<br>

 
Ah, now we are getting to the heart of the matter. A &quot;subject area model&quot; ala SASMAN, as you seem to understand it, sort of encapsulates a number of entities in each of a collection of &quot;superentities&quot; that we shall call &quot;subject areas&quot;. This is the impression I get from the discussion so far. A &quot;business model&quot;, as you appear to have understood from me so far, is a full blown, normalized logical model such as one might build to support development of an application. If this is the case, you have misunderstood me but you probably have SASMAN's intention right. When I talk about a &quot;high level&quot; ERD, I am refering to an ERD that contains fundamental entity types only, and only enough attribution to clarify the identity and scope of these entity types. It is like a sketch. As each business case comes up for implementation, this model becomes the template after it has been corrected to eliminate any errors that were included in the first, rapidly developed cut. The detailed ERD that results, still a business view, becomes the physical design of the main warehouse, if you are using that design paradigm. There is no application layer here that requires an &quot;elegant&quot; database solution. Some partitioning may be necessary if the ERD translates into a table or two that is too wide and too deep. In my experience, this doesn't usually happen if an appropriate granularity has been established. If you find yourself making a lot of adjustments to the logical model in order to get to physical, then in all likelihood your design has too fine a grain and you need to do more analysis.<br>
<br>
To get back to the purpose of the subject area model, my version, it is the glue that holds the whole mess together while you are building it piecemeal. It is also an excellent vehicle to define the ultimate scope of the effort as envisioned for the future. It is also very useful for illustrating to end users the scope of the subject matter that will be implemented with each addition to the warehouse starting with the initial build. This type of model allows you to present additions to the subject matter within the context of the business area. This makes it easier to visualize the scope and grasp the practical implications of the new development. It is developed and presented using the same tools as you use for the logical modeling. It is an ERD. It just isn't as well developed and the logical ERD will eventually be. It is preserved and maintained at this level to avoid the inconsistencies that usually result from piecemeal development by different project teams.
 
FYI<br>
<br>
IBM has a book available for free download in PDF format entitled:<br>
<br>
Data Modeling Techniques for Data Warehousing<br>
SG24-2238-00<br>
<br>
It is available at:<br>
<A HREF=" TARGET="_new"><br>
It has a dimensional modeling notation that could be considered a SAM (agree?). See page 90, 97...<br>
<br>
<br>
 
Dimensional modeling is a powerful tool and databases based on star joins can be powerful data structures. However, the usefulness of the dimensional model is limited to database design, whether as an analog representation of a dimensional database or as a star or snowflake schema. While these structures are excellent and highly flexible structures from a reporting perspective, they can be difficult to update incrementally and resist design modification once they are in place. While dimensional modeling is a very useful tool, it doesn't replace the entity relationship model, because it can't represent the business context, or as the UML guys say, the &quot;problem domain&quot;.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top