The following provides a summary of some design options, which the development team should review and check whether any of the snowflaked models are mapped as a many to many relationships and can be re-design using one of these approaches:
Technique #1: Select a Primary
Although not a technical solution, the best way to solve the M:M problem is to eliminate it. By selecting one of the many dimensional records that are associated with a fact, the entire problem can be avoided. In the OLTP, Primaries are used throughout the model, which are carried over and used in the Oracle BI model. If it is at all possible to identify a primary, and the use of the primary is acceptable to the user community, then it is recommended to use this technique.
Technique #2: Direct Modeling into the Dimension
A straightforward technique is where the table that serves as the intersection table is modeled into a lower level in the Dimension. The specifics of this technique are similar to those outlined in Solution B of the No direct physical link between a base Dimension and a Fact table section above.
Note that over-counting will occur when performing the many-to-many join.
Technique #3a: Use of a Bridge Table
Instead of modeling the relationship table into a new lower level in the dimension as in Technique #2, the relationship table can become a separate logical table that servers as the Bridge between the dimension and the facts. Create a new Logical table with the M:M relationship table as the source, mark the logical table as a Bridge table, and adjust the Business model to show the relationship of Facts:Bridge as 1:M and Bridge:Dimension as M:1. The indication that the Logical Table is a Bridge table is merely an indicator to Analytics that the table is not a Fact table, which it assumes to be any lowest-level table in the data model.
Note that over-counting will occur when performing the many-to-many join
Technique #3b: Use a Weighted Bridge Table
Similar to Technique #3a, this technique is the classic Kimball approach, where the Bridge table employs weighting factors to prorate a total value over multiple records. For example, if there is one Opportunity worth $1,000,000 and there are two Employees associated with it, the bridge table might contain a record for each with a weighting factor of 0.5. In this way, each employee will be associated with 0.5 of the whole amount of $1,000,000, or $500,000. If it is determined that Employee A should receive 75% of the credit, then the weighting factors would be stored as 0.75 and 0.25, which would give Employee A 75 of the total or $750,000.
It is important to note that the weighting factors must all add up to 1 (One), as they are effectively percentages of a whole. Additional ETL effort will be required to complete this solution.
This technique eliminates over-counting, but may be difficult to implement if users are not comfortable prorating a value over several records.
Technique #4: Use Level Based Measures
As an enhancement to Techniques 2 and 3, the use of level based measures can help prevent the over counting problem associated with each. When a metric or measure is explicitly bound to a specific level in a dimension, it is indicating that the metric will be viewed at that level. If the metrics in a fact table are to be viewed by a Dimension with which it has an M:M relationship, those metrics can be set to a level in the dimension, thereby forcing that the records be broken out across that dimension. By forcing a breakout of rows (one fact row for each dimensional row), aggregation is prevented, and therefore over counting will not occur.
As an example, suppose there is an M:M between Employee and Fact_Opty_Revenue. The data in the tables indicate that Tom, Larry and Bill are all linked to an Opportunity worth $9 million. The user makes a report that asks for the Opportunity Type and the total Potential Opportunity Revenue. Without level setting the metrics on the fact table, a report that does not include the employee dimension will over count, as each of the three dim records will be brought into the query and aggregated into one:
Opportunity Type | Potential Opportunity Revenue |
Software Sales | $27,000,000 |
By level setting the Revenue metrics to the Employee level in the Employee Dimension, this same report will return the following:
Opportunity Type | Potential Opportunity Revenue |
Software Sales | $9,000,000 |
Software Sales | $9,000,000 |
Software Sales | $9,000,000 |
Although not intuitively obvious as to the cause of the breakout to the end user, the over counting scenario is prevented. When the user adds the Employee to the report, the breakout becomes clearer:
Opportunity Type | Employee | Potential Opportunity Revenue |
Software Sales | Larry | $9,000,000 |
Software Sales | Tom | $9,000,000 |
Software Sales | Bill | $9,000,000 |
Technique #5: Lower the Fact Table
The most complicated and involved solution is to lower the level of the fact table, and create a 1:M between the Dimensions and the Facts. This involves a business rule to split up the metrics and spread them over all possible dimensional records. In the example above, the simplest spread would be to assign Larry, Tom and Bill each 1/3 of the total amount of $9,000,000, or $3,000,000. Thus, a report that does not break out by employee still totals to the correct $9,000,000. Note that this would require three records in the fact table instead of one, hence the concept of lowering the level of detail in the fact.
When modeling M:M bridge tables, ensure that the requirements for reporting against the measures are reviewed.
Also refer to “Siebel Analytics Repository Design Best Practices [ID 477431.1]” on Oracle My Oracle Support which provides details on modelling snowflake relationships in OBIEE.