A walk through some of the hard won lessons of using Oracle Business Intelligence, the experience of upgrading to 11G and how it works.
Wednesday, 17 October 2012
Getting to Grips With WLST
The other day I decided to challenge myself into starting the dashboard Server using the command line using WLST, and what follows is an edited version of what I did.
Start the managed service using WLST
Start the managed server using the following commands, replacing servername
with the servername on which you are running:
cd <MW_HOME>\wlserver_10.3\server\bin
run setWLSEnv.cmd
cd
<MW_HOME> \wlserver_10.3\common\bin
run wlst.cmd
connect(‘weblogic’,’<weblogicpassword>’,’t3://<servername>:7001’)
state(‘bi_server1’,’Server’)
- You should see SUSPENDED, or STOPPED
start(‘Server’,’bi_server1’,’<servername>’,’80’)
exit()
Monday, 15 October 2012
Modelling Many-to-Many relationships:
The following provides a summary of some design options, which the development team should review and check whether any of the snowflaked models are mapped as a many to many relationships and can be re-design using one of these approaches:
Technique #1: Select a Primary
Although not a technical solution, the best way to solve the M:M problem is to eliminate it. By selecting one of the many dimensional records that are associated with a fact, the entire problem can be avoided. In the OLTP, Primaries are used throughout the model, which are carried over and used in the Oracle BI model. If it is at all possible to identify a primary, and the use of the primary is acceptable to the user community, then it is recommended to use this technique.Technique #2: Direct Modeling into the Dimension
A straightforward technique is where the table that serves as the intersection table is modeled into a lower level in the Dimension. The specifics of this technique are similar to those outlined in Solution B of the No direct physical link between a base Dimension and a Fact table section above.Note that over-counting will occur when performing the many-to-many join.
Technique #3a: Use of a Bridge Table
Instead of modeling the relationship table into a new lower level in the dimension as in Technique #2, the relationship table can become a separate logical table that servers as the Bridge between the dimension and the facts. Create a new Logical table with the M:M relationship table as the source, mark the logical table as a Bridge table, and adjust the Business model to show the relationship of Facts:Bridge as 1:M and Bridge:Dimension as M:1. The indication that the Logical Table is a Bridge table is merely an indicator to Analytics that the table is not a Fact table, which it assumes to be any lowest-level table in the data model.Note that over-counting will occur when performing the many-to-many join
Technique #3b: Use a Weighted Bridge Table
Similar to Technique #3a, this technique is the classic Kimball approach, where the Bridge table employs weighting factors to prorate a total value over multiple records. For example, if there is one Opportunity worth $1,000,000 and there are two Employees associated with it, the bridge table might contain a record for each with a weighting factor of 0.5. In this way, each employee will be associated with 0.5 of the whole amount of $1,000,000, or $500,000. If it is determined that Employee A should receive 75% of the credit, then the weighting factors would be stored as 0.75 and 0.25, which would give Employee A 75 of the total or $750,000.It is important to note that the weighting factors must all add up to 1 (One), as they are effectively percentages of a whole. Additional ETL effort will be required to complete this solution.
This technique eliminates over-counting, but may be difficult to implement if users are not comfortable prorating a value over several records.
Technique #4: Use Level Based Measures
As an enhancement to Techniques 2 and 3, the use of level based measures can help prevent the over counting problem associated with each. When a metric or measure is explicitly bound to a specific level in a dimension, it is indicating that the metric will be viewed at that level. If the metrics in a fact table are to be viewed by a Dimension with which it has an M:M relationship, those metrics can be set to a level in the dimension, thereby forcing that the records be broken out across that dimension. By forcing a breakout of rows (one fact row for each dimensional row), aggregation is prevented, and therefore over counting will not occur.As an example, suppose there is an M:M between Employee and Fact_Opty_Revenue. The data in the tables indicate that Tom, Larry and Bill are all linked to an Opportunity worth $9 million. The user makes a report that asks for the Opportunity Type and the total Potential Opportunity Revenue. Without level setting the metrics on the fact table, a report that does not include the employee dimension will over count, as each of the three dim records will be brought into the query and aggregated into one:
| Opportunity Type | Potential Opportunity Revenue |
| Software Sales | $27,000,000 |
By level setting the Revenue metrics to the Employee level in the Employee Dimension, this same report will return the following:
| Opportunity Type | Potential Opportunity Revenue |
| Software Sales | $9,000,000 |
| Software Sales | $9,000,000 |
| Software Sales | $9,000,000 |
Although not intuitively obvious as to the cause of the breakout to the end user, the over counting scenario is prevented. When the user adds the Employee to the report, the breakout becomes clearer:
| Opportunity Type | Employee | Potential Opportunity Revenue |
| Software Sales | Larry | $9,000,000 |
| Software Sales | Tom | $9,000,000 |
| Software Sales | Bill | $9,000,000 |
Technique #5: Lower the Fact Table
The most complicated and involved solution is to lower the level of the fact table, and create a 1:M between the Dimensions and the Facts. This involves a business rule to split up the metrics and spread them over all possible dimensional records. In the example above, the simplest spread would be to assign Larry, Tom and Bill each 1/3 of the total amount of $9,000,000, or $3,000,000. Thus, a report that does not break out by employee still totals to the correct $9,000,000. Note that this would require three records in the fact table instead of one, hence the concept of lowering the level of detail in the fact.When modeling M:M bridge tables, ensure that the requirements for reporting against the measures are reviewed.
Also refer to “Siebel Analytics Repository Design Best Practices [ID 477431.1]” on Oracle My Oracle Support which provides details on modelling snowflake relationships in OBIEE.
Friday, 12 October 2012
Only Measure columns in the Facts
OBIEE Repository Best Practice
Every Logical Column within a Fact Logical Table must be a measure column, and therefore have an Aggregation Rule assigned
When executing a query that includes columns which are not measures (defined with aggregate) and generates a GROUP BY or ORDER BY, this will cause the Oracle BI Server to generate an error when the data returned is not at the lowest level of granularity (such as dimensional data queried on the fact source).
Recommendation
Generally, dimensional data is placed on dimension tables to prevent the fact table from increasing in size, and to ensure that the queries executed on the fact tables can roll up the data appropriately.It is important to keep in mind that the fact table will contain a large number of records compared to the dimension tables.
Therefore storing dimension data on these tables will result in the tables being increased in size unnecessarily, thus potentially impacting the performance of queries executed against it. It could potentially lead to the grain of the fact data to change if/when further dimensional data is added at a future stage.
Note however, there are cases where it is more efficient to place the dimensional data on the fact tables where the data remains the same grain regardless of the queries performed, an example of this is the Activity fact table in the standard DW schema, which has data at the lowest grain, and by extracting the dimensional data out into a separate dimension table would not reduce the dataset held in the dimension table in comparison to the fact table.
If the reasons for placing the data on the fact table is because the data elements are unique to each record in the fact source (transactional data), such as order number. The Oracle BI RPD logical model should be built in a specific way to ensure that no potential errors occur. The following provides details on how this should be configured:
In this particular case, if there is no requirement to define an aggregate for this column, then the columns should be configured to be used as dimensional data and not fact data. This can be achieved by configuring the base physical fact table as a logical table source to both a fact and dimension logical table, whereby the fact table consists of all the aggregated columns and the dimension table will contain all other columns.
Wednesday, 10 October 2012
Guided Navigation
Guided Navigation:
is a way of making OBIEE dashboards and reports more dynamic, personal and relevant to the current user viewing the dashboard.For example - specific regions of the dashboard can be made to 'only' appear when there is important information to report.
To do this you create a condition based on either an Analysis or KPI, where the result is boolean (true/false).
i.e. a count > 20 or no rows returned.
The conditions can be used to determine:
- If agents should run and /or to who they should deliver their contents.
- If action links are displayed
- If reports are shown
For example if the sales department has a target to meet, an anlysis could be built to only show a report if sales targets are not met and also send out an email report via the agent showing underperforming sales. If targets are met, then no email is sent, or a 'nice' email is sent out.
There are two types of condition - Named and Inline.
Inline Conditions:
If you only ever intend to use a condition once - and this condition is only for dashboard reporting, you can create an inline condition. This is created within the dashboard page when you build and design the layout and is saved as part of the page.Named Conditions:
A Named Condition is created from the New dropdown or from the create --> more panel on the home page. It is as its name suggests, named, and therefore saved. This means that the same condition can be used in many places to guide action links, raise alerts or expose parts of the dashboard.A judicious use of guided navigation using conditions can improve the entire BI experience for your users, but as always, too much of a good thing will confuse rather than clarify the story you tell using the dashboard as your canvas.
Friday, 5 October 2012
ODBC or OCI?
How should you set your connection pool - OCI or ODBC?
When possible, configure your connection pools to use a “native driver” to connect to your physical databases. For example, use OCI for connecting to an Oracle database rather than ODBCThe native drivers understand the database better than ODBC. ODBC is the lowest common denominator and uses "simple" statements to select data and then passes the load to the business intelligence server for aggregation and transformation.
The native drivers allow the load to passed from the application server to the database, and the database then returns only the result required.
A simple example would be to ask for a sales total.
Using ODBC
In ODBC the query pushed to the database would be to select the relevant sales. The sales then are returned to the BI server and the total is calculated and sent on to the dashboard. YOu may even find that all the sales records are returned to the BI server, then filtered for the relevant records, before any aggregation takes place.Whereas:
Using OCI
Using a native driver helps the OBI server understand what the database, where the data resides, is capable of. Just how much complexity can be pushed down and passed to the database, relieving the load on the OBI server.You will find filters, aggregations and complex calculations appearing in the physical queries being passed down with the native drivers (eg. OCI for Oracle). I have examined some queries in the process of 'debugging' various parts of the repository and been surprised with new ways and functionality in SQL that the OBIEE server passes down. I have been writing SQL queries for a good many years and OBIEE still finds ways to expand my knowledge.
Labels:
connecting,
OBIEE,
obiee 11g,
OCI,
ODBC,
optimising,
SQL
Monday, 1 October 2012
No more Complex Joins please!
Avoid Physical Layer Complex Joins.
While it is perfectly acceptable to use complex joins in the physical layer, this can (and usually does) result in poor query performance when OBIEE generates the query against the database. Far better to use the foreign key joins that should be implicit in the star/snowflakes of your data model.For example where the complex join involves a between condition, then you are looking at index range scans on the database, repeat this several million times when you need an average on a large dataset, and you are potentially going to be looking at a spinning clock symbol for a long time.
Any work that has to be done by the BI server costs time. Much as you tune a query in the database to avoid making function calls in the 'WHERE' clause, so you should aim to do the same in your physical joins. - No more BICOLUMN = TRUNC(DATECOLUIMN) or BICOLUMN = SUBSTR(CHARCOLUMN,1,4).
Subscribe to:
Comments (Atom)