Pages

Wednesday, 7 March 2012

The Forgotton Side of OBIEE - The Database (Part 1).

All too often the glamour of repository modelling and gleaming dashboards distracts from some of the more mundane and forgotten things that have a direct bearing on OBIEE, and especially the performance of OBIEE. I am of course talking about the database.

Some of the changes that we can make have a direct and measureable impact on our front end (in this case the dashboard), for want of a better word, 'snappiness'. Our users are after reports that respond in 2-3 seconds, they don't care what happens in the background, they have grown accustomed to the new world of instant.

So what can we do in the source system, namely the database, to maximise throughput to our OBI server.
Unfortunately I can't comment on systems other than Oracle, but some of the recommendations should stand regardless of database platform.

Database Performance Analysis

Library Cache Hit RatioLibrary hit ratio should be > 0.9. However, using ASMM should automatically size the shared pool to the most optimal value given the amount of SGA memory available.
Executions          Cache Misses     Hit Ratio
6,658,846,369   18,005,199         1

Data Dictionary Cache Hit RatioDictionary cache hit ratio should be > 0.95. However, using ASMM should automatically size the shared pool to the most optimal value given the amount of SGA memory available.
Data Dictionary Gets  Data Dictionary Misses            Hit Ratio
############          513,493,838                           0.96

Buffer Pool Hit RatioBuffer cache hit ratio should be > 0.9. However, using ASMM should automatically size the shared pool to the most optimal value given the amount of SGA memory available.
Physical Reads        DB Gets                    Consistent Gets        Hit Ratio
5,834,233,365       11,477,901,612         11,730,931,334        0.75


Database Recommendations

1. DB_BLOCK SIZE   

The recommended db_block_size is 32K for an OBIEE Data Warehouse. The effect of setting an inappropriate database block size in a data warehouse can negatively impact the overall performance of the database by causing inefficient I/O for the majority of the queries. The Oracle OS manual will provide the acceptable ranges for your operating system, but the generally accepted wisdom is to create your database blocks as large as your operating system will allow. Remember, minimizing disk I/O is one of the most important factors in data warehouse tuning, and the more data that can be read in a single I/O, the faster your warehouse will perform. Oracle b-tree indexes are built in flatter structures in 32k block sizes.  We also see a huge reduction in logical I/O during index range scans and sorting within the TEMP tablespace because adjacent rows are located inside the same data block. The db_block_size parameter should be set to a multiple of the operating system block size. Set it to the largest value supported by Oracle that your I/O subsystem can handle in a single read. A database block size of 32 Kb or 64 Kb is not uncommon for a data warehouse database. Ideally a row should be contained within a single block. You get the best benefit from larger block size if your database is configured on raw devices or direct I/O is available to you.
Note that not only the database block size should be 32K, but on OS level and disk system level, the working unit should be 32K as well.


2. DB_FILE_MULTIBLOCK_READ_COUNT

This parameter determines how many Oracle blocks the I/O processes should read during a full table scan.
Set the value of DB_FILE_MULTIBLOCK_READ_COUNT to 32

Remember, the parameter db_file_multiblock_read_count is only applicable for tables/indexes that are full scanned, but it also affects the SQL optimizer in its calculation of the cost of a full-table scan. The maximum effective setting for db_file_multiblock_read_count is OS and disk dependant. As of Oracle Database 10g release 2, the default value of this parameter is a value that corresponds to the maximum I/O size that can be performed efficiently. This value is platform-dependent and is 1MB for most platforms. Because the parameter is expressed in blocks, it will be set to a value that is equal to the maximum I/O size that can be performed efficiently divided by the standard block size.

Note that if the number of sessions is extremely large the multiblock read count value is decreased to avoid the buffer cache getting flooded with too many table scan buffers.

Even though the default value may be a large value, the optimizer will not favor large plans if you do not set this parameter. It would do so only if you explicitly set this parameter to a large value. Online transaction processing (OLTP) and batch environments typically have values in the range of 4 to 16 for this parameter. DSS and data warehouse environments tend to benefit most from maximizing the value of this parameter. The optimizer is more likely to choose a full table scan over an index if the value of this parameter is high. The maximum value is the operating system's maximum I/O size expressed as Oracle blocks ((max I/O size)/DB_BLOCK_SIZE). If you set this parameter to a value greater than the maximum, Oracle uses the maximum.


3. High Availability   

For database high availability it is recommended to have a RAC or a clustered Database server.


4. PGA_AGGREGATE_TARGET

Oracle recommends to setup initially PGA_AGGREGATE_TARGET to about one third of the available physical memory, the rest being given to the SGA (refer to the OBIEE Applications Installation and Administration Guide).  The Oracle Database can automatically tune the distribution of the memory components in two memory areas. As a result, you need to set only the following parameters:
• SGA_TARGET
• PGA_AGGREGATE_TARGET

The SGA_TARGET parameter is the amount of memory you want to allocate for shared memory. For a data warehouse, the SGA can be relatively small compared to the total memory consumed by the PGA. The SGA, at a minimum, should be 100 MB. The PGA_AGGREGATE_TARGET parameter is the target amount of memory that you want the total PGA across all sessions to consume. If SGA_TARGET is specified, then the following memory pools are automatically sized:

 ■ Buffer cache (DB_CACHE_SIZE)
 ■ Shared pool (SHARED_POOL_SIZE)
 ■ Large pool (LARGE_POOL_SIZE)
 ■ Java pool (JAVA_POOL_SIZE)
 ■ Streams pool (STREAMS_POOL_SIZE)


5. Oracle Tablespace capacity

It is recommended 20% free capacity is configured in your tablespaces; to assure unexpected capacity limits impacting availability, are not reached before remedial action can be applied. Oracle leading practice is to adopt a pro-active capacity and layout strategy, based on expected and experienced data volume growth.
The Oracle Database can proactively help in managing disk space for tablespaces by alerting you when available space is running low. Two alert thresholds are defined by default: warning and critical. The warning threshold is the limit at which space is beginning to run low. The critical threshold is a serious limit that warrants your immediate attention. The database issues alerts at both thresholds. For each tablespace, you can set just percent-full thresholds, just free-space-remaining thresholds, or both types of thresholds simultaneously. Setting either type of threshold to zero disables it.

The ideal setting for the warning threshold is one that issues an alert early enough for you to resolve the problem before it becomes critical. The critical threshold should be one that issues an alert still early enough so that you can take immediate action to avoid loss of service. It is possible to view alerts by accessing the home page of Enterprise Manager Database Control.

It is also possible to use Grid Control to monitor the space available and assign space / datafiles once certain thresholds have been reached.

6. Set and Monitor SESSION_CACHED_CURSORS

Set the Oracle initialization parameter SESSION_CACHED_CURSORS.  Initially a figure of 200 is suggested which is the value set presently but this should be reviewed with use of the following SQL, which identifies a ratio of usage (note that this SQL only returns a meaningful value once the parameter has been set).

select 'session_cached_cursors' parameter, lpad(value, 5)  value,
       decode(value, 0, '  n/a',
          to_char(100 * used / value, '990') || '%')  usage
from  (select max(s.value) used
       from   v$statname  n, v$sesstat  s
       where  n.name = 'session cursor cache count'
       and    s.statistic# = n.statistic#),
     (select value
      from   v$parameter
      where  name = 'session_cached_cursors')


Increase the parameter until the ratio is not decreased significantly. A value of 500 is not uncommon.

7. Query Rewrite Parameter

Set the parameter QUERY_REWRITE_INTEGRITY to TRUSTED. This allows relationships to be defined in the database, which can influence the optimizer to choose different query paths. Setting this parameter to TRUSTED allows rewrites to use such relationships without the relationships having to be enforced in the database itself. This functionality can be used to improve performance for specific queries. Note that the use of Oracle parameter ‘Query Rewrite’ requires the creation of foreign keys in the database (but the Oracle database must be configured to not enforce FK relationships).


8. STAR_TRANSFORMATION_ENABLED

Set the parameter STAR_TRANSFORMATION_ENABLED to TRUE. This allows the optimizer to consider a cost-based query transformation to be applied to star queries typically encountered within the DW database. Again, this may require the creation of foreign keys in the database (but Oracle must be configured to not enforce the FK relationships), along with the creation of the necessary Oracle bitmap indexes.

This recommendation may be investigated and applied in the future, once other performance improvement options recommended in this document have been exhausted. Note that there are problems with setting this parameter to true with specific versions of Oracle, as specified in Oracle MetaLink note 4722645.8. These problems are fixed in versions v9.2.0.8 and v10.2.0.3 of Oracle.

9. ARCHIVELOG Mode

You can specify one of two modes for writing redo log files: ARCHIVELOG and NOARCHIVELOG. Using the redo logs in ARCHIVELOG mode allows data recovery in the event of media, software, or system failure. This is a Data Warehouse DB (and not as critical typically as a transaction OLTP system) the choice maybe to restore the whole database if there is failure of the database or to complete a full load to restore the whole data depending on the customers’ needs. In this case the recovery mechanism needs to be fully tested.  You must use the ARCHIVELOG mode to recover from media failure.

Initial ETL may cause higher than usual generation of REDO logs, when loading large data volumes in a data warehouse database. If your target database is configured to run in ARCHIVELOG mode, you can consider two options:

1. Switch the database to NOARCHIVELOG mode, execute Initial ETL, take a cold backup and switch the database back to ARCHIVELOG mode.
2. Allocate up to 10-15% of additional space to accommodate for archived REDO logs during Initial ETL.
Below is a calculation of generated REDO amount in an internal initial ETL run:
redo log file sequence:
start : 641 (11 Jan 21:10)
end : 1624 (12 Jan 10:03)
total # of redo logs : 983
log file size : 52428800
redo generated: 983*52428800 = 51537510400 (48 GB)
Data Loaded in warehouse:

SQL> select sum(bytes)/1024/1024/1024 Gb from dba_segments where owner='DWH' and segment_type='TABLE';
Gb
----------
280.49


Note if you are using Oracle Data Guard then the primary database must run in ARCHIVELOG mode at all times. The Primary database has to be running in ARCHIVELOG mode to capture all REDO changes.


10. Redo Logs Switching

The log file groups are used to segregate logs onto separate disks in case of failure. The size of the redo log files can influence performance.

Although the size of the redo log files does not affect LGWR performance, it can affect DBWR and checkpoint behaviour. Checkpoint frequency is affected by several factors, including log file size and the setting of the FAST_START_MTTR_TARGET initialization parameter. If the FAST_START_MTTR_TARGET parameter is set to limit the instance recovery time, Oracle automatically tries to checkpoint as frequently as necessary. Under this condition, the size of the log files should be large enough to avoid additional checkpointing due to under sized log files.

Oracle suggests that you do a log file switch no more frequently than 10-15 minutes to reduce overhead. Check if there are too few or too small redo logs. If you have a few redo logs or small redo logs (for example two x 100k logs), and your system produces enough redo to cycle through all of the logs before DBWR has been able to complete the checkpoint, then increase the size or number of redo logs. It may be necessary to increase the size of the online redo logs if the log switches more than 5 times each hour.
The size of the redo log files can influence performance, because the behaviour of the database writer and archiver processes depend on the redo log sizes. Generally, larger redo log files provide better performance. Undersized log files increase checkpoint activity and reduce performance. Although the size of the redo log files does not affect LGWR performance, it can affect DBWR and checkpoint behaviour. Checkpoint frequency is affected by several factors, including log file size and the setting of the FAST_START_MTTR_TARGET initialization parameter. If the FAST_START_MTTR_TARGET parameter is set to limit the instance recovery time, Oracle automatically tries to checkpoint as frequently as necessary. Under this condition, the size of the log files should be large enough to avoid additional checkpointing due to under sized log files. The optimal size can be obtained by querying the OPTIMAL_LOGFILE_SIZE column from the V$INSTANCE_RECOVERY view. You can also obtain sizing advice on the Redo Log Groups page of Oracle Enterprise Manager Database Control.

It may not always be possible to provide a specific size recommendation for redo log files, but redo log files in the range of a hundred megabytes to a few gigabytes are considered reasonable. Size the online redo log files according to the amount of redo your system generates. A rough guide is to switch logs at most once every twenty minutes. See also the Oracle Database Administrator's Guide for information on managing the redo log.


11. OPEN_CURSORS

The usual current setting (300) is unlikely to present a problem, however increasing this parameter to 2000 will further reduce the risk of errors.

The default value for this parameter is too small for systems such as WebLogic Server. It is important to set the value of OPEN_CURSORS high enough to prevent system from running out of open cursors.
Note: assuming that a session does not open the number of cursors specified by OPEN_CURSORS, there is no added overhead to setting this value higher than actually needed.


12. TEMPORARY TABLESPACE


The following table summarizes space allocation estimates in a data warehouse by its data volume range:

This reference is from the “Oracle Business Intelligence Applications Version 7.9.6 Performance Recommendations - An Oracle Technical Note August 2009”.



13. STATISTICS GATHERING

Statistics are typically collected on the OBIEE schema using the dbms_stats package and the ‘GATHER_SCHEMA_STATS’ option. 

This is usually done once or twice a week and in some cases just once during the week-end. Gathering stats should be done whenever there have been large changes to the data content, for example a large number of deletes or inserts. If the table structure has changed you should gather stats also. Do this as an automated process out of business hours if possible, or if you have to do it during business hours then choose a time when there is minimum access to the tables you wish to gather stats for.

Use the Oracle package dbms_stats to collect statistics.  This is the method recommended by Oracle.  Use the ‘GATHER_TABLE_STATS’ procedure, as this provides increased flexibility for collecting statistics. Existing statistics on the tables should be deleted, as the remnants of these can remain after the collection of statistics.  To delete statistics execute the command:

exec dbms_stats.delete_schema_stats(ownname=>'MYBI') or (for each table):
exec dbms_stats.delete_table_stats(ownname=>'MYBI', tabname=>'<table name>')


Initially, the ‘FOR ALL COLUMNS’ option should be used, subsequent performance monitoring might identify issues and this approach might require modification for specific tables. Note that the ‘SIZE AUTO’ clause results in different results, in particular for non-unique columns, and should be avoided initially. The recommended command to gather the statistics for the tables is, therefore:

exec dbms_stats.gather_table_stats (ownname=>'MYBI', tabname=>'<table name>', method_opt=>'FOR ALL COLUMNS', granularity=>'ALL', cascade=>TRUE)

Statistics can be estimated for larger tables using the parameter “estimate_percent=><percentage>”. Statistics should be re-collected on objects when the data has changed by 10-15%.  Export a well performing set of statistics prior to re-analyzing objects (in case the new analysis causes performance degradation).  Exporting the statistics can be performed by first creating an ‘export’ table:

exec dbms_stats.create_stat_table (ownname=>'MYBI', stattab=>'<table name>',
tablespace=>'<tablespace name>')


The command to export statistics is:

exec dbms_stats.export_table_stats (ownname=>'MYBI', tabname=>'<table name>', stattab=>'<statistics table name>', statid=>'<optional statistics id>', cascade=>TRUE, statown=>'<owner of statistics table, if not current schema>')

The command to restore a statistics profile is:

exec dbms_stats.import_table_stats (ownname=>'MYBI', tabname=>'<table name>', stattab=>'<statistics table name>', statid=>'<optional statistics id>', cascade=>TRUE, statown=>'<owner of statistics table, if not current schema>')

Full details of these commands are provided in the Oracle Supplied Packages Reference documentation. All tables and indexes being part of the ETL process should be analyzed after loading data. For further information see Oracle (Note: 153761.1 and Note: 149560.1) for further information regarding the collection of system statistics. Also review ‘Recommendations for Gathering Optimizer Statistics on 10g [ID 605439.1]’.


14. UNDO_RETENTION


UNDO_RETENTION specifies (in seconds) the low threshold value of undo retention. For AUTOEXTEND undo tablespaces, the system retains undo for at least the time specified in this parameter, and automatically tunes the undo retention period to satisfy the undo requirements of the queries. For fixed- size undo tablespaces, the system automatically tunes for the maximum possible undo retention period, based on undo tablespace size and usage history, and ignores UNDO_RETENTION unless retention guarantee is enabled.

The setting of this parameter should account for any flashback requirements of the system. Automatic tuning of undo retention is not supported for LOBs. The RETENTION value for LOB columns is set to the value of the UNDO_RETENTION parameter. UNDO_RETENTION may need to be tuned in order to take care of some long running ETL processes (The default value for the UNDO_RETENTION parameter is 900). The suggested value range is from 30000 to 90000.

The UNDO_RETENTION parameter can only be honoured if the current undo tablespace has enough space. If an active transaction requires undo space and the undo tablespace does not have available space, then the system starts reusing unexpired undo space. This action can potentially cause some queries to fail with a "snapshot too old" message. The amount of time for which undo is retained for the Oracle Database for the current undo tablespace can be obtained by querying the TUNED_UNDORETENTION column of the V$UNDOSTAT dynamic performance view. The UNDO_RETENTION value should at least be equal to the length of longest running query on a given database instance. Note that the UNDO_RETENTION parameter works best if the current undo tablespace has enough space for the active transactions.  If an active transaction needs undo space and the undo tablespace does not have any free space, then the system will start reusing undo space that would have been retained. This may cause long queries to fail.

1 comment:

  1. This is truly excellent. I am very appreciative for this post. I am installing the database this weekend and this is giving me a good start.

    ReplyDelete