I was just discussing code instrumentation with a tool vendor, and he brought up an interesting point: Too many applications are not instrumented well enough to be monitored. Many system management tools can run SQL queries against a database, gather historical data and trends, compare to thresholds, issue alerts, etc. But this can only work if the tool has some way of querying the state of the application. As part of every custom-built application, always create a couple of views that summarize application health. In that way, operations can see if there is a problem with the application - whether using a tool or manually issuing the query.
↧
Blog Post: When is Your Application Running Well?
↧
Blog Post: EM12c Management Agent, OutOfMemoryError and Ulimits
While enjoying the lovely Liverpool, UK weather at Tech14 with UKOUG, (just kidding about that weather part and apologies to the poor guy who asked me the origin of “Kevlar” which in my pained, sleep-deprived state I answered with a strange, long-winded response…. :)) a customer contacted me in regards to a challenge he was experiencing starting an agent on a host that was home to 100’s of targets. oracle_database.DB301.com - LOAD_TARGET_DYNAMIC running for 596 seconds oracle_database.rcvcat11 - LOAD_TARGET_DYNAMIC running for 596 seconds oracle_database.DB302.com - LOAD_TARGET_DYNAMIC running for 596 seconds oracle_database.B303.com - LOAD_TARGET_DYNAMIC running for 596 seconds oracle_database.DB304.com - LOAD_TARGET_DYNAMIC running for 596 seconds oracle_database.DB305.com - LOAD_TARGET_DYNAMIC running for 596 seconds oracle_database.DB307.com - LOAD_TARGET_DYNAMIC running for 596 seconds oracle_database.DB309.com - LOAD_TARGET_DYNAMIC running for 596 seconds oracle_database.B311.com - LOAD_TARGET_DYNAMIC running for 596 seconds Dynamic property executor tasks running ------------------------------ --------------------------------------------------------------- Agent is Running but Not Ready The output from the “emctl start agent” wasn’t showing him anything he didn’t already know, but I asked him to send me the output and the following showed the actual issue that was causing the Agent not to finish out the run: MaxThreads=96 agentJavaDefines=-Xmx345M -XX:MaxPermSize=96M SchedulerRandomSpreadMins=5 UploadMaxNumberXML=5000 UploadMaxMegaBytesXML=50.0 Auto tuning was successful ----- Tue Dec 9 12:50:04 2014::5216::Finished auto tuning the agent at time Tue Dec 9 12:50:04 2014 ----- ----- Tue Dec 9 12:50:04 2014::5216::Launching the JVM with following options: -Xmx345M -XX:MaxPermSize=96M -server -Djava.security.egd=file:///dev/./urandom -Dsun.lang.ClassLoader.allowArraySyntax=true -XX:+UseLinuxPosixThreadCPUClocks -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+UseCompressedOops ----- Agent is going down due to an OutOfMemoryError This host target was a unique environment in that it contained so many targets, especially targets. One of the reasons that the management agent was created and OEM processing removed from an internal database backend process was to lighten the footprint. As EM12c introduced numerous features that has assisted its direction towards the center of the Oracle universe, the footprint became heavier, but I’ve been very impressed with development’s continued investment into lightening that footprint, even when considerable additions with plug-ins and metric extensions are added. With all of this, the server administrator may have a different value set to limits on resource usage than what may be required for your unique environment. I asked the customer to run the following for me: ulimit -Su ulimit -Hu Which returned the following expected values: $ ulimit -Su 8192 $ ulimit -Hu 3100271 The user limit values with these added arguments are for the following: -H display hard resource limits. -S display soft resource limits. I asked him to please have the server administrator set both these values to unlimited with the chuser command and restart the agent. The customer came back to confirm that the agent had now started, (promptly!) and added the remaining 86 database targets without issue. The customer and his administrator were also insightful and correctly assumed that I’d made the unlimited values not indefinitely, but as a trouble-shooting step. The next step was to monitor the actual resource usage of the agent and then set the limits to values that would not only support the existing requirements, but allocate enough of a ceiling to support additional database consolidation, metric extensions, plug-in growth. Tags: Del.icio.us Facebook TweetThis Digg StumbleUpon Comments: 0 (Zero), Be the first to leave a reply! You might be interested in this: Parallel with 11G Retrieving Bind Values from SQL Monitor in EM12c Release 4 EM12c, Damage Control Monitoring a Microsoft OS Failover Cluster Oracle Open World, Part I, Symposium Copyright © DBA Kevlar [ EM12c Management Agent, OutOfMemoryError and Ulimits ], All Right Reserved. 2014.
↧
↧
Wiki Page: Automatic/Adaptive Dynamic Sampling in Oracle 12C part 3
by Chinar Aliyev This is the third article of my 12c Adaptive Sampling series. Until this part we knew how ADS works for single table (also for group by clause) and join. As you know join cardinality calculated (Jonathan Lewis has explained that in his book) as Join Selectivity = ((num_rows(t1) - num_nulls(t1.c1)) / num_rows(t1)) * ((num_rows(t2) - num_nulls(t2.c2)) / num_rows(t2)) / greater(num_distinct(t1.c1), num_distinct(t2.c2)) Join Cardinality = Join Selectivity * filtered cardinality(t1) * filtered cardinality(t2) So to estimate Join cardinality you need estimate three factors: cardinality of both tables – included in the join and join selectivity. How do you think? Can be this mechanism get benefit from dynamic sampling? Of course estimating cardinality of both tables can be get benefit from DS but what about join selectivity? Answer is yes. Using dynamic sampling DBMS can estimate or calculate column statistics, number of distinct values and number of null values. And this information is enough for estimation join cardinality. In previous articles we have used t2 and t3 tables. In this part we also will use these tables but with different sizes (greater than previous). To increasing segment size I have used several insert into select from clause so our case is: SEGMENT_NAME MB (user_segments) ------------ ------- T2 3110.00 T3 3589.00 TABLE_NAME NUM_ROWS BLOCKS (user_tables) T3 14208000 231391 T2 11810048 204150 TABLE_NAME COLUMN_NAME NUM_DISTINCT HISTOGRAM (user_tab_col_statistics) T2 OBJECT_NAME 53744 NONE T3 TABLE_NAME 9004 NONE TABLE_NAME STALE_STATS (user_tab_statistics) ------------ ----------- T2 YES T3 YES And we will see following query select count(*) from t2 ,t3 where --t2.owner=t3.owner and t2.object_name=t3.table_name Now I have decided setting NDV of t2.object_name to 1500. Actually it is 53744. It means number of distinct values in dictionary wills less than actual number of distinct values. However statistics is stale but number of distinct values 53744 indicates reality. Because several insert into select from statement do not change number of distinct values of object_name column. But why did I decide to change column statistic? Soon you will know that. Let me try doing changing DECLARE l_distcnt NUMBER ; l_density NUMBER ; l_nullcnt NUMBER ; l_srec DBMS_STATS .StatRec; l_avgclen NUMBER ; BEGIN DBMS_STATS .get_column_stats ( ownname = 'sh', tabname = 't2', colname = 'object_name', distcnt = l_distcnt, density = l_density, nullcnt = l_nullcnt, srec = l_srec, avgclen = l_avgclen); l_distcnt:=15000; l_density:=1/15000; DBMS_STATS .set_column_stats ( ownname = 'sh', tabname = 't2', colname = 'object_name', distcnt = l_distcnt, density = l_density, nullcnt = l_nullcnt, srec = l_srec, avgclen = l_avgclen); END ; / Firstly I want to note that oracle completely ignores dynamic sampling statistics for computing any cardinality estimation. For Q3 query from 10046 trace file I have got (we can see that for both tables but I just provided only for T3 table). SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel */ sum(vsize(C1))/count(*) , substrb(dump(max(substrb(C2,1,32)), 16,0,32), 1,120) , substrb(dump(min(substrb(C3,1,32)), 16,0,32), 1,120) , SUM(C4), COUNT(DISTINCT C5) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T3") */ "T3"."TABLE_NAME" AS C1, "T3"."TABLE_NAME" AS C2, "T3"."TABLE_NAME" AS C3, CASE WHEN ("T3"."TABLE_NAME" IS NULL) THEN 1 ELSE 0 END AS C4, "T3"."TABLE_NAME" AS C5 FROM "T3" SAMPLE BLOCK(0.174483, 8) SEED(1) "T3") innerQuery Sometimes it can be deciding increase sample size (generally with twice times) like below SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel */ sum(vsize(C1))/count(*) , substrb(dump(max(substrb(C2,1,32)), 16,0,32), 1,120) , substrb(dump(min(substrb(C3,1,32)), 16,0,32), 1,120) , SUM(C4), COUNT(DISTINCT C5) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T3") */ "T3"."TABLE_NAME" AS C1, "T3"."TABLE_NAME" AS C2, "T3"."TABLE_NAME" AS C3, CASE WHEN ("T3"."TABLE_NAME" IS NULL) THEN 1 ELSE 0 END AS C4, "T3"."TABLE_NAME" AS C5 FROM "T3" SAMPLE BLOCK(0.348966, 8) SEED(2) "T3") innerQuery What does this SQL mean? This statement use estimate/calculate column (join) statistics which involved in the join (also can be apply this approach for filter predicate). And this third method for estimating joins cardinality. Here is COUNT(DISTINCT C5) indicate number of distinct values of the join(or can be filter) column and SUM(C4) is number of null values. I have removed all columns from the that SQL except NDV fot t3 table then: SQL SELECT 2 COUNT(DISTINCT C5) as num_dist 3 FROM 4 (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2") */ "T2"."OBJECT_NAME" 5 AS C1, "T2"."OBJECT_NAME" AS C2, "T2"."OBJECT_NAME" AS C3, CASE WHEN 6 ("T2"."OBJECT_NAME" IS NULL) THEN 1 ELSE 0 END AS C4, "T2"."OBJECT_NAME" AS 7 C5 FROM "T2" SAMPLE BLOCK(0.402778, 8) SEED(2) "T2") innerQuery; NUM_DIST ---------- 37492 You know I have updated NDV to 1500 for that column and plan was -------------------------------------------- | Id | Operation | Name | Rows | -------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | SORT AGGREGATE | | 1 | |* 2 | HASH JOIN | | 11G| | 3 | TABLE ACCESS FULL| T2 | 11M| | 4 | TABLE ACCESS FULL| T3 | 14M| -------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T2"."OBJECT_NAME"="T3"."TABLE_NAME") Note ----- - Dynamic statistics used: dynamic sampling (level=AUTO) And from 10053 trace file Best NL cost: 2037555914706.766602 resc: 2037555914706.766602 resc_io: 2030576990663.999756 resc_cpu: 75477063522522944 resp: 2037555914706.766602 resp_io: 2030576990663.999756 resc_cpu: 75477063522522944 SPD: Return code in qosdDSDirSetup: NOCTX, estType = JOIN Join Card: 11186477465.600000 = outer (11810048.000000) * inner (14208000.000000) * sel (6.6667e-05) As you see optimizer did not use number of distinct values which computed using sampling instead it used num_distinct from dictionary. Optimizer has such mechanism but it do not use this as expected, however dictionary statistics is STALE also using sampling number of distinct values(37492) is greater than dba_tab_col_statistics.num_distinct(15000). It means in this moment DS give us more correct information than dictionary but optimizer ignores that fact. It should not happen. Conclusion As you know ADS will help if your tables are small otherwise it will not help or can be bad. Depending on tables (size) involved in the joins and predicate type (join/filter) dynamic sampling can be completely ignored. Also optimizer tries (for some cases) estimate column statistics for join selectivity but it do not take benefits still from that. I hope some opportunities can be improve or fix in next releases. In additionally ADS increase parse time of the statements therefore it can be produce additionally concurrency in OLTP environment like latch/mutexes.
↧
Wiki Page: Automatic/Adaptive Dynamic Sampling in Oracle 12C part 2
Chinar Aliyev 12c Adaptive Sampling: part II This is the second article of my 12c Adaptive Sampling series. In part I, we saw how dynamic sampling estimates single table and group by cardinality. In this part we will focus on how Adaptive Dynamic Sampling works with joins. The model Let's create the following two tables and gather statistics without histograms create table t1 as select * from dba_users; create table t2 as select * from dba_objects; execute dbms_stats .gather_table_stats( user ,'t1',method_opt= 'for all columns size 1'); execute dbms_stats .gather_table_stats( user ,'t2',method_opt= 'for all columns size 1'); Then I will set the optimizer_dynamic_sampling to the new 12c value (11) alter session set optimizer_dynamic_sampling=11; And finally I will execute the following simple two table join query select count (*) from t1,t2 where t1.username=t2.owner; The execution plan of the above query is SQL_ID a28zr3kmq7psn, child number 0 ------------------------------------- --------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 | |* 2 | HASH JOIN | | 1 | 52070 | 55220 | | 3 | TABLE ACCESS FULL| T1 | 1 | 42 | 42 | | 4 | TABLE ACCESS FULL| T2 | 1 | 92254 | 92254 | --------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T1"."USERNAME"="T2"."OWNER") Note ----- - dynamic statistics used: dynamic sampling (level=AUTO) The investigation In previous releases dynamic sampling has not applied to estimate join cardinality. So above estimation is enough good. By looking at the corresponding below 10053 and 10046 trace files respectively 10053 trace file SPD: Return code in qosdDSDirSetup: NOCTX, estType = JOIN Join Card: 92254.000000 = outer (42.000000) * inner (92254.000000) * sel (0.023810) Join Card adjusted from 92254.000000 to 52070.040000 due to adaptive dynamic sampling, prelen=2 Adjusted Join Cards: adjRatio=0.564420 cardHjSmj=52070.040000 cardHjSmjNPF=52070.040000 cardNlj=52070.040000 cardNSQ=52070.040000 cardNSQ_na=92254.000000 Join Card - Rounded: 52070 Computed: 52070.040000 Outer table: T1 Alias: T1 10046 trace file SQL ID: 86cd0yqkg18hx Plan Hash : 3696410285 SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) */ SUM (C1) FROM ( SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2#0") */ 1 AS C1 FROM "T2" SAMPLE BLOCK ( 50.5051 , 8) SEED(1) "T2#0", "T1""T1#1" WHERE ("T1#1"."USERNAME"="T2#0"."OWNER")) innerQuery call count cpu elapsed disk query current rows ------- ----- ----- ------- ----- ------ ------- ---- Parse 1 0.00 0.00 0 2 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.03 0.03 0 763 0 1 ------- ----- ----- ------- ----- ------ ------- ---- total 3 0.03 0.03 0 765 0 1 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 105 (recursive depth: 1) Rows Row Source Operation ----- -------------------- 1 RESULT CACHE 1 SORT AGGREGATE 26298HASH JOIN 42 TABLE ACCESS FULL T1 44761 TABLE ACCESS SAMPLE T2 we can infer that Oracle did an estimation of just a portion(fraction) of join, then ends up by making a guess on the whole join (let me call this method 1 ). By sampling only table T2 at a sampling rate of 50,5051, Oracle has estimated that the hash join operation between T2 and T1 will generate 26,298 rows. Then, via a simple arithmetic guess, Oracle estimated that the entire hash join operation will generate (26,298 rows / 50,5051) * 100 = 52069,98 ~ 52070 You might ask why T1 table has not been sampled? As it has already explained in the part I of this series, the whole strategy followed by Oracle when using Adaptive Dynamic Sampling is “Estimate a cardinality for a fraction of the single table then guess the cardinality of the whole result set”. In order to follow the same strategy when estimating cardinality of joins, Oracle may follow three methods Sample a fraction of the large table in the join, then join that sampled fraction of the small table to the second table. The cardinality that results from that join will be used to get the cardinality of the entire join result set as we saw through the above “guess” formula. When both tables in the join are large (what does “large” mean here? when DBMS consider segment is “large”? I think it is depend I/O count, so how many I/O DBMS did for reading this segment? or this criteria can be controllable via a threshold), Oracle will sample a fraction of both tables using a different sampling size (depending on the table size), join these “small” result sets and compute the cardinality of that small fraction of the join. How all those methods can be efficient in practice? In order to answer this question, I will go step by step create table t3 as select OWNER, TABLE_NAME, COLUMN_NAME, DATA_TYPE, DATA_TYPE_MOD, DATA_TYPE_OWNER, DATA_LENGTH, DATA_PRECISION, DATA_SCALE, NULLABLE, COLUMN_ID, DEFAULT_LENGTH, NUM_DISTINCT, LOW_VALUE, HIGH_VALUE, DENSITY, NUM_NULLS, NUM_BUCKETS, LAST_ANALYZED, SAMPLE_SIZE, CHARACTER_SET_NAME, CHAR_COL_DECL_LENGTH, GLOBAL_STATS, USER_STATS, AVG_COL_LEN, CHAR_LENGTH, CHAR_USED, V80_FMT_IMAGE, DATA_UPGRADED, HISTOGRAM, DEFAULT_ON_NULL, IDENTITY_COLUMN, SENSITIVE_COLUMN, EVALUATION_EDITION, UNUSABLE_BEFORE, UNUSABLE_BEGINNING from dba_tab_columns; execute dbms_stats .gather_table_stats( user ,'t3',method_opt= 'for all columns size 1'); alter system flush shared_pool; alter session set optimizer_dynamic_sampling=11; select count (*) from t2 ,t3 where t2.owner=t3.owner and t2.object_name=t3.table_name and t2.object_type='TABLE'; And execution plan will be --------------------------------------------------------------- | Id | Operation | Name | Starts | E- Rows | A- Rows | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 | |* 2 | HASH JOIN | | 1 | 31573 | 30286 | |* 3 | TABLE ACCESS FULL| T2 | 1 | 1938 | 2479 | | 4 | TABLE ACCESS FULL| T3 | 1 | 110K| 110K| --------------------------------------------------------------- Predicate Information ( identified by operation id): --------------------------------------------------- 2 - access ("T2"."OWNER"="T3"."OWNER" AND "T2"."OBJECT_NAME"="T3"."TABLE_NAME") 3 - filter("T2"."OBJECT_TYPE"='TABLE') Note ----- - dynamic statistics used: dynamic sampling ( level =AUTO) As you can see the optimizer estimations are not bad. In this example the 10046 trace file reveals that Oracle executed two dynamic sampling statements: the first one concerns the sampling of table T2 as shown below (it seems that this choice is because of the presence of a predicate on T2 table using a constant value): Plan Hash : 3252009800 SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) */ SUM (C1) FROM ( SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2") */ 1 AS C1 FROM "T2" SAMPLE BLOCK ( 50.5051 , 8) SEED(1) "T2" WHERE ("T2"."OBJECT_TYPE"='TABLE')) innerQuery call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- -------- ------- Parse 1 0.00 0.00 0 2 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.01 0.01 118 760 0 1 ------- ------ -------- ---------- ---------- ---------- -------- ------- total 3 0.01 0.01 118 762 0 1 Rows Row Source Operation ------- --------------------- 1 RESULT CACHE 1 SORT AGGREGATE 979 TABLE ACCESS SAMPLE T2 (979 rows / 50,5051) * 100 = 1938,41 ~ 1938 While the second sampling concerns the join operation using the OPT_ESTIMATE hint as shown below: SQL ID: 6wxwa7hvmmnb3 Plan Hash : 2702931906 SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) OPT_ESTIMATE (@"innerQuery", TABLE, "T2#1", ROWS=1938.42) */ SUM (C1) FROM ( SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T3#0") */ 1 AS C1 FROM "T3" SAMPLE BLOCK (44.1258, 8) SEED(1) "T3#0", "T2""T2#1" WHERE ("T2#1"."OBJECT_TYPE"='TABLE') AND ("T2#1"."OWNER"="T3#0"."OWNER") AND ("T2#1"."OBJECT_NAME"="T3#0"."TABLE_NAME")) innerQuery call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- -------- ------- Parse 1 0.00 0.00 0 2 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.04 0.04 143 2352 0 1 ------- ------ -------- ---------- ---------- ---------- -------- ------- total 3 0.05 0.05 143 2354 0 1 Rows Row Source Operation ------- --------------------------------------------------- 1 RESULT CACHE 1 SORT AGGREGATE 13932 HASH JOIN 2479 TABLE ACCESS FULL T2 49701 TABLE ACCESS SAMPLE T3 Finally putting the two pieces together we can say that Oracle started by sampling T2 table and ended up by estimating a T2 cardinality of 1938. Then it used this estimated cardinality and sampled the join operation using a sample size value of 44.12 with which Oracle found a join cardinality of 13932. Using this estimated join cardinality, Oracle find finally that the entire result set of the join cardinality will be 13932*100/44.1258=31573.36 ≈ 31573 This kind of estimation is based on method 2 (see point ii above) where both tables in the join are large. If you have quite big tables what will happen? Having investigated a case where the first method (see point i above) has been used, next I will investigate the second method (see point ii above) where both tables are large. This is obtained issuing several insert into t2 (t3) select * from t2(t3) statements until I got the following size picture SEGMENT_NAME MB ------------ ------- T2 1600.00 T3 1813.00 TABLE_NAME NUM_ROWS BLOCKS ------------ ------- ------- T3 110996 1813 T2 92254 158 Those table inserts have not been followed by a call to dbms_stats package so that statistics are stale select count (*) from t2; Return 11,808,512 rows. select count (*) from t3; Return 14,207,488 rows. The following query select count (*) from t2 ,t3 where t2.object_name=t3.table_name and t2.object_type='TABLE'; when executed under optimizer_dynamic_sampling set to 11 gives a n excerpt of the corresponding 10046 trace file reproduced below: SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) */ SUM (C1) FROM ( SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2") */ 1 AS C1 FROM "T2" SAMPLE BLOCK ( 0.391869 , 8) SEED(1) "T2" WHERE ("T2"."OBJECT_TYPE"='TABLE')) innerQuery Rows Row Source Operation ------- -------------------- 1 RESULT CACHE 1 SORT AGGREGATE 1114 TABLE ACCESS SAMPLE T2 Oracle use method 1, in this case, to estimate t2 table cardinality (because of that “filter” on t2 table Oracle thinks it will reduce the data from t2 and hence get benefit from sampling) as shown in the above trace file. Additionally the same trace file shows the following SQL statement. QL ID: 22fjpqb6fyafj Plan Hash: 527772662 SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) OPT_ESTIMATE(@"innerQuery", TABLE, "T2#1", ROWS=284278.875) */ SUM(C1) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "T2#1") */ 1 AS C1 FROM "T2" SAMPLE BLOCK(0.391869, 8) SEED(1) "T2#1", "T3""T3#0" WHERE ("T2#1"."OBJECT_NAME"="T3#0"."TABLE_NAME") AND ("T2#1"."OBJECT_TYPE"= 'TABLE')) innerQuery call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ------- ----- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 0.93 1.09 1889327006 0 0 ------- ------ -------- ---------- ---------- ---------- ------- ----- total 3 0.93 1.09 1889327006 0 0 Rows Row Source Operation ------- -------------------- 0 RESULT CACHE 0 SORT AGGREGATE 0 HASH JOIN (cr=0 pr=0 pw=0 …) 1298249 TABLE ACCESS FULL T3 0 TABLE ACCESS SAMPLE T2 (cr=0 pr=0 pw=0 …) Where you can notice that the above query has not been fully executed.A fter scanning 1.2 millions rows from T3 table, Oracle stopped the query . Because Oracle has derived that the sampling it is doing will not be efficient in this situation. So, how Oracle has managed to get its c ardinality in this case ? From the same 10053 trace file we can isolate the below lines related to join estimation Best NL cost: 24711383775.180614 resc: 24711383775.180614 resc_io: 24667427343.000000 resc_cpu: 475388814033344 resp: 24711383775.180614 resp_io: 24667427343.000000 resc_cpu: 475388814033344 SPD: Return code in qosdDSDirSetup: NOCTX, estType = JOIN Join Card: 587113.315151 = outer (284278.875000) * inner (110996.000000) * sel (1.8607e-05) Join Card - Rounded: 587113 Computed: 587113.315151 Here is outer table cardinality estimated using dynamic sampling but for inner table cardinality used num_rows from dictionary (object statistics). However both object statistics is inefficient (stale!), they are really different from reality. It only applied for T2 table because it has filter column, due to it assumes using sampling can be more efficient. So from user_tab_col_statistics Table column num_nistinct density T2 OBJECT_NAME 53744 0.0000186067281929145 T3 TABLE_NAME 9003 0.000111074086415639 From above information CBO selected join selectivity 1/num_distinct of t2.object_name (or density of this column). And final execution plan was: --------------------------------------------------------------- | Id | Operation | Name | Starts | E- Rows | A- Rows | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 | |* 2 | HASH JOIN | | 1 | 587K| 500M| | 3 | TABLE ACCESS FULL| T3 | 1 | 110K| 14M| |* 4 | TABLE ACCESS FULL| T2 | 1 | 284K| 317K| --------------------------------------------------------------- Predicate Information ( identified by operation id): --------------------------------------------------- 2 - access ("T2"."OBJECT_NAME"="T3"."TABLE_NAME") 4 - filter("T2"."OBJECT_TYPE"='TABLE') Note ----- - dynamic statistics used: dynamic sampling ( level =AUTO) It is Interesting to note that T3 table cardinality has not been estimated using sampling but instead Oracle used the stale statistics from dictionary to get the T3 table cardinality estimation. And this is the reason why the join cardinality estimation was very bad. But even when we refresh (gather) statistics again including histograms, the join cardinality estimation will not be good. Re-gathering statistics will help in getting a better T3 table cardinality estimation. This is why, after I have gathered statistics (including histograms) the new execution plan is: --------------------------------------------------------------- | Id | Operation | Name | Starts | E- Rows | A- Rows | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | | 1 | SORT AGGREGATE | | 1 | 1 | 1 | |* 2 | HASH JOIN | | 1 | 75M| 500M| |* 3 | TABLE ACCESS FULL| T2 | 1 | 284K| 317K| | 4 | TABLE ACCESS FULL| T3 | 1 | 14M| 14M| --------------------------------------------------------------- Predicate Information ( identified by operation id): --------------------------------------------------- 2 - access ("T2"."OBJECT_NAME"="T3"."TABLE_NAME") 3 - filter("T2"."OBJECT_TYPE"='TABLE') Note ----- - dynamic statistics used: dynamic sampling ( level =AUTO) In this case with the presence of both Dynamic Sampling at level 11 and fresh and representative statistics the estimations of T2 table cardinality is 284k instead of Actual 317K. We can say that even when accurate statistics are present, if Dynamic sampling is used at level 11, Oracle will use dynamic sampling on table T2 because it thinks that sampling will be efficient when applied on a table having a filter predicate. This is why I decided to check the same query without the predicate part select count(*) from t2 ,t3 where t2.object_name=t3.table_name –- and t2.owner=t3.owner –- I commented this predicate part The above query now contains only an equality on the join columns. In this case, as we will see via the corresponding CBO trace file,Oracle will completely ignore dynamic sampling. Dynamic sampling didn't kicked not because of the large size of the table but because there is no filter predicate on table and there is fresh and accurate statistics (and the CBO will ignore Dynamic Sampling in this case as well even if statistics are stale). The CBO trace shows clearly that Oracle used dictionary statistics to estimate single table and join cardinality: SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for T2[T2] SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE *** 2014-10-10 06:24:57.445 ** Performing dynamic sampling initial checks. ** ** Not using old style dynamic sampling since ADS is enabled. Table: T2 Alias: T2 Card: Original: 11810048.000000 Rounded: 11810048 Computed: 11810048.000000 Non Adjusted: 11810048.000000 *************************************** SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for T3[T3] SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE *** 2014-10-10 06:24:57.445 ** Performing dynamic sampling initial checks. ** ** Not using old style dynamic sampling since ADS is enabled. Table: T3 Alias: T3 Card: Original: 14208000.000000 Rounded: 14208000 Computed: 14208000.000000 Non Adjusted: 14208000.000000 Best NL cost: 1029989931239.724365 resc: 1029989931239.724365 resc_io: 1024777132847.000000 resc_cpu: 56376414617314440 resp: 1029989931239.724365 resp_io: 1024777132847.000000 resc_cpu: 56376414617314440 SPD: Return code in qosdDSDirSetup: NOCTX, estType = JOIN Join Card: 3122156184.578744 = outer (11810048.000000) * inner (14208000.000000) * sel (1.8607e-05) *** 2014-10-10 06:24:59.238 Join Card - Rounded: 3122156185 Computed: 3122156184.578744 In the next (final) article I will discuss Optimizer`s additional mechanism to estimate join cardinality and I will summarize all these.
↧
Blog Post: Managing Oracle Database 12c with Enterprise Manager – Part XVI
We are discussing the management of Oracle Database 12c in Oracle Enterprise Manager 12c. In our previous blog post on this topic, we were exploring the Activity tab in the Performance Hub of Enterprise Manager Database Express 12c. Let us move to the Monitored SQL Tab. This is the Real-time SQL Monitoring feature of the Diagnostics pack. This screen shows all the long running SQL statements (that have consumed 5 seconds or more of combined CPU and I/O time in a single execution, or are using parallel query). Ever wondered why that report was taking so long? It is possible to drill down and see the plan steps executing for the SQL statement, as can be seen in the following screenshot. This helps considerably in analyzing long-running SQL statements. The next two tabs of the Performance Hub show ADDM (Automatic Database Diagnostics Monitor) and its results, including Real-time ADDM which was previously used for emergency database issues, but now runs proactively to catch database issues before they cause a real problem. For example, the following real-time ADDM report shows library cache contention, and the “Show Reasons” button suggests that the system is CPU bound. Real-time ADDM runs in the database automatically every 3 seconds, and in this way is able to detect transient performance issues. The performance data in memory is examined, and any performance spikes are detected. The administrator is then informed of the spike and its root cause.
↧
↧
Blog Post: Oracle Database 12c Installation failed with Error "network/lib/ins_net_client.mk" on Oracle Solaris 11.2
Installation of Oracle Database 12cR1 on Oracle Solaris 11.2 (X86-64Bit) failed with error " network/lib/ins_net_client.mk " though all pre-requisites checks was successful. Installation failed on the following screen: Error log content from the Installation logfile: INFO: rm -f ntcontab.* INFO: (if [ "assemble" = "compile" ] ; then \ /u01/oradb/oracle/product/12.1.0/dbhome_1/bin/gennttab ntcontab.c ;\ cc -c ntcontab.c ;\ rm -f /u01/oradb/oracle/product/12.1.0/dbhome_1/lib/ntcontab.o ;\ mv ntcontab.o /u01/oradb/oracle/product/12.1.0/dbhome_1/lib/ ;\ /usr/ccs/bin/ar rv /u01/oradb/oracle/product/12.1.0/dbhome_1/lib/libn12.a /u01/oradb/oracle/product/12.1.0/dbhome_1/lib/ntcontab.o ; fi) INFO: (if [ "assemble" = "assemble" ] ; then \ /u01/oradb/oracle/product/12.1.0/dbhome_1/bin/gennttab ntcontab.s ;\ /usr/ccs/bin/as -m64 -Kpic -o ntcontab.o ntcontab.s ;\ rm -f /u01/oradb/oracle/product/12.1.0/dbhome_1/lib/ntcontab.o ;\ mv ntcontab.o /u01/oradb/oracle/product/12.1.0/dbhome_1/lib/ ;\ /usr/ccs/bin/ar rv /u01/oradb/oracle/product/12.1.0/dbhome_1/lib/libn12.a /u01/oradb/oracle/product/12.1.0/dbhome_1/lib/ntcontab.o ; fi) INFO: sh[2]: /usr/ccs/bin/as: not found [No such file or directory] INFO: *** Error code 127 INFO: make: Fatal error: INFO: Command failed for target ` INFO: ntcontab.o INFO: ' INFO: INFO: End output from spawned process. INFO: ---------------------------------- INFO: Exception thrown from action: make Exception Name: MakefileException Exception String: Error in invoking target 'mkldflags ntcontab.o nnfgt.o' of makefile '/u01/oradb/oracle/product/12.1.0/dbhome_1/network/lib/ins_net_client.mk'. See '/u01/oradb/oraInventory/logs/installActions2014-12-14_09-04-46AM.log' for details. Exception Severity: 1 Cause: Installer is not able to find the binary "/usr/ccs/bin/as" in its location. soladmin@soltest2:~$ /usr/ccs/bin/as bash: /usr/ccs/bin/as: No such file or directory soladmin@soltest1:~$ ls -lrt /usr/ccs/bin/as /usr/ccs/bin/as: No such file or directory soladmin@soltest2:~$ Solution: There is a missing package "developer/assembler" in Oracle Solaris 11.2 OS. You actually need not have to abort the Installation. Just Install this missing package and retry Installation it should work. soladmin@soltest2:~$ pkg info developer/assembler pkg: info: no packages matching the following patterns you specified are installed on the system. Try specifying -r to query remotely: developer/assembler soladmin@soltest2:~$ -The mentioned above package is not Installed on the host. Make sure you have Full IPS repository configured to Install this missing package. root@soltest2:~# pkg publisher PUBLISHER TYPE STATUS P LOCATION solaris origin online F file:///IPS/ root@soltest2:~# pkg search assembler INDEX ACTION VALUE PACKAGE pkg.fmri set solaris/developer/assembler pkg:/developer/assembler@0.5.11-0.175.2.0.0.37.0 pkg.summary set Converts assembler source code to object code. pkg:/developer/assembler@0.5.11-0.175.2.0.0.37.0 root@soltest2:~# Install pakcage: root@soltest2:~# pkg install solaris/developer/assembler pkg install: The following pattern(s) did not match any allowable packages. Try using a different matching pattern, or refreshing publisher information: solaris/developer/assembler root@soltest2:~# pkg install assembler Packages to install: 1 Create boot environment: No Create backup boot environment: No DOWNLOAD PKGS FILES XFER (MB) SPEED Completed 1/1 6/6 0.2/0.2 0B/s PHASE ITEMS Installing new actions 13/13 Updating package state database Done Updating package cache 0/0 Updating image state Done Creating fast lookup database Done Updating package cache 1/1 root@soltest2:~# soladmin@soltest2:~$ pkg info assembler Name: developer/assembler Summary: Converts assembler source code to object code. Category: Development/Other Languages State: Installed Publisher: solaris Version: 0.5.11 Build Release: 5.11 Branch: 0.175.2.0.0.37.0 Packaging Date: April 14, 2014 01:02:41 PM Size: 625.46 kB FMRI: pkg://solaris/developer/assembler@0.5.11,5.11-0.175.2.0.0.37.0:20140414T130241Z soladmin@soltest2:~$ oracle@soltest2:~$ ls -lrt /usr/ccs/bin/as -rwxr-xr-x 1 root bin 632072 Dec 14 09:28 /usr/ccs/bin/as oracle@soltest2:~$ After Installation of package the binary is now available and Installer should proceed further without any Issues. Just click on "retry" - Installation completed successfully without any Issues. Thanks for reading. regards, X A H E E R
↧
Blog Post: Oracle Database 12c fix-up scripts not fixing up the kernel parameters on Oracle Solaris 11.2
Pre-requisites check for Oracle Database 12c failing on configuration of kernel parameters. Since version 11gR2 oracle Installer will generate fix-up scripts for those parameters which can be fixed by running fix-up scripts. Pre-requisite check failed for kernel parameter "project.max-shm-memory" . - Execute the fix-up script from root user and check fix-up result. The fix-up result is successful but if we check again the pre-requisites are failing on the same kernel parameter. In solaris 9/10 we need to reboot the server for kernel parameters to be effective. In Solaris 11 exit the Installer, then open new session with oracle user with which Installing the database, start the Installer and here it should read the newly configured kernel parameter. So no reboot is required. Thanks for reading. regards, X A H E E R
↧
Blog Post: Implementing a Database Authentication Scheme in APEX
The following tangential opening was written especially for Scott Wesley in the hope that he’ll be minded to point out any errors in what follows. The same applies to Jeff Kemp ( although I don’t know if he’s into the AFL). Unlike me, both of these guys are APEX experts. Football. It’s a term that means different things to different people. To a European, it’s most likely to be a reference to good old Association Football ( or Soccer). To an American, it’s more likely to be the Grid-iron game. A New Zealander will probably immediately think of Rugby Union. An Australian ? Well, it’s probably a fair bet that they’ll think of Aussie Rules Football. On the face of it, the rules appear rather arcane to an outsider. 18-a-side teams kicking, catching and punching something that resembles a Rugby ball around a pitch that resembles a cricket oval. Then there is the scoring system. “Nice Behind”, to an AFL player is more likely to be taken as a compliment of their skill at the game than an appreciation of their anatomy. Then again, it’s easy to scoff at any sport with which you are unfamiliar. For example, Rugby could be characterised as 30 people chasing after an egg. Occasionally, they all stop and half of them go into some strange kind of group hug. I wonder if the backs ever get paranoid because they think the forwards are talking about them ? As for soccer, even afficionados will acknowledge that there’s something a bit odd about a game where 22 millionares spend lots of time chasing after one ball…when they’re not rolling around in apparent agony after appearing to trip over an earth worm. I mean, the ball isn’t that expensive, surely they can afford one each ? The point of all of this ? Well, what is considered to be obscure, eccentric, or just plain odd often depends on the perspective of the observer. Take APEX authentication schemes for example. Whilst not the default, Database Authentication is a scheme that is readily available. However, there doesn’t seem to be much written on this subject. In contrast, there is a fair bit out there about APEX Custom Authentication. A lot of it would appear to re-enforce the idea that implementing security by hand is fraught with difficulty. Just one example can be seen here . If we were to approach this topic from the perspective of looking to migrate an elderly Oracle Forms application – where each user has their own database account – to APEX, we might be attracted to the idea of a Database Authentication Scheme and want to find out more. What follows is my adventure through setting up such an Authentication Scheme. Specifically, I’m going to cover : Creating an APEX Database Authentication Scheme Default behaviour Adding a Verification Function to restrict access to a sub-set of Database Users The vexed question of password resets Why use Database Authentication The Oracle documentation states : “Database Account Credentials is a good choice if having one database account for each named user of your application is feasible and account maintenance using database tools meets your needs.” If we’re migrating an application from Oracle Forms, then chances are that this is what we’re doing now, so a Database Authentication Scheme should save us a fair bit of work. The other major advantage is that utilising the Database’s built-in User and Security management means that we don’t have to try and re-invent the wheel. So, the objective here is to implement Authentication in our new Application without having to : Create and maintain extra tables Write lots of extra code Figure out a secure way of storing passwords The Application Firing up my trusty XE 11g installation, I’ll be using a simple APEX application that consist of a standard login page and, initially at least, a Home Page with two read only fields in an HTML Region called WHOAMI. These are : Application User – the APP_USER that I’m connected to APEX as Database User – the actual user connected to the database For the P1_APPLICATION_USER, the Source Type is Item (application or page item name) . The source value is APP_USER . For the P1_DATABASE_USER, the Source Type is set to SQL Query(return single value) . The source value is simply the query : select user from dual A Note on the Design In this example, I’ve taken the approach that the code required to implement this functionality is included in the parsing schema ( HR in this case). As a consequence, the privileges required to execute this code are also granted to the parsing schema. I’ve done this for the purposes of clarity. Careful consideration needs to be given to this design decision if you’re planning to implement it in a “proper” production environment. Creating a Database Authentication Scheme After navigating to the Application in Application Builder, rather than do anything to the Application itself, we need to create a Shared Component… The type of component we want is an Authentication Scheme. NOTE – Authentication Scheme – controls login to the Application. Authorisation Scheme – governs which bits of the Application the user can see…once they’re connected. Anyway, in the Security Region, select Authentication Scheme : …and then hit the Create button… We want to create a scheme “Based on a pre-configured scheme in the gallery” … In the next screen : Name : HR_DB Scheme Type : Database Accounts And finally, we click the Create Authentication Scheme button and… We can see from this that HR_DB is now the Authentication Scheme currently being used by any Application in the Workspace. Anyway, now to test it. To this point, I haven’t setup any users for this application. So, Can I log in as a user that does exist in the database ? Well, I have a user called MIKE : select 1 from dba_users where username = 'MIKE' / 1 ---------- 1 SQL So, if I now run my application and try to connect using my database credentials… … I can connect using my database credentials. It’s worth noting that, despite this, the actual database connection from APEX is as the ANONYMOUS user. If you’re using the APEX Listener instead of the Embedded PL/SQL Gateway (the default in XE), then it’ll probably be APEX_PUBLIC_USER. So, in order to login to my application, you now have to be a database user. All the messy password encryption stuff is handled by Oracle and I can now get on with polishing my finely crafted APEX Application….or so you might think. Just consider this : …also let’s you connect : We’re not fussy, we’ll let anyone in ! Now, my imaginary Forms application – remember, that’s the one I want to migrate to APEX – may be sitting on a Database Instance with a number of other Applications. So, how do I restrict access to my application to a subset of the users in the database ? Time for a bit of a re-think then… The verify function What we need is a means of identifying a database user as an Application user. At this point it may well be worth revisiting the role of database roles in APEX applications. Hang on, you’re thinking, last time you said they were pretty much useless in APEX . Well, bear with me. Roles as Privileges, sort of What we’re going to do here is to simply create an empty role and assign it to a database user : create role hr_user / grant hr_user to mike / We now have some means of determining which database users are our application users : select 1 from dba_role_privs where granted_role = 'HR_USER' and grantee = 'MIKE' / The function Now all we need is a function that checks to see if the user attempting to login has this role granted to them. It’s worth bearing in mind here that, for a function based on the above statement, select privileges on DBA_ROLE_PRIVS is required. To start with I’m going to grant the privilege to HR : grant select on sys.dba_role_privs to hr / and then I’m going to create the function in the HR schema : create or replace function is_hr_user_fn return boolean is -- -- Is this user a database user with privileges to access the APEX Application ? -- NOTE - the owner of this function requires SELECT privilege on DBA_ROLE_PRIVS -- l_dummy pls_integer; begin select 1 into l_dummy from sys.dba_role_privs where granted_role = 'HR_USER' and grantee = apex_040200.v('APP_USER'); return true; exception when no_data_found then raise_application_error('-20000', 'You are not an application user'); end; / You’ll note that the references to both DBA_ROLE_PRIVS and the V function are done directly on the objects themselves rather than through their public synonyms. In many cases, but especially where security is concerned, it’s usually a good idea to make sure that you’re referencing the object that you intend rather than relying on a synonym. If you want to see an example of how public synonyms can be changed to point to objects other than those originally intended, then have a look here . Now we need to tell our Authentication scheme to use this function as the Verify Function. In the Application Builder, go back to the Shared Components screen then select Authentication Schemes. Now click on the pencil icon next to HR_DB – Current : If you want to be a bit more discerning… In the Session Not Valid section, there is a field called Verify Function Name. In here, simply enter the name of our function – i.e. is_hr_user_fn : …add a Verify Function And save the changes. So, we should now be able to connect as MIKE, but not any other database user. Connecting as MIKE works as before. However, for SYSTEM, the results are slighty different : Your name’s not down, you’re not coming in ! As we can see, the Application Error raised by the function is displayed. If you hit the OK button, you’ll then be returned to the Login Page. The Principle of Least Privilege In case your not familiar with the term, it basically boils down to the principle that access to an application should be restricted to the minimum level required for a user, application or program to function. Have a look here for a proper explanation . It’s probably worth noting that, implementing this approach to Authentication means that, in order to create a new application user, all that’s required is the following : create user plugger identified by pwd / grant hr_user to plugger / In case you’re wondering, Plugger is the nickname of a certain Tony Lockett who, apparently, was a pretty good Aussie Rules player in his time. Anyway, as you can see, our new user requires no system privileges, not even CREATE SESSION. They simply need to be granted the role so that they can be identified as an application user. Whilst were on the subject of least privilege, you might consider that it is by no means necessary for the parsing schema of an APEX application to have CREATE SESSION priviliges, or indeed, to even be the owner of the application’s database objects. This applies irrespective of the Authentication Scheme being used. We now have a robust and efficient Authentication Scheme. There is however, one rather thorny issue that we still need to consider. Changing Passwords Whilst we now have a mechanism for authenticating users through their database accounts, unless we give them the facility to change their passwords before they expire, we’ll be storing up a significant amount of admin for the poor, hard-pressed DBA. The venerable Forms Application we’re migrating was written in the days prior to SSO becoming prevalent and authentication is still managed entirely within the database. Remember, the whole point of chosing Database Authentication is so that we minimise the amount of effort required to migrate this application onto APEX in terms of re-coding the Application’s Security Model. This is where things get a bit tricky. Whilst our users are authenticating as themselves, they are actually connecting to the database as ANONYMOUS or APEX_PUBLIC_USER. Therefore, we need a procedure in a schema with ALTER USER privileges to change passwords from within the APEX application. So, how do we provide this functionality in our application. Danger ! Assumption Imminent ! As I’m all too aware ( often through bitter experience), Assumption is the Mother of all ***-ups. Therefore, the assumption I’m about to make here requires careful explanation. Here goes then… I’m assuming that I can safely call a stored procedure from within APEX, passing a user password in clear text. Clear text ! I hear you cry, Have you gone mad ? Well, possibly. On the other hand a trawl through of the APEX documentation reveals that there are a few package members in the APEX packages themselves where this takes place. These are : APEX_UTIL.IS_LOGIN_PASSWORD_VALID APEX_UTIL.EDIT_USER APEX_CUSTOM_AUTH.LOGIN APEX_AUTHENTICATION.LOGIN Further research reveals that, certainly in the latest versions of APEX, there do not appear to be any exploits available to compromise these procedures. The most recent one I found was for APEX 3.1, an example of which can be seen on the Red Database Security site . As well as giving the user the ability to change their password at any time, we also want to check immeadiately after the user connects and find out whether their password is near to expiry. If so, then we need to re-direct them to a password change page. What was Jeff saying about scary code ? Anyway, the steps to build this functionality are, in order : Create a Change Password Procedure to be called from the application Create a Change Password Page where the user can change their password ( and which will call the procedure) Create a branch in the Application to re-direct a user to the Change Password Page if their password is due to expire Allowable characters in the password As we’re going to have to change the password by executing an ALTER USER command from within a PL/SQL procedure, we’re going to have to use dynamic SQL. Critically, we’re not going to be able to use bind variables for this command because it’s a DDL statement. In order to ensure that the resulting procedure is not vulnerable to SQL Injection, we’re going to have to make sure that passwords do not contain the single quote (‘) character. To do this, we’re going to create a profile for our application users which includes a password verify function and assign it to them. So, the Password Verify Function, which needs to be created in the SYS schema, looks like this : create or replace function new_hr_verify_fn ( username varchar2, password varchar2, old_password varchar2 ) return boolean is -- -- A very simple password verify function. In this case. all we're interested in -- is that the password should not contain a quote (') character. -- NOTE : this function is purely to illustrate this sole restriction. -- A proper password verification function would be rather more extensive. -- begin if instr(password, q'[']') 0 then raise_application_error(-20000, q'[Password cannot contain a "'"]'); end if; return true; end; / A quick test of this function shows that it works as expected : set serveroutput on size unlimited declare -- -- Script to test the password verify function -- type typ_passwords is table of varchar2(100) index by pls_integer; tbl_passwords typ_passwords; lc_old_password constant varchar2(50) := 'DUMMY'; l_dummy boolean; begin tbl_passwords(1) := q'[Hawthorn Top O' the heap!]'; tbl_passwords(2) := 'Tony||chr(39)||or 1=1||chr(39)'; tbl_passwords(3) := q'[Tony '; select * from dba_users --]'; tbl_passwords(4) := 'beware men with funny shaped balls'; for i in 1..tbl_passwords.count loop begin l_dummy := new_hr_verify_fn ( username = user, password = tbl_passwords(i), old_password = lc_old_password ); dbms_output.put_line('Test '||i||' Password '||tbl_passwords(i)||' is allowed'); exception when others then dbms_output.put_line('Test '||(i)||' ERROR : '||sqlerrm); end; end loop; end; / Run this as we get… SQL @test_verify Test 1 ERROR : ORA-20000: Password cannot contain a "'" Test 2 Password Tony||chr(39)||or 1=1||chr(39) is allowed Test 3 ERROR : ORA-20000: Password cannot contain a "'" Test 4 Password beware men with funny shaped balls is allowed PL/SQL procedure successfully completed. Note that, although the string containing “chr(39)” is allowed, because there is no way to concatenate a quote into the entry string, this is treated as a collection of characters rather than a call to the CHR function. Incidentally 39 is the ASCII code for a single quote. Also note that this particular password verify function has been kept simple deliberately for the purposes of clarity. Something rather more complex is likely to be in place in a real-life production scenario. The profile then, looks like this : create profile hr_default limit failed_login_attempts 10 password_life_time 30 password_reuse_time 1800 password_reuse_max 60 password_lock_time 1 password_grace_time 7 password_verify_function new_hr_verify_fn / Finally, we’re going to assign the profile to PLUGGER : alter user plugger profile hr_default / The Change Password Procedure Once again, this procedure is being created in the HR schema. It will be used to ultimately issue the ALTER USER command to change the passwords. Therefore, we need to grant the ALTER USER privilege to HR : grant alter user to hr / As this procedure also needs to reference DBA_USERS, we’ll need to grant SELECT on that too. grant select on sys.dba_users to hr / When writing this procedure, paranoia is the watchword. Objects need to be referenced directly, rather than via synonyms and any user input needs to be sanitised before we plug it into the dynamic SQL statement we need to run. The result might look something like this : create or replace procedure change_apex_user_pwd_pr ( i_old_pwd in varchar2, i_new_pwd in varchar2 ) is -- -- Procedure to change the password for a user of the NEW_HR APEX application -- The old password is required, as well as the new one because, if we're -- using a verify function in the profile the user is assigned to, the -- old password must be specified in the ALTER USER statement. -- l_user sys.dba_users.username%type; lc_apex_user constant sys.dba_users.username%type := 'ANONYMOUS'; l_dummy pls_integer; cursor c_validate_user( cp_user sys.dba_users.username%type) is select 1 from sys.dba_users usr inner join sys.dba_role_privs rol on rol.grantee = usr.username where usr.username = cp_user; begin -- -- Make sure that the parameter values have been specified -- if i_new_pwd is null or i_old_pwd is null then raise_application_error(-20000, 'Both the Old Password and the New Password must be specified'); end if; -- -- Sanitize the user input parameters to prevent SQL Injection. -- This boils down to rejecting strings that contain a "'" -- if instr(i_old_pwd, q'[']') 0 or instr(i_new_pwd, q'[']') 0 then raise_application_error(-20001, 'Passwords must not contain the '||CHR(39)||' character.'); end if; -- -- Additionally, check that the password does not exceed the maximum length -- allowed ( 50 in 11g) -- if length( i_old_pwd) 50 or length( i_new_pwd) 50 then raise_application_error(-20002, 'Passwords must not exceed 50 characters in length.'); end if; -- -- Now validate that the user is indeed -- (a) calling the function from APEX -- (b) exists in the database -- (c) is a user of this application -- ...also check that the username does not contain a quote character -- to guard against a Blind Injection. -- l_user := apex_040200.v('APP_USER'); if l_user is null or user != lc_apex_user then raise_application_error(-20003, 'This function can only be called from APEX'); end if; open c_validate_user( l_user); fetch c_validate_user into l_dummy; if c_validate_user%notfound then close c_validate_user; raise_application_error(-20004, 'This user is not a NEW_HR Application user'); end if; close c_validate_user; -- -- Now change the password. REPLACE clause is required in case the -- user's default profile has a password verify function specified -- execute immediate 'alter user '||l_user||' identified by '||i_new_pwd||' replace '||i_old_pwd; end; / In the procedure itself, we’re taking a number of precautions : Values for both parameters must be supplied The input parameter values must not exceed 50 characters – the maximum length of an 11g password The input parameter values must not contain a single quote character The user currently connected to the database is the Apex user ( in my case ANONYMOUS) A call to the V function for the application user returns a value The application user we’re changing is indeed a valid user of the NEW_HR Apex application – and a database user references to any database objects are done directly and not via synonyms Hopefully, that’s enough paranoia to prevent the procedure being misused. Once again, we can use a test harness to check the parameter tests at least : set serveroutput on size unlimited declare -- -- test for the change_apex_user_pwd_pr procedure. -- Note all of these tests should fail as we're running from SQL*Plus and -- are not connected as ANONYMOUS. -- type rec_params is record ( old_pwd varchar2(100), new_pwd varchar2(100) ); type typ_params is table of rec_params index by pls_integer; tbl_params typ_params; begin -- populate the test parameter array -- Test 1 - missing old password value tbl_params(1).old_pwd := null; tbl_params(1).new_pwd := 'Boring'; -- Test 2 - missing new password value tbl_params(2).old_pwd := 'Boring'; tbl_params(2).new_pwd := null; -- Test 3 - old password contains a quote tbl_params(3).old_pwd := q'[I'm a silly password]'; tbl_params(3).new_pwd := 'sensible'; -- Test 4 - new password contains a quote tbl_params(4).old_pwd := 'sensible'; tbl_params(4).new_pwd := q'[Who's sensible now ?]'; -- Test 5 - old password 50 characters tbl_params(5).old_pwd := 'just leaning on the keyboard until i have printed over 50 characters zzzzz'; tbl_params(5).new_pwd := 'short_and_to_the_point'; -- Test 6 - new password 50 characters tbl_params(6).old_pwd := 'short_and_to_the_point'; tbl_params(6).new_pwd := 'just leaning on the keyboard until i have printed over 50 characters zzzzz'; -- Test 7 - parameters are valid but we're not connected through APEX... tbl_params(7).old_pwd := 'Valid_pwd'; tbl_params(7).new_pwd := 'anotherboringpassword'; -- -- Execute the tests -- for i in 1..tbl_params.count loop begin change_apex_user_pwd_pr ( i_old_pwd = tbl_params(i).old_pwd, i_new_pwd = tbl_params(i).new_pwd ); dbms_output.put_line('Test '||i||' - Somthing has gone wrong - no error !'); exception when others then dbms_output.put_line('Test '||i||' Error : '||sqlerrm); end; end loop; end; / Running this gives us : SQL @change_pwd_test Test 1 Error : ORA-20000: Both the Old Password and the New Password must be specified Test 2 Error : ORA-20000: Both the Old Password and the New Password must be specified Test 3 Error : ORA-20001: Passwords must not contain the ' character. Test 4 Error : ORA-20001: Passwords must not contain the ' character. Test 5 Error : ORA-20002: Passwords must not exceed 50 characters in length. Test 6 Error : ORA-20002: Passwords must not exceed 50 characters in length. Test 7 Error : ORA-20003: This function can only be called from APEX PL/SQL procedure successfully completed. SQL To test the rest of the function, we will of course, need to be connected via APEX. The Change Password Page Now we come to the page we will be using to call the procedure we’ve just created. The page will have : a password field for the application user to enter their current password a password field for the application user to enter their new password and another one for them to re-type it some validation that the new password and confirm password matches a button to call the change password procedure a field to present a message to the user after the password change call Sounds simple (dangerous) enough… In Application Builder hit the Create Page button… select Blank Page …. In the Page Attributes… Page Alias : change_db_pwd In the Page Name … Name : Change My Password HTML Region1 : change password In Tab Options… Tab Options : Use an existing tab set and create a new tab within the existing tab set New Tab Label : Change Password …and hit Finish. Now Edit the Page. Create a new field with an Item Type of Password : In the Display Position and Name screen, Item Name : PX_OLD_PWD (where X is the number of the page you’re editing). In the Item Attributes Screen : Label : Current Password Field Width : 50 In the Settings Screen – Value Required : Yes Submit when Enter pressed : No In the Source Screen : Source Used : Always, replacing any existing session state Hopefully, the APEX5 Graphical Page Designer will result in fewer screenshots being required in future ! And hit Create Item. Now create two further fields with the same properties except : PX_NEW_PWD has a label of New Password PX_CONFIRM_PWD has a label of Confirm New Password Next, we create a Display Only field called PX_MESSAGE. We’ll use this to provide feedback to the user. We define this with no label so that it doesn’t show up on the screen, until it’s populated. Now we’ve got all of the fields on the page the next step is to create the Change Password button : Accept the defaults for Button Region and Button Position. In the Button Attributes Page : Button Name : change_pwd_btn Label : Change Password Then just hit Create Button. Finally, we need to add a Dynamic Action to validate that the values in PX_NEW_PWD and PX_CONFIRM_PWD are not null and identical, and then to call the Procedure. NOTE – I daresay any APEX experts reading this may have a better way of doing this ! So, Create a Dynamic Action. In the Identification Page : Name : change_pwd_da In the When Page : Action : Click Selection Type : Button Button : CHANGE_PWD_BTN In the True Action Page : Action : Execute PL/SQL Code The PL/SQL Code is as follows : begin if nvl(:P6_NEW_PWD, 'x') != nvl(:P6_CONFIRM_PWD, 'y') then :P6_MESSAGE := 'Confirm Password does not match New Password.'; else hr.change_apex_user_pwd_pr ( i_old_pwd = :P6_OLD_PWD, i_new_pwd = :P6_NEW_PWD ); :P6_MESSAGE := 'Your password has been changed'; end if; exception when others then :P6_MESSAGE := SQLERRM; end; Page Items to Submit : P6_OLD_PWD,P6_NEW_PWD,P6_CONFIRM_PWD,P6_MESSAGE Page Items to Return : P6_MESSAGE Click Create Dynamic Action. Now to test. I’m connected as PLUGGER and I want to change my password. So, I click on the Change Password Tab and I see : If the new and confirm password fields don’t match, I get an error from the Dynamic Action itself, before it calls the procedure : Someone’s having a fat-finger moment If I try to enter a password that contains a single quote, I get : We’ll have none of those naughty quotes thank you very much. Finally, I manage to get it right and am rewarded with : Invoking the Change Password Programatically All that remains now is for us to arrange for the user to be re-directed to the change password page when they connect and their password is near expiry. The password expiry_date is available in the DBA_USERS view so we need to grant SELECT on this to HR : grant select on sys.dba_users to hr / As I’m re-directing them to a page that belongs specifically to the current application, I’m going to put the re-direction in the application itself. So, I’m going to add a Branch to the Home Page. Once again we need to pause here for the APEX gurus to explain the proper way to do this ! Edit the Home Page and Create a Branch… In Branch Attributes Name : pwd_change_br Branch Point : On Load : Before Header In Target Page : the number of the Change Password Page ( 6 in my case) In Branch Conditions Condition Type : Exists( SQL query returns at least one row) In Expression 1, enter the query : select 1 from sys.dba_users where username = apex_040200.v('APP_USER') and expiry_date trunc(sysdate) + 7 This will return 1 if the password is due to expire within the next 7 days. and hit Create Branch. In order to test the branch, I’ve had a bit of a fiddle with the FIXED_DATE parameter [link to post] so that PLUGGER’s password is now due to expire in less than 7 days. Now, when I login as plugger… …I go straight to the Change Password Page… Summary What started off as a fairly short post about Database Authentication Schemes in APEX has grown quite a bit more than I intended. I believe that the solution to password management, which I have outlined here, is secure. Obviously, if anyone can spot any flaws in this, I (and anyone reading this), would find it immensly helpful if you could provide reasons/code as to why and how this approach could be exploited. Whilst the Change Password functionality is something of an overhead in going down the Database Authentication route, the use of database roles, not to mention the RDBMS itself, does mean that this is an approach worth considering when porting older applications to APEX….or maybe it isn’t. I wonder if there’s a passing Australian who’d like to share their opinion on this ? Filed under: APEX , Oracle , PL/SQL , SQL Tagged: APEX Database Authentication Scheme , change password procedure , dba_role_privs , dba_users , password verify function , profile , verify function
↧
Comment on APEX 503 – Service Unavailable – And you don’t know the APEX_PUBLIC_USER Password
@Damir, at the time I ran into this problem, I didn't know how to rerun the listener setup as I hadn't set it up in the first place. Is it the case that you should be able to re-execute the setup using the apex.war and then just "bounce" it for the new settings to take effect ? When doing this, do you need to provide all of the setup settings again or can you just change the one you're interested in ? Additionally, is this something you would reasonably consider doing every, say three months, to enable you to have an APEX_PUBLIC_USER account with a password set to expire ? Thanks, Mike
↧
↧
Blog Post: UKOUG Tech14 Slides: Testing Jumbo Frames for RAC
Just a quick post to link to the slides from my presentation at the UKOUG Tech14 conference. The slides do not appear to be visible on the Super Sunday site, hence this post. The presentation was called “Testing Jumbo Frames for RAC”, the emphasis as much on the testing as on Jumbo Frames. Abstract: A discussion on the usage of Jumbo Frames for the RAC interconnect. We’ll cover background information and how this can be tested at the Oracle level and in the Operating system. We’ll also discuss how testing initially failed and why it’s interesting. This is a topic that enters the realm of network and Unix admin’s but the presentation is aimed at DBAs who want to know more and want to know how to use Operating System tools to investigate further. Slides: http://www.osumo.co.uk/presentations/Jumbo%20Frames-Tech14-Public.pdf The conference itself was another success with particular highlights for me being James Morle talking about SAN replication vs DataGuard, Ludovico Caldara talking about PDBs with MAA, Richard Foote because I was like a silly little fanboi and Bryn Llewellyn purely for the way he weaves the English language in to something beautiful on the ears. All of my programs will have “prologues” and “epilogues” from now on, “headers” and “footers” are so passe :-) Equal to the presentations was the social side of things too. I do enjoy hanging around in pubs and talking shop.
↧
Blog Post: The UKOUG Apps14 Conference
Cedar had a strong presence at the recent Apps14 conference in the ACC, Liverpool. It is supposed to be the largest applications conference in Europe with over 800 attendees, and we were particularly interested in the PeopleSoft and Apps Innovation streams. This year we’d decided not to have a stand (as the conference is mixed with the other applications and the database/tech community, it means the exhibition stands are much more expensive than at the PeopleSoft-only Roadshow earlier in the year, while the attendance of the PeopleSoft community is much lower) but we did support the conference with four of our team and three speaking slots. It was nice to see a different city, although I’m pleased to see that it’s back in Birmingham next year as that is a lot more central for everyone. This is the view from my hotel room out over Albert Docks (the ACC is out to the left). In terms of our speakers, our GP guru Alex spoke about Global Payroll upgrades to v9.2 (as Cedar has helped several clients either complete their move to v9.2 or with an upgrade in progress at the moment): Several of us also took part in the Oracle Usability Feedback session. We have ahem quite strong opinions on how a UX should be so it was really interesting to be part of the process. We can’t talk about the product that was being tested, but it looks really nice. I can’t wait until it hits GA. Here’s a picture of Simon giving it a thorough test with the lovely Rhonda (the screen has been intentionally blurred): Cedar’s tech guru Neville also spoke on a couple of topics. He covered PeopleSoft Selective Adoption / PeopleSoft Update Manager in one session (joint with Hays, who are using some of our upgraders at the moment on their massive upgrade – HCM, Fins, CRM, Portal, ELM, all at the same time!). He also spoke about Oracle Secure Enterprise Search (joint with Allen & Overy, who we helped to upgrade to 9.2 earlier in the year). The evening in between the two PeopleSoft days was pretty fun too. It started off with a familiar looking (for Liverpool, especially) band called ‘The Cheatles’ (pic below) and then many from the PeopleSoft community sat down for ‘off the record’ chat over a decent meal.
↧
Blog Post: 12cR1 Upgrade in Oracle Applications EBS R12.2
Introduction: Oracle E-Business suite R12.2 is shipped with Oracle Database Version 11.2.0.3. There will be always a requirement for upgrading your database to latest available release for fixing bugs, using new features and to be on supported version of database. This article will outline all steps required for upgrading and using 12cR1 database with your Oracle E-Business suite R12.2 Environment details: Host : erpnode3 ( Oracle Enterprise Linux 5.7) Installation Type : Host Details erpnode3 - Oracle Enterprise Linux 5.7 64 Bit EBS Installation Type Single Node (DB + Application) EBS Version R12.2.4 Database Version 11.2.0.3 STEPS for Upgrade: 1) Pre-Upgrade Steps: 1.1 - Install 12cR1 RDBMS Software 1.2 - Install R12cR1 Examples CD 1.3 - Create /nls/data/9idata directory 1.4 - Install all pre-requisite Database and Application patches 1.5 - Verify the JRE version in Oracle Home 1.6 - Verify Application patching cycle is complete 1.7 - Drop SYS.ENABLED$INDEXES 1.8 - Remove the MGDSYS schema 1.9 - Run pre upgrade tool 2) Upgrade Database 2.1 - Use DBUA for upgrading database 3) Post Upgrade Steps: 3.1 - Install patch post Installation steps for all RDBMS patches (opatches) 3.2 - Start Listener from 12cR1 Home 3.3 - Application Database accounts Expired & Locked 3.4 - Run adgrants.sql 3.5 - Grant create procedure privilege on CTXSYS 3.6 - Compile Invalid Objects 3.7 - Set CTXSYS parameter 3.8 - Validate Workflow ruleset 3.9 - Create context file and run autoconfig on dbTier 3.10 - Run script "adstats.sql" to gather SYS stats 3.11 - Create new MGDSYS schema 3.12 - Apply post upgrade WMS patches 3.13 - Recreate grants and synonyms 3.14 - Start Application services 3.15 - Run concurrent Request "Synchronize workflow views" 3.16 - Verify the upgraded version from OAM 4) Issues 4.1 - Unable to proceed with 12c runInstaller with error "[INS-10102] Installer initialization failed" 4.2 - DBUA not listing the database 4.3 - Invalid specification of system parameter "LOCAL_LISTENER" 4.4 - Application Database account locked after upgrade 1.1 - Install 12cR1 RDBMS Software Install 12cR1 RDBMS Oracle Home in a separate directory from the existing Oracle Home. The current Oracle Home location is "/u01/ora_test/11.2.0" and 12cR1 Oracle Home will be Installed in "/u01/ora_test/12.1.0" - Set the Proper Display Variable and execute "runInstaller" - Here select option "Install Database software only" - Select "Single Instance Database" - If you have any other language Installed and configured than add required languages. - Select Enterprise Edition - Provide the valid Oracle Base and Oracle home location. - Select the valid groups for all roles. In my case only one group is used. - If there are any missing pre-requisites listed then fix it. Some of these pre-requisites can be fixed using fixup script and missing rpm package cannot be fixed using fixup script. Hence install any listed missing rpm packages. - Execute fixup script from the specified location as "root" user. - Check "summary" of setting configured for Installation. If there are any changes you can edit it from the same screen. - Execute "root.sh" script from root user. - Here the Installation of RDBMS software completed. 1.2 - Install R12cR1 Examples CD - Installation of Examples CD is mandatory and one should not skip its Installation before starting the upgrade process. Examples CD will be Installed in existing newly Installed 12cR1 Oracle home. - set the proper display variable and execute runInstaller - Here select the existing 12cR1 Oracle home (/u01/ora_test/12.1.0) - Check "summary" for configured settings. - Installation of "Examples CD" completed successfully. 1.3 - Create /nls/data/9idata directory Execute "cr9idata.pl" script from 12c Home ( $ORACLE_HOME/nls/data/old/cr9idata.pl ). Configure "ORA_NLS10" Environment variable with directory created in 12c Home [oraebs@erpnode3 ~]$ perl $ORACLE_HOME/nls/data/old/cr9idata.pl Creating directory /u01/ora_test/12.1.0/nls/data/9idata ... Copying files to /u01/ora_test/12.1.0/nls/data/9idata... Copy finished. Please reset environment variable ORA_NLS10 to /u01/ora_test/12.1.0/nls/data/9idata! [oraebs@erpnode3 ~]$ 1.4 - Install all pre-requisite Database and Application patches Application Patches: Current EBS environment is running on latest release EBS R12.2.4 so there are no additional application patches are required to be Installed. Database Patches: The following RDBMS patches need to be Installed as a pre-requisites before upgrading the Database. Patches Linux X86-84 Bit - Version 12.1.0.1.0 17695298 (see Footnote 2 ), 14237793, 16989137, 17184721, 17448638, 17600719, 17801303, 17892268, 17912217, 17973865, 17973883, 18288676, 18419770, 18604144, 18614015, 18665660, 18685209, 19466632, 19603897 (see Footnote 3 ), 19393542 Footnote 2 - This is the Database Bundle Patch for 12.1.0.1.0 and must be applied first. This includes the database patch for 18241194 which will be removed in one of the subsequent patches as it is no longer needed. Footnote 3 - If a conflict is reported with 18241194, either roll back 18241194 first (opatch rollback, then apply 19603897), or allow opatch to roll it back when applying Patch 19603897 . Configure 12c Enviroment variables: [oraebs@erpnode3 ~]$ cat 12c.env export ORACLE_HOME=/u01/ora_test/12.1.0 export PATH=$ORACLE_HOME/bin:$ORACLE_HOME/OPatch:$PATH export ORA_NLS10=/u01/ora_test/12.1.0/nls/data/9idata [oraebs@erpnode3 ~]$ [oraebs@erpnode3 ~]$ export ORACLE_HOME=/u01/ora_test/12.1.0 [oraebs@erpnode3 ~]$ export PATH=$ORACLE_HOME/bin:$PATH [oraebs@erpnode3 ~]$ export PATH=$ORACLE_HOME/bin:$ORACLE_HOME/OPatch:$PATH [oraebs@erpnode3 ~]$ vi 12c.env [oraebs@erpnode3 ~]$ which opatch /u01/ora_test/12.1.0/OPatch/opatch [oraebs@erpnode3 ~]$ opatch lsinventory Oracle Interim Patch Installer version 12.1.0.1.0 Copyright (c) 2012, Oracle Corporation. All rights reserved. Oracle Home : /u01/ora_test/12.1.0 Central Inventory : /u01/ora_test/oraInventory from : /u01/ora_test/12.1.0/oraInst.loc OPatch version : 12.1.0.1.0 OUI version : 12.1.0.1.0 Log file location : /u01/ora_test/12.1.0/cfgtoollogs/opatch/opatch2014-11-17_15-53-52PM_1.log Lsinventory Output file location : /u01/ora_test/12.1.0/cfgtoollogs/opatch/lsinv/lsinventory2014-11-17_15-53-52PM.txt ------------------------------------------------------------------------------ Installed Top-level Products (2): Oracle Database 12c 12.1.0.1.0 Oracle Database 12c Examples 12.1.0.1.0 There are 2 products installed in this Oracle Home. There are no Interim patches installed in this Oracle Home. ------------------------------------------------------------------------------OPatch succeeded. [oraebs@erpnode3 ~]$ The above command lists the Installed products in an Existing Oracle home. All pre-requisites patches needs to be Installed in 12.1.0.1.0 Oracle home. As per footnote2 patch " 17695298" should be Installed first as this patch contains the consolidated bug fixes. As per footnode3 If a conflict is detected with patch " 18241194 " than uninstall this patch than Install patch " 19603897 ". Refer Attached file for Installing all RDBMS opatches. [applebs@erpnode3 appl_test]$ adop phase=cutover,cleanup 12c_Opatches.doc 1.5 - Verify the JRE version in Oracle Home To upgrade to 12cR1 minimum version of JRE required is version 6. Please make sure that Installed version of JRE in an existing Oracle Home. If the Installed version is lower than required then please upgrade it. In current setup the Installed version is version 7 so no action is required. [oraebs@erpnode3 bin]$ pwd /u01/ora_test/11.2.0/appsutil/jre/bin [oraebs@erpnode3 bin]$ ./java -version java version "1.7.0_17" Java(TM) SE Runtime Environment (build 1.7.0_17-b02) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) [oraebs@erpnode3 bin]$ 1.6 - Verify Application patching cycle is complete Before starting the upgrade process verify the Application patching cycle is complete and there no pending actions. [applebs@erpnode3 appl_test]$ adop phase=cutover,cleanup 1.7 - Drop SYS.ENABLED$INDEXES If table SYS.ENABLED$INDEXEX exists then drop this table with sysdba user. In current setup this table doesn't exists. [oraebs@erpnode3 11.2.0]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Tue Nov 18 11:39:15 2014 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL drop table sys.enabled$indexes; drop table sys.enabled$indexes * ERROR at line 1: ORA-00942: table or view does not exist SQL desc sys.enabled$indexes; ERROR: ORA-04043: object sys.enabled$indexes does not exist SQL 1.8 - Remove the MGDSYS schema If upgrading from database version prior to 12c than drop MGDSYS schema from the existing database. Execute script " catnomgdidcode.sql " from an existing Oracle home. SQL @?/md/admin/catnomgdidcode.sql User dropped. Synonym dropped. Synonym dropped. Synonym dropped. Synonym dropped. SQL 1.9 - Run pre upgrade tool Pre Upgrade tool will lists all changes need to be performed before starting the upgrade process. Copy pre upgrade scripts from 12cR1Oracle home to any other directory. I used "db_scripts" directory to keep all upgrade related scripts in one location. [oraebs@erpnode3]$cp /u01/ora_test/12.1.0/rdbms/admin/preupgrd.sql . [oraebs@erpnode3]$cp /u01/ora_test/12.1.0/rdbms/admin/utluppkg.sql . Make sure that preupgrd.sql and utluppkg.sql files are copied in the same directory. - Connect from 11.2.0.3 Home as sysdba and run pre-upgrade tool SQL @preupgrd.sql Loading Pre-Upgrade Package... Executing Pre-Upgrade Checks... Pre-Upgrade Checks Complete. ************************************************************ Results of the checks are located at: /u01/ora_test/11.2.0/cfgtoollogs/test/preupgrade/preupgrade.log Pre-Upgrade Fixup Script (run in source database environment): /u01/ora_test/11.2.0/cfgtoollogs/test/preupgrade/preupgrade_fixups.sql Post-Upgrade Fixup Script (run shortly after upgrade): /u01/ora_test/11.2.0/cfgtoollogs/test/preupgrade/postupgrade_fixups.sql ************************************************************ Fixup scripts must be reviewed prior to being executed. ************************************************************ ************************************************************ ==== USER ACTION REQUIRED ==== ************************************************************ The following are *** ERROR LEVEL CONDITIONS *** that must be addressed prior to attempting your upgrade. Failure to do so will result in a failed upgrade. 1) Check Tag: INVALID_SYS_TABLEDATA Check Summary: Check for invalid (not converted) table data Fixup Summary: "UPGRADE Oracle supplied table data prior to the database upgrade." +++ Source Database Manual Action Required +++ You MUST resolve the above error prior to upgrade ************************************************************ SQL Oracle Database 12c Pre upgrade utility will generate preupgrade.log, preupgrade_fixups.sql and postupgrade_fixup.sql We need to review preupgrade.log file and check for all recommendation. This will be helpful if you are performing the manual upgrade. - perform full database backup. 2) Upgrade Database 2.1 - Use DBUA for upgrading database - Configure the temporary environment file during upgrade process. After enabling autoconfig on 12c Home it will generate new environment file with all required EBS parameters - Execute dbua from 12c Home Make sure that oratab file contains a valid SID entry in "/etc/oratab" file. In my environment after fresh installation this entry was missing and dbua was not able to list the database. After manually adding entry in oratab file it was listed in dbua window. Its gathering the information required for upgrading the database. - Performing the pre-requisite checks As mentioned in the tech notewe can safely ignore DBMS LDAP dependencies. If there are any Invalid objects in the database try to compile them. - Select "ignore" and set action "ignore" then click on next. - You can select Parallelism parameters based on available hardware resources. If you want to upgrade the Timezone Data then check "Upgrade Timezone Data" it will upgrade your Timezone Data to V18 , the current version of Timezone is 17. - Listener will be configured after upgrading the database. So do not select to configure listener. - You can perform a backup before starting the upgrade process if there is no backup policy in place. If you backed up your database already then you can select "I have my own backup and restore strategy" - This error was encountered as there are no listeners configured in 12.1.0.1 Oracle Home. You can safely ignore this error. We will manually configure the listener after completion of database upgrade process. To avoid this error you can create the listener from 12cR1 Oracle home before starting dbua. - You can also avoid this error by removing the LOCAL_LISTENER parameter from the spfile created by dbua and re-run dbua. - There was an error in upgrade alert log " SYSTEM.EBS_LOGON ORA-01031:Insufficient privileges " By default SYSTEM.EBS_LOGON trigger is enabled. To avoid this error disable this trigger. - Upgrade completed successfully. Click on "upgrade results" to check full details. - The new spfile has been created in 12cR1 Oracle home. - Time zone version has been upgraded from version 17 to version 18 - sec_case_sensitive_logon parameter was removed. 3) Post Upgrade Steps: 3.1 - Install patch post Installation steps for all RDBMS patches (opatches ) Perform post Installation steps for all Installed patches prior to upgrade. If DATAPATCH is required to run then it can be run but only once. 3.2 - Start Listener from 12cR1 Home: Copy TNS_ADMIN directory from 11gR2 Home to 12cR1 home and modify all required files with 12c Oracle Home location. [oraebs@erpnode3 admin]$ ls -lrt total 12 -rw-r--r-- 1 oraebs dbaerp 205 May 11 2011 shrept.lst drwxr-xr-x 2 oraebs dbaerp 4096 Oct 27 21:25 samples drwxr-xr-x 2 oraebs dbaerp 4096 Oct 27 23:11 test_erpnode3 [oraebs@erpnode3 admin]$ cp -pr test_erpnode3/u01/ora_test/12.1.0/network/admin [oraebs@erpnode3 admin]$ pwd /u01/ora_test/11.2.0/network/admin [oraebs@erpnode3 admin]$ - Configured listener.ora and tnsnames.ora Network_files_12c.doc - Start listener from 12c Home [oraebs@erpnode3 test_erpnode3]$ pwd /u01/ora_test/12.1.0/network/admin/test_erpnode3 [oraebs@erpnode3 test_erpnode3 ]$ export TNS_ADMIN=/u01/ora_test/12.1.0/network/admin/test_erpnode3 [oraebs@erpnode3 test_erpnode3]$ lsnrctl start test LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 20-NOV-2014 08:55:35 Copyright (c) 1991, 2013, Oracle. All rights reserved. Starting /u01/ora_test/12.1.0/bin/tnslsnr: please wait... TNSLSNR for Linux: Version 12.1.0.1.0 - Production System parameter file is /u01/ora_test/12.1.0/network/admin/test_erpnode3/listener.ora Log messages written to /u01/ora_test/12.1.0/network/admin/test.log Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=erpnode3.oralabs.com)(PORT=1529))) Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=erpnode3.oralabs.com)(PORT=1529))) STATUS of the LISTENER ------------------------ Alias test Version TNSLSNR for Linux: Version 12.1.0.1.0 - Production Start Date 20-NOV-2014 08:55:36 Uptime 0 days 0 hr. 0 min. 0 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /u01/ora_test/12.1.0/network/admin/test_erpnode3/listener.ora Listener Log File /u01/ora_test/12.1.0/network/admin/test.log Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=erpnode3.oralabs.com)(PORT=1529))) Services Summary... Service "test" has 1 instance(s). Instance "test", status UNKNOWN, has 1 handler(s) for this service... The command completed successfully [oraebs@erpnode3 test_erpnode3]$ 3.3 - Application Database accounts Expired& Locked After successful completion of upgrade we noticed that all database accounts was expired and locked Including APPS, APPLSYS and APPLSYSPUB except "SYS" and "SYSTEM". This is the default behavior of 12c DBUA and its mentioned in MOS tech note " 1516557.1 " in section known Issues. You should not encounter this Issue if you are manually upgrading your database using upgrade script. As per Oracle documentation and procedure we should change the Application Database account using FNDCPASS utility. As the APPS account is locked and expired it was not allowing to change the password using FNDCPASS utility. Solution: - As there was no alternate available then changed the password for APPS using alter command to the same old password. - Changed password again using FNCPASS utility for APPLSYS and APPLSYSPUB user. - Changed password for all Application Oracle users - Run autoconfig on application and database Tiers Tried login with APPS user: [oraebs@erpnode3 db_scripts]$ sqlplus SQL*Plus: Release 12.1.0.1.0 Production on Thu Nov 20 00:23:07 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Enter user-name: apps Enter password: ERROR: ORA-28000: the account is locked Enter user-name: /as sysdba Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL alteruser apps account unlock; User altered. SQL Check the account Status: SQL select username,account_status, lock_date , expiry_date from dba_users where username='APPS'; USERNAME ACCOUNT_ST LOCK_DATE EXPIRY_DA ---------- ---------- --------- --------- APPS EXPIRED 18-NOV-14 [applebs@erpnode3 sql]$ pwd /u01/appl_test/fs1/EBSapps/appl/ad/12.0.0/patch/115/sql [applebs@erpnode3 sql]$ FNDCPASS apps/APPS 0 Y system/manager APPLSYS APPS APP-FND-01564: ORACLE error 28001 in AFPCOA Cause: AFPCOA failed due to ORA-28001: the password has expired . The SQL statement being executed at the time of the error was: and was executed from the file . [applebs@erpnode3 sql]$ - Changed password manually for APPS user: [applebs@erpnode3 sql]$ sqlplus SQL*Plus: Release 10.1.0.5.0 - Production on Thu Nov 20 09:18:26 2014 Copyright (c) 1982, 2005, Oracle. All rights reserved. Enter user-name: apps Enter password: ERROR: ORA-28001: the password has expired Changing password for apps New password: Retype new password: Password changed Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL - In Weblogic admin server APPS password is configured, hence changed password again with FNDCAPSS utility and updated the same password in weblogic admin server datasource. [applebs@erpnode3 ~]$ FNDCPASS apps/APPS 0 Y system/manager SYSTEM APPLSYS APPS Log filename : L462324.log Report filename : O462324.out [applebs@erpnode3 ~]$ In R12.2 the process for changing the APPS password is same as prior releases with some additional steps in Weblogic Admin Server. Steps: Start Admin server (Do not start any other services/managed servers) Log in to Weblogic Server Admin Console Click Lock & Edit in Change Center In the Domain expand Services and select Data Sources On "Summary of JDBC Data Sources" page select EBSDataSource On the "Settings for EBSDataSource" select the Connection Pool tab Enter the new password Click on Save Click on Activate Changes in Change Center stop admin server Run autoconfig on Database Tier and Application Tier All other Oracle Application schemas accounts are expired and locked. We need to change password for APPLSYSPUB and other schema (like AR, GL) using FNDCPASS alloracle . - Change password for APPLYSPUB using alter command. - Change password for all Oracle User using FNDCPASS utility. [applebs@erpnode3 appl_test]$ FNDCPASS apps/apps 0 Y system/manager ALLORACLE oracle Log filename : L466353.log Report filename : O466353.out [applebs@erpnode3 appl_test]$ - After changing passwords all Application database accounts will now have new password but still these accounts are locked. All these users should be unlocked using alter command. - Use below SQL command to generate script for altering users which are locked. - script attached "db_unlock_users.sql" SQL select 'alter user '||username||' account unlock;' from dba_users where ACCOUNT_STATUS='LOCKED' ; - As per the procedure we should run autoconfig after changing the password using FNDCPASS. But we will run autoconfig later while enabling autoconfig on 12c Home dbTier. After successful completion of autconfig on dbTier than run autoconfig on Application Tier. - Script to unlock Application database accounts: db_account_unlock.sql 3.4 - Run adgrants.sql Run "adgrants.sql" script from $APPL_TOP/admin [oraebs@erpnode3 db_scripts]$ ls adgrants.sql preupgrd.sql utluppkg.sql [oraebs@erpnode3 db_scripts]$ mv adgrants.sql adgrants.sql.12.2.0 [oraebs@erpnode3 db_scripts]$ cp /u01/appl_test/fs1/EBSapps/appl/admin/adgrants.sql . [oraebs@erpnode3 db_scripts]$ ls -lrt total 596 -rwxr-xr-x 1 oraebs dbaerp 99663 Nov 4 21:11 adgrants.sql.12.2.0 -rw-r--r-- 1 oraebs dbaerp 5231 Nov 14 23:48 preupgrd.sql -rw-r--r-- 1 oraebs dbaerp 381893 Nov 14 23:53 utluppkg.sql -rwxr-xr-x 1 oraebs dbaerp 99663 Nov 20 00:14 adgrants.sql [oraebs@erpnode3 db_scripts]$ SQL @adgrants.sql APPS Connected. --------------------------------------------------- --- adgrants.sql started at 2014-11-20 00:16:02 --- Creating PL/SQL profiler objects. PL/SQL procedure successfully completed. End of PURGE DBA_RECYCLEBIN. Commit complete. Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options [oraebs@erpnode3 db_scripts]$ 3.5 - Grant create procedure privilege on CTXSYS Run script "adctxprv.sql" from $AD_TOP/sql by connecting as SYSDBA user [oraebs@erpnode3 db_scripts]$ cp /u01/appl_test/fs1/EBSapps/appl/ad/12.0.0/patch/115/sql/adctxprv.sql . [oraebs@erpnode3 db_scripts]$ sqlplus SQL*Plus: Release 12.1.0.1.0 Production on Thu Nov 20 09:24:00 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Enter user-name: apps Enter password: Last Successful login time: Thu Nov 20 2014 09:21:59 +03:00 Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL @adctxprv.sql manager CTXSYS Connecting to SYSTEM Connected. PL/SQL procedure successfully completed. Commit complete. Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options [oraebs@erpnode3 db_scripts]$ 3.6 - Compile Invalid Objects [oraebs@erpnode3 db_scripts]$ sqlplus / as sysdba SQL*Plus: Release 12.1.0.1.0 Production on Thu Nov 20 09:25:57 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL @$ORACLE_HOME/rdbms/admin/utlrp.sql 3.7 - Set CTXSYS parameter Execute following procedure with "SYSDBA" account SQL exec ctxsys.ctx_adm.set_parameter('file_access_role', 'public'); PL/SQL procedure successfully completed. SQL 3.8 - Validate Workflow ruleset On admin server node, run script "$FND_TOP/patch/115/sql/wfaqupfix.sql" as below: [applebs@erpnode3 sql]$ pwd /u01/appl_test/fs1/EBSapps/appl/fnd/12.0.0/patch/115/sql [applebs@erpnode3 sql]$ ls -lrt wfaqupfix.sql -rwxr-xr-x 1 applebs dbaerp 4942 Nov 6 18:39 wfaqupfix.sql [applebs@erpnode3 sql]$ sqlplus SQL*Plus: Release 10.1.0.5.0 - Production on Thu Nov 20 09:32:35 2014 Copyright (c) 1982, 2005, Oracle. All rights reserved. Enter user-name: apps Enter password: Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL @wfaqupfix.sql APPLSYS APPS PL/SQL procedure successfully completed. Commit complete. Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options [applebs@erpnode3 sql]$ 3.9 - Create context file and run autoconfig on dbTier In 12c Oracle Home "appsutil" directory doesn’t exists, We have to copy the appsutil directory from 11g Oracle Home to 12c Oracle Home. § Run script to generate latest appsutil.zip § Copy appsutil directory from 11g Home to 12c Home § copy and unzip appsutil.zip in 12c Oracle Home § Generate new contextfile using "adbldxml.pl" script § Execute adconfig.sh script [applebs@erpnode3 bin]$ ls -l admkappsutil.pl -rwxr-xr-x 1 applebs dbaerp 7232 Nov 24 2012 admkappsutil.pl [applebs@erpnode3 bin]$ perl admkappsutil.pl Starting the generation of appsutil.zip Log file located at /u01/appl_test/fs1/inst/apps/test_erpnode3/admin/log/MakeAppsUtil_11200942.log output located at /u01/appl_test/fs1/inst/apps/test_erpnode3/admin/out/appsutil.zip MakeAppsUtil completed successfully. [applebs@erpnode3 bin]$ Copy Existing appsutil directory from 11gR2 Home to 12cR1 Home. Run script "$AD_TOP/bin/admkapsutil.pl" to generate new appsutil.zip file, It will generate file in $INST_TOP/out directory. Copy this file to 12c Oracle Home and unzip it. [oraebs@erpnode3 12.1.0]$ cp -pr /u01/ora_test/11.2.0/appsutil . [oraebs@erpnode3 12.1.0]$ cp /u01/appl_test/fs1/inst/apps/test_erpnode3/admin/out/appsutil.zip . [oraebs@erpnode3 12.1.0]$ ls -lrt appsutil.zip -rw-r--r-- 1 oraebs dbaerp 3538270 Nov 20 09:45 appsutil.zip [oraebs@erpnode3 12.1.0]$ - Unzip file "appsutil.zip" file from 12.1 Home [oraebs@erpnode3 12.1.0]$ unzip -o appsutil.zip Archive: appsutil.zip inflating: appsutil/template/adclobtmp.sql inflating: appsutil/template/afinit_db112.ora inflating: appsutil/template/adstrtdb.sql inflating: appsutil/template/adregtools.drv......................................... [oraebs@erpnode3 bin]$ perl adbldxml.pl Starting context file generation for db tier.. Using JVM from /u01/ora_test/12.1.0/appsutil/jre/bin/java to execute java programs.. APPS Password: The log file for this adbldxml session is located at: /u01/ora_test/12.1.0/appsutil/log/adbldxml_11200955.log Could not Connect to the Database with the above parameters, Please answer the Questions below Enter Hostname of Database server: erpnode3 Enter Port of Database server: 1529 Enter SID of Database server: test Enter Database Service Name: test Enter the value for Display Variable: 0.0 The context file has been created at: /u01/ora_test/12.1.0/appsutil/test_erpnode3.xml [oraebs@erpnode3 bin]$ - Execute autoconfig on database Tier: [oraebs@erpnode3 appsutil]$ pwd /u01/ora_test/12.1.0/appsutil [oraebs@erpnode3 appsutil]$ cd bin/ [oraebs@erpnode3 bin]$ sh adconfig.sh Enter the full path to the Context file: /u01/ora_test/12.1.0/appsutil/test_erpnode3.xml Enter the APPS user password: The log file for this session is located at: /u01/ora_test/12.1.0/appsutil/log/test_erpnode3/11201009/adconfig.log AutoConfig is configuring the Database environment... AutoConfig will consider the custom templates if present. Using ORACLE_HOME location : /u01/ora_test/12.1.0 Classpath : :/u01/ora_test/12.1.0/jdbc/lib/ojdbc6.jar:/u01/ora_test/12.1.0/appsutil/java/xmlparserv2.jar:/u01/ora_test/12.1.0/appsutil/java:/u01/ora_test/12.1.0/jlib/netcfg.jar:/u01/ora_test/12.1.0/jlib/ldapjclnt12.jar Using Context file : /u01/ora_test/12.1.0/appsutil/test_erpnode3.xml Context Value Management will now update the Context file Updating Context file...COMPLETED Attempting upload of Context file and templates to database...COMPLETED Updating rdbms version in Context file to db121 Updating rdbms type in Context file to 64 bits Configuring templates from ORACLE_HOME ... AutoConfig completed successfully. [oraebs@erpnode3 bin]$ - Run autoconfig on appsTier Run &Patch File system: Execute autconfig on run filesystem by sourcing environmental variables for RUN file system and on patch file system after sourcing environmental variable on patch file system. Ignore errors encountered while running autoconfig on patch file system. [applebs@erpnode3 scripts]$ adautocfg.sh Enter the APPS user password: The log file for this session is located at: /u01/appl_test/fs1/inst/apps/test_erpnode3/admin/log/11201225/adconfig.log AutoConfig is configuring the Applications environment... AutoConfig will consider the custom templates if present. Using CONFIG_HOME location : /u01/appl_test/fs1/inst/apps/test_erpnode3 Classpath : /u01/appl_test/fs1/FMW_Home/Oracle_EBS-app1/shared-libs/ebsappsborg/WEBINF/lib/ebsAppsborgManifest.jar:/u01/appl_test/fs1/EBSapps/comn/java/classes Using Context file : /u01/appl_test/fs1/inst/apps/test_erpnode3/appl/admin/test_erpnode3.xml Context Value Management will now update the Context file Updating Context file...COMPLETED Attempting upload of Context file and templates to database...COMPLETED Configuring templates from all of the product tops... Configuring AD_TOP........COMPLETED Configuring FND_TOP.......COMPLETED Configuring ICX_TOP.......COMPLETED Configuring MSC_TOP.......COMPLETED Configuring IEO_TOP.......COMPLETED ................................... ................................... Configuring GMF_TOP.......COMPLETED Configuring PON_TOP.......COMPLETED Configuring FTE_TOP.......COMPLETED Configuring ONT_TOP.......COMPLETED Configuring AR_TOP........COMPLETED Configuring AHL_TOP.......COMPLETED Configuring IES_TOP.......COMPLETED Configuring OZF_TOP.......COMPLETED Configuring CSD_TOP.......COMPLETED Configuring IGC_TOP.......COMPLETED AutoConfig completed successfully. [applebs@erpnode3 scripts]$ 3.10 - Run script "adstats.sql" to gather SYS stats Run script "$APPL_TOP/admin/adstats.sql" as SYSDBA user to gather statistic for sys user. Enable restricted mode before running the script and should be disabled once finished. [oraebs@erpnode3 db_scripts]$ cd /u01/appl_test/fs1/EBSapps/appl/admin [oraebs@erpnode3 admin]$ ls -l adstats.sql -rwxr-xr-x 1 applebs dbaerp 2752 Nov 24 2012 adstats.sql [oraebs@erpnode3 admin]$ sqlplus / as sysdba SQL*Plus: Release 12.1.0.1.0 Production on Thu Nov 20 13:53:08 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL alter system enable restricted session; System altered. SQL @adstats.sql Connected. ------------------------------------------------- --- adstats.sql started at 2014-11-20 13:53:50 --- Checking for the DB version and collecting statistics ... PL/SQL procedure successfully completed. ------------------------------------------------ --- adstats.sql ended at 2014-11-20 14:30:35 --- Commit complete. Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options [oraebs@erpnode3 admin]$ [oraebs@erpnode3 ~]$ sqlplus / as sysdba SQL*Plus: Release 12.1.0.1.0 Production on Thu Nov 20 14:33:18 2014 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL alter system disable restricted session; System altered. 3.11 - Create new MGDSYS schema MGDSYS schema was dropped before starting the upgrade. If you're planning to use MGDSYS schema then you should create it again by running script "@?/rdbms/admin/catmgd.sql" SQL @?/rdbms/admin/catmgd.sql .. Creating MGDSYS schema User created. .. Granting permissions to MGDSYS Grant succeeded. Grant succeeded. 3.12 - Apply post upgrade WMS patches If upgraded prior version to 12c we have to apply patch "19007053". As we are upgraded from version 11g then we should apply this patch. [applebs@erpnode3 admin]$ adop phase=apply patches=19007053 apply_mode=downtime 3.13 - Recreate grants and synonyms Using adadmin from Admin server node recreate grants and synonyms for APPS schema: AD Administration Main Menu -------------------------------------------------- 1. Generate Applications Files menu 2. Maintain Applications Files menu 3. Compile/Reload Applications Database Entities menu 4. Maintain Applications Database Entities menu 5. Exit AD Administration Enter your choice [5] : 4 Maintain Applications Database Entities --------------------------------------------------- 1. Validate APPS schema 2. Re-create grants and synonyms for APPS schema 3. Maintain multi-lingual tables 4. Check DUAL table 5. Return to Main Menu - Verify adadmin logfile for Errors. 3.14 - Start Application services Start Application services using “adstrtal.sh" script and make sure all services started successfully without any errors. - Login to WLS Admin server and check both Admin and managed servers are working normally. - Login to Oracle Application and verify all functionalities are working normally. 3.15 - Run concurrent Request "Synchronize workflow views" Run concurrent request by specifying the following parameters and make sure request completed successfully without any Issues. 3.16 - Verify the upgraded version from OAM Login to OAM as sysadmin user: Navigate to system administrator responsibility sitemap Monitor database 4. Issues: 4.1 - Unable to proceed with 12c runInstaller with error "[INS-10102] Installer initialization failed" Refer: http://www.toadworld.com/platforms/oracle/b/weblog/archive/2014/11/14/ins-21003-installer-has-detected-that-an-invalid-inventory-pointer-location-file-was-specified-oracle-12c.aspx 4.2 - DBUA not listing the database This is the fresh Installation of EBS R12.2. File "/etc/oratab" does not contain entry for SID. Hence DBUA was unable to detect SID from 11g Oracle home. Manually add entry in "/etc/oratab" file and rerun dbua. 4.3 - Invalid specification of system parameter "LOCAL_LISTENER" "ORA-00119: Invalid specification for system parameter LOCAL_LISTENER" "ORA-00132: syntax error or unresolved network name test_LOCAL" This Issue occurred due to incorrect configuration of LOCAL_LISTENER parameter in dbua created spfile. This Issue can be avoided by setting the valid parameter of LOCAL_LISTENER in the spfile or by removing this parameter from spfile. 4.4 Application Database account locked after upgrade After completion of upgrade noticed all database account are locked & Expired including APPS, APPLSYS and APPLSYSPUB except sys and system. This is a known Issue if database is upgraded using 12c DBUA then it will expire all database accounts except SYS and SYSTEM. Please refer section " 3.3 - Application Database accounts Expired & Locked " 5) References: 1. Interoperability Notes Oracle EBS 12.2 with Oracle Database 12c Release 1 (Doc ID 1926201.1) 2. Oracle E-Business Suite Release 12.2: Consolidated List of Patches and Technology Bug Fixes (Doc ID 1594274.1) 3. Complete Checklist for Upgrading to Oracle Database 12c Release 1 using DBUA (Doc ID 1516557.1) 4. FNDCPASS Troubleshooting Guide For Login and Changing Applications Passwords (Doc ID 1306938.1)
↧
Blog Post: SQL Tuning
In the first article of my contention series we discussed how contention can be regarded as a literal “bottleneck”, limiting the amount of work that the database can sustain - Figure 1 illustrates the concept. By reducing contention, we essentially widen the funnel so that more of the application demand can be met. Figure 1 Contention can be thought of as a literal bottleneck Tuning contention is appealing from a DBA perspective, since you don’t have to change the application code or schema. But it’s often more effective to reduce the demand on the database by making the application work smarter. In general, this involves tuning SQL and PL/SQL. SQL tuning is a big topic; many books have been written on the topic including mine . I can’t reproduce a complete guide to SQL tuning here, but I can get you started on finding SQL to tune, and how to use Toad DBA suite tools to tune them. Finding problem SQL There’s a number of ways to identify SQL that might need tuning, but since time immemorial (OK, since Oracle 7!) the easiest way has been to examine the cached SQL information held in the V$SQL view. This view contains information about the SQL statements that are stored in the shared pool. Providing SQL statements are sharable – usually meaning that they use bind variables appropriately – then most of the high resource SQL will be represented in this view. Figure 2 V$SQL and related tables Originally V$SQL contained only limited information about SQL statement execution – logical IO and disk IO counts. Today, V$SQL includes CPU time, elapsed time and time spent in other high level categories such as Application time (which includes lock wait time). While SQL statements which consume the most logical IO or have the highest elapsed times are often good targets for tuning, it’s often only examination of individual steps that will pinpoint the best tuning opportunity. In Oracle Database 10g, we can use cached query plan statistics to pinpoint individual steps within an SQL execution that might warrant attention. The view V$SQL_PLAN shows the execution plan for all cached SQL statements, while V$SQL_PLAN_STATISTICS shows execution counts, IO and rows processed by each step in the plan (You may have to up your STATISTICS_LEVEL from TYPICAL to ALL to get some of this new information). Figure 2 shows the essential columns in these new views. Mining V$SQL, we can identify SQL statements that have high elapsed times, CPU or IO requirements. Using the newer plan tables, we can find SQLs that are performing actions that might be undesirable, such as table scans of large tables. We can also see the “real” execution plan and even get information about which steps in the plan are most troublesome. The easiest way to mine this information is through Spotlight on Oracle. Figure 3 Finding tuneable SQLs in Spotlight The Spotlight top SQL facility allows us to filter SQLs for analysis in a number of ways. We can set a minimum filter condition, such as only those SQLs that haven high elapsed times, can sort by a wide range of metrics. Figure 4 Viewing the SQL details Figure 4 shows the detail information that is available for a specific SQL. Not only can we see the execution timings overall, but we can see them on a per execution basis and we can see a breakdown of timing in high level categories. In this case, the SQL spends about 70% of its elapsed time performing IO. An examination of the cached execution plan (Figure 5) shows us why: the SQL is performing a full table scan on the relatively massive LINE_ITEMS table. Figure 5 Viewing the cached SQL plan It’s worth noting elapsed times in cached execution plans are cumulative. So for instance, if I perform a full table scan and then sort the results, the sort step will include the elapsed time from the scan step. That’s why every elapsed time above the full table scan in Figure 5 is roughly the same; the time spent performing the scan dominates every other step. Tuning the SQL As I said earlier, whole books can and have been written about SQL tuning and it takes a long time to get up to speed. But if you’re a Toad DBA suite or Toad Xpert owner then most of the hard work is removed, since you can use SQL Optimizer to automatically tune your SQL. Clicking the SQL optimizer button (highlighted in Figure 4) launches SQL optimizer to tune the current SQL. I usually go straight for “Optimize and Index” to get indexing suggestions as well as options for optimizing the SQL with hints or rewrites. Figure 6 SQL Optimizer in action SQL Optimizer identifies several rewrites, one of which (#4) is a significant improvement over the original SQL, reducing elapsed time by 20%. It’s interesting to note however, that the improved SQL actually has a higher estimated plan cost – Oracle’s cost based optimizer is smart, but not infallible. Conclusion Getting started with SQL tuning involves mastering two fundamentals: finding SQL that warrants tuning, and working out how to make those SQLs work more efficiently. Spotlight and SQL Optimizer – within the Toad DBA Suite – make these two steps extremely easy even if you’re new to SQL tuning. Tuning SQL reduces the demands that the application makes on the database. Every downstream activity – contention, memory efficiency and IO load – will be improved as a result. For this reason, tuning SQL is an essential pre-requisite for an efficient and scalable database server.
↧
↧
Blog Post: Reducing Oracle Network Contention
Unclogging those network tubes It’s usual for your Oracle client – be it an application server, windows client or even humble old SQL*Plus – to be running on a computer other than the server hosting the database. Obviously, in those circumstances, Oracle must transfer data between the client and the database server. Occasionally, this network traffic can in itself become a bottleneck. When a US politician described the internet as “a series of tubes” he was ridiculed for his lack of technical sophistication. However, I a think a "series of tubes" is not a bad analogy for a layperson, since it at least intuitively explains the concepts of bandwidth and latency . Messages between two locations on a network are passed as packets that travel across the physical network layers – such as fibre optic cable - at a significant fraction of the speed of light. However, packets must pass through routers that direct the packets to their ultimate destinations. The time spent in routers is usually far greater than the time spent travelling down the “pipes”. Also, if more packets hit the router than can be immediately processed, then packets will be queued . In extreme cases, packets might be lost - and then resent – if the number of packets queued exceeds the buffer capacity of a router. Consequently, if the number of packets sent across the network is excessive then routers will begin to queue messages and network latency – the time taken to send a packet from source to destination – will increase. We might call that situation network contention . The maximum number of packets that can be sent from source to destination in any unit of time is referred to as the bandwidth of the network. From all of the above, it’s fairly clear that to reduce network contention we need to either improve the bandwidth of the network or reduce the number of packets sent across the network. Increasing network bandwidth is a complex topic, but reducing packet transmissions in Oracle is actually fairly straight forward. When an Oracle client wants to retrieve rows from a result set, the code at some level will look something like this: UNTIL all rows retrieved FETCH next ROW DONE If each of the FETCH operations results in a network operation, then there will be two network packets for each row retrieved: one to request the next row, one with the row in it. However, network packets are typically 2K at least in size, and if each row were 100 bytes long, then we could fit 200 rows in each packet. So we would be better off doing something like this: UNTIL all rows retrieved FETCH 100 ROWS DONE Oracle’s array interface allows you to do exactly this. Using the array interface reduces the number of network packets and so will reduce network contention. Just as importantly, each network round trip involves some network latency, so we improve our response time as well. Figure 1 shows how response time improved dramatically as we increase the array size. Eventually, however, increasing the array size stops helping, since the network packets are already full. As well as using the array interface to improve bulk selects, it’s also very important in optimizing bulk inserts. Figure 1 Relationship between array size and elapsed time So exactly how do we exploit this important feature? Unfortunately, each programming environment is different in this respect. For a lot of tools, the array size is set with a configuration parameter. For instance, in SQLPLUS the ROWCOUNT variable transparently controls the array size. In TOAD, it’s set in the Oracle/General options (Figure 2). Figure 2 Setting the array size in TOAD In many programming languages you often have to modify your code to exploit the array interface. For instance in PL/SQL the BULK COLLECT keyword can be used to perform array selects (Figure 3), while the FORALL keyword can be used to perform array inserts. Quest PL/SQL guru Steven Feuerstein explains how to do this in this Toad World database tip . Although PL/SQL does not perform network round trips (it runs inside the database server), using the array interface in PL/SQL still reduces overhead quite dramatically. Figure 3 Using array fetch in PL/SQL Most versions of Oracle Java drivers will automatically use the array interface, but you can always set it explicitly using the setFetchSize parameter as shown in Figure 4. Doing array inserts is a little tricker, but can be done using the addBatch() method. Virtually every programming language – C#, PHP, Perl, Ruby, etc – supports the array interface, but it’s often up to the programmer to make it happen. Table 1 provides a starting point for some popular languages, Language Array fetch interface Oracle Data Provider for .NET (C#, VB.NET) Set the FetchSize property of the OracleCommand or OracleDataReader object. Java setFetchSize method of the Statement object perl Set the RowCacheSize property of the database handle (eg, $db- {RowCacheSize} = 100;) OCI Use OCIDefineByPos() to define an array as destination for output and OCIStmtFetch() to provide array size PHP If using the OCI8 interface, set the oci8.default_preset propery. Python If using the cx_oracle interface, set the arraysize property of the cursor object Table 1: How to do array fetch in various programming languages Figure 4 Setting the array size in Java Spotlight will raise an alarm should it detect that a significant proportion of network traffic could be eliminated by the use of the array interface. Drilldowns to the alarm will identify the sessions and SQLs associated with poor array interface usage. Figure 5 Spotlights array interface alarm Using the array interface is one of the simplest ways to optimize a program that retrieves or inserts large number of rows. As well as optimizing individual SQLs and programs, it also reduces overall network packet transmissions, thereby reducing the chance of network contention or – as some might put it – “clogging up the tubes”.
↧
Blog Post: Oracle 10G RAC Scalability – Lessons Learned
Note: This paper is based on collaborative research effort between Quest Software and Dell Computers, with all research having been done by the four authors whose bios are listed at the end of this paper. This presentation is being offered on behalf of both companies and all the authors involved. Also an extra special thanks to Dell for allocating a million dollars worth of equipment to make such testing possible . When businesses start small and have low capital budget to invest, configuration of computer systems are minimal, consisting of a single database server with a storage array to support the database implementation. However, as the business grows, expansion follows and consequently the current configuration (single node – storage system) is not able to handle the increased load due to increased business. It is typical at this point to either upgrade the current hardware by adding more CPU’s and other required resources or to add additional servers to work as a cohesive unit /cluster. Increasing resources is similar to placing a temporary bandage on the system as it solves the current situation of increased business and demand, but only shifts the issue to be dealt with at a future date as the workload will invariably increase once again. Instead of adding more memory or CPU to the existing configuration, which could be referred to as vertical scaling, additional server(s) should be added to provide load balancing, workload distribution and availability. This functionality is achieved by scaling the boxes horizontally, or in a linear fashion, and by configuring these systems to work as a single cohesive unit or a cluster. Figure 1.0. Vertical and Horizontal / Linear Scalability Representation Figure 1.0 illustrates the advantages of horizontal scalability over vertical scalability. The growth potential on a vertically scalable system is limited and, as explained earlier, reaches a point where the addition of resources does not provide proportionally improved results. From a systems perspective, a hardware clustering provides this. ( Note: A cluster is a group of independent hardware systems or nodes that are interconnected to provide a single computer source). Linear (horizontal) scalability brought about by clustered hardware solutions also provides distribution of user workload Vertical Scalability Horizontal Scalability among many servers (nodes). Clusters offer both horizontal (linear) and vertical scalability, the cluster model provides investment protection. Horizontal scalability is the ability to add more nodes to the cluster to provide more functionality. These nodes may be relatively small and/or inexpensive commodity hardware, offering economical upgradeability options that might offer enhancements to a single large system. Vertical scalability is the ability to upgrade individual nodes to higher specifications. Oracle real application cluster (RAC) version 10g with its unique architecture is the only database solution available today that provides true clustering, in a shared database environment and provides horizontal scalability. REAL APPLICATION CLUSTER RAC is a configuration of two or more nodes comprising of two or more instances clustered together using Oracle Clusterware 2 using its unique cache fusion technology it is able to share resources and balance workload providing optimal scalability for today’s high end computing environments. A RAC configuration consists of: Many instances of Oracle running on many nodes Many instances sharing a single physical database All instances having common data and control files Each instance contains individual log files and undo segments All instances can simultaneously execute transactions against the single physical database Cache synchronization between users requests across various instances using the cluster interconnect COMPONENTS OF RAC CLUSTER Figure 2.0. Cluster components in a 10g RAC cluster ORACLE CLUSTERWARE Oracle Clusterware is three tiered, comprising of the cluster synchronization services (CSS), event manager (EVM) and the cluster ready services (CRS) that provides a unified integrated solution bringing maximum scalability to the RAC environment. INTERCONNECT Interconnect is a dedicated private network between the various nodes in the cluster. RAC architecture uses the cluster interconnect for instance to instance block transfers by providing cache coherency. Ideally interconnects are Gigabit Ethernet adapters configured to transfer packets of maximum size supported by the operating system. Depending on the operating system the suggested protocols may vary, on Linux clusters the recommended protocol is UDP. VIP Traditionally users and applications have connected to the RAC cluster and database using a public network interface. The network protocol used for this connection has typically been TCP/IP. In a RAC environment, when a node or instance fails, the application unware of such failure attempts to make a connection, at which time the time taken for TCP/IP to acknowledge such a failure could be over 10 minutes causing unresponsive behaviors to the users. VIP is virtual connection over the public interface. When an application or user makes a connection using the VIP, during a node failure, the Oracle Clusterware based on an event received from the EVM will transfer the VIP address to another surviving instance. Now when a new connection is attempted by the application there are two possible situations based on the Oracle 10g database features implemented. If the application uses fast application notification calls, Oracle notification services (ONS) will inform the ONS running on the client machines and the application using Oracle provided API can receive such notifications and make connections to one of the other available instances in the cluster. Such proactive notification mechanisms will avoid connections to the failed node. If the application attempts to connect using the VIP address of the failed node, an immediate failure is notified to the through an acknowledgement when the connection is refused because of a mismatch in the hardware address. SHARED STORAGE Another important component of RAC cluster is its shared storage that all participating instances in the cluster will access. The shared storage contains the datafiles, control files, redo logs and undo files. Oracle Database 10g supports three different methods for storing files on shared storage. Raw devices A raw device partition is a contiguous region of a disk accessed by a UNIX or Linux character-device interface. This interface provides raw access to the underlying device, arranging for direct I/O between a process and the logical disk. Therefore, the issuance of a write command by a process to the I/O system directly moves the data to the device. Oracle clustered file system (OCFS) OCFS is a clustered file system developed by Oracle Corporation to provide a solution for easy data file management at the same time providing performance characteristics similar to RAW devices. Oracle initial release OCFS 1.0 only supports database files to be stored on devices formatted using OCFS. The latest release OCFS 2.0 is a more generic file system support both Oracle and non-Oracle files. OCFS supports both Linux and Windows operating systems. Automatic Storage Management (ASM) ASM is new storage management solution introduced in Oracle Database 10 g, it bridges the gap in today’s storage management solution and I/O tuning burden present among the administrators. ASM integrates the file system, volume manager and using the OMF architecture distributes I/O load across all available resources optimizing performance and throughput using the SAME methodology. TESTING FOR SCALABILITY Primary functions that RAC systems provide apart from improved performance are availability and scalability. Availability because when one of the nodes or the instances in the cluster where to fail, the reminder of the instances would continue to provide access to the physical database. Scalability because when the user workload or work pattern increases users can access the database from any of the available instances that have higher amount of resources available. Scalability also provides the option to add additional nodes when the user base increases. When organizations move to a RAC environment, it would be in their best interests to perform independent performance tests to determine the capacity of the cluster configured. Such tests will help determine at what stage in the life cycle of the organization and its application, the cluster will require additional instances to be added to accommodate higher workload. The basis of these tests is to provide a similar baseline to users regarding the scalability levels of the DELL servers using ASM. TEST ENVIRONMENT TOOLS BENCHMARK FACTORY Benchmark Factory (BMF) for Databases provides a simple yet robust GUI (figure 3.0) for creating, managing and scheduling industry standard database benchmarks and real-world workload simulation so as to determine accurate production database hardware and software configurations for optimal effectiveness, efficiency and scalability. Using BMF for Databases, DBA’s can easily address the two of the most challenging tasks they face: what hardware architecture and platform should they deploy, and what performance related SLA’s (Service Level Agreements) can they agree to. Figure 3.0: Benchmark Factory for Databases (BMF) While BMF for Databases offers numerous industry standard benchmarks, the tests performed for this article were similar to the TPC-C (shown above): The TPC-C like benchmark measures on-line transaction processing (OLTP) workloads. It combines readonly and update intensive transactions simulating the activities found in complex OLTP enterprise environments. The benchmark tests were setup to simulate loads from 100 to 5000 concurrent users in increments of 100 against a 10 GB database (which BMF for Databases creates). The idea was to ascertain two critical data points: how many concurrent users can each RAC node sustain, and does the RAC cluster scale both predictably and linearly as additional nodes and users are added. SPOTLIGHT ON RAC Spotlight on RAC (SoRAC) is a new, innovative database monitoring and diagnostic tool for RAC. It extends the proven architecture and intuitive GUI of Quest Software’s Spotlight on Oracle to RAC environments. Spotlight on RAC is designed to provide a comprehensive yet comprehendible overview of numerous RAC internals visualized by a world-class dash-board like display that makes clustered database monitoring and diagnostics a snap. With its simple street-light like color coded scheme (where green is good and red is bad), plus its point-and-click to drill-down into details design – DBA’s can easily monitor their clusters in order to detect, diagnose and correct all potential problems or hotspots. SoRAC (figure 4.0) even offers alarms with automatic prioritization and weighted escalation rankings to help less experienced RAC DBA’s focus their attention on the most critical or problematic issues. SoRAC is shown below monitoring over 2300 users on a 10 node cluster – with all being well. Figure 4.0: Spotlight on RAC (SoRAC) Note that SoRAC requires only a Windows client install to monitor all the nodes on the cluster. It requires no server side agents and no data repository. It’s truly a light footprint, simple to install, launch and start to use tool. The hardware configuration illustrated in figure 5.0 reflects the primary configuration and hardware layout used for this benchmark operation and the tables below provide the details: Hardware and Software Setup: Database Configuration Database Version Oracle database 10g R1 (10.1.04) Enterprise Edition ASM Diskgroups: SYSTEMDG: 50 GB DATADG: 50GB INDEXDG: 50GB REDO01DG: 20 GB REDO02 DG: 20GB All disk groups were created using external redundancy option of ASM Tablespaces: Quest_data in DATADG disk group size 40GB using OMF feature Quest_index in INDEXDG disk group size 10GB using OMF feature All Other database tablespaces were created in SYSTEMDG disk group. Redo log files were created in REDO01DG REDO02DG disk groups Figure 5.0 10 Node physical hardware layout. TESTING METHODOLOGY Methodology is of critical importance for any reliable benchmarking exercise – especially those of a complex and repetitive nature. A methodology allows for comparison of current activity with previous activities, while recording any changes to the baseline criteria. RAC testing is no different – a methodology to identify performance candidates, tune parameters or settings, run the tests, and then record the results is critical. And because of its highly complex multi-node architecture, RAC benchmarking should follow an iterative testing process that proceeds as follows: For a single node and instance Establish a fundamental baseline Install the operating system and Oracle database (keeping all normal installation defaults) Create and populate the test database schema Shutdown and startup the database Run a simple benchmark (e.g. TPC-C for 200 users) to establish a baseline for default operating system and database settings Optimize the basic operating system Manually optimize typical operating system settings Shutdown and startup the database Run a simple benchmark (e.g. TPC-C for 200 users) to establish a new baseline for basic operating system improvements Repeat prior three steps until a performance balance results Optimize the basic non-RAC database Manually optimize typical database “spfile” parameters Shutdown and startup the database Run a simple benchmark (e.g. TPC-C for 200 users) to establish a new baseline for basic Oracle database improvements iv. Repeat prior three steps until a performance balance results Ascertain the reasonable per-node load Manually optimize scalability database “spfile” parameters Shutdown and startup the database Run an increasing user load benchmark (e.g. TPC-C for 100 to 800 users increment by 100) to find the “sweet spot” of how many concurrent users a node can reasonably support Monitor the benchmark run via the vmstat command, looking for the point where excessive paging and swapping begins – and where the CPU idle time consistently approaches zero Record the “sweet spot” number of concurrent users – this represents an upper limit vi. Reduce the “sweet spot” number of concurrent users by some reasonable percentage to account for RAC architecture and inter/intra-node overheads (e.g. reduce by say 10%) Establish the baseline RAC benchmark Shutdown and startup the database Create an increasing user load benchmark based upon the node count and the “sweet spot” (e.g. TPC-C for 100 to node count * sweet spot users increment by 100) iii. Run the baseline RAC benchmark 2. For 2nd through Nth nodes and instances Duplicate the environment Install the operating system Duplicate all of the base node’s operating system setting Add the node to the cluster Perform node registration task Propagate the Oracle software to the new node Update the database “spfile” parameters for the new node Alter the database to add node specific items (e.g. redo logs Run the baseline RAC benchmark Update the baseline benchmark criteria to include user load scenarios from the prior run’s maximum up to the new maximum based upon node count * “sweet spot” of concurrent users using the baseline benchmark’s constant for increment by Shutdown and startup the database – adding the new instance Run the baseline RAC benchmark 4.Plot the transactions per second graph showing this run versus all the prior baseline benchmark runs – the results should show a predictable and reliable scalability factor As with any complex testing endeavor, the initial benchmarking setup and sub-optimization procedure is very time consuming. In fact reviewing the steps above, nearly two thirds of the overall effort is expended in getting the single node and instance correctly setup, plus the baseline benchmark properly defined. Of course once that initial work is completed, then the remaining steps of adding another node and retesting progresses rather quickly. Plus if the DBA simply duplicates all the nodes and instances like the first once it’s done, then the additional node benchmarking can be run with little or no DBA interaction (i.e. eliminates steps 2-a and 2-b). This also provides the greatest flexibility to test any scenario and in any order that one might prefer (e.g. test 10 nodes down to 1 node). So a little up front work can go a long way. TESTING In our benchmarking test case, the first three steps (establish a fundamental baseline, optimize the basic operating system and optimize the non-RAC database) were very straight forward and quite uneventful. We simply installed Redhat Advanced Server 4.0 update 1 and all the device drivers necessary for our hardware, installed Oracle 10g Release 1, and patched Oracle to version 10.1.0.4. We of course then modified the Linux kernel parameters to best support Oracle by adding the following entries to /etc/sysctl.conf: kernel.shmmax = 2147483648 kernel.sem = 250 32000 100 128 fs.file-max = 65536 fs.aio-max-nr = 1048576 net.ipv4.ip_local_port_range = 1024 65000 net.core.rmem_default = 262144 net.core.rmem_max = 262144 net.core.wmem_default = 262144 net.core.wmem_max = 262144 We then made sure that asynchronous IO was compiled in and being used by performing the following steps: cd to $ORACLE_HOME/rdbms/lib make -f ins_rdbms.mk async_on make -f ins_rdbms.mk ioracle Set necessary “spfile” parameter settings disk_asynch_io = true (default value is true) filesystemio_options = setall (for both async and direct io) Note that in Oracle 10g Release 2 asynchronous IO is now compiled in by default. Next we then created our RAC database and initial instance using Oracle’s Database Configuration Assistant (DBCA), being very careful to choose parameter settings that made sense for our proposed maximum scalability (i.e. 10 nodes). Finally, we made the final manual “spfile” adjustments shown below: cluster_database=true cluster_database_instances=10 db_block_size=8192 processes=16000 sga_max_size=1500m sga_target=1500m pga_aggregate_target=700m db_writer_processes=2 open_cursors=300 optimizer_index_caching=80 optimizer_index_cost_adj=40 The key idea was to eek out as much SGA memory usage as possible within the 32-bit operating system limit (about 1.7 GB). Since our servers had only 4 GB of RAM each, we figured that allocating half to Oracle was sufficient – with the remaining memory to be shared by the operating system and the thousands of dedicated Oracle server processes that the TPC-C like benchmark would be creating as its user load. Now it was time to ascertain the reasonable per node load that our servers could accommodate. This is arguably the most critical aspect of the entire benchmark testing process – and especially for RAC environments with more than just a few nodes. We initially ran a TPC-C on the single node without monitoring the benchmark run via the vmstat command. So simply looking at the transactions per second graph in BMF yielded a deceiving belief that we could set the “sweet spot” at 700 users per node. The issue was that even though the graph continued in a positive direction up to 700 users, that in reality the operating system was being overstressed and exhibited minimal thrashing characteristics at about 600 users. Moreover we did not temper that value by reducing for RAC overhead. The end result was that our first attempt at running a series of benchmarks for 700 users per node did not scale either reliably or predictably beyond four servers. Our belief is that by taking each box to a near thrashing threshold by our overzealous per node user load selection, the nodes did not have sufficient resources available to communicate in a timely enough fashion for inter/intra-node messaging – and thus Oracle began to think that nodes were either dead or non-respondent.. Furthermore when relying upon Oracle’s client and server side load balancing feature, which allocates connections based upon node responding, the user load per node became skewed and then exceeded our per node “sweet spot” value. For example when we tested 7000 users for 10 nodes, since some nodes appeared dead to Oracle – the load balancer simply directed all the sessions across whatever node were responding. So we ended up with nodes trying to handle far more than 700 users – and thus the thrashing was even worse. Note: This will not be of any concern in Oracle database 10g Release 2. With the runtime connection load balancing feature and FAN technology the client will be proactively notified regarding the resource availability on each node and the client can place connections on instances that have more resources. Load balancing can be performed either by connections or by response time. So with a valuable lesson learned by our first attempt, we made two major improvements. First, we re-ascertained what our true “sweet spot” was by monitoring the single node 100 to 800 user load test – watching very carefully for the onset of excessive paging, swapping or consistent CPU idle time near zero percent. That number was 600 users, not 700 as we had tried before. We then adjusted that number down to 500 users – by simply figuring that the RAC architecture would require 15% overhead. This is not a recommendation per se; we simply wanted to pick a number that would yield a purely positive scalability experience for the next set of benchmarking test runs. If we had more time, we could have selected a less conservative “sweet spot” and kept repeating our tests until a definitive reduction percentage could be factually derived. Again, we simply erred on the side of caution and chose a “sweet spot” value that we expected to work well and yet that did not overcompensate. And second, we decided to rely upon Benchmark Factory’s load balancing feature for clusters – which simply allocates or directs 1/n th of the jobs to each node. That way we could be absolutely sure that we never had more than our “sweet spot” of users running on any given node. With the correct per node user load now correctly identified and guaranteed load balancing, it was now a very simple (although time consuming) exercise to run the TPC-C like benchmarks listed below: 1 Node: 100 to 500 users, increment by 100 2 Node 100 to 1000 users, increment by 100 4 Node 100 to 2000 users, , increment by 100 6 Node 100 to 3000 users, , increment by 100 8 Node 100 to 4000 users, , increment by 100 10 Node 100 to 5000 users, , increment by 100 Benchmark Factory’s default TPC-C like test iteration requires about 4 minutes for a given user load. So for the single node with five user load scenarios, the overall benchmark test run requires 20 minutes. During the entire testing process the load was monitored to identify any hiccups using SoRAC. As illustrated in figure 6.0 when we reached our four node tests we did identify that CPU’s on node racdb1 and racdb3 reached 84% and 76% respectively. Analyzing the root cause of the problem it was related to temporary overload of users on these servers, and the ASM response time. Figure 6.0 Four node tests - high CPU usage on nodes racdb1 and racdb3 We increased the following parameters on the ASM instance ran our four node tests again and all was well beyond this: Parameter Default Value New Value SHARED_POOL 32M 67M LARGE_POOL 12M 67M This was the only parameter change we had to make to the ASM instance and beyond this everything work just smooth. Figure 7.0 gives another look at the cluster level latency charts from SoRAC during our eight node run. This indicated that the interconnect latency was well within expectations and in par with any industry network latency numbers. Figure 7.0 Cluster Latency charts eight node tests And once you factor in certain inherent latencies and delays between test runs, it actually takes about 30 minutes. So as you add nodes and thus more user load scenarios, the overall benchmarking test runs take longer and longer. In fact the 10 node test takes well over four hours. Other than some basic monitoring to make sure that all is well and the tests are working, there’s really not very much to do while these tests run – so bring a good book to read. The final results are shown below in Figure 8.0. Figure 8.0. Results for 1 to 10 RAC nodes Monitoring the storage subsystem using the SoRAC (figure 9.0) indicated that ASM was performing excellently well at this user load. 10 instances with over 5000 users indicated an excellent service time from ASM, actually the I/O’s per second was pretty high and noticeably good toping to over 2500 I/O’s per second. Figure 9.0. ASM performance with 10 RAC nodes The results are quite interesting. As the graph clearly shows, Oracle’s RAC and ASM are very predictable and reliable in terms of its scalability. Each successive node seems to continue the near linear line almost without issue. Now there are 3 or 4 noticeable troughs in the graph for the 8 and 10 node test runs that seem out of place. Note that we had one database instance that was throwing numerous ORA-00600 [4194] errors related to its UNDO tablespace. And that one node took significantly longer to startup and shutdown than all the other nodes combined. A search of Oracle’s metalink web site located references to a known problem that would require a database restore or rebuild . Since we were tight on time, we decided to ignore those couple of valleys in the graph, because it’s pretty obvious from the overall results we obtained that smoothing over those few inconsistent points would yield a near perfect graph – showing that RAC is truly reliable and predictable in terms of scalability. And using the 6 node graph results to project forward, Figure10.0 shows a reasonable expectation in terms of realizable scalability – where 17 nodes should equal nearly 500 TPS and support about 10,000 concurrent users. Figure 10.0. Projected RAC Scalability CONCLUSION Apart from the minor hiccups at the initial round where we tried to determine the optimal user load on a node for the given hardware and processor configuration, beyond this scalability of the RAC cluster was outstanding. Addition of every node to the cluster showed steady - close to linear scalability. Close to linear scalability because of the small overhead that the cluster interconnect would consume during block transfer between instances. The interconnect also performed very well, in this particular case NIC paring/bonding feature of Linux was implemented to provide load balancing across the redundant interconnects which also helped provide availability should any one interconnect fails. The DELL|EMC storage subsystem that consisted of six ASM diskgroups for the various data files types performed with high throughput also indicating high scalability numbers. EMC PowerPath provided IO load balancing and redundancy utilizing dual Fibre Channel host bus adapters on each server. It’s the unique architecture of RAC that makes this possible, because irrespective of the number of instances in the cluster, the maximum number of hops that will be performed to before the requestor gets the block requested will not exceed three under any circumstances. This unique architecture of RAC removes any limitations in clustering technology (available from other database vendors) giving maximum scalability. This was demonstrated through the tests above. These tests would also conclude that Oracle’s direction to bring a true technology grid environment where RAC and ASM are only the stepping stones is encouraging. Oracle® 10g Real Application Clusters (RAC) software running on standards-based Dell™ PowerEdge™ servers and Dell/EMC storage can provide a flexible, reliable platform for a database grid. In particular, Oracle 10g RAC databases on Dell hardware can easily be scaled out to provide the redundancy or additional capacity that the grid environment requires. ABOUT THE AUTHORS Anthony Fernandez: Fernandez is a senior analyst with the Dell Database and Applications Team of Enterprise Solutions Engineering, Dell Product Group. His focus is on Database optimization and performance. Anthony has a bachelor’s degree in Computer Science from Florida International University. Bert Scalzo: Bert Scalzo is a product architect for Quest Software and a member of the TOAD development team. He designed many of the features in the TOAD DBA module. Mr. Scalzo has worked as an Oracle DBA with versions 4 through 10g. He has worked for both Oracle Education and Consulting. Mr. Scalzo holds several Oracle Masters, a BS, MS and PhD in Computer Science, an MBA and several insurance industry designations. His key areas of DBA interest are Linux and data warehousing (he designed 7-Eleven Corporation's multi-terabyte, star-schema data warehouse). Mr. Scalzo has also written articles for Oracle’s Technology Network (OTN), Oracle Magazine, Oracle Informant, PC Week, Linux Journal and www.linux.com. He also has written three books: "Oracle DBA Guide to Data Warehousing and Star Schemas", "TOAD Handbook" and "TOAD Pocket Reference". Mr. Scalzo can be reached at bert.scalzo@quest.com or bert.scalzo@comcast.net. Murali Vallath: Murali Vallath has over 17 years of IT experience designing and developing databases with over 13 years on Oracle products. Vallath has successfully completed over 50 successful small, medium and terabyte sized RAC implementations (Oracle 9i & Oracle 10g) for reputed corporate firms. Vallath is the author of the book titled ‘Oracle Real Application Clusters’ and an upcoming book titled ‘Oracle 10g RAC, Grid, Services & Clustering’. Vallath is a regular speaker at industry conferences and user groups, including the Oracle Open World, UKOUG and IOUG on RAC and Oracle RDBMS performance tuning topics. Vallath is the president of the RAC SIG ( www.oracleracsig.org ) and the Charlotte Oracle Users Group (www.cltoug.org). Zafar Mahmood: is a senior consultant with the Dell Database and Applications Team of Enterprise Solutions Engineering, Dell Product Group. He has been involved in database performance optimization, database systems, and database clustering solutions for more than eight years. Zafar has a B.S. and M.S. in Electrical Engineering with specialization in Computer Communications from the City University of New York.
↧
Blog Post: Getting started with Oracle in the Amazon cloud
Cloud computing is a heavily overused term, applied to wide range of offerings such as Gmail and Flicker as well as Cloud platforms such as Windows Azure, Google App Engine and Amazon Web Services environment. I generally define cloud computing as the provision of virtualized application software, platforms or infrastructure across the network, in particular the internet . Web Services (AWS) have taken an early market, technology and mindshare the lead in the Infrastructure as a Service (IaaS) category of cloud computing. AWS and similar offerings provide virtualized server resources together with other infrastructure (messaging, database, etc) over the internet. Customers build their application stack using these components. Amazon probably provides the richest IaaS stack including distributed storage (S3), cloud database (SimpleDB), messaging (SQS) and payments (FPS). Ever since Oracle announced support for Amazon AWS at Oracle Open World I’ve been eager to play around with it. This week I managed to find the time to set up a test database and thought I'd share my experiences. Getting into AWS Getting started with Amazon AWS is a big topic. In a nutshell: Get an AWS account from www.amazon.com/aws (you need your credit card) Download and install ElasticFox ( http://sourceforge.net/projects/elasticfox/ ). I’ve used the Amazon command line utilities in the past, but ElasticFox is far easier. Another alternative is to use the free web based dashboards at Rightscale . Install putty ( www.putty.org ) - a free SSH client Enter your AWS credentials in ElasticFox Create a keypair in the keypairs tab of ElasticFox Most of the above went smoothly for me, other than the use of putty in ElasticFox. It turns out that Putty kepairs are not compatible with normal AWS keypairs so you have to create a putty compatible key. To do this, import your .PEM file in the PUTTYGEN program that comes with putty and generate a .PPK key which you then use when connecting via Putty. Port configuration We need to open various ports to allow us to use SQL*NET, OEM etc. ElasticFox lets you change the default port settings in the security groups page: I've got 3306 open for MySQL, and 5901-5902 so I can run VNC servers in the EC2 instance. 1521 is for SQL*NET of course, and 1158 is for enterprise manager. 22 and 80 are standard for SSH and HTTP. Starting the EC2 instance Amazon Machine Images (AMIs) are template servers that you can start with. It's easy enough to find the oracle AMIs, just specify "oracle" in the ElasticFox search box. Then you can select the one you want (in this case 32-bit Oracle 11) and select "Launch instance(s) of the AMI": You need to wait for the new instance to start. About 5 minutes should do it (about the time it takes to startup a linux host on physical hardware). Once the instance is running we can connect to it by right clicking and selecting "connect to public DNS name". This is where things will go wrong if you haven't created a PUTTY compatible private key file. If everything is set up correctly you will get a SSH session on your new EC2 instance (you should not be prompted for a password). All being well, you will be asked to accept the Oracle licensing agreement and then it will ask you if you want to create a database. Say "no" for now, since we are going to create our database on Elastic Block Storage (EBS). Elastic Block Storage An EC2 instance is a transient virtual machine image. Any changes you make to the virtual machine will be lost when you terminate the instance. Obviously we don't want our database to just disappear, so we need a permanent solution. Firstly, we want our database files on persistent storage. That means we need an Elastic Block Storage (EBS) device. EBS appear like raw partitions on the EC2 host, but they have a separate and persistent existence. EBS volumes can be created in ElasticFox in the "Volumes and Snapshots" page. Below I'm creating a 10GB volume where I will put the database files. Make sure that the volumes are in the same availability zone as your instance. In the screen shot below I created the volume in 'us-east-1a', but my EC2 instance is in 'us-east-1b' (see screen shot above). Consequently I had to discard and recreate this volume before I could use it. Now, attach the EBS volume to your instance. Assign a device such as '/dev/sdb': Now we are ready to create a database. Below I cd to the ~oracle/scripts directory (as root!) and run the run_dbca.sh script: Creating the database is straight forward, but when asked where you want to create the database, make sure you specify /dev/sdb (or whatever device you used to mount your EBS volumes): If you create your database on /dev/sda2, then the database files will be lost when the instance is terminated. Because /dev/sdb is on an EBS storage volume, it will persist even if the EC2 instance goes down. So our database files will not be lost. Creating an AMI template Now we have a database on our EC2 instance, and the database files are on permanent storage. But if the EC2 instance goes down I will have lost whatever customizations I've made (such as installing perl DBI, setting up VNC servers, etc) and I'll have to somehow get the datafiles attached to a new database on a fresh image. What I really want to do is save a copy of my EC2 image when I've got it the way I want it. The way to do this is to create an Amazon Machine Image (AMI) template. Here are the steps: Shutdown the database Use ec2-bundle-vol on the EC2 image to create a bundle containing the image definition. Use ec2-upload-bundle to copy the bundle into S3 (Simple Storage System) files. Use ec2-register to assign the S3 files to an AMI image You need more space than usual to create an image of the Oracle system. The maximum bundle size is 10G and I ran out of space initially when trying to create such a large bundle. So I created and mounted a 20GB EBS volume and mounted it as /bundle: I created the bundle with the following commands: The --user option provides your AWS account number without any hyphens. The --cert and --privatekey arguments require your digital certificate and private key files that should have been created for you when you first signed up for AWS. Next I upload the bundle to an S3 bucket ec2-upload-bundle \ --manifest /bundle/gh-ami-oracle11.img.manifest.xml \ --bucket gh-ami-oracle11 \ --access-key `cat aws_access_key.txt` \ --secret-key `cat aws_secret_key.txt` The access-key and secret-key arguments expect you to pass the keys on the command line, but it's probably safer to cat them from files as shown above. Also I don't want to give you my keys! . Finally, I register the bundle as an AMI image. The ec-register command is part of the AWS command line toolkit, which I have installed on my desktop: bash-3.2$ ec2-register gh-ami-oracle11/gh-ami-oracle11.img.manifest.xml IMAGE ami-8f3adee6 My AMI template is now visible from ElasticFox, ready for me to launch: Setting up a new server So now my configuration is saved to the AMI template and the datafiles are safely stored in my EBS volume. If I lose my existing server I can "simply" start up a new version of the EC2 instance from the template and attach the datafiles. Well, maybe "simply" is an exaggeration! There are two main things that have to happen before I can restart my database on a new image: I need to attach the EBS storage to the new EC2 server I need to adjust the Oracle listener and OEM to the new host name. When I launch my new instance, it won't boot up all the way unless the EBS volumes that were mounted when I bundled it are present. There is an option to provide a different fstab file with your bundle, but for now I need to attach my EBS volumes to the instance and then reboot it. The second issue is a bit fiddlier. The new EC2 instance has a new hostname, and the listener and the OEM dbconsole are associated with the old instance name. Firstly, we should change the listener.ora file so that the hostname is the private DNS name (you can copy this to the clipboard from ElasticFox): After you've edited the LISTENER.ORA file you should restart the listener. You also need to change the value of ORACLE_HOSTNAME in ~oracle/.bash_profile. In this case you want to use the public DNS name. Again, you can cut this to the clipboard from the ElasticFox screen. Finally, rebuild Enterprise Manager configuration by issuing the following commands from the Oracle account (in this example, the passwords for all the relevant accounts are held in the $PASSWORD variable): emca -deconfig dbcontrol db -repos drop -SID ghec2a -PORT 1521 \ -SYSMAN_PWD $PASSWORD -SYS_PWD $PASSWORD emca -config dbcontrol db -repos create -SID ghec2a -PORT 1521 \ -SYSMAN_PWD $PASSWORD -SYS_PWD $PASSWORD -DBSNMP_PWD $PASSWORD All done! Listener, database and OEM dbconsole should now be running successfully on the new EC2 host. A quick look in Spotlight I'll be using Amazon images from time to time in the future for research purposes, but the first thing I was curious about was to see how the database looked in Spotlight. So here's the Spotlight home page for the database I just set up: Whoops! The original Oracle image had archivelog on and now I'm almost out of space. It's archiving both to the Recovery area and to $ORACLE_HOME/dbs/arch. The recovery area is on the EBS volume, but the filesystem location is within the Oracle Home and therefore on my EC2 instance storage. In the future, I'll turn off archiving before creating an AMI, since not only could I potentially run out of disk, but also the AMI is bigger than it should be since it has a couple of 100MB of archive logs. I cleared the logs using RMAN: connect target sys/mypassword CONFIGURE DEVICE TYPE DISK BACKUP TYPE TO BACKUPSET; CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/oradata/tmp/ora_df%t_s%s_s%p'; BACKUP ARCHIVELOG ALL DELETE ALL INPUT; Now when I look in the Spotlight archive destinations drilldown the situation is OK, and I have a few days of archivelog left (at current allocation rates): Conclusion I'm a big believer in the style of cloud computing pioneered by Amazon and I'm really pleased to see Oracle playing a part in the Amazon cloud. It's pretty easy to get started with Oracle in AWS - especially if you have a basic familiarity with the AWS environment - but a bit harder to establish a persistent database environment. For the Open Source Software stacks that dominate within AWS, companies such as Rightscale offer facilities to help automate the constructions and deployment of solutions. Hopefully there'll be similar facilities for Oracle in AWS soon. In the meantime, I hope that sharing my experiences helps in some small way.
↧
Blog Post: Optimizing the Oracle 11g Result Cache
The Oracle 11g Result Set Cache stores entire result sets in shared memory . If a SQL query is executed and its result set is in the cache then almost the entire overhead of the SQL execution is avoided: this includes parse time, logical reads, physical reads and any cache contention overhead (latches for instance) that might normally be incurred. Sounds good right? In fact you might be thinking that the result set cache is better than the buffer cache. However, the reality is that the result set caching is only sometimes a good idea. This is because: Multiple SQLs that have overlapping data will store that data redundantly in the cache. So the result set that contains all customers from California will duplicate some of the data in the cached result set for all customers from North America. Therefore, the result set cache is not always as memory efficient as the buffer cache. Any change to a dependent object – to any table referenced in the query – will invalidate the entire cached result set. So the result set cache is most suitable for tables that are read only or nearly read only. Really big result sets will either be too big to fit in the result set cache, or will force most of the existing entries in the cache out. Rapid concurrent creation of result sets in the cache will result in latch contention. We introduced diagnostics for the result set cache in Spotlight on Oracle 6.1 - part of the Toad DBA suite . Let's examine the 11g result set cache using Spotlight as our guide. The result set cache is part of the shared pool. By default it is sized at only 1% of the shared pool, so is usually pretty small. I increased the size of my result set cche to 10MB using the RESULT_CACHE_MAX_SIZE parameter. We show the result set cache on the Spotlight on Oracle home page within the shared pool section: We show the size of the result set cache, and the number of result sets "found" in the cache each second. Any alarms relating to the result set cache (which we'll see a bit later in this article) will show up here. We also added a drilldown to show you the contents and the behavior of the result set cache. Here's an example of that drilldown for a cache which is working well: There's only one SQL in the cache, though it has 77 result sets. A single statement can have many result sets - one for each unique combination of bind variables. So for the statement above, a new result set can be cached for each value of CUST_ID that is supplied to the query. Let's take a look at that statement in a bit more detail: The RESULT_CACHE hint - identified by a red "1" above - is what allows the SQL to be cached. In 11g release1, the only other way to make a query cached is to set the RESULT_CACHE parameter tor FORCE. We'll see later how that is almost always a Bad Idea. The Hit ratio - identified by a red "2" above - indicates how often a matching result set was found in the cache. In the case above, 57% of executions found their result set in the cache - pretty good! The "Execution time % saved" is our estimate of how much execution time was saved through the result cache. In this example, more than half of the execution time was avoided. Again, a pretty good result considering the relatively small cost in memory (only 10M) That statement was a good candidate for caching, since the SQL statement is somewhat expensive, has a small result set (a single number) and a limited number of possible bind variables (limited to the number of customers). As a result, we have a good chance of finding a match in the cache and we avoid a lot of work when we do. The pattern in the the Result Cache statistics chart shows what we hope to see if the SQL is being cached effectively: Initially we have a lot of result sets being created, but not many "Finds". After a while the number of finds increases as the cache fills up with reusable results. The number of creates decreases because the result sets are already in the cache. We generally want to see many more finds than creates, because that indicates that the cache entries which are created are generally reused. Not suitable for all SQLs If the above example makes you enthusiastic about result set caching, brace yourself, because the following examples show how result set caching can often provide no benefit or even severely hurt performance. Result set caching makes sense if the following are true: The result set is small The SQL statement is expensive The SQL statement does not experience high rates of concurrent execution The SQL does not generate a large number of result sets The SQL is against relatively static tables If the result set is large, then it may be too big to fit into the result set cache. The result set works best when the SQL is expensive, since a quick index lookup might be satisfied by the buffer cache almost as quickly as a result cache lookup. Expensive aggregate queries such as the one from the previous example are ideal for the result set cache ( Bert Scalzo likes to compare these to "on-the-fly in-memory materialized views"). If a table accessed in the query is subject to DML, then the result set caches are invalidated. You can view these dependencies in Spotlight. So, for instance, below we see that one SQL is dependent on the SALES_ARCHIVE table and the other on the TXN_DATA table. If any of these tables were subject to frequent updates, then the SQLs would probably not be suitable for result set caching: If we knew that SALES_ARCHIVE was updated only infrequently, but that TXN_DATA was very volatile, then we could surmise that the second statement above probably should not be cached. SQLs that generate a large number of distinct result sets are probably not good candidates either. In the SQL below, we've cached about 5,100 result sets (1) - all that would fit in our result set cache. However, over 2.5 million executions (2) we only found 137 matches (3). This situation arises becasue there can be hundreds of thousands of possible TXN_ID values and they don't repeat very often. So this SQL is a poor choice for result set caching: Spotlight will raise a warning if the ratio of finds to creates is very low. We generally want at least as many finds as creates, since this indicates that result sets are being reused. Like many "hit ratios", there's no one correct value, but low values might suggest that you should take a closer look at the configuration of your result set cache and selection of SQLs for caching. Don't force it! Poor hit rates in the query cache probably do little real harm. However, high rates of concurrent result set cache creation can bring a database to its knees. There is a single latch protecting the query cache; in effect this means that only one session can create a new result set cache entry at any moment. If you set the RESULT_CACHE_MODE to FORCE, you'll probably kill your system, since every single SQL execution will be blocking on this latch. In this circumstance, Spotlight will fire a general latch alarm, as well as a specific alarms on the result set cache latch: In the Result Set Cache drilldown, we can see the rate at which sessions are sleeping on the latch, and the general level of latch waits as a percentage of total active time: The chart above tells us that overall, about 20% of database time is spent waiting on latches Requests for the result set cache latch "sleep" - fail to acquire the latch after trying around 2,000 times - about 95% of the time. This is very severe latch contention and results from allowing SQL statements with high execution rates to be included in the result set cache. See this Toadworld article for more information on latch contention. Conclusion The result set cache is a new 11g facility that allows complete result sets to be stored in memory. If a result set can be re-used then almost all the overhead of SQL execution can be avoided. The result set cache best suits small result sets from expensive queries on tables that are infrequently updated. Applying the result set cache to all SQLs is unlikely to be effective and can lead to significant latch contention. In 11g release 1, you should use the RESULT_CACHE hint to selectively cache SQL result sets. There's more information about the Result Set cache in my upcoming book Oracle Performance Survival Guide .
↧
↧
Blog Post: Optimizing Memory
In previous articles, we’ve looked at ways of reducing the demand placed on the database by application SQL, and we’ve looked at ways to reduce contention between the demands of concurrently executing sessions. If we’ve done a good job with this, then we hopefully have minimized the amount of logical IO demand that the application creates for the database. This logical IO consists of requests for information contained within the database files as well as for temporary result sets required to resolve joins and subqueries, to sort results or to create hash structures for joins and similar operations. Reducing this logical IO is one of the primary aims of SQL tuning – but just as important is to prevent as much as possible from turning into physical IO. Logical IO operations are effectively memory operations which are implemented in the real world by electrons whizzing through integrated circuits at a significant fraction of the speed of light. Today, most physical IO is implemented by spinning disk devices and the movement of read-write heads. These mechanical operations are incredibly slow compared to memory operations and although disk devices are getting faster, they are not increasing in speed anywhere near as fast as CPU and memory access. So it’s absolutely critical that we prevent as much of the logical IO from turning into physical IO as is possible. The key to doing this is effective memory configuration. Figure 1 Oracle uses the buffer pools and PGA memory to avoid datafile and temporary tablespace IO There are essentially two types of IO that we can avoid though optimizing memory. IO to the datafiles can be minimized through effective sizing of the buffer cache; while IO to the temporary tablespace can be minimized through optimal allocations to the PGA (see Figure 1). Tuning the buffer cache Memory allocated to the Oracle buffer cache stores copies of database blocks in memory and thereby eliminates the need to perform physical IO if a requested block is in that memory. In the “old days” Oracle DBAs would tune the size of the buffer cache by examining the “buffer cache hit ratio” – the percentage of IO requests that were satisfied in memory. However this approach has proved to be error prone, especially when performed prior to tuning the application workload or eliminating contention. It’s true that, all other things being equal , a high buffer cache hit ratio is better than a low ratio. The problem is that things are almost never equal when these measurements are taken and very high buffer cache hit ratios are a common side-effect of unnecessarily high logical IO. In modern Oracle, the effect of adjusting the size of the buffer cache can be accurately determined by taking advantage of the Oracle advisories. V$DB_CACHE_ADVICE shows the amount of physical I/O that would be incurred or avoided had the buffer cache been of a different size. Examining this advisory will reveal whether increasing the buffer cache will help avoid IO, or if reducing the buffer cache could free up memory without adversely affecting IO. Oracle allows you to setup separate memory areas to cache blocks of different size and also allows you to nominate KEEP or RECYCLE areas to cache blocks from full table scans. You can optimize your IO by placing small tables accessed by frequent table scans in KEEP, and large tables subject to infrequent table scans only in RECYCLE. V$DB_CACHE_ADVICE will allow you to appropriately size each area, although this resizing can occur automatically in 10g and 11g. The buffer cache exists in the System Global Area (SGA) which also houses other important shared memory areas such as the shared pool, java pool and large pool. Oracle database 10g automatically sizes these areas within the constraint of the SGA_MAX_SIZE parameter. Oracle manages blocks in the buffer pool by using a modified “Least Recently Used” (LRU) algorithm. The actual algorithm is fairly sophisticated, but in general the more often a block is accessed the longer it will stay in the buffer pool. Blocks that have not been accessed for a long time will “age out” of memory. In order to avoid full table scans flooding the buffer pools with one off data, the algorithm favours indexed reads to full table scan reads. Blocks from full table scans tend to age out immediately, while blocks from indexed lookups will be held in the buffer pool until pushed out by other blocks. Tuning the PGA In addition to disk reads to access data not in the buffer cache, Oracle may perform substantial IO when required to sort data or execute a hash join. Where possible, Oracle will perform a sort or hash join in memory using memory configured within the Program Global Area (PGA). However, if sufficient memory is not available, then Oracle will write to temporary segments in the “temporary” tablespace. The effect of inadequately sizing the PGA can be very substantial. If memory is very short, Oracle may have to read and write data to and from disks many times during a sort. This IO can easily be far more significant than the IO required getting the data in the first place. Figure 2 shows how for a full table scan with an ORDER BY clause, the response time increases dramatically as the memory for sorting decreases. At an extreme, the IO for sorting is many times the IO for retriving the data in the first place. Figure 2 IO from sorting can become excessive if the memory available is too small The amount of memory available for sorts and hash joins is determined primarily by the PGA_AGGREGATE_TARGET parameter. The V$PGA_TARGET_ADVICE advisory view will show how increasing or decreasing PGA_AGGREGATE_TARGET will affect this temporary table IO. ORACLE 11g automatic memory management Oracle database 10g manages the internal memory within the PGA and SGA quite effectively. But prior to Oracle 11g, memory will not be moved between the two areas, so it’s up the DBA to make sure that memory is allocated effectively to these two areas. Unfortunately, the two advisories concerned do not measure IO in the same units – V$DB_CACHE_ADVICE uses IO counts, while V$PGA_TARGET_ADVICE uses bytes of IO. Consequently it is hard to work out if overall IO would be reduced if one of these areas were to be increased at the expense of another. Also, it’s difficult to associate the IO savings reported by the advisories with IO time as reported in the wait interface. Nevertheless, determining an appropriate trade-off between the PGA and SGA sizing is probably the most significant memory configuration decision facing today’s DBA and can have a substantial impact on the amount of physical IO that the database must perform. If the Oracle 11g parameter MEMORY_TARGET is set, then Oracle will attempt to optimize memory between the SGA and the PGA and attempt to keep the sum of both within the specified memory target. This is potentially a big step forward, since it’s so hard to determine the best setting in 10g. Automatic memory management is worth trying, especially if you are concerned about your overall IO load. However, for some applications automatic memory management might not be the best solution. Automatic memory management tends to adjust memory allocations over a short term window of about one hour. If your system has abrupt changes in workload, or short term spikes of demand, then automatic memory management may play catch up throughout the day, never delivering a good outcome. In this case it’s better to determine the optimal setting for memory and either leave it at this fixed value, or alternate between known good configurations. For instance, we might switch memory from SGA to PGA at night, when long batch jobs require large sort areas. During the day, we could move that memory back to the SGA to minimize datafile IO during OLTP activity. Working out the exact amount of memory for your workload is a complex task and requires that you convert the various advisories into a common unit of measure. Luckily, if you’re a TOAD DBA Suite owner, this is all done for you in Spotlight’s memory management facility. Figure 3 shows the memory management module in action. Figure 3. Spotlight memory management Spotlight accumulates all the advisory information and converts the IO or bytes savings into estimated time savings based on the actual IO performance observed on your system. Then it works out what combination of memory allocations would most minimize processing time on your system. You can adopt the recommendation in Spotlight or save it to be implemented at a later time. Conclusion Physical disk IO is the slowest part of the database architecture and so minimizing IO is one of the most important objectives in database tuning. The first step is to minimize our physical IO as outlined in previous instalments of this series. Once that is done, try to prevent the logical IO from turning into physical IO by clever memory configuration. The two most important areas of memory in this context are the buffer cache and the Program Global Area (PGA). The former helps prevent logical reads turning into physical reads against the datafiles. The later helps prevent sort and hash operations generating IO to the temporary tablespace. Getting the balance right is difficult. 11g Automatic Memory Management can help, but is not suitable for every system. Spotlight's memory management facility can automatically calculate an appropriate breakdown of memory.
↧
Blog Post: EM12c and Hardware
I often hear, “My EM12c environment is slow!” and I just as often am granted access to the environment and find out that EVERYTHING is running on an old single core server with 2Gb of RAM someone found lying around. Enterprise Manager is often an after thought to many IT organizations. After all the work has gone into the production environments that build revenue, there is often a disconnect on how important information on the health and status of those revenue machines are, resulting in Enterprise Manager receiving the “cast off” servers for their hosted environment. So today we are just going to talk about sizing out an EM12c environment- no tuning, yet, (but we’ll get to it!) How does the team I work with decide what recommendations to make and why is it important to make those recommendations? The questions to ask when designing or upgrading an EM12c environment are: How many targets will you be monitoring? Here are the features we have available outside of just the basic monitoring, etc. What do you foresee the business finding the most value in? How many users will be accessing the Enterprise Manager console? Do you have any unique firewall or network configurations? The basic sizing recommendations, along with recommendations to meet MAA, (Maximum Availability Architecture) are shown below, decision factors are on number of targets and users*: Armed with this info, is your EM12c environment under-size or under-powered? Next, we’ll talk about why the database is not the only thing you should be tuning in your EM12c environment and why the SCP, (Strategic Customer Program) has so much value! *There are other factors, including features, management packs, plug-ins, etc. that can also impact the build and design of an EM12c environment for optimal performance. Copyright © DBA Kevlar [ EM12c and Hardware ], All Right Reserved. 2014.
↧
Blog Post: How to Configure EM12c to NOT Use Load Balancers
This may not come up very often, but for some reason, an administrator might have to reconfigure an EM12c environment to NOT use load balancers. This could be due to: 1. Hardware Issue on the Load Balancers 2. Mis-configured Load Balancers 3. Re-allocation of Load Balancer hardware for other purposes. Whatever the reason may be, I noted that the instructions to set up the load balancers are easy to find, but not so clearly found are how to configure the EM12c environment to not use the load balancers once they’ve been put in place. To revert back to non-load balancer configuration: You will need the SYSMAN password to perform the following tasks. First step will require you to reconfigure/secure the OMS without the load balancers. Second step is to secure each agent without the wallet credentials and load balancer information. Tip: If you have a large number of agents that will need to be re-secured, scripting the task may be advisable to limit the downtime and the overhead. Reconfigure the OMS Log onto the OMS host Proceed to the $OMS_HOME and secure the OMS, telling it to bypass the load balancers in the parameters: cd $OMS_HOME ./emctl stop oms -all ./emctl secure oms -no_slb Oracle Enterprise Manager Cloud Control 12c Release 4 Copyright (c) 1996, 2014 Oracle Corporation. All rights reserved. Securing OMS... Started. Enter Enterprise Manager Root (SYSMAN) Password : Enter Agent Registration Password : Securing OMS... Succeeded. Verify Changes Check the details of your OMS details to ensure that the change has taken place: ./emctl status oms -details Oracle Enterprise Manager Cloud Control 12c Release 4 Copyright (c) 1996, 2014 Oracle Corporation. All rights reserved. Enter Enterprise Manager Root (SYSMAN) Password : .... OMS is not configured with SLB or virtual hostname Agent Upload is unlocked. ... Re-secure the Agents Each agent, (outside the one on the OMS that was secured without the load balancers when you ran the command in the last step…) will need to be secured via the following command: cd $AGENT_HOME ./emctl secure agent Oracle Enterprise Manager Cloud Control 12c Release 4 Copyright (c) 1996, 2014 Oracle Corporation. All rights reserved. Agent is already stopped... Done. Securing agent... Started. Enter Agent Registration Password : Securing agent... Suceeded. That is all there is to reverting your EM12c environment to pre-load balancer configuration! Copyright © DBA Kevlar [ How to Configure EM12c to NOT Use Load Balancers ], All Right Reserved. 2014.
↧