Introduction In this article, I want to talk about parallelism in Oracle database. First of all, let me introduce some concepts. Parallel execution in Oracle database is the ability to physically partition a large serial task and/or a huge dataset into many smaller chunks to be processed physically at the same in order to reduce the processing time. For example, we have problems to be solve using parallelism such as: · Parallel query: execute a single costly query in parallel using many OS processes. For example, queries involves with full table scans, fast full index scans, partitioned index scans, partitioned table scans, large sorts, etc · Parallel DML: perform bulk INSERTs, UPDATEs, DELETEs, and MERGE · Parallel DDL: perform large DDL operations in parallel. For example, create and rebuild large indexes, load data via a CREATE TABLE AS SELECT , reorganization of large tables, etc · Procedural parallelism: the ability to run the PSQL code in parallel It´s very remarkable to say that applying parallelism techniques is only possible if and only if we have physically the required computing resources such as several processing units (multiple CPU and/or multiple cores) and I/O channels (RAID formation spreading data in multiple disks). The reason is that we need to execute processing operations and data access at the same time, so we need the infrastructure necessary to do it physically at the same time, otherwise the physical resources (CPU and IO) become a bottleneck of the logical operations needed to execute in parallel. In order to illustrate the idea above, for example, if we have a node with 4 processors, and we execute 2 intensive query in parallel with parallel degree equal 2 (then we have 4 running works), then each processor takes one running work while executing the query in parallel. If we increase the workload and run 50 intensive queries in parallel with parallel degree equal 2 (then we have 100 running work), and each processor can process physically only 4 running work at the same time, so the OS scheduler tries to execute a time slice for every running work in the set of 100 running work, making context switching between the running works which are competing for the processor as a bottleneck. In this scenario, parallelism is not the solution to improve the performance; but the source of performance and bottleneck problems. The same logic applies when several tasks are accessing to data in disks. The rule of thumb is that parallelism is not the silver bullet to reduce the processing time of intensive operations, and it can only be used when we have available the physical computing resources (sufficient free CPU, memory and IO bandwidth). Parallel architecture in Oracle database Now let´s talk how Oracle database implements parallelism. Let´s start stating the basic architecture of Oracle database based on background and foreground processes can be easily extended to support parallelism (more details about the basic architecture of Oracle database can be find in my previous articles). When a SQL statement is executed in parallel, the server process (that represents the user session and receives the request to execute the SQL statement) becomes into the query coordinator (QC). The main role of the QC is to parse the SQL statement and then partition the work between parallel query slaves (PQS). During the parse phase, the QC creates serial and parallel execution plans based on the defined degree of parallelism. In order to go to the partition and execution phase, the QC tries to obtain the sufficient number of PQS to execute the SQL statement. If the QC is unable to obtain the resources, then the SQL statement is executed serially. Otherwise if the QC obtains the sufficient resources, then QC sends the commands to execute the SQL statement to the allocated PQS. A queue mechanism named table queues (TQ) is used to coordinate QC and PQS as well as between PQS. TQ is a communication mechanism that enables the data rows to flow between the processes involved in the parallel SQL statement. PQS is a background process that executes a work portion for the SQL statement. PQS are part of pool of processes and they are coordinated using producer/consumer formation pattern. When acting as a producer, they are putting data into a queue for the next step; while acting a consumer, they are getting data from a queue and performing some operations on it. In order to illustrate the ideas above, let´s see an architecture diagram to represents the key concepts and data flow as shown in the Figure 1. The PQS0 and PQS1 read data from the database and put the data rows in their output queues TQ0 and TQ1. The queues TQ0 and TQ1 move the data rows into the queues TQ2 and TQ3. Then the PQS2 and PQS3 get the data rows from their input queues TQ2 and TQ3 in order to process effectively the underlying SQL statement. After PSQ2 and PQS3 finish their processing, then they put the results into their output queues TQ4 and TQ5. The results from the queues TQ4 and TQ5 are moved into the consolidation queue TQ6. And finally, the QC takes and prints the results the queue TQ6. It´s remarkable to say that PQS bypass the buffer cache and perform I/O operations directly on the storage. This avoids creating a contention for the buffer cache and enables IO to be more optimal distributed between PQS. The data read from the storage is processed directly on the PGA of the PQS. Figure 1 There are also intra-operations and inter-operations that can be executed in parallel to support the parallel execution of SQL statements. The intra-operation or data parallelism is the parallelization of individual operations where the same operation is performed on chunks of a big dataset while inter-operation or task parallelism happens when two operations run concurrently on different dataset with data flowing from one operation into another. Let´s see the ideas above with an example by executing the following query in parallel as shown in the listing 1. SELECT /*+ parallel(c,4) */ * FROM CUSTOMER c ORDER BY CUSTOMER_NAME Listing 1 As we can see in the figure 2, in this case, we want to get a list of customers ordered by the customer name with degree of parallelism equal to four. In other words, we have two phases: scan to get the data rows and sort to order the data rows. The scan phase is executed by four processes (PQS) in parallel. As well, the sort phase is executed by four processes (PQS) in parallel. Finally, the QC (or session server process) combines the resulted data rows and returns to the client process. We can see that the process formation is represented by eight PQS executing at the same time because the inter-operation parallelism is done simultaneous. Because the PQS runs at the same time, we can see the scenario where a set of rows with customer name starting with A-G are scanned by a PQS and these rows are sent to another PQS in the sort phase which finally they are returned to the QC. Figure 2 Another question to ask about Oracle database parallelism is: how many PQS are needed to execute a SQL statement? From the previous examples, we can see that for executing a query with a sort with degree of parallelism equal to 4, we need 8 PQS, so using the logic we need 2*(degree of parallelism) PQS in total. So, if we have a query with three phases, that is, a query that includes a table scan phase, a sort (ORDER BY) phase and a grouping (GROUP BY) phase: how many PQS are needed for this query? In this case, the formula is not (number of phases)*(degree of parallelism) because Oracle database reuses the PQS of the scan phase to perform the grouping phase. At the end of the day, we only need at most 2*(degree of parallelism) PQS in total for executing any SQL statement. We can illustrate this concept in the following figures. Figure 3 Figure 4 Figure 5 And finally, let´s talk about configuration of the Oracle data instance in order to tune the parallelism. Oracle database has a pool of parallel query server (PQS) slaves to execute SQL statements in parallel. There are two important configuration parameters to control this pool: · PARALLEL_MIN_SERVERS: Specify the minimum number of PQS in the pool for the instance. The value of 0 indicates that we don´t need PQS. A value greater than 0 indicates to create PQS in the pool at instance startup. It´s remarkable to say after a time of inactivity, the PQS processes are shut down until the pool reaches the value of this parameter · PARALLEL_MAX_SERVERS: Specify the maximum number of PQS in the pool for the instance. By defining a max value, Oracle database doesn’t create the processes at instance startup, but processes are created on demand in the pool when needed. We need to be carefully with this parameter, because if the value is too high and there´s no sufficient physical resources (CPU, IO, memory) then the system can degrade. It´s important to remember that each PQS process allocates a memory area in the PGA. How to use parallel features And finally, let´s talk about how to use the parallel features of Oracle database in your SQL statements. As we have seen in previous queries, when we declare a query as shown in listing 01, then the hint /*+ parallel(c,4) */ tells to Oracle database to scan the CUSTOMER table represented by the alias c using four PQS. We can also specify a fixed degree of parallelism per table as shown in the following listing. Although, Oracle database permits this type of sentence, it´s not well suited for OLTP system where we can take control of parallelism and apply it when it´s strictly necessary, so the recommended approach is to add the hint to all parallel SQL statements and to tune the degree of parallelism. ALTER TABLE CUSTOMER PARALLEL DEGREE 4; Listing 2 It´s remarkable to say that the final number of PQS dedicated to execute a SQL statement in parallel is ultimately determined by the Oracle Query Manager and the degree of parallelism is our desired number of PQS and it only specifies an upper limit. There are several hints related to parallelism: · PARALLEL: Specify the desired number of PQS to be used for the parallel SQL statement. This hint applies to SELECT, INSERT, UPDATE, DELETE. In queries that join multiple tables and when aliases are used, then the hint should use the table alias · NOPARALLEL: Specify not use parallelism as his name indicates · PQ_DISTRIBUTED: Improve the performance of parallel join operations. The improve comes from specifying how rows of the joined tables should be distributed among producers and consumers · PARALLEL_INDEX: Specify the desired number of PQS that can be used to parallelize index range scans for partitioned indexes. As the definitions says, this hint makes sense only on partitioned indexes and only when the range scans are performed · NOPARALLEL_INDEX: Negates explicitly the previous point Summary In this article, I´ve shown the key concepts, principles and use case for parallelism in Oracle database as well as the architecture and implementation of this technique. Finally, I´ve described some example of application of parallelism. Now you can apply this technique in your own database.
↧
Wiki Page: Parallelism in the Oracle Database
↧
Wiki Page: TUNING VIEW'S PERFORMANCE
TUNING VIEWS' PERFORMANCE Author JP Vijaykumar Date November 22 2014 /* I was reviewing, a poorly performing report in one of our datawarehouse dbs. This report is selecting all data from a view and performing some aggregations. The view's definition is : All the column_names & table_names were altered/masked. */ select text from dba_views where view_name='ORIGINAL_COL_VW'; --PROBLEMATIC/ORIGINAL VIEW DEFINITION(VIEW 01): --------------------------------------------------------------------------------------------- SELECT "TBL1_ID", "COL2", "COL3", "COL4", "COL5", "COL6", "COL7", "COL8", "COL9", "COL10", "COL11", "COL12", "COL13", "COL14", "COL15", "COL16", "COL17", "COL18", "COL19", "COL20", "COL21", "COL22", "COL23", "COL24", "COL25" FROM (SELECT PREF.TBL1_ID, PREF.COL2, PREF.COL3, PREF.COL4, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 901 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL5, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 901 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL6, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 902 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL7, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 902 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL8, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 903 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL9, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 903 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL10, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 904 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL11, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 904 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL12, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 905 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL13, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 905 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL14, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 906 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL15, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 906 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL16, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 907 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL17, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 907 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL18, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 908 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL19, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 908 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL20, (SELECT P.SHORT_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 73 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL21, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 909 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL22, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 912 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL23, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 910 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL24, (SELECT P.LONG_NM FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.PREDICATE_ID = 911 AND PP.TBL4_ID = P.TBL4_ID AND P.COL3 = PREF.COL3 AND PGA.GROUP_ID = PREF.COL2) COL25 FROM SCOTT.TBL1 PREF ) --Tested the performance on the original view. set autot on stat timing on echo on feedback on linesize 120 pagesize 0 --EXECUTION PLAN DETAILS WERE OMITTED. --PLS NOTE THE ELAPSED TIME & VALUE FOR "consistent gets" FOR EXECUTING A QUERY AGAINST (ORIGINAL) VIEW 01. SQL select * from ORIGINAL_COL_VW; ... ... ... 1710238 rows selected. Elapsed: 00:27:42.44 -- Statistics ---------------------------------------------------------- 546 recursive calls 0 db block gets 196754124 consistent gets -- 10335 physical reads 0 redo size 128500878 bytes sent via SQL*Net to client 1254688 bytes received via SQL*Net from client 114017 SQL*Net roundtrips to/from client 46 sorts (memory) 0 sorts (disk) 1710238 rows processed /* The sql query against the above view is consuming high resources and performing poorly. Re-written the view, with the following construct. The PP.PREDICATE_ID column is having multiple values. For different values of PP.PREDICATE_ID, the view should fetch all the available values into separate aliased columns. As such, the below view is not fetching the expected values. */ CREATE OR REPLACE VIEW MODIFIED_COL_VW_JP1 as SELECT "TBL1_ID", "COL2", "COL3", "COL4", "COL5", "COL6", "COL7", "COL8", "COL9", "COL10", "COL11", "COL12", "COL13", "COL14", "COL15", "COL16", "COL17", "COL18", "COL19", "COL20", "COL21", "COL22", "COL23", "COL24", "COL25" FROM (SELECT PREF.TBL1_ID, PREF.COL2, PREF.COL3, PREF.COL4, case when PP.PREDICATE_ID = 901 then P.SHORT_NM else '' end COL5, case when PP.PREDICATE_ID = 901 then P.LONG_NM else '' end COL6, case when PP.PREDICATE_ID = 902 then P.SHORT_NM else '' end COL7, case when PP.PREDICATE_ID = 902 then P.LONG_NM else '' end COL8, case when PP.PREDICATE_ID = 903 then P.SHORT_NM else '' end COL9, case when PP.PREDICATE_ID = 903 then P.LONG_NM else '' end COL10, case when PP.PREDICATE_ID = 904 then P.SHORT_NM else '' end COL11, case when PP.PREDICATE_ID = 904 then P.LONG_NM else '' end COL12, case when PP.PREDICATE_ID = 905 then P.SHORT_NM else '' end COL13, case when PP.PREDICATE_ID = 905 then P.LONG_NM else '' end COL14, case when PP.PREDICATE_ID = 906 then P.SHORT_NM else '' end COL15, case when PP.PREDICATE_ID = 906 then P.LONG_NM else '' end COL16, case when PP.PREDICATE_ID = 907 then P.SHORT_NM else '' end COL17, case when PP.PREDICATE_ID = 907 then P.LONG_NM else '' end COL18, case when PP.PREDICATE_ID = 908 then P.SHORT_NM else '' end COL19, case when PP.PREDICATE_ID = 908 then P.LONG_NM else '' end COL20, case when PP.PREDICATE_ID = 73 then P.SHORT_NM else '' end COL21, case when PP.PREDICATE_ID = 909 then P.LONG_NM else '' end COL22, case when PP.PREDICATE_ID = 912 then P.LONG_NM else '' end COL23, case when PP.PREDICATE_ID = 910 then P.LONG_NM else '' end COL24, case when PP.PREDICATE_ID = 911 then P.LONG_NM else '' end COL25 FROM SCOTT.TBL1 PREF, SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P where PGA.TBL3_ID = PP.TBL3_ID AND PP.TBL4_ID = P.TBL4_ID and p.COL3 = PREF.COL3 AND pga.GROUP_ID = PREF.COL2 ); /* Modified the logic and re-written the veiw with the following constuct. */ (VIEW 02) CREATE OR REPLACE VIEW MODIFIED_COL_VW_JP2 AS with t as (SELECT P.SHORT_NM,P.LONG_NM,P.COL3,PGA.GROUP_ID,PP.PREDICATE_ID FROM SCOTT.TBL2 PGA,SCOTT.TBL3 PP,SCOTT.TBL4 P WHERE PGA.TBL3_ID = PP.TBL3_ID AND PP.TBL4_ID = P.TBL4_ID order by PREDICATE_ID,COL3,GROUP_ID) select "TBL1_ID", "COL2", "COL3", "COL4", a901.SHORT_NM COL5, a901.LONG_NM COL6, a902.SHORT_NM COL7, a902.LONG_NM COL8, a903.SHORT_NM COL9, a903.LONG_NM COL10, a904.SHORT_NM COL11, a904.LONG_NM COL12, a905.SHORT_NM COL13, a905.LONG_NM COL14, a906.SHORT_NM COL15, a906.LONG_NM COL16, a907.SHORT_NM COL17, a907.LONG_NM COL18, a908.SHORT_NM COL19, a908.LONG_NM COL20, a73.SHORT_NM COL21, --a909.SHORT_NM, a909.LONG_NM COL22, a912.LONG_NM COL23, --a910.SHORT_NM, a910.LONG_NM COL24, --a911.SHORT_NM, a911.LONG_NM COL25 --a912.SHORT_NM, --a73.LONG_NM from SCOTT.TBL1 PREF, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 901) a901, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 902) a902, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 903) a903, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 904) a904, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 905) a905, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 906) a906, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 907) a907, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 908) a908, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 909) a909, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 910) a910, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 911) a911, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 912) a912, (SELECT t.SHORT_NM,t.LONG_NM,t.COL3,t.GROUP_ID FROM t WHERE t.PREDICATE_ID = 73) a73 where PREF.COL3 = a901.COL3(+) and PREF.COL2 = a901.GROUP_ID(+) and PREF.COL3 = a902.COL3(+) and PREF.COL2 = a902.GROUP_ID(+) and PREF.COL3 = a903.COL3(+) and PREF.COL2 = a903.GROUP_ID(+) and PREF.COL3 = a904.COL3(+) and PREF.COL2 = a904.GROUP_ID(+) and PREF.COL3 = a905.COL3(+) and PREF.COL2 = a905.GROUP_ID(+) and PREF.COL3 = a906.COL3(+) and PREF.COL2 = a906.GROUP_ID(+) and PREF.COL3 = a907.COL3(+) and PREF.COL2 = a907.GROUP_ID(+) and PREF.COL3 = a908.COL3(+) and PREF.COL2 = a908.GROUP_ID(+) and PREF.COL3 = a909.COL3(+) and PREF.COL2 = a909.GROUP_ID(+) and PREF.COL3 = a910.COL3(+) and PREF.COL2 = a910.GROUP_ID(+) and PREF.COL3 = a911.COL3(+) and PREF.COL2 = a911.GROUP_ID(+) and PREF.COL3 = a912.COL3(+) and PREF.COL2 = a912.GROUP_ID(+) and PREF.COL3 = a73.COL3(+) and PREF.COL2 = a73.GROUP_ID(+) set autot on stat timing on echo on feedback on linesize 120 pagesize 0 --EXECUTION PLAN DETAILS WERE OMITTED. --PLS NOTE ELAPSED TIME & VALUE FOR "consistent gets" FOR EXECUTING A QUERY AGAINST (RE-WRITTEN) VIEW 02. SQL select * from MODIFIED_COL_VW_JP2; ... ... ... 1710238 rows selected. Elapsed: 00:10:31.55 -- Statistics ---------------------------------------------------------- 564 recursive calls 789 db block gets 134042 consistent gets -- 11051 physical reads 884 redo size 128500878 bytes sent via SQL*Net to client 1254688 bytes received via SQL*Net from client 114017 SQL*Net roundtrips to/from client 48 sorts (memory) 0 sorts (disk) 1710238 rows processed The modified view was completing in 10:30 minutes as compated to 27:40 minutes for the original view. The modified view was submitted for further testing/validation by the application team.
↧
↧
Blog Post: Restore database after fatal user error
Even though Oracle enhances the restore or fault tolerance capabilities with every release there are still fatal errors which lead into a major outage of the database. Those errors are mainly produced by human beings – sorry guys but that’s how it is. In this blog I’ll show you how to restore the database to a specific SCN after a major incident. Actually, there is a true story behind this: Some years ago I had a customer whose production database was named “TEST”. Even though I told them to rename the database to somewhat meaningful the DBA was totally convinced that this is not a problem… As time goes by he had a development database where he wanted to test a new version of the application so he dropped the user … You probably know what happened? He dropped the application user on the “TEST” database. The entire production line came to a full stop – no wonder- and the DBA became a little nervous. He was shocked and called me to help restore his database. Find out when the problem occurred My first question was: “When did you execute the drop” – silence. Actually I think he did not even know what day it was so his answer was: “about one hour ago – or so”. Not a real help if you need to restore a database to a very specific point in time because even though the production stopped we didn’t want to lose too much data. So I was using Toad and the LogMiner wizard to find out when he executed the fatal script. To give you an impression on how to look for the last DML and first DDL (drop) I inserted one row in the customer table and checked for that time: Picture 1: Insert Customer data As you can see the time of the insert was at 16:31:58 on November 27 th 2014. And how the fatal SQL occurs SQL DROP USER demeng CASCADE; User dropped Now we can start the LogMiner and give it some estimations where to look for the data. Menu à Database à Diagnostic à LogMiner Picture 2: Dictionary select in LogMiner Even though I’m not on the server (the database JOHANN is running on a linux box) I’m able to directly use the Online Data Dictionary and check for the most recent redo log files. Picture 3: find files for LogMiner session Picture 4: select Online redologs At this point it doesn’t matter if you are using the online redologs or the archived ones as long as they are still available. That’s why it makes sense to keep the archived redologs on disk as long as you can. With the next step the content of the redologs is analyzed. Unfortunately the timestamp given back on the screen is wrong. The only value which matters is the SCN where the high number indicates that the SCN is invalid. So I’m reading until the end of the logfiles. I changed the “From” and “To” Data fields to the approximately time window I’m looking for. And I eliminated uncommitted data. Picture 5: narrow date for Logminer session Before I press the green triangle to execute the LogMiner query I add some more columns to the list because they might help me to find the right SCN. Those columns are “Segment Owner”, “Segment Name” and “Operation”. Picture 6: select additional columns So let’s do the analysis: Picture 7: query operations of DEMOENG Because I’ve dropped the schema “DEMOENG” I filter the output to only in that “Schema Owner”. I had expected to see the “INSERT INTO CUSTOMER …” statement first but actually only the DROP statements occurring right after the INSERT where shown. The reason is simple: Because that user had been dropped there is no longer a relationship between the tables and the owner (except for DDLs). So a new filter on the SCN gave me an indication on the DML statements occurring right before that fatal DROP. Picture 8: get scn before fatal drop As you can see and verify with the first screenshot the INSERT INTO CUSTOMER still exists (highlighted) but there is no longer any meaningful data as there are no corresponding columns or object names. But that doesn’t matter. Looking for the details I can see that I probably logged in before the fatal error at approximately 16:35:00. So I can recover until that time or SCN. Restoring the database In a real life environment I would have opened the database in restricted mode to avoid any other application to continue to work and I would have taken a second backup just to ensure I’ve some more trials if miscalculating the SCN. In my example I’m now shutting down the instance and using RMAN to restore the database. RMAN SHUTDOWN IMMEDIATE RMAN STARTUP MOUNT RMAN RESTORE DATABASE UNTIL SCN 859045; RMAN RECOVER DATABASE UNTIL SCN 859045; RMAN ALTER DATABASE OPEN RESETLOGS; The successful login to the schema “DEMOENG” will show that the recovery was successful. And a query for custid=200000 will list the row I inserted right before the DROP USER. Conclusion The LogMiner wizard is easy to use if you are able to read the content of V$LOGMNR_CONTENTS (that’s the view behind the result set). It doesn’t matter if you are using Oracle Standard Edition as in this case and have to restore the database using RMAN. If you are on Enterprise Edition you might want to use flashback database instead. But the behavior is similar. During my tests I realized the INSERT command for the single row I added did not show up in the logminer session. I had to enable supplemental logging first. So I would guess it does make sense to set supplemental logging even if you don’t use replication. In case of a fatal user error it will probably help identify the correct SCN.
↧
Blog Post: RMOUG Training Days 2015!
OK, Oracle EM hat off, RMOUG Training Days hat on! So as many of you know, RMOUG Board of Directors made the smart move after I joined Oracle, instead of losing a valuable member of the board, they moved me to being a non-voting board member emeritus and realized that I could still serve as the Training Days Conference Director. The conference is by far, the most demanding position on the board and its a role that I relish and have the skills for. Oracle is happy. RMOUG is happy. Membership is happy. Training Days is taken care of… This year I’m taking it up a notch and I wanted to talk about why RMOUG Training Days 2015 is the one conference you DON’T want to miss! Project O.W.L. The OWL is not just our mascot, it stands for Oracle Without Limits and Project O.W.L. is a new event at Training Days that will offer the attendee some great, new opportunities to learn, to interact with those in the industry and to immerse in the technology we love. The event will center behind our great exhibition area and will have the following: RAC Attack RAC Attack will be back this year and better than ever! Learn all the ins and outs of an Oracle RAC by building on on your laptop! Experts will be on hand from the ACE and Oracle community to help you with your questions and make you a RAC Attack ninja! Clone Attack Delphix is bringing Clone Attack for those who want to find out how quickly you can provision environments! Find out how much space and time savings can be reached and do it all on a VM on your laptop! Oracle Engineered System and Hardware Demo Want to get up close and personal with some great Oracle hardware ? You’ll get the chance at Project O.W.L. Oracle is going to be bringing some of the newest, coolest appliances and engineered systems so you can find out just how cool it really is! Stump the Chump Have a real technical conundrum? Want to see if you have the tech question that our experts can’t answer? We’ll have opportunities to ask the experts your tough questions and if they can’t get you an answer, you’ll get a “I stumped the chump” button to wear proudly at the conference! New Attendee Recommendation Initiative I was first introduced to Training Days by the recommendation of a Senior DBA I worked with back in 2004. There is nothing more valuable than word of mouth and we are going to reward that at TD2015. If you recommend someone new to Training Days and they list your name on their registration form as the one that recommended them, we’ll reward you with a $25 amazon gift card after the conference! Special Interest Meetup Lunch Our SIGs are an important part of our membership. Show your support the first day by taking your box lunch and sitting in on one of the SIG meetups! There are so many special interest groups to be a part of, so find out what you’ve been missing out on! Hyperion Big Data and Cloud Enterprise Manager Higher Education Database 12c APEX and others! ACE Lunches We’re bringing back our ACE lunches both days again! If you aren’t in on a SIG lunch, sit with your favorite ACE, ACE Associate or ACE Director and find out what got them where they are in the Oracle community. Talk tech with the best in the industry! Deep Dive and Hands on Lab For those with a full registration pass, (we have single day passes for those that can’t get away for the full conference…) the first 1/2 day is our gift to you! From 1-5:15pm on the 17th , you will get to immerse yourself in hands on labs and deep dives from the best of the best in Oracle database, development ADF and APEX and even one of my favorites, Enterprise Manager Database as a Service! This is a first come first serve sessions that day, (until we hit the room capacity limit, trust me, you do not want to tick off the fire marshal at the convention center… :)) so get there early and get the most out of your full registration! Two Full Days, 100 Sessions! Yes, you heard me right- two days, 100 sessions, 9 tracks! We have Steven Feuerstein , Jeff Smith , Iggy Fernandez , Kyle Hailey , David Peake , Scott Spendolini, Alex Gorbachev , Graham Wood , Bryn Llewellyn , Carlos Sierra and John King. New speakers this year for RMOUG, (we work very hard to introduce new speakers into our schedule…) include Bjoern Rost, Rene Antunez, Werner De Gruyter, (Yoda!) and Wayne Van Sluys. Professional Development and More! I will be heading up our ever popular Women in Technology series on the first full day again this year! The panel is starting to form and I look forward to everyone who attends getting the most out of the session and to further their love of their tech career. This session is not just for women, but for father’s of daughters, husbands who want more for their wives, managers of women employees and even those that are hoping to hire more diversity in their departments! Jeff Smith and I will be doing another year of Social Media for the Database Professional ! Come learn HOW to do Social Media instead of just the WHY. We’ll teach you how to automate, find the easy button for social media and how to find your social media style to make it work for you and your career. The schedule is set up so that there is something impressive all day every day for the 2015 conference and I’m excited to share it with everyone! Registration is open, so don’t miss out on what is going to be the best Training Days yet! Tags: Del.icio.us Facebook TweetThis Digg StumbleUpon Comments: 0 (Zero), Be the first to leave a reply! You might be interested in this: How to Apply an EBS Patch Removing Redundant Startup/Restart for the OMS Service in Windows EM12c Enterprise Monitoring, Part IV Agents Management Via EM12c Release 4 Console The Technical Bully Scenario Copyright © DBA Kevlar [ RMOUG Training Days 2015! ], All Right Reserved. 2014.
↧
Blog Post: DBMS_RANDOM using the Normal function to return numbers in a pattern on either side of a supplied value
Hi, Last week, I showed you my favorite random number generator: DBMS_RANDOM.Value. This weeks submission expands on DBMS_RANDOM but uses the Normal function to return numbers in a pattern on either side of a supplied value. With no options, DBMS_RANDOM.Normal returns numbers on either side of 0. Lets build on this. Multiply the returned value by 10000, truncate off the decimal positions and you get numbers on either side of 0 for a distance of 10000 in either direction. OK…you want larger random numbers to use for salaries perhaps. Add in 50,000 (or a number that you would like the returned values to be near) the number you want as the center point for the test data. This will give you positive numbers and most of the numbers will be around the 50,000 center point. I hope you find this information useful in your day-to-day use of the Oracle RDBMS. Dan Hotka Oracle ACE Director Instructor/Author/CEO
↧
↧
Blog Post: Using Random Number Generator in the Oracle Database
Hi! I gotta tell you, I like this random number generator that appeared a while back with the Oracle database. I use it to select the winners when I do book draws at user groups. I type this in…just run it…it returns a random number between the low and high range supplied. Interesting…some people have gotten used to my drawings this way and will select a number! I just make sure the min and max includes the list then if the number generator comes up with a number not assigned to a name…I simply run it again. It’s a function so I just ‘select … from dual’…running the function once. It returns up to 6 decimal places so the round does a fair job of not returning 2 identical values. When using the TRUNC function, sometimes I get the same value twice…it’s the decimal places that differ… You can use this to generate random numbers for test data too…just plug it into an insert statement, loader script, and away you go. OK…that’s cool and you can probably use a modulo math function (for you geeks…email me with the syntax!), modify this and run thru PL/SQL for 1 thru 52 for a deck of cards…make a PL/SQL solitaire game perhaps… Say you want 10 random rows from a table…maybe you are making test data. I’ll use my EMP table (I just love EMP and DEPT…). Here it is: The query is in the highlight. Run it again…you get 10 different rows. Next week I’ll show you some other practical uses for Oracle’s random number generators. I’ll show you a random method to return values hear a fixed value amount! Better for random dollar amounts for test data. Dan Hotka Oracle ACE Director Instructor/Author/CEO
↧
Blog Post: Enterprise Manager 12c- Management Packs and Licensing Information
Tweet I’m often asked what management pack is used by what feature and there is actually a very easy way to find out this information in the EM12c console. Let’s say we are in ASH Analytics and want to view what management packs are required as part of this feature utilization: Click on the global Setup menu in the upper right, then click on Management Packs and Packs for this Page - You will quickly see a pop up page that shows you what management pack(s) are required to use the ASH Analytics feature: If we jumped to SQL Monitor and did the same process of clicking on Setup, Management Packs, Packs for this Page , we’d see that the SQL Monitor requires the Database Tuning Pack : You an click on host monitoring and this process will confirm for you that no packs are required unlike the previous two features. This will offer you the management packs required for any of the features and can save you a lot of time if you just quickly want to verify, (I know I often mix up the Tuning and Diagnostics pack and some features require both!) It’s always better to be safe than sorry. License info can be very daunting as well, so having some explanations of what licenses are for what features, (high level) is helpful. You can access information on EM12c licensing by going again to Setup, Management Packs and then clicking on License Information . Each license required, along with high level descriptions are included in this section: It also identifies the acronym/abbreviation used for each pack license, so when it’s time to renew, here’s a bit more information to help you decide what you need to support your environment! Copyright © DBA Kevlar [ Enterprise Manager 12c- Management Packs and Licensing Information ], All Right Reserved. 2014.
↧
Wiki Page: Recovering Table in Non-Container Database and Pluggable Database (PDB) in Container Database (CDB)
Table Recovery in Pluggable Database/Non-Container Database: With Oracle Database 12c, Recover Manager (RMAN) enables you to recover one or more tables or your table partitions to a specified point in time without effecting the remaining database objects either for Pluggable database and Non-Container Database. This feature will enabled through Recover Manager (RMAN), reduces time and disk space compared to earlier Oracle Database versions. In earlier Oracle database versions, recovering table will involve the functionality of recovering the entire tablespace containing the table in a separate disk location and export the desired table and import into the original database location (originally dropped location) Creating Table in Pluggable Database (PDB) in Container Database (CDB) [oracle@12cdb ~]$ sqlplus /nolog SQL conn scott/scott@12cdb:1521/pdb1 Connected. SQL create table case (name varchar2(10)); Table created. SQL insert into case values('ORACLE'); 1 row created. SQL insert into case values('MYSQL'); 1 row created. SQL commit; Commit complete. Check Current SCN from Pluggable Database (PDB) SQL conn sys/oracle@12cdb:1521/pdb1 as sysdba Connected. SQL select current_scn from v$database; CURRENT_SCN -------------------- 2326559 Check Current SCN from Container Database (CDB) SQL select current_scn,name from v$database; CURRENT_SCN NAME -------------------- ------- 2326563 CDB1 SQL exit Login into ‘Recover Manager (RMAN)’ prompt. Once connected, we have to make sure that when we backup the container database, the control file will be backed up as well. Therefore, issue the following command: [oracle@12cdb ~]$ rman target / connected to target database: CDB1 (DBID=838297997) ü RMAN configure controlfile autobackup on; ü RMAN backup as compressed backupset pluggable database pdb1; Starting backup at 28-NOV-14 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=69 device type=DISK channel ORA_DISK_1: starting compressed full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set input datafile file number=00021 name=/u01/app/oracle/oradata/cdb2/pdb_plug_move/example01.dbf input datafile file number=00019 name=/u01/app/oracle/oradata/cdb2/pdb_plug_move/sysaux01.dbf input datafile file number=00018 name=/u01/app/oracle/oradata/cdb2/pdb_plug_move/system01.dbf input datafile file number=00037 name=/u01/app/oracle/oradata/cdb1/pdb1/scott1.dbf input datafile file number=00020 name=/u01/app/oracle/oradata/cdb2/pdb_plug_move/SAMPLE_SCHEMA_users01.dbf channel ORA_DISK_1: starting piece 1 at 28-NOV-14 channel ORA_DISK_1: finished piece 1 at 28-NOV-14 piece handle=/u01/app/oracle/fast_recovery_area/CDB1/08D1916DA8ED7561E0536538A8C05650/backupset/2014_11_28/o1_mf_nnndf_TAG20141128T105623_b7j1w040_.bkp tag=TAG20141128T105623 comment=NONE channel ORA_DISK_1: backup set complete, elapsed time: 00:00:55 Finished backup at 28-NOV-14 Starting Control File and SPFILE Autobackup at 28-NOV-14 piece handle=/u01/app/oracle/fast_recovery_area/CDB1/autobackup/2014_11_28/o1_mf_s_864817039_b7j1xqky_.bkp comment=NONE Finished Control File and SPFILE Autobackup at 28-NOV-14 RMAN list backup of pluggable database 'pdb1'; List of Backup Sets ============== BS Key Type LV Size Device Type Elapsed Time Completion Time ------- ---- -- ---------- ----------- ------------ --------------- 31 Full 171.67M DISK 00:00:58 27-NOV-14 BP Key: 31 Status: AVAILABLE Compressed: YES Tag: TAG20141127T165157 Piece Name: /u01/app/oracle/fast_recovery_area/CDB1/08D1916DA8ED7561E0536538A8C05650/backupset/2014_ ……Output Truncated…………. RMAN exit Recovery Manager complete. Dropping Table from Pluggable Database (PDB) [oracle@12cdb ~]$ sqlplus /nolog SQL conn scott/scott@12cdb:1521/pdb1 Connected. SQL drop table scott.case purge; Table dropped. SQL exit [oracle@12cdb ~]$ rman target / connected to target database: CDB1 (DBID=838297997) Recovering Table for Pluggable Database (PDB1) We will now recover the dropped table to the SCN displayed right before the table drop, using the UNTIL SCN clause. If we didn’t know this SCN, we could’ve recovered the table using the UNTIL TIME clause. The recovery will be made by using an auxiliary destination, under '/u01/app/oracle/stage'. ü RMAN recover table scott.case of pluggable database pdb1 until scn 2326559 auxiliary destination '/u01/app/oracle/stage'; Note: In Case of you want to recover table from Non-Container Database check the syntax below ü RMAN RECOVER TABLE "USER1".USER_TABLE UNTIL SCN 2116818 AUXILIARY DESTINATION '/u01/app/oracle/orcl_backup'; Once the table recovery begins, RMAN creates an automatic instance. Starting recover at 28-NOV-14 using channel ORA_DISK_1 RMAN-05026: WARNING: presuming following set of tablespaces applies to specified Point-in-Time List of tablespaces expected to have UNDO segments Tablespace SYSTEM Tablespace UNDOTBS1 Creating automatic instance, with SID='AFlC' initialization parameters used for automatic instance: db_name=CDB1 db_unique_name=AFlC_pitr_pdb1_CDB1 compatible=12.1.0.2.0 db_block_size=8192 db_files=200 diagnostic_dest=/u01/app/oracle _system_trig_enabled=FALSE sga_target=2560M processes=200 db_create_file_dest=/u01/app/oracle/stage log_archive_dest_1='location=/u01/app/oracle/stage' enable_pluggable_database=true _clone_one_pdb_recovery=true #No auxiliary parameter file used Once the instance is started, RMAN finds the appropriate backup pieces and loads them into this instance. starting up automatic instance CDB1 Oracle instance started Total System Global Area 2684354560 bytes Fixed Size 2928008 bytes Variable Size 587203192 bytes Database Buffers 2080374784 bytes Redo Buffers 13848576 bytes Automatic instance created contents of Memory Script: { # set requested point in time set until scn 2326559; # restore the controlfile restore clone controlfile; # mount the controlfile sql clone 'alter database mount clone database'; # archive current online log sql 'alter system archive log current'; } executing Memory Script executing command: SET until clause Starting restore at 28-NOV-14 allocated channel: ORA_AUX_DISK_1 channel ORA_AUX_DISK_1: SID=22 device type=DISK channel ORA_AUX_DISK_1: starting datafile backup set restore channel ORA_AUX_DISK_1: restoring control file channel ORA_AUX_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/CDB1/autobackup/2014_11_28/o1_mf_s_864816848_b7j1qrq6_.bkp channel ORA_AUX_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/CDB1/autobackup/2014_11_28/o1_mf_s_864816848_b7j1qrq6_.bkp tag=TAG20141128T105408 channel ORA_AUX_DISK_1: restored backup piece 1 channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01 output file name=/u01/app/oracle/stage/CDB1/controlfile/o1_mf_b7j31vrk_.ctl Finished restore at 28-NOV-14 sql statement: alter database mount clone database sql statement: alter system archive log current contents of Memory Script: { # set requested point in time set until scn 2326559; # set destinations for recovery set and auxiliary set datafiles set newname for clone datafile 1 to new; set newname for clone datafile 4 to new; set newname for clone datafile 3 to new; set newname for clone datafile 18 to new; set newname for clone datafile 19 to new; set newname for clone tempfile 1 to new; set newname for clone tempfile 3 to new; # switch all tempfiles switch clone tempfile all; # restore the tablespaces in the recovery set and the auxiliary set restore clone datafile 1, 4, 3, 18, 19; switch clone datafile all; } executing Memory Script executing command: SET until clause executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME renamed tempfile 1 to /u01/app/oracle/stage/CDB1/datafile/o1_mf_temp_%u_.tmp in control file renamed tempfile 3 to /u01/app/oracle/stage/CDB1/datafile/o1_mf_temp_%u_.tmp in control file RMAN proceeds to restore and recover objects into the auxiliary instance. Starting restore at 28-NOV-14 using channel ORA_AUX_DISK_1 channel ORA_AUX_DISK_1: starting datafile backup set restore channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set channel ORA_AUX_DISK_1: restoring datafile 00001 to /u01/app/oracle/stage/CDB1/datafile/o1_mf_system_%u_.dbf channel ORA_AUX_DISK_1: restoring datafile 00004 to /u01/app/oracle/stage/CDB1/datafile/o1_mf_undotbs1_%u_.dbf channel ORA_AUX_DISK_1: restoring datafile 00003 to /u01/app/oracle/stage/CDB1/datafile/o1_mf_sysaux_%u_.dbf channel ORA_AUX_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/CDB1/backupset/2014_11_27/o1_mf_nnndf_TAG20141127T180141_b7g6fg17_.bkp channel ORA_AUX_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/CDB1/backupset/2014_11_27/o1_mf_nnndf_TAG20141127T180141_b7g6fg17_.bkp tag=TAG20141127T180141 channel ORA_AUX_DISK_1: restored backup piece 1 channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:02:05 channel ORA_AUX_DISK_1: starting datafile backup set restore channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set channel ORA_AUX_DISK_1: restoring datafile 00018 to /u01/app/oracle/stage/CDB1/datafile/o1_mf_system_%u_.dbf channel ORA_AUX_DISK_1: restoring datafile 00019 to /u01/app/oracle/stage/CDB1/datafile/o1_mf_sysaux_%u_.dbf channel ORA_AUX_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/CDB1/08D1916DA8ED7561E0536538A8C05650/backupset/2014_11_28/o1_mf_nnndf_TAG20141128T105303_b7j1oqc9_.bkp channel ORA_AUX_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/CDB1/08D1916DA8ED7561E0536538A8C05650/backupset/2014_11_28/o1_mf_nnndf_TAG20141128T105303_b7j1oqc9_.bkp tag=TAG20141128T105303 channel ORA_AUX_DISK_1: restored backup piece 1 channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:45 Finished restore at 28-NOV-14 datafile 1 switched to datafile copy input datafile copy RECID=12 STAMP=864818372 file name=/u01/app/oracle/stage/CDB1/datafile/o1_mf_system_b7j3228c_.dbf datafile 4 switched to datafile copy input datafile copy RECID=13 STAMP=864818372 file name=/u01/app/oracle/stage/CDB1/datafile/o1_mf_undotbs1_b7j322bt_.dbf datafile 3 switched to datafile copy input datafile copy RECID=14 STAMP=864818372 file name=/u01/app/oracle/stage/CDB1/datafile/o1_mf_sysaux_b7j322c3_.dbf datafile 18 switched to datafile copy input datafile copy RECID=15 STAMP=864818372 file name=/u01/app/oracle/stage/CDB1/datafile/o1_mf_system_b7j35zfx_.dbf datafile 19 switched to datafile copy input datafile copy RECID=16 STAMP=864818372 file name=/u01/app/oracle/stage/CDB1/datafile/o1_mf_sysaux_b7j35zfg_.dbf contents of Memory Script: { # set requested point in time set until scn 2326559; # online the datafiles restored or switched sql clone "alter database datafile 1 online"; sql clone "alter database datafile 4 online"; sql clone "alter database datafile 3 online"; sql clone 'PDB1' "alter database datafile 18 online"; sql clone 'PDB1' "alter database datafile 19 online"; # recover and open database read only recover clone database tablespace "SYSTEM", "UNDOTBS1", "SYSAUX", "PDB1":"SYSTEM", "PDB1":"SYSAUX"; sql clone 'alter database open read only'; } executing Memory Script executing command: SET until clause sql statement: alter database datafile 1 online sql statement: alter database datafile 4 online sql statement: alter database datafile 3 online sql statement: alter database datafile 18 online sql statement: alter database datafile 19 online Starting recover at 28-NOV-14 using channel ORA_AUX_DISK_1 starting media recovery archived log for thread 1 with sequence 3 is already on disk as file /u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_27/o1_mf_1_3_b7g6ofl8_.arc archived log for thread 1 with sequence 4 is already on disk as file /u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_27/o1_mf_1_4_b7gnsgkh_.arc archived log for thread 1 with sequence 5 is already on disk as file /u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_28/o1_mf_1_5_b7gx6nrv_.arc archived log for thread 1 with sequence 6 is already on disk as file /u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_28/o1_mf_1_6_b7j0cj1f_.arc archived log for thread 1 with sequence 7 is already on disk as file /u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_28/o1_mf_1_7_b7j2yxg8_.arc archived log file name=/u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_27/o1_mf_1_3_b7g6ofl8_.arc thread=1 sequence=3 archived log file name=/u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_27/o1_mf_1_4_b7gnsgkh_.arc thread=1 sequence=4 archived log file name=/u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_28/o1_mf_1_5_b7gx6nrv_.arc thread=1 sequence=5 archived log file name=/u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_28/o1_mf_1_6_b7j0cj1f_.arc thread=1 sequence=6 archived log file name=/u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_28/o1_mf_1_7_b7j2yxg8_.arc thread=1 sequence=7 media recovery complete, elapsed time: 00:00:26 Finished recover at 28-NOV-14 sql statement: alter database open read only contents of Memory Script: { sql clone 'alter pluggable database PDB1 open read only'; } executing Memory Script sql statement: alter pluggable database PDB1 open read only contents of Memory Script: { sql clone "create spfile from memory"; shutdown clone immediate; startup clone nomount; sql clone "alter system set control_files = ''/u01/app/oracle/stage/CDB1/controlfile/o1_mf_b7j31vrk_.ctl'' comment= ''RMAN set'' scope=spfile"; shutdown clone immediate; startup clone nomount; # mount database sql clone 'alter database mount clone database'; } executing Memory Script sql statement: create spfile from memory database closed database dismounted Oracle instance shut down connected to auxiliary database (not started) Oracle instance started Total System Global Area 2684354560 bytes Fixed Size 2928008 bytes Variable Size 603980408 bytes Database Buffers 2063597568 bytes Redo Buffers 13848576 bytes sql statement: alter system set control_files = ''/u01/app/oracle/stage/CDB1/controlfile/o1_mf_b7j31vrk_.ctl'' comment= ''RMAN set'' scope=spfile Oracle instance shut down connected to auxiliary database (not started) Oracle instance started Total System Global Area 2684354560 bytes Fixed Size 2928008 bytes Variable Size 603980408 bytes Database Buffers 2063597568 bytes Redo Buffers 13848576 bytes sql statement: alter database mount clone database contents of Memory Script: { # set requested point in time set until scn 2326559; # set destinations for recovery set and auxiliary set datafiles set newname for datafile 37 to new; # restore the tablespaces in the recovery set and the auxiliary set restore clone datafile 37; switch clone datafile all; } executing Memory Script executing command: SET until clause executing command: SET NEWNAME Starting restore at 28-NOV-14 allocated channel: ORA_AUX_DISK_1 channel ORA_AUX_DISK_1: SID=22 device type=DISK channel ORA_AUX_DISK_1: starting datafile backup set restore channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set channel ORA_AUX_DISK_1: restoring datafile 00037 to /u01/app/oracle/stage/AFLC_PITR_PDB1_CDB1/datafile/o1_mf_scott_%u_.dbf channel ORA_AUX_DISK_1: reading from backup piece /u01/app/oracle/fast_recovery_area/CDB1/08D1916DA8ED7561E0536538A8C05650/backupset/2014_11_28/o1_mf_nnndf_TAG20141128T105303_b7j1oqc9_.bkp channel ORA_AUX_DISK_1: piece handle=/u01/app/oracle/fast_recovery_area/CDB1/08D1916DA8ED7561E0536538A8C05650/backupset/2014_11_28/o1_mf_nnndf_TAG20141128T105303_b7j1oqc9_.bkp tag=TAG20141128T105303 channel ORA_AUX_DISK_1: restored backup piece 1 channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01 Finished restore at 28-NOV-14 datafile 37 switched to datafile copy input datafile copy RECID=18 STAMP=864818463 file name=/u01/app/oracle/stage/AFLC_PITR_PDB1_CDB1/datafile/o1_mf_scott_b7j3b6gp_.dbf contents of Memory Script: { # set requested point in time set until scn 2326559; # online the datafiles restored or switched sql clone 'PDB1' "alter database datafile 37 online"; # recover and open resetlogs recover clone database tablespace "PDB1":"scott", "SYSTEM", "UNDOTBS1", "SYSAUX", "PDB1":"SYSTEM", "PDB1":"SYSAUX" delete archivelog; alter clone database open resetlogs; } executing Memory Script executing command: SET until clause sql statement: alter database datafile 37 online Starting recover at 28-NOV-14 using channel ORA_AUX_DISK_1 starting media recovery archived log for thread 1 with sequence 7 is already on disk as file /u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_28/o1_mf_1_7_b7j2yxg8_.arc archived log file name=/u01/app/oracle/fast_recovery_area/CDB1/archivelog/2014_11_28/o1_mf_1_7_b7j2yxg8_.arc thread=1 sequence=7 media recovery complete, elapsed time: 00:00:00 Finished recover at 28-NOV-14 database opened RMAN proceeds to perform an export of the table that needs to be recovered and the appropriate metadata. This export will result in a dump file. contents of Memory Script: { sql clone 'alter pluggable database PDB1 open'; } executing Memory Script sql statement: alter pluggable database PDB1 open contents of Memory Script: { # create directory for datapump import sql 'PDB1' "create or replace directory TSPITR_DIROBJ_DPDIR as '' /u01/app/oracle/stage''"; # create directory for datapump export sql clone 'PDB1' "create or replace directory TSPITR_DIROBJ_DPDIR as '' /u01/app/oracle/stage''"; } executing Memory Script sql statement: create or replace directory TSPITR_DIROBJ_DPDIR as ''/u01/app/oracle/stage'' sql statement: create or replace directory TSPITR_DIROBJ_DPDIR as ''/u01/app/oracle/stage'' Performing export of tables... EXPDP Starting "SYS"."TSPITR_EXP_AFlC_nBEi": EXPDP Estimate in progress using BLOCKS method... EXPDP Processing object type TABLE_EXPORT/TABLE/TABLE_DATA EXPDP Total estimation using BLOCKS method: 64 KB EXPDP Processing object type TABLE_EXPORT/TABLE/TABLE EXPDP Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS EXPDP Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER EXPDP . . exported "scott"."CASE" 5.070 KB 2 rows EXPDP Master table "SYS"."TSPITR_EXP_AFlC_nBEi" successfully loaded/unloaded EXPDP ****************************************************************************** EXPDP Dump file set for SYS.TSPITR_EXP_AFlC_nBEi is: EXPDP /u01/app/oracle/stage/tspitr_AFlC_71419.dmp Export completed contents of Memory Script: { # shutdown clone before import shutdown clone abort } executing Memory Script Oracle instance shut down The exported dump file, containing the recovered table, is then imported into the target database. Once the recovery processed is finished, we should disconnect from RMAN. Performing import of tables... IMPDP Master table "SYS"."TSPITR_IMP_AFlC_lfoq" successfully loaded/unloaded IMPDP Starting "SYS"."TSPITR_IMP_AFlC_lfoq": IMPDP Processing object type TABLE_EXPORT/TABLE/TABLE IMPDP Processing object type TABLE_EXPORT/TABLE/TABLE_DATA IMPDP . . imported "scott"."CASE" 5.070 KB 2 rows IMPDP Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS IMPDP Processing object type TABLE_EXPORT/TABLE/STATISTICS/MARKER IMPDP Job "SYS"."TSPITR_IMP_AFlC_lfoq" successfully completed at Fri Nov 28 11:22:30 2014 elapsed 0 00:00:05 Import completed Removing automatic instance Automatic instance removed auxiliary instance file /u01/app/oracle/stage/CDB1/datafile/o1_mf_temp_b7j38m84_.tmp deleted auxiliary instance file /u01/app/oracle/stage/CDB1/datafile/o1_mf_temp_b7j38c16_.tmp deleted auxiliary instance file /u01/app/oracle/stage/AFLC_PITR_PDB1_CDB1/onlinelog/o1_mf_3_b7j3bb0j_.log deleted auxiliary instance file /u01/app/oracle/stage/AFLC_PITR_PDB1_CDB1/onlinelog/o1_mf_2_b7j3b9gg_.log deleted auxiliary instance file /u01/app/oracle/stage/AFLC_PITR_PDB1_CDB1/onlinelog/o1_mf_1_b7j3b8p6_.log deleted auxiliary instance file /u01/app/oracle/stage/AFLC_PITR_PDB1_CDB1/datafile/o1_mf_scott_b7j3b6gp_.dbf deleted auxiliary instance file /u01/app/oracle/stage/CDB1/datafile/o1_mf_sysaux_b7j35zfg_.dbf deleted auxiliary instance file /u01/app/oracle/stage/CDB1/datafile/o1_mf_system_b7j35zfx_.dbf deleted auxiliary instance file /u01/app/oracle/stage/CDB1/datafile/o1_mf_sysaux_b7j322c3_.dbf deleted auxiliary instance file /u01/app/oracle/stage/CDB1/datafile/o1_mf_undotbs1_b7j322bt_.dbf deleted auxiliary instance file /u01/app/oracle/stage/CDB1/datafile/o1_mf_system_b7j3228c_.dbf deleted auxiliary instance file /u01/app/oracle/stage/CDB1/controlfile/o1_mf_b7j31vrk_.ctl deleted auxiliary instance file tspitr_AFlC_71419.dmp deleted Finished recover at 28-NOV-14 RMAN exit Recovery Manager complete. We will now connect with sqlplus to the Pluggable database (pdb1), with the SCOTT user. Once connected to SQL Plus, we will issue the following command to see if the table was successfully recovered. SQL conn scott/scott@12cdb:1521/pdb1 Connected. SQL select * from case; NAME ---------- ORACLE MYSQL Conclusion: Table has recovered seamlessly for Pluggable Database (PDB) using the option ‘Recover Table” from Recovery Manager (RMAN).
↧
Wiki Page: Loading Excel Spreadsheet Data into Oracle Database 11g
Loading Excel Spreadsheet Data into Oracle Database 11g A commonly asked question is “How to load Excel Spreadsheet data into Oracle Database?”. Oracle Loader for Hadoop does not support the Excel spreadsheet format directly. But, OLH does support an input format to which a Excel spreadsheet may be first saved, the oracle.hadoop.loader.examples.CSVInputFormat . The input format loads data from a comma-separated value (CSV) file. In this tutorial we shall store a Excel spreadsheet as a CSV file and subsequently load the CSV file into Oracle Database using OLH 3.0.0. This tutorial has the following sections. Setting the Environment Storing Excel Spreadsheet as CSV File Storing CSV File in HDFS Loading CSV File in Oracle Database Setting the Environment The following software is required for the tutorial. Oracle Database 11g Oracle Loader for Hadoop 3.0.0 CDH 4.6 Hadoop 2.0.0 Java 7 Microsoft Excel 2010 We have installed the software on Oracle Linux 6.5, which is a guest OS on Oracle Vrtual Box 4.3. First, create a Linux directory to install the software an set its permissions. mkdir /csv chmod -R 777 /csv cd /csv Download Java 7 from http://www.oracle.com/technetwork/java/javase/downloads/index.html and extract the file to the /csv directory. tar zxvf jdk-7u55-linux-i586.gz Download and install CDH 4.6 Hadoop 2.0.0 to the /csv directory also. wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.6.0.tar.gz tar -xvf hadoop-2.0.0-cdh4.6.0.tar.gz Create symlinks for the bin and conf directories. ln -s /csv/hadoop-2.0.0-cdh4.6.0/bin /csv/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce2/bin ln -s /csv/hadoop-2.0.0-cdh4.6.0/etc/hadoop /csv/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce2/conf Download Oracle Loader for Hadoop 3.0.0 from http://www.oracle.com/technetwork/database/database-technologies/bdc/big-data-connectors/downloads/index.html and extract the file to the /csv directory. unzip oraloader-3.0.0-h2.x86_64.zip Set the Hadoop configuration properties fs.defaultFS and hadoop.tmp.dir in the /csv/hadoop-2.0.0-cdh4.6.0/etc/hadoop/core-site.xml file. The fs.defaultFS property specifies the NameNode URI and the hadoop.tmp.dir property specifies the temporary directory used by a MapReduce job. ?xml-stylesheet type="text/xsl" href="configuration.xsl"? !-- Put site-specific property overrides in this file. -- configuration property name fs.defaultFS /name value hdfs://10.0.2.15:8020 /value /property property name hadoop.tmp.dir /name value file:///var/lib/hadoop-0.20/cache /value /property /configuration Remove any previously created tmp directory and create the tmp directory and set its permissions. rm -rf /var/lib/hadoop-0.20/cache mkdir -p /var/lib/hadoop-0.20/cache chmod -R 777 /var/lib/hadoop-0.20/cache Set the HDFS configuration properties dfs.permissions.superusergroup , dfs.namenode.name.dir , dfs.replication , and dfs.permissions in /csv/hadoop-2.0.0-cdh4.6.0/etc/hadoop/hdfs-site.xml file. The dfs.namenode.name.dir property specifies the NameNode storage directory and the dfs.replication property specifies the replication factor for storing file blocks. The other properties are permissions related. ?xml version="1.0" encoding="UTF-8"? ?xml-stylesheet type="text/xsl" href="configuration.xsl"? !-- Put site-specific property overrides in this file. -- configuration property name dfs.permissions.superusergroup /name value hadoop /value /property property name dfs.namenode.name.dir /name value file:///data/1/dfs/nn /value /property property name dfs.replication /name value 1 /value /property property name dfs.permissions /name value false /value /property /configuration Remove any previously created NameNode storage directory/ies and create the NameNode storage directory and set its permissions. rm -rf /data/1/dfs/nn mkdir -p /data/1/dfs/nn chmod -R 777 /data/1/dfs/nn Set the environment variables for Oracle Database, Java, Hadoop, and Oracle Loader for Hadoop in the bash shell. vi ~/.bashrc export HADOOP_PREFIX=/csv/hadoop-2.0.0-cdh4.6.0 export HADOOP_CONF=$HADOOP_PREFIX/etc/hadoop export OLH_HOME=/csv/oraloader-3.0.0-h2 export JAVA_HOME=/csv/jdk1.7.0_55 export ORACLE_HOME=/home/oracle/app/oracle/product/11.2.0/dbhome_1 export ORACLE_SID=ORCL export HADOOP_MAPRED_HOME=/csv/hadoop-2.0.0-cdh4.6.0/bin export HADOOP_HOME=/csv/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce2 export HADOOP_CLASSPATH=$HADOOP_HOME/*:$HADOOP_HOME/lib/*:$OLH_HOME/jlib/* export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_MAPRED_HOME:$ORACLE_HOME/bin export CLASSPATH=$HADOOP_CLASSPATH Create the Oracle Database table OE.WLSLOG to load the Excel spreadsheet data into. SQL CREATE TABLE OE.wlslog (time_stamp VARCHAR2(255), category VARCHAR2(255), type VARCHAR2(255), servername VARCHAR2(255), code VARCHAR2(255), msg VARCHAR2(255)); Storing Excel Spreadsheet as CSV File Create the Excel Spreadsheet that is to be loaded into Oracle Database. We have used the WebLogic Server log data as sample data. Select File Save As CSV (comma delimited) to store the Excel spreadsheet as a CSV file wlslog.csv . The wlslog.csv file has the following data. Apr-8-2014-7:06:16-PM-PDT,Notice,WebLogicServer,AdminServer,BEA-000365,Server state changed to STANDBY Apr-8-2014-7:06:17-PM-PDT,Notice,WebLogicServer,AdminServer,BEA-000365,Server state changed to STARTING Apr-8-2014-7:06:18-PM-PDT,Notice,WebLogicServer,AdminServer,BEA-000365,Server state changed to ADMIN Apr-8-2014-7:06:19-PM-PDT,Notice,WebLogicServer,AdminServer,BEA-000365,Server state changed to RESUMING Apr-8-2014-7:06:20-PM-PDT,Notice,WebLogicServer,AdminServer,BEA-000331,Started WebLogic AdminServer Apr-8-2014-7:06:21-PM-PDT,Notice,WebLogicServer,AdminServer,BEA-000365,Server state changed to RUNNING Apr-8-2014-7:06:22-PM-PDT,Notice,WebLogicServer,AdminServer,BEA-000360,Server started in RUNNING mode Storing CSV File in HDFS We need to store the CSV file wlslog.csv in HDFS before we may load the file data into Oracle Database. Format the Hadoop NameNode and start the HDFS (NameNode and DataNode). hadoop namenode -format hadoop namenode hadoop datanode Create a directory ( /wlslog ) in HDFS to store the CSV file and set its permissions. hdfs dfs -mkdir hdfs://localhost:8020/wlslog hadoop dfs -chmod -R g+w hdfs://localhost:8020/wlslog Put the wlslog.csv file into HDFS. hdfs dfs -put wlslog.csv hdfs://localhost:8020/wlslog We also need to add OLH to the runtime classpath in HDFS. Create a directory /csv in HDFS and set its permissions. hdfs dfs -mkdir hdfs://localhost:8020/csv hadoop dfs -chmod -R g+w hdfs://localhost:8020/csv Put the OLH directory into the /csv directory. hdfs dfs -put /csv/oraloader-3.0.0-h2 hdfs://localhost:8020/csv Loading CSV File in Oracle Database The oracle.hadoop.loader.examples.CSVInputFormat automatically assigns field names F0 , F1 , F2, .. Fn to the fields in a CSV file. The fields are of type String by default. A field “Fi” is loaded into Oracle Database only if either a COLUMN property is specified in configuration file for field “Fi” or if the target Oracle Database has a column named “Fi”. Two types of mapping is supported: automatic and manual. For automatic mapping the target Oracle Database table must have columns F0 , F1 , F2 , Fn corresponding to the input fields. If the target Oracle Database table has different column names than F0 , F1 , Fn , manual mapping must be used for which the oracle.hadoop.loader.loaderMap.columnNames must be set to the columns in the target database table. For column to field mapping the oracle.hadoop.loader.loaderMap.column_name.field property must be set for each column/field. Optionally the oracle.hadoop.loader.loaderMap.column_name.format property may be set to specify the data format for each column. Set the format class using the mapreduce.inputformat.class property in the configuration file. Set the target table using the oracle.hadoop.loader.loaderMap.targetTable property. Create a configuration file OraLoadConf.xml (the name is arbitrary). Set the following configuration properties in OraLoadConf.xml file. Property Value Description mapreduce.inputformat.class oracle.hadoop.loader.examples.CSVInputFormat The input format class. oracle.hadoop.loader.loaderMap.columnNames TIME_STAMP,CATEGORY,TYPE, SERVERNAME,CODE,MSG The Oracle Database column names. oracle.hadoop.loader.loaderMap.TIME_STAMP.field F0 The TIME_STAMP column is mapped to field F0. oracle.hadoop.loader.loaderMap.CATEGORY.field F1 The CATEGORY column is mapped to field F1. oracle.hadoop.loader.loaderMap.TYPE.field F2 The TYPE column is mapped to field F2. oracle.hadoop.loader.loaderMap.SERVERNAME.field F3 The SERVERNAME column is mapped to field F3. oracle.hadoop.loader.loaderMap.CODE.field F4 The CODE column is mapped to field F4. oracle.hadoop.loader.loaderMap.MSG.field F5 The MSG column is mapped to field F5. mapred.input.dir hdfs://localhost:8020/wlslog The input directory for the MapReduce job to load data from the CSV file. mapreduce.job.outputformat.class oracle.hadoop.loader.lib.output.JDBCOutputFormat The output format for Oracle Database. mapreduce.output.fileoutputformat.outputdir oraloadout The output directory. oracle.hadoop.loader.loaderMap.targetTable OE.WLSLOG The target Oracle Database table. oracle.hadoop.loader.connection.url jdbc:oracle:thin:@${HOST}:${TCPPORT}:${SID} The connection URL. TCPPORT 1521 The Oracle Database TCP port. HOST localhost The Oracle Database hostname. SID ORCL The Oracle Database SID. oracle.hadoop.loader.connection.user OE The Oracle Database user name. oracle.hadoop.loader.connection.password OE The Oracle Database password. The OraLoadConf.xml file is listed: ?xml version="1.0" encoding="UTF-8" ? configuration !-- Input settings -- property name mapreduce.inputformat.class /name value oracle.hadoop.loader.examples.CSVInputFormat /value /property property name oracle.hadoop.loader.loaderMap.columnNames /name value TIME_STAMP,CATEGORY,TYPE,SERVERNAME,CODE,MSG /value /property property name oracle.hadoop.loader.loaderMap.TIME_STAMP.field /name value F0 /value /property property name oracle.hadoop.loader.loaderMap.CATEGORY.field /name value F1 /value /property property name oracle.hadoop.loader.loaderMap.TYPE.field /name value F2 /value /property property name oracle.hadoop.loader.loaderMap.SERVERNAME.field /name value F3 /value /property property name oracle.hadoop.loader.loaderMap.CODE.field /name value F4 /value /property property name oracle.hadoop.loader.loaderMap.MSG.field /name value F5 /value /property property name mapred.input.dir /name value hdfs://localhost:8020/wlslog /value /property !-- Output settings -- property name mapreduce.job.outputformat.class /name value oracle.hadoop.loader.lib.output.JDBCOutputFormat /value /property property name mapreduce.output.fileoutputformat.outputdir /name value oraloadout /value /property !-- Table information -- property name oracle.hadoop.loader.loaderMap.targetTable /name value OE.WLSLOG /value /property !-- Connection information -- property name oracle.hadoop.loader.connection.url /name value jdbc:oracle:thin:@${HOST}:${TCPPORT}:${SID} /value /property property name TCPPORT /name value 1521 /value /property property name HOST /name value localhost /value /property property name SID /name value ORCL /value /property property name oracle.hadoop.loader.connection.user /name value OE /value /property property name oracle.hadoop.loader.connection.password /name value OE /value /property /configuration The oracle.hadoop.loader.examples.CSVInputFormat class is in the oraloader-examples.jar JAR. Include the oraloader-examples.jar with the –libjars option on the command line to run OLH. Run OLH with the following hadoop command using the configuration file specified with the –conf option. hadoop jar $OLH_HOME/jlib/oraloader.jar oracle.hadoop.loader.OraLoader -conf OraLoadConf.xml -libjars $OLH_HOME/jlib/oraloader-examples.jar The MapReduce job runs to load the CSV file data into Oracle Database table OE.WLSLOG . A detailed output from the OLH command is as follows. [root@localhost csv]# hadoop jar $OLH_HOME/jlib/oraloader.jar oracle.hadoop.loader.OraLoader -conf OraLoadConf.xml -libjars $OLH_HOME/jlib/oraloader-examples.jar Oracle Loader for Hadoop Release 3.0.0 - Production Copyright (c) 2011, 2014, Oracle and/or its affiliates. All rights reserved. 14/09/15 09:53:47 INFO loader.OraLoader: Oracle Loader for Hadoop Release 3.0.0 - Production Copyright (c) 2011, 2014, Oracle and/or its affiliates. All rights reserved. 14/09/15 09:53:47 INFO loader.OraLoader: Built-Against: hadoop-2.2.0-cdh5.0.0-beta-2 hive-0.12.0-cdh5.0.0-beta-2 avro-1.7.3 jackson-1.8.8 14/09/15 09:54:11 INFO loader.OraLoader: oracle.hadoop.loader.loadByPartition is disabled because table: WLSLOG is not partitioned 14/09/15 09:54:11 INFO loader.OraLoader: oracle.hadoop.loader.enableSorting disabled, no sorting key provided 14/09/15 09:54:11 INFO loader.OraLoader: Reduce tasks set to 0 because of no partitioning or sorting. Loading will be done in the map phase. 14/09/15 09:54:11 INFO output.DBOutputFormat: Setting map tasks speculative execution to false for : oracle.hadoop.loader.lib.output.JDBCOutputFormat 14/09/15 09:54:17 INFO loader.OraLoader: Sampling time=0D:0h:0m:0s:905ms (905 ms) 14/09/15 09:54:17 INFO loader.OraLoader: Submitting OraLoader job OraLoader 14/09/15 09:54:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 14/09/15 09:54:20 INFO input.FileInputFormat: Total input paths to process : 1 14/09/15 09:54:20 INFO mapreduce.JobSubmitter: number of splits:1 14/09/15 09:54:21 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local100120898_0001 14/09/15 09:54:31 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 14/09/15 09:54:31 INFO mapred.LocalJobRunner: OutputCommitter set in config null 14/09/15 09:54:32 INFO mapred.LocalJobRunner: OutputCommitter is oracle.hadoop.loader.lib.output.DBOutputCommitter 14/09/15 09:54:32 INFO mapred.LocalJobRunner: Waiting for map tasks 14/09/15 09:54:32 INFO mapred.LocalJobRunner: Starting task: attempt_local100120898_0001_m_000000_0 14/09/15 09:54:33 INFO loader.OraLoader: map 0% reduce 0% 14/09/15 09:54:34 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/09/15 09:54:34 INFO mapred.MapTask: Processing split: hdfs://localhost:8020/wlslog/wlslog.csv:0+724 14/09/15 09:54:35 INFO output.DBOutputFormat: conf prop: defaultExecuteBatch: 100 14/09/15 09:54:35 INFO output.DBOutputFormat: conf prop: loadByPartition: false 14/09/15 09:54:37 INFO output.DBOutputFormat: Insert statement: INSERT INTO "OE"."WLSLOG" ("TIME_STAMP", "CATEGORY", "TYPE", "SERVERNAME", "CODE", "MSG") VALUES (?, ?, ?, ?, ?, ?) 14/09/15 09:54:37 INFO mapred.LocalJobRunner: 14/09/15 09:54:39 INFO mapred.Task: Task:attempt_local100120898_0001_m_000000_0 is done. And is in the process of committing 14/09/15 09:54:39 INFO mapred.LocalJobRunner: 14/09/15 09:54:39 INFO mapred.Task: Task attempt_local100120898_0001_m_000000_0 is allowed to commit now 14/09/15 09:54:39 INFO output.JDBCOutputFormat: Committed work for task attempt attempt_local100120898_0001_m_000000_0 14/09/15 09:54:39 INFO mapred.LocalJobRunner: map 14/09/15 09:54:39 INFO output.FileOutputCommitter: Saved output of task 'attempt_local100120898_0001_m_000000_0' to hdfs://10.0.2.15:8020/user/root/oraloadout/_temporary/0/task_local100120898_0001_m_000000 14/09/15 09:54:39 INFO mapred.LocalJobRunner: map 14/09/15 09:54:39 INFO mapred.Task: Task 'attempt_local100120898_0001_m_000000_0' done. 14/09/15 09:54:39 INFO mapred.LocalJobRunner: Finishing task: attempt_local100120898_0001_m_000000_0 14/09/15 09:54:39 INFO mapred.LocalJobRunner: Map task executor complete. 14/09/15 09:54:40 INFO loader.OraLoader: map 100% reduce 0% 14/09/15 09:54:40 INFO loader.OraLoader: Job complete: OraLoader (job_local100120898_0001) 14/09/15 09:54:40 INFO loader.OraLoader: Counters: 23 File System Counters FILE: Number of bytes read=10411605 FILE: Number of bytes written=10725736 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=9777663 HDFS: Number of bytes written=9769986 HDFS: Number of read operations=238 HDFS: Number of large read operations=0 HDFS: Number of write operations=36 Map-Reduce Framework Map input records=7 Map output records=7 Input split bytes=104 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=312 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=20082688 File Input Format Counters Bytes Read=724 File Output Format Counters Bytes Written=1624 [root@localhost csv]# Run a SELECT query in SQL*Plus to select-list the loaded data. The 7 rows in the CSV file are loaded as 7 rows in the Oracle Database table OE.WLSLOG . In this tutorial we loaded data from a Excel spreadsheet generated CSV file into a Oracle Database table.
↧
↧
Blog Post: Create Histograms On Columns That Already Have One
RSS content The default value of METHOD_OPT from 10g onwards is ‘FOR ALL COLUMNS SIZE AUTO’. The definition of AUTO as per Oracle documentation is : AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns. This basically implies that Oracle will automatically create histograms on those columns which have skewed data distribution and there are SQL statements referencing those columns. However, this gives rise to the problem is that Oracle generates too many unnecessary histograms . Let’s demonstrate: – Create a table with skewed data distribution in two columns SQL drop table hr.skewed purge; create table hr.skewed ( empno number, job_id varchar2(10), salary number); insert into hr.skewed select employee_id, job_id, salary from hr.employees; – On gathering statistics for the table using default options, it can be seen that histogram is not gathered on any column although data distribution in columns JOB_ID and SALARY is skewed SQL exec dbms_stats.gather_table_stats('HR','SKEWED'); col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY NONE SKEWED JOB_ID NONE SKEWED EMPNO NONE – Let’s now issue some queries querying the table based on the three columns in the table followed by statistics gathering to verify that histograms get automatically created only on columns with skewed data distribution. – No histogram gets created if column EMPNO is queried which has data distributed uniformly SQL select * from hr.skewed where empno = 100 ; exec dbms_stats.gather_table_stats('HR','SKEWED'); col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY NONE SKEWED JOB_ID NONE SKEWED EMPNO NONE – A histogram gets created on JOB_ID column as soon as we search for records with a JOB_ID as data distribution is non-uniform in JOB_ID column SQL select * from hr.skewed where job_id = 'CLERK' ; exec dbms_stats.gather_table_stats('HR','SKEWED'); col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY NONE SKEWED JOB_ID FREQUENCY SKEWED EMPNO NONE – A histogram gets created on SALARY column when search is made for employees drawing salary more than 10000 as data distribution is non-uniform in SALARY column. SQL select * from hr.skewed where salary 10000 ; exec dbms_stats.gather_table_stats('HR','SKEWED'); col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY FREQUENCY SKEWED JOB_ID FREQUENCY SKEWED EMPNO NONE Thus gathering statistics using default options, manually or as part of the automatic maintenance task, might lead to creation of histograms on all such columns which have skewed data distribution and had been part of the search clause even once. That is, Oracle makes even the histograms you didn’t ask for. Some of the histograms might not be needed by the application and hence are undesirable as computing histograms is a resource intensive operation and moreover they might degrade the performance as a result of their interaction with bind peeking. Solution Employ FOR ALL COLUMNS SIZE REPEAT option of METHOD_OPT parameter which prevents deletion of existing histograms and collects histograms only on the columns that already have histograms. First step is to eliminate unwanted histograms and have histograms only on the desired columns. Well, there are two options: OPTION-I: Delete histograms from unwanted columns and use REPEAT option henceforth which Collects histograms only on the columns that already have histograms. – Delete unwanted histogram for SALARY column SQL exec dbms_stats.gather_table_stats('HR','SKEWED', - METHOD_OPT = 'for columns salary size 1' ); -- Verify that histogram for salary column has been deleted col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY NONE SKEWED JOB_ID FREQUENCY SKEWED EMPNO NONE – Issue a SQL with salary column in where clause and verify that gathering stats using repeat option retains histogram on JOB_ID column and does not cause histogram to be created on salary column. SQL select * from hr.skewed where salary 10000 ; exec dbms_stats.gather_table_stats('HR','SKEWED',- METHOD_OPT = 'for columns salary size REPEAT '); col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY NONE SKEWED JOB_ID FREQUENCY SKEWED EMPNO NONE OPTION-II: Wipe out all histograms and manually add only the desired ones. Use REPEAT option henceforth which Collects histograms only on the columns that already have one. – Delete histograms on all columns SQL exec dbms_stats.gather_table_stats('HR','SKEWED',- METHOD_OPT = 'for all columns size 1 '); – Verify that histograms on all columns have been dropped SQL col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY NONE SKEWED JOB_ID NONE SKEWED EMPNO NONE – Create histogram only on the desired JOB_ID column SQL exec dbms_stats.gather_table_stats('HR','SKEWED',- METHOD_OPT = 'for columns JOB_ID size AUTO '); – Verify that histogram has been created on JOB_ID SQL col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY NONE SKEWED JOB_ID FREQUENCY SKEWED EMPNO NONE - Verify that gathering stats using repeat option creates histogram only on JOB_ID column on which it already exists SQL exec dbms_stats.gather_table_stats('HR','SKEWED',- METHOD_OPT = 'for columns salary size REPEAT '); SQL col table_name for a10 col column_name for a10 select TABLE_NAME,COLUMN_NAME,HISTOGRAM from dba_tab_columns where table_name = 'SKEWED'; TABLE_NAME COLUMN_NAM HISTOGRAM ---------- ---------- --------------- SKEWED SALARY NONE SKEWED JOB_ID FREQUENCY SKEWED EMPNO NONE That is, now Oracle will no longer make histograms you didn’t ask for. – Finally, change the preference for METHOD_OPT parameter of automatic stats gathering job from default value of AUTO to REPEAT so that it will gather histograms only for the columns already having one. – Get Current value – SQL select dbms_stats.get_prefs ('METHOD_OPT') from dual; DBMS_STATS.GET_PREFS('METHOD_OPT') ----------------------------------------------------------------------- FOR ALL COLUMNS SIZE AUTO – Set preference to REPEAT– SQL exec dbms_stats.set_global_prefs ('METHOD_OPT',' FOR ALL COLUMNS SIZE REPEAT '); – Verify – SQL select dbms_stats.get_prefs ('METHOD_OPT') from dual; DBMS_STATS.GET_PREFS('METHOD_OPT') ----------------------------------------------------------------------- FOR ALL COLUMNS SIZE REPEAT From now onwards, gathering of statistics, manually or automatically will not create any new histograms while retaining all the existing ones. I hope this post is useful. Happy reading…. References: https://blogs.oracle.com/optimizer/entry/how_does_the_method_opt http://www.pythian.com/blog/stabilize-oracle-10gs-bind-peeking-behaviour/ https://richardfoote.wordpress.com/2008/01/04/dbms_stats-method_opt-default-behaviour-changed-in-10g-be-careful/ ———————————————————————————————– Related Links: Home Database Index Tuning Index ——————————————————————————————— Tags: Del.icio.us Digg Comments: 0 (Zero), Be the first to leave a reply! You might be interested in this: 11G R2 RAC: GPNP PROFILE DEMYSTIFIED 11G DATAGUARD: FLASHBACK STANDBY AFTER RESETLOGS ON PRIMARY IDENTIFY THE DATABASE OWNING A CPU INTENSIVE PROCESS 11g R2 RAC: HOW TO FIND THE RESOURCE MASTER? PARALLEL_MIN_SERVERS Copyright © ORACLE IN ACTION [ Create Histograms On Columns That Already Have One ], All Right Reserved. 2014. The post Create Histograms On Columns That Already Have One appeared first on ORACLE IN ACTION .
↧
Blog Post: El verdadero secreto del VI
El viejo y famoso editor vi es como Diego Armando Maradona. Por un lado están sus amantes incondicionales; por otro, sus acérrimos enemigos. Hace tiempo, un evangelizador del vi me dijo: - “ Si no utilizas correctamente el vi, sientes como si el teclado estuviera endemoniado. El verdadero secreto para una relación pacífica con el vi radica en un adecuado proceso de iniciación ”. Al principio, esta argumentación me resultó un tanto esotérica. Pero luego me di cuenta de que es la pura verdad. El gran secreto del vi está en prestar atención a lo que nos dicen los manuales en sus primerísimas páginas. Sin embargo, como en el cuento “La carta robada” de E.A. Poe, el lugar más obvio es el que se pasa por alto más fácilmente. ¿Y qué dicen los cursos y manuales en sus inicios? Durante una sesión de edición con vi, siempre estaremos en uno de los siguientes tres modos posibles: Modo comando Modo entrada Modo última línea o externo De acuerdo con el modo en que estemos, las teclas cumplirán funciones diferentes. Si estamos en modo comando, las teclas nos servirán para recorrer el archivo y desplazarnos con el cursor Si estamos en modo entrada ingresaremos los caracteres que tipeamos Si estamos en modo externo, podremos escribir diversos comandos en la última línea del editor ¿Cómo hacemos para pasar de un modo a otro? El siguiente gráfico de estados nos da la respuesta. Durante nuestro proceso de iniciación, el uso del vi requiere de un uso “extremadamente consciente”, en el que tenemos que estar muy atentos y pensar detenidamente en qué tecla hay que “tocar”. Pero a medida que lo seguimos utilizando, el cambio de un modo a otro comienza a hacerse de forma mas “inconsciente” y el uso del vi “fluye”. Si estás en el bando de los enemigos acérrimos, ¿por qué no le das una nueva oportunidad al vi?
↧
Comment on 12cR1 Upgrade in Oracle Applications EBS R12.2
Nice piece of work, Appreciated!!!
↧
Wiki Page: Object Types and Subtypes
This article teaches you how to use subtypes or subclasses. You can define an object type with or without dependencies. Object types can have two types of dependencies. The simplest case occurs when you define an object attribute with an object type instead of a data type. The more complex case occurs when you define an object subtype because it inherits the behavior of the base object type. The base object type is a superclass and a parent class. The subtype is a subclass and a child class. The ability to capture various result sets is a key use case for object types and subtypes. That’s because you can define a table’s column with the object type, and then you can store in that column the object type or any of its subtypes. A base object type should contain a unique identifier and an object name. The “Object Types & Bodies Basic” article explains the best practice for unique identifiers. It suggests that you populate the unique ID value with a no argument constructor function. The object name attribute should hold the object type name. I’d like to suggest we consider base_t as the name of our superclass. You can define a base_t object type like this: SQL CREATE OR REPLACE 2 TYPE base_t IS OBJECT 3 ( obj_id NUMBER 4 , obj_name VARCHAR2(30) 5 , CONSTRUCTOR FUNCTION base_t RETURN SELF AS RESULT 6 , MEMBER FUNCTION to_string RETURN VARCHAR2 ) 7 INSTANTIABLE NOT FINAL; 8 / Line 2 and 3 define two attributes. They are the unique identifier, or ID, and the object. The no argument constructor function assigns values to the obj_id and obj_name attributes. It assigns the base_t_s sequence value to the obj_id attribute and it assigns a string literal to the obj_name attribute. The to_string member function returns a concatenated string of the obj_id and obj_name values. The return value of the to_string function is what you want to disclose about the contents of an object type. Line 7 declares the class as instantiable and not final. You can create an instance of a class when its instantiable, and you can create subtypes of a type when it’s NOT FINAL. You need to create the base_t_s sequence before we can compile the base_t object body. The following statement creates the base_t_s sequence as a set of values starting at 1: SQL CREATE SEQUENCE base_t_s; The object body for the base_t object type is: SQL CREATE OR REPLACE 2 TYPE BODY base_t IS 3 4 /* Default constructor. */ 5 CONSTRUCTOR FUNCTION base_t RETURN SELF AS RESULT IS 6 BEGIN 7 /* Assign a sequence value and string literal 8 to the instance. */ 9 self.obj_id := base_t_s.NEXTVAL; 10 self.obj_name := 'BASE_T'; 11 RETURN; 12 END; 13 14 /* A text output function. */ 15 MEMBER FUNCTION to_string RETURN VARCHAR2 IS 16 BEGIN 17 RETURN 'UID#: ['||obj_id||']'||CHR(10) 18 || 'Type: ['||obj_name||']'; 19 END; 20 / Line 9 assigns a base_t_s sequence value to the obj_id attribute, which serves as a unique identifier. Line 10 assigns a string literal to the obj_name attribute. The obj_name attribute identifies the object type. Line 17 and 18 prints the contents of the base_t object type as a two-row string. You can test the construction of the base_t object type with this query: SQL SELECT base_t() FROM dual; It displays: BASE_T()(OBJ_ID, OBJ_NAME) ---------------------------- BASE_T(1, 'BASE_T') Alternatively, you can test the to_string member function with the TREAT function, like: SQL SELECT TREAT(base_t() AS base_t).to_string() AS "Text" 2 FROM dual It displays: Text ---------------- UID#: [2] Type: [BASE_T] Alternatively, you can test to_string member function with an anonymous block (by enabling SERVEROUTPUT): SQL SET SERVEROUTPUT ON SIZE UNLIMITED SQL BEGIN 2 dbms_output.put_line(base_t().to_string); 3 END; 4 / The anonymous block displays: UID#: [3] Type: [BASE_T] There’s another way to query the object instance with a query. While I don’t think it’s effective for this situation, you should know how the syntax works. It requires that you create a collection of the base_t object type, which you can do with this syntax: SQL CREATE OR REPLACE 2 TYPE base_t_tab IS TABLE OF base_t; 3 / You can query the base_t object type from inside a collection by using the CAST and COLLECT functions. The COLLECT function puts a single object instance into a base_t_tab collection. The CAST function puts the generic collection into a specific collection. The syntax to perform this operation is: SQL COLUMN obj_id FORMAT 9999 SQL COLUMN obj_name FORMAT A20 SQL SELECT * 2 FROM TABLE(SELECT CAST(COLLECT(base_t()) as base_t_tab) 3 FROM dual); Assuming the base_t_s sequence holds a current value of 3, the query returns: OBJ_ID OBJ_NAME ------ -------------------- 5 BASE_T This type of query isn’t too useful in day-to-day programming. It’s more of a corner use case for testing an object type with a sequence value. While you expect an obj_id value of 4, the query returns a value of 5. Somewhere in the execution Oracle appears to call the sequence twice. The COLLECT and TREAT functions increment the value of sequence when you put them inside object types. So, you shouldn’t use a sequence as a unique identifier inside an object type. I plan to cover the better approach in subsequent article. Now that you have a solid base_t object, let’s create a hobbit_t subtype. The hobbit_t subtype adds one attribute to the two attributes in the base_t object type. The following declares the hobbit_t object type as a subtype and overrides the to_string member function: SQL CREATE OR REPLACE 2 TYPE hobbit_t UNDER base_t 3 ( hobbit_name VARCHAR2(30) 4 , CONSTRUCTOR FUNCTION hobbit_t 5 ( hobbit_name VARCHAR2 ) RETURN SELF AS RESULT 6 , OVERRIDING MEMBER FUNCTION to_string RETURN VARCHAR2) 7 INSTANTIABLE NOT FINAL; 8 / Line 2 declares the hobbit_t subtype as UNDER the base_t object type. There isn’t a no argument constructor that mirrors the parent base_t object type. You also can’t call the parent type’s constructor like they do in Java. Line 4 and 5 declare a single argument constructor. The hobbit_t object type’s constructor assigns values to the obj_id and obj_name attributes. More or less it performs the same function as its parent’s constructor. Then, the constructor assigns the parameter value to the hobbit_name attribute of the hobbit_t object type. Line 6 declares an overriding to_string member function. The overriding to_string member function replaces the behavior of our parent class. It provides the subclass with its own a specialized behavior. You implement the hobbit_t object type like this: SQL CREATE OR REPLACE 2 TYPE BODY hobbit_t IS 3 4 /* One argument constructor. */ 5 CONSTRUCTOR FUNCTION hobbit_t 6 ( hobbit_name VARCHAR2 ) RETURN SELF AS RESULT IS 7 BEGIN 8 /* Assign a sequence value and string literal 9 to the instance. */ 10 self.obj_id := base_t_s.NEXTVAL; 11 self.obj_name := 'HOBBIT_T'; 12 13 /* Assign a parameter to the subtype only attribute. */ 14 self.hobbit_name := hobbit_name; 15 RETURN; 16 END; 17 18 /* An output function. */ 19 OVERRIDING MEMBER FUNCTION to_string RETURN VARCHAR2 IS 20 BEGIN 21 RETURN (self AS base_t) .to_string||CHR(10) 22 || 'Name: ['||hobbit_name||']'; 23 END; 24 END; 25 / Lines 10 assigns a sequence value to the obj_id attribute. Line 11 assigns a string literal to the obj_name attribute. Line 14 assigns the parameter value of the constructor to the hobbit_name attribute of the hobbit_t subtype. Line 21 is more complex than a simple assignment. Line 21 contains a “generalized invocation” of the base_t object. A generalized invocation calls a parent or super class method. PL/SQL member functions or procedures are methods. Line 21 calls the base_t type’s to_string function. This way, the overriding to_string function returns a specialized result. It returns the result from the parent class and the value of its own hobbit_name attribute. You can test the generalized invocation with the following query: SQL SELECT 2 TREAT( 3 hobbit_t('Bilbo') AS hobbit_t).to_string() AS "Text" 4 FROM dual; The query prints: Text ----------------------- UID#: [1] Type: [HOBBIT_T] Name: [Bilbo] Together we’ve explored of how you create types and subtypes. You’ve learned a type is a generalization or superclass, and a subtype is a specialization or subclass. You’ve also learned how to create both a generalization and specialization. At this point, you may ask, “Why should I bother with subtypes?” The benefit of subtypes is dynamic dispatch. Dynamic dispatch is the process of selecting an object type from an inverted tree of object types. The topmost object type is the root node or most generalized version of an object type. The bottommost object type is a leaf node or the most specialized version of an object type. All nodes between the root node and leaf nodes are simply nodes. Nodes become more specialized as you step down the hierarchy from the root node. The process of selecting an object type from an inverted tree is polymorphism. Polymorphism means your program specifies the most general node at compile time. Then, the program accepts the root node or any subordinate nodes at runtime. Moreover, dynamic dispatch is like writing a function or procedure to do many things. Another form of dynamic dispatch occurs when you overload a function or procedure in a PL/SQL package. Calls to overloaded functions or procedure choose which version to run based on the data types of the call parameters. The key difference between overloading and selecting object types is simple. The first deals with choosing between different data types or object types. The second deals with choosing between object types in the same node tree. You have two choices to demonstrate dynamic dispatch. One would use a SQL table or varray collection and the other would use column substitutability. Creating a table that uses substitutability seems the easiest approach. The following creates a table of the base_t object type: SQL CREATE TABLE dynamic 2 ( character_type BASE_T); You can now insert a base_t object type or any of the base_t subtypes. The base_t_s sequence is reset for the test case INSERT statements: SQL INSERT INTO dynamic VALUES (base_t()); SQL INSERT INTO dynamic VALUES (hobbit_t('Bilbo Baggins')); SQL INSERT INTO dynamic VALUES (hobbit_t('Peregrin Took')); The following query uses a CASE statement to identify whether the column returns a base_t or hobbit_t object type: SQL SELECT 2 CASE 3 WHEN TREAT(character_type as hobbit_t) IS NOT NULL THEN 4 TREAT(character_type AS hobbit_t).to_string() 5 ELSE 6 TREAT(character_type AS base_t).to_string() 7 END AS "Text" 8 FROM dynamic; The query returns the following: Text ----------------------- UID#: [3] Type: [BASE_T] UID#: Type: [HOBBIT_T] Name: [Bilbo Baggins] UID#: [13] Type: [HOBBIT_T] Name: [Peregrin Took] The result set shows you that the character_type column holds different types of the base_t object type. It should also show you how you may store different result logs from DML row level triggers in a single table. Another article, I hope to write soon. The unique identifier appears to increment three times with the first INSERT statement and five times with subsequent inserts. Actually, each INSERT statement increments the sequence five times. A debug statement would show you that it assigns the third call to the .NEXTVAL pseudo column value to the obj_id value. This is true for both the base_t and hobbit_t object type, and any other derived subtypes. This article has shown you how to implement object types and subtypes. It also has explained how dynamic dispatch works and it provides a working example of dynamic dispatch leveraging column substitutability.
↧
↧
Blog Post: Database Professional's Introduction to Records Management
Virtually every medium to large organization has Information Management policies and procedures in place that are very important to its core business activities. Now, because all (or almost all) of your information is stored in the database - chances are that you, as a database professional, will be called upon to help support and implement these policies and procedures or manage related data. As always, combining business and technical knowledge will make you more valuable to your organization, up your scores with decision makers and take your career to the next level. That is why this straight-to-the point, quick introduction may give you a rocket boost on your current or your next project. It’s also a quick read. Below you’ll find the key concepts and principles of Records Management - that will help you speak the same language and understand what your Information Management and Records People are talking about. As I said, almost every organization has policies and procedures in place - to ensure compliance with regulations and to make certain that key information is retained, kept accessible and that it is safely preserved or destroyed after a specified period of time. This is all there is to it! Let’s start by reviewing the main reasons why most organizations do implement Records Management in the first place and what results should you expect out of a successful implementation. Why Organizations use Record Management? Essentially, there are three main reasons: 1. Business needs. Here’re the most popular a. Flag outdated content for deletion or archival, b. Synchronize physical (paper, CDs microfiche and so on) and electronic (scanned PDF etc.) versions of content c. Keep track of the physical files, folders and boxes floating around your offices. d. Ensure the content is retained for the period that it should be – regardless of the source or the system that stores it. e. Share content across organization more effectively 2. Compliance . Ensure regulatory compliance (SOX, SEC, industry regulations) by reliably storing the content for required periods of time and consistently following the processing rules, such as destruction, review or archival – when this period is over. 3. Litigation support . Systematically disposing of certain types of documents, when you’re no longer required to keep them may reduce your organization’s risk of being sued. You may also choose to freeze some important content, related to legal action you’re about to take – so it doesn’t get “accidentally” updated or destroyed by a disgruntled employee. Where Records Management Software fits in Records Management Software can only be successful if it can ensure that documents are preserved for just as long as they need to stick around - and not longer - no matter where they actually reside. Same policies needs to apply to vendor invoices in your SharePoint, email archive and historic ones stored in a file share. To accomplish that, your Records Management system acts as an overarching layer on top of your other enterprise systems. This way it allows you to apply records management policies and practices on external and remote repositories - file systems, other CMS (content management system’s) repositories and email archives, as well as your non-records content. This is usually accomplished through adapters that allow your Records Management tool to interact with your other repositories. For example, Oracle WebCenter Records comes with adapters for File System, ECM Documentum, IBM FileNet, MS SharePoint and others, allowing you manage retention of all of your enterprise’s content. Understanding the concepts and terminology Here’s the light speed introduction to just the essential Records Management terms you’ll need. Retention Retention is the act of keeping information for specified periods of time, depending on the type of content or depending on applicable laws or regulations. The most common types of retention are time-based (i.e. keeping the content for 10 years from the date of filing), event-based (i.e. employee terminated) and a combination of the two. Disposition After retention period is over, authorized people within the organization take appropriate action to dispose of the content. Please note that most Records Management tools do not simply delete content when it no longer need to be retained. Disposition does not always mean deletion. Yes, content may need to be physically or electronically destroyed after a period of retention, but it can also be stored internally in an archive location or transferred to external storage facility. Disposition is there for defined as a set of disposition instructions that define when and how the content will be disposed of. Disposition instructions are based on the time periods and triggers – item status changes or simple act of previous disposition action completing. Record Content Record content, as opposed to regular managed content, is managed on a retention schedule . It is being marked as record on filing, retained according to retention rules and then goes into disposition. As shown on the diagram, the item remains constant until the Cutoff , and then it enters Disposition. Once again, cut off may be time– or event-based. The filing date may be the item’s check in date (which is the most common case) or it may be the date that it is marked as being retained. Retention Schedule Retention Schedule, also known as File Plan, is a hierarchy that defines specifically how the content is actually retained. Retention schedule is defined on 3 levels: ● Retention Categories – is the place where disposition instructions and security settings are actually applied. ● Series is a hierarchy that allows you to organize your retention categories (Retention Categories cannot be nested). Series can also be hidden which allows you to prevent your contributors from filing content into the work-in-progress structures. ● Record Folders is a level below Retention Categories that allows you to organize items with similar retention and security. Record Folders can be nested. So here’s how the entire Retention Schedule looks in Oracle WebCenter Records user interface: Freeze Items in a record folder or individual content items may be frozen which prevents them from being updated, deleted or destroyed. This is useful when you need to comply with audit or litigation requirements. Internal and External Retained Content Exactly as it sounds, Internal content stored in in your Records Management system. External content can either be physical, like filing boxes, DVD media and microfiche or it can be electronic content stored externally, for instance in email archive, shared drive and so on. Classified Content Classified content is sensitive information that requires protection against unauthorized disclosure. Unclassified content is the opposite and Declassified content is the content that used to be classified in the past. Typical classifications include “Top Secret”, “Secret”, “Confidential” and “Unclassified”. DoD 5015 DoD 5015 is the software standard that provides implementation guidance for the management of records in US Department of Defense. It is a de-facto industry standard and it establishes requirements for managing classified records and support of the Freedom of Information Act, Privacy Act, and interoperability. NARA NARA is a US National Archives and Records Administration. The DoD 5015 standard was originally developed specifically for software vendors wanting to sell to NARA. Later on, the standard was widely adopted and used across industry. Creating Retention Schedule and Disposition To create Retention Schedule Information Management (IM) people start by creating your File Plan Hierarchy - Series, Retention Categories and Record folders. Then they may add or customize triggers, time periods, and freezes. When that’s complete, the meat and potatoes of creating a retention schedule comes down to defining disposition instructions for each type of your Record Folders. Conclusion There you have it. Now you know what Records and Retention Management is and why organizations implement it. You know where typical Records Management software fits within the Enterprise and you should have a pretty good idea about the key concepts and terminology. You should now be prepared and be more helpful and play a more active role in projects that involve Records and Information Management.
↧
Blog Post: Invoker Rights for PL/SQL Compiler Options
Invoker Rights Hi! Another blog on PL/SQL compiler options. Lets discuss invoker rights. Invoker rights controls the ownership of the objects the procedure/function/package will look at based on the option used. There are 2 options: Definer Current_User The default setting is ‘Definer’. This tells Oracle to use the creator (definer) of the PL/SQL code’s referenced objects. The setting Current_User allows everyone to share 1 set of PL/SQL programs but to access similarly-named objects but with their own data. IMAGE1 Note the above illustration. Using the default Invoker Rights setting…anyone accessing user1.my_proc will also access the user1.employee table. IF user2 wanted to use the user1.my_proc program to access his own employee table, he would have to have resource privileges and access to the my_proc code to compile it into his schema. Now, there would be 2 versions of the same program. The maintenance nightmare begins! Using the setting Current_User, this problem is eliminated. The user1.my_proc will use the employee table belonging to the user who is executing the my_proc program. Now, everyone can use 1 set of executables and have their own data appear in the reports (or what ever my_proc does…). The syntax: Create [or replace] Procedure (or Function or Package) [AUTHID { DEFINER | CURRENT_USER } ] … When you are using Current_User, you inherit any privileges that the owner of the object has. This is how Oracle has implemented Definer Rights. To follow the same behavior, Oracle12 now has ‘Grant Inherit any Privilege to public’…issued to each user. DBA’s should consider revoking this and granting specific INHERIT privileges to those users who are using/sharing common objects. This privilege might leave an opportunity for SQL injection…maybe another blog!!! I hope you find this information useful in your day-to-day use of the Oracle RDBMS. Dan Hotka Oracle ACE Director Instructor/Author/CEO
↧
Blog Post: EM12c and the Optimizer Statistics Console
Today we’re going to review another great feature in the EM12c that you may not have realized was available. Once logged into a database target, click on the Performance menu and navigate to the Optimizer Statistics Console: Optimizer Statistics Console Page The new console page is clean, easy to navigate and has great access points to manage and monitor optimizer statistics for the given database target. We’ll actually start at the bottom of the page with the Statistics Status and go up into the links. Viewing the graph, you get a quick and clear idea of the status of your statistics for the database target you are logged into. You can easily see if there are any stale stats that may be impacting performance and if there are any missing stats. You are shown how many objects are involved in the status category and can then move your way up into the links to review and manage your database statistics configuration. Operations View We’re going to go through the Operations by order of logic and not by order in the console, so we’ll start with View. This link will take you to a great little report console that will display information about statistics in the database. Even though our example will display results for Stale statistics, note the other great filters for the report: As we want to see everything, we’re not going to choose any other filters for our report until we get to the bottom and have the options of Current, Pending or All for our Scope We’re going to change it to All considering the version of database is 11.2.0.4 and we could have pending statistics waiting to be implemented. The report quickly showed that both data dictionary and fixed objects were stale, (schemas are up to date!) so we could multi-select objects on the left of the report and gather stats, (along with other options) or we could use the next section we’ll be covering to gather those stats in an EM job and address the stale statistics issue in what I feel, is a more user friendly interface. Gather Back in the Optimizer Statistics Console , we can click on the Gather link, you will be taken directly to the Gather Statistics Wizard : There is a clear warning at the top letting you know that as of DB11g, automated maintenance tasks should be enabled to gather nightly statistics. This is turned on by default in most databases, so this warning is a nice addition to this page for those that may not be aware. Below this warning, you are able to choose what level of statistics gathering you wish to perform, (database, schema, objects, fixed objects or data dictionary…) By default, Oracle’s guidelines for statistic collection options will be chosen, but you can change this to customize if you wish to work outside of Oracle’s recommendations. You can view the default values before deciding and if for some reason, you wish to use manual configuration options: The wizard won’t ask you to set the manual configurations until later into the setup steps and if you change your mind, you can still choose the defaults. At the bottom of the wizard, you also have the opportunity to use the Validate with the SQL Performance Analyzer, but as noted, the changes won’t be published and you’ll have to do that manually post the statistics collection run. The next page will take you through the customizes options you want to use instead of GATHER AUTO, (although, like I said, you could just leave it as is and have it just perform the default anyway! :)) Then you get to schedule it via the EM Job Service and would monitor and manage this job via the EM12c Job Activity console. This means that this is not an automated maintenance task in the Database Job Scheduler and if you are not aware of how to view jobs via the DBMS_JOB_SCHEDULER, then you could have two stats jobs running for a database or even worse, simultaneously, so BE AWARE. Lock/Unlock/Delete As the Lock , Unlock and Delete links take you to similar wizards that do just the opposite action, we’ll group them together in one section. Using the Unlock statistics wizard in our example, you can click on the link and choose to unlock a schema or specific tables: If you decide to unlock just a few or even just one object, the wizard makes it quite easy to search and choose: In the example above, I clicked on the magnifying glass next to the box for the Schema and then chose the DBSNMP schema. I can use a wild card search in the object name box or leave it blank and all tables in the schema are returned and a simple click in the box to the left of the object name will select it to lock, delete or unlock it, (depending which wizard you’ve chosen…) You also can view information on IF the object is locked or unlocked already, along with partitioning information, as you may have partitions that are locked while the table may not be. Restore The restore option is a great feature for those that may not be on top of their “restore statistics syntax on the top of their head” game. Now, I have to admit, some of the options in this wizard makes me very nervous. The idea that someone would dial back database level statistics vs. locating the one or two offenders that changed just seems like throwing the baby out with the bath water, but it is an option in the restore statistics command, so here it is in the wizard, as well. You have the option to override locked objects and force a restore, too. Like with locking and unlocking objects, the next screen in the wizard will allow you to choose a schema and object(s) that you wish to restore from and then once chosen, you will be asked when to restore to, including the earliest restore timestamp available: Post these choices, you then schedule the EM Job to run the task and you’re set. Manage Optimizer Statistics You must be granted the Create Job and Create Any Job privileges to take advantage of these features and will be warned if you haven’t been granted one or both. Operations links include the ability to Gather Optimizer Statistics, which includes database and schema level, along with distinct object level. Secondary links to restore, lock, unlock and delete statistics for each statistics gathering type is available as well. Related Links The Related Links section includes links for research and configuration settings, such as current object statistics, global statistic gathering options, the job scheduler to view current intervals for jobs involving statistics as well as automated maintenance tasks which inform you of any clean up and maintenance jobs that are part of the overall Cost Based Optimizer world. Configure These links will configure the Automated Maintenance Tasks, allowing you to update schedules of execution, disable/enable and work with SPA results, (SQL Performance Analyzer.) If you haven’t used SPA yet, it has some pretty cool features allowing you to simulate and analyze different performance changes before you make them. Nothing like being able to see in the future! Working with some of these features may require a few management packs, (tuning, real application testing, etc.) but if you’re curious if you’re wandering into a new management pack land, it’s easy to locate from any EM12c console page: You will receive information about any management packs involved with the features you are using in the EM12c console for the page you’re on: So embrace the power with optimizer statistics in EM12c Cloud Control and if you want to know more about managing Optimizer Statistics click here for the Oracle documentation or this whitepaper for more info. Tags: Del.icio.us Facebook TweetThis Digg StumbleUpon Comments: 0 (Zero), Be the first to leave a reply! You might be interested in this: Simple Reporting Without Materialized Views SQL Server and Distributed Transaction Tuning CBO, Statistics and A Rebel(DBA) With A Cause EM12c- Dealing with Unknown Targets AWR Warehouse Webinar from ODTUG Copyright © DBA Kevlar [ EM12c and the Optimizer Statistics Console ], All Right Reserved. 2014.
↧
Blog Post: Beware default tablespace during Data Pump import
Today, while importing one database to another, my colleague (Turkel) got a tablespace issue despite the fact that he used REMAP_TABLESPACE for almost all available tablespaces of the source database. After investigating, we saw that there’re a lot of users that was assigned a default tablespace which was deleted afterwards. While importing data, the user was supposed to be created and deleted tablespace was assigned as a default tablespace For this, make sure you get distinct account of default tablespaces of all users and change it during import using REMAP_TABLESPACE parameter
↧
↧
Blog Post: Why do Clever People Use Bad Tools?
We all like to work on projects that use our favorite tools - like the Oracle database. Some organizations arrive at a good decision through a rational decision process (which often leads to proven technology like relational databases) while others are just lucky and stumble onto a good tool. We don't want to work on the doomed projects that use bad tools and has a defective decision process that makes it impossible to change the decision even long after everybody knows that the latest trendy Big Data NoSQL object-oriented hierarchical in-memory database won't work. But the most interesting situation is the one where you respect the decision-making process and the people making the decision, and they still choose another tool than you would have chosen. Pay attention - that's the projects where you might really learn something new.
↧
Wiki Page: Migrating a MySQL Database Table to Oracle Database
While Oracle Database is the most commonly used commercial database, MySQL database is the most commonly used open source database. Several migration tools are available to migrate MySQL to Oracle, the most commonly used being Oracle SQL Developer. While the migration tools are designed for migrating relatively small quantities of data a different tool/technology is required for migrating large quantities of table data. Apache Sqoop is a tool for transferring large quantities of data between a relational database, such as MySQL and Oracle database, and Hadoop Distributed File System (HDFS). In this tutorial we shall migrate a MySQL database table to a Oracle Database table using Sqoop. We shall use the following procedure for the migration. Create a Oracle Database table with the same table definition as the MySQL database table. Import MySQL table data into HDFS with Sqoop. Export HDFS data to Oracle Database with Sqoop. This tutorial has the following sections. Setting the Environment Configuring the HDFS Creating a MySQL Database Table Creating an Oracle Database Table Importing MySQL Table Data into HDFS Exporting HDFS Data to Oracle Database Querying Migrated Table Data Setting the Environment We have used Oracle Linux 6.5 installed on Oracle VirtualBox 4.3. We need to download and install the following software for this tutorial. Oracle Database 11g MySQL Database 5.6 Hadoop 2.0.0 CDH 4.6 Sqoop 1.4.3 CDH 4.6 Java 7 First, create a directory to install MySQL database and other software and set its permissions to global. mkdir /mysql chmod -R 777 /mysql cd /mysql Download the Java 7 tar.gz file and extract the file to the /mysql directory. tar zxvf jdk-7u55-linux-i586.tar.gz Download CDH4.6 Hadoop 2.0.0 and extract the tar.gz file to the /mysql directory. wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.6.0.tar.gz tar -xvf hadoop-2.0.0-cdh4.6.0.tar.gz Create symlinks for the Hadoop installation bin and conf directories. ln -s /mysql/hadoop-2.0.0-cdh4.6.0/bin /mysql/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce2/bin ln -s /mysql/hadoop-2.0.0-cdh4.6.0/etc/hadoop /mysql/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce2/conf Download and install Sqoop 1.4.3 CDH 4.6 and extract the tar.gz file to the /mysql directory. wget http://archive-primary.cloudera.com/cdh4/cdh/4/sqoop-1.4.3-cdh4.6.0.tar.gz tar -xvf sqoop-1.4.3-cdh4.6.0.tar.gz Copy the JDBC jars for Oracle Database and MySQL database to the Sqoop lib directory. The JDBC jars may be downloaded separately or obtained from the database installations. cp mysql-connector-java-5.1.31-bin.jar /mysql/sqoop-1.4.3-cdh4.6.0/lib cp ojdbc6.jar /mysql/sqoop-1.4.3-cdh4.6.0/lib Download the MySQL Server 5.6.21 tar.gz file mysql-5.6.19-linux-glibc2.5-i686.tar.gz for the Linux Generic platform from http://dev.mysql.com/downloads/mysql/. Extract the tar.gz file to the /mysql directory. tar zxvf mysql-5.6.19-linux-glibc2.5-i686.tar.gz Add a group mysql and a user mysql to the group. groupadd mysql useradd -r -g mysql mysql Create a symlink for the MySQL installation directory. ln -s /mysql/mysql-5.6.19-linux-glibc2.5-i686 mysql cd mysql Run the following commands to install MySQL database and set the owner/group for the database to mysql . chown -R mysql . chgrp -R mysql . scripts/mysql_install_db --user=mysql chown -R root . chown -R mysql data Start MySQL Server with the following command. bin/mysqld_safe --user=mysql & MySQL database gets started. By default the root user does not require a password. Set a password for the root user using the following command. mysqladmin -u root -p password Login to the MySQL console using the following command and specify root password when prompted. mysql -u root -p The /mysql directory should list the following software. Set environment variables for Oracle Database, MySQL database, Hadoop, Sqoop, and Java in the bash shell. Though we shall be using MapReduce2 (MR2) some MR1 libraries are also used in running Sqoop import/export. vi ~/.bashrc export HADOOP_PREFIX=/mysql/hadoop-2.0.0-cdh4.6.0 export HADOOP_CONF=$HADOOP_PREFIX/etc/hadoop export SQOOP_HOME=/mysql/sqoop-1.4.3-cdh4.6.0 export JAVA_HOME=/mysql/jdk1.7.0_55 export MYSQL_HOME=/mysql/mysql-5.6.19-linux-glibc2.5-i686 export ORACLE_HOME=/home/oracle/app/oracle/product/11.2.0/dbhome_1 export ORACLE_SID=ORCL export HADOOP_MAPRED_HOME=/mysql/hadoop-2.0.0-cdh4.6.0 export HADOOP_HOME=/mysql/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce2 export HADOOP_CLASSPATH=$HADOOP_HOME/*:$HADOOP_HOME/lib/*:SQOOP_HOME/lib/*:/mysql/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce1/lib/* export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_MAPRED_HOME/bin:$ORACLE_HOME/bin:$MYSQL_HOME/bin:$SQOOP_HOME/bin export CLASSPATH=$HADOOP_CLASSPATH Configuring the HDFS In a single node cluster the HDFS comprises of the NameNode and a DataNode. We need to set some configuration properties for the HDFS. In the $HADOOP_CONF/core-site.xml file set the fs.defaultFS and hadoop.tmp.dir properties. The fs.defaultFS is for the NameNode URI and the hadoop.tmp.dir is for the Hadoop temporary directory. ?xml-stylesheet type="text/xsl" href="configuration.xsl"? !-- Put site-specific property overrides in this file. -- configuration property name fs.defaultFS /name value hdfs://10.0.2.15:8020 /value /property property name hadoop.tmp.dir /name value file:///var/lib/hadoop-0.20/cache /value /property /configuration Next, remove any previously created tmp directory and create the NameNode tmp directory specified in the hadoop.tmp.dir property. rm -rf /var/lib/hadoop-0.20/cache mkdir -p /var/lib/hadoop-0.20/cache chmod -R 777 /var/lib/hadoop-0.20/cache Set the dfs.namenode.name.dir , dfs.permissions.superusergroup , dfs.permissions , dfs.replication properties in the $HADOOP_CONF/ hdfs-site.xml file. The dfs.namenode.name.dir configuration property is for the NameNode storage directory. The dfs.replication is for the replication factor and other are permissions related configuration properties. ?xml version="1.0" encoding="UTF-8"? ?xml-stylesheet type="text/xsl" href="configuration.xsl"? !-- Put site-specific property overrides in this file. -- configuration property name dfs.permissions.superusergroup /name value hadoop /value /property property name dfs.namenode.name.dir /name value file:///data/1/dfs/nn /value /property property name dfs.replication /name value 1 /value /property property name dfs.permissions /name value false /value /property /configuration Remove any previously created NameNode storage directory and create a new directory and set its permissions to global. rm -rf /data/1/dfs/nn mkdir -p /data/1/dfs/nn chmod -R 777 /data/1/dfs/nn Next, format the NameNode and start the NameNode and DataNode. hadoop namenode -format hadoop namenode hadoop datanode We shall be importing MySQL data into a HDFS directory. Create a HDFS directory /mysql/import for the import and set its permissions to global. hadoop dfs -mkdir /mysql/import hadoop dfs -chmod -R 777 /mysql/import We also need to copy Sqoop to HDFS to be available in the runtime classpath of Sqoop. Create a directory /mysql/sqoop-1.4.3-cdh4.6.0/lib in HDFS and set its permissions to global. Put the Sqoop installation directory Jars and the lib directory jars to the HDFS. hadoop dfs -mkdir /mysql/sqoop-1.4.3-cdh4.6.0/lib hadoop dfs -chmod -R 777 /mysql/sqoop-1.4.3-cdh4.6.0/lib hdfs dfs -put /mysql/sqoop-1.4.3-cdh4.6.0/* hdfs://10.0.2.15:8020/mysql/sqoop-1.4.3-cdh4.6.0 Creating a MySQL Database Table In this section we shall create a MySQL database table wlslog and add data to the table. The MySQL database table must have a primary key column for Sqoop to be able to import MySQL data into HDFS. Login to the MySQL Console and set database as test . mysql –u root –p use test Run the following SQL script to create the wlslog table and add table data. CREATE TABLE wlslog(time_stamp VARCHAR(255) PRIMARY KEY,category VARCHAR(255),type VARCHAR(255),servername VARCHAR(255), code VARCHAR(255),msg VARCHAR(255)); INSERT INTO wlslog(time_stamp,category,type,servername,code,msg) VALUES('Apr-8-2014-7:06:16-PM-PDT','Notice','WebLogicServer','AdminServer','BEA-000365','Server state changed to STANDBY'); INSERT INTO wlslog(time_stamp,category,type,servername,code,msg) VALUES('Apr-8-2014-7:06:17-PM-PDT','Notice','WebLogicServer','AdminServer','BEA-000365','Server state changed to STARTING'); INSERT INTO wlslog(time_stamp,category,type,servername,code,msg) VALUES('Apr-8-2014-7:06:18-PM-PDT','Notice','WebLogicServer','AdminServer','BEA-000365','Server state changed to ADMIN'); INSERT INTO wlslog(time_stamp,category,type,servername,code,msg) VALUES('Apr-8-2014-7:06:19-PM-PDT','Notice','WebLogicServer','AdminServer','BEA-000365','Server state changed to RESUMING'); INSERT INTO wlslog(time_stamp,category,type,servername,code,msg) VALUES('Apr-8-2014-7:06:20-PM-PDT','Notice','WebLogicServer','AdminServer','BEA-000361','Started WebLogic AdminServer'); INSERT INTO wlslog(time_stamp,category,type,servername,code,msg) VALUES('Apr-8-2014-7:06:21-PM-PDT','Notice','WebLogicServer','AdminServer','BEA-000365','Server state changed to RUNNING'); INSERT INTO wlslog(time_stamp,category,type,servername,code,msg) VALUES('Apr-8-2014-7:06:22-PM-PDT','Notice','WebLogicServer','AdminServer','BEA-000360','Server started in RUNNING mode'); The MySQL Console output is as follows. Creating an Oracle Database Table Next, create the target database table in Oracle Database to which the MySQL database table is to be migrated to. The Oracle Database table should have the same name/type columns as the MySQL database table. Drop any previously created table OE.WLSLOG and run the following SQL script in SQL*Plus. CREATE TABLE OE.wlslog (time_stamp VARCHAR2(4000), category VARCHAR2(4000), type VARCHAR2(4000), servername VARCHAR2(4000), code VARCHAR2(4000), msg VARCHAR2(4000)); A Oracle Database table OE.WLSOG gets created and its description may be listed with the DESC command. Importing MySQL Table Data into HDFS The Sqoop import tool is used to import a single table from a relational database into HDFS. Each row in the relational database table is imported as a single record and stored as a text file/s (one record per line) or as binary Avro or sequence files, the default being text file/s. The import tool is run with the sqoop import command and some of the command arguments are discussed in Table 1; most of the command arguments are optional. Command Argument Description MySQL Value used --connect jdbc-uri Specifies the JDBC URI used to connect to the relational database. jdbc:mysql://localhost/test --connection-manager class-name Specifies the connection manager class. Not used --driver class-name Specifies the JDBC driver class. com.mysql.jdbc.Driver inferred automatically and not required to be specified. --hadoop-home dir Specifies the Hadoop home directory to override the $HADOOP_HOME environment variable. Not used -P Used to specify password from console. Not used --password password Specifies password mysql --username username Specifies user name root --verbose Outputs verbose information Not used --connection-param-file filename Specifies file name in which additional connection parameters may be specified. Not used Some of the connection parameter properties that may be specified in the connection param file are discussed in Table 2. These may also be specified on the command line when the sqoop import command is run. Connection Parameter Description MySQL Value used --append Specifies the data to be appended to an existing dataset in HDFS. Not used --as-avrodatafile Imports as Avro data file. Not used --as-sequencefile Imports as SequenceFiles. Not used --as-textfile Imports as text file, which is the default. Not used --columns col,col,col… Specifies the columns to import. time_stamp,category,type,servername, code,msg --table table-name Specifies table name to import from. wlslog --target-dir dir Specifies target directory in HDFS. /mysql/import With the MySQL Server and HDFS running run the following command to import using Sqoop. sqoop import --connect "jdbc:mysql://localhost/test" --password "mysql" --username "root" --table "wlslog" --columns "time_stamp,category,type,servername, code,msg" --target-dir "/mysql/import" –verbose The Sqoop import tool gets started. A MapReduce job runs to import MySQL Database table data into HDFS. A more detail output from the sqoop import command is as follows. sqoop import --connect "jdbc:mysql://localhost/test" --password "mysql" --username "root" --table "wlslog" --columns "time_stamp,category,type,servername, code,msg" --target-dir "/mysql/import" --verbose 4/10/13 19:44:35 INFO sqoop.Sqoop: Running Sqoop version: 1.4.3-cdh4.6.0 14/10/13 19:44:36 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/13 19:44:36 INFO tool.CodeGenTool: Beginning code generation 14/10/13 19:44:40 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `wlslog` AS t LIMIT 1 14/10/13 19:44:40 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `wlslog` AS t LIMIT 1 14/10/13 19:44:40 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /mysql/hadoop-2.0.0-cdh4.6.0 14/10/13 19:41:59 DEBUG manager.CatalogQueryManager: Retrieving primary key for table 'wlslog' with query SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = (SELECT SCHEMA()) AND TABLE_NAME = 'wlslog' AND COLUMN_KEY = 'PRI' 14/10/13 19:41:59 DEBUG manager.CatalogQueryManager: Retrieving primary key for table 'wlslog' with query SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = (SELECT SCHEMA()) AND TABLE_NAME = 'wlslog' AND COLUMN_KEY = 'PRI' 14/10/13 19:41:59 INFO mapreduce.ImportJobBase: Beginning import of wlslog 14/10/13 19:42:05 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound '1=1' and upper bound '1=1' 14/10/13 19:42:06 INFO mapreduce.JobSubmitter: number of splits:1 14/10/13 19:42:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local985417148_0001 14/10/13 19:42:36 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 14/10/13 19:42:36 INFO mapreduce.Job: Running job: job_local985417148_0001 14/10/13 19:42:36 INFO mapred.LocalJobRunner: OutputCommitter set in config null 14/10/13 19:42:36 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 14/10/13 19:42:36 INFO mapred.LocalJobRunner: Waiting for map tasks 14/10/13 19:42:36 INFO mapred.LocalJobRunner: Starting task: attempt_local985417148_0001_m_000000_0 14/10/13 19:42:37 INFO mapreduce.Job: Job job_local985417148_0001 running in uber mode : false 14/10/13 19:42:37 INFO mapreduce.Job: map 0% reduce 0% 14/10/13 19:42:37 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/13 19:42:37 DEBUG db.DBConfiguration: Fetching password from job credentials store 14/10/13 19:42:37 INFO mapred.MapTask: Processing split: 1=1 AND 1=1 14/10/13 19:42:37 DEBUG db.DataDrivenDBInputFormat: Creating db record reader for db product: MYSQL 14/10/13 19:42:38 INFO db.DBRecordReader: Working on split: 1=1 AND 1=1 14/10/13 19:42:38 DEBUG db.DataDrivenDBRecordReader: Using query: SELECT `time_stamp`, `category`, `type`, `servername`, `code`, `msg` FROM `wlslog` AS `wlslog` WHERE ( 1=1 ) AND ( 1=1 ) 14/10/13 19:42:38 DEBUG db.DBRecordReader: Using fetchSize for next query: -2147483648 14/10/13 19:42:38 INFO db.DBRecordReader: Executing query: SELECT `time_stamp`, `category`, `type`, `servername`, `code`, `msg` FROM `wlslog` AS `wlslog` WHERE ( 1=1 ) AND ( 1=1 ) 14/10/13 19:42:38 DEBUG mapreduce.AutoProgressMapper: Instructing auto-progress thread to quit. 14/10/13 19:42:38 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 14/10/13 19:42:38 DEBUG mapreduce.AutoProgressMapper: Waiting for progress thread shutdown... 14/10/13 19:42:38 DEBUG mapreduce.AutoProgressMapper: Progress thread shutdown detected. 14/10/13 19:42:38 INFO mapred.LocalJobRunner: 14/10/13 19:42:39 INFO mapred.Task: Task:attempt_local985417148_0001_m_000000_0 is done. And is in the process of committing 14/10/13 19:42:39 INFO mapred.LocalJobRunner: 14/10/13 19:42:39 INFO mapred.Task: Task attempt_local985417148_0001_m_000000_0 is allowed to commit now 14/10/13 19:42:39 INFO output.FileOutputCommitter: Saved output of task 'attempt_local985417148_0001_m_000000_0' to hdfs://10.0.2.15:8020/mysql/import/_temporary/0/task_local985417148_0001_m_000000 14/10/13 19:42:39 INFO mapred.LocalJobRunner: map 14/10/13 19:42:39 INFO mapred.Task: Task 'attempt_local985417148_0001_m_000000_0' done. 14/10/13 19:42:39 INFO mapred.LocalJobRunner: Finishing task: attempt_local985417148_0001_m_000000_0 14/10/13 19:42:39 INFO mapred.LocalJobRunner: Map task executor complete. 14/10/13 19:42:40 INFO mapreduce.Job: map 100% reduce 0% 14/10/13 19:42:41 INFO mapreduce.Job: Job job_local985417148_0001 completed successfully 14/10/13 19:42:41 INFO mapreduce.Job: Counters: 23 File System Counters FILE: Number of bytes read=3804 FILE: Number of bytes written=15319975 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=14995703 HDFS: Number of bytes written=717 HDFS: Number of read operations=343 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Map-Reduce Framework Map input records=7 Map output records=7 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=35 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=50167808 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=717 14/10/13 19:42:41 INFO mapreduce.ImportJobBase: Transferred 717 bytes in 37.2268 seconds (19.2603 bytes/sec) 14/10/13 19:42:41 INFO mapreduce.ImportJobBase: Retrieved 7 records. 14/10/13 19:42:41 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@10f243b Exporting HDFS Data to Oracle Database The Sqoop export tool is used to export one or more files in HDFS to a relational database table. The target table must already be created in the database as the export tool does not create the database table. The input files are parsed and converted into a set of records and added to the database table using INSERT statements. In update mode, which is similar to the append mode for the Sqoop import tool, the database table is updated using UPDATE statements. The same command arguments as those listed in Table 1 are also supported by the export tool, but the values are different as listed in Table 3. Command Argument Oracle Database specific value --connect jdbc-uri jdbc:oracle:thin:@localhost:1521:ORCL --connection-manager class-name Not used. --driver class-name Oracle.jdbc.OracleDriver Inferred automatically --hadoop-home dir /mysql/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce2 -P Not used. --password password OE --username username OE --verbose Outputs verbose information --connection-param-file filename Not used. The connection parameter file supports some of the same and some different connection parameters as discussed in following table. These may also be specified on the command line with the sqoop export command. Connection Parameters Description Export specific value --export-dir dir HDFS directory to export from. /mysql/import -m,--num-mappers n Number of map tasks to use to export in parallel. Not used. --table table-name Table to export to. OE.WLSLOG --update-mode mode Specifies the update mode. Default setting is updateonly, which only updates rows. The allowinsert value may be used to all insert of new rows. ?? Not used. --input-null-string null-string Specifies the string to be interpreted as null for string columns. Not used. --input-null-non-string null-string Specifies the string to be interpreted as null for non-string columns. Not used. --staging-table staging-table-name The database table to be used as staging table. Not used. --batch Specifies to use as batch mode. Not used. Run the following sqoop export command to export from the /mysql/import directory to the OE.WLSLOG table in Oracle Database. The /mysql/import is the HDFS directory to which the MySQL table was imported to in the previous section. sqoop export --connect "jdbc:oracle:thin:@localhost:1521:ORCL" --hadoop-home "/mysql/hadoop-2.0.0-cdh4.6.0/share/hadoop/mapreduce2" --password "OE" --username "OE" --export-dir "/mysql/import" --table "OE.WLSLOG" --verbose The sqoop export tool gets started. A MapReduce job runs to export the /mysql/import directory data to Oracle Database. A more detailed output from the sqoop export command is as follows. 14/10/13 19:50:08 INFO input.FileInputFormat: Total input paths to process : 1 14/10/13 19:50:08 INFO input.FileInputFormat: Total input paths to process : 1 14/10/13 19:50:08 DEBUG mapreduce.ExportInputFormat: Target numMapTasks=4 14/10/13 19:50:08 DEBUG mapreduce.ExportInputFormat: Total input bytes=717 14/10/13 19:50:08 DEBUG mapreduce.ExportInputFormat: maxSplitSize=179 14/10/13 19:50:08 INFO input.FileInputFormat: Total input paths to process : 1 14/10/13 19:50:08 DEBUG mapreduce.ExportInputFormat: Generated splits: 14/10/13 19:50:09 DEBUG mapreduce.ExportInputFormat: Paths:/mysql/import/part-m-00000:0+179 Locations:localhost:; 14/10/13 19:50:09 DEBUG mapreduce.ExportInputFormat: Paths:/mysql/import/part-m-00000:179+179 Locations:localhost:; 14/10/13 19:50:09 DEBUG mapreduce.ExportInputFormat: Paths:/mysql/import/part-m-00000:358+179 Locations:localhost:; 14/10/13 19:50:09 DEBUG mapreduce.ExportInputFormat: Paths:/mysql/import/part-m-00000:537+90,/mysql/import/part-m-00000:627+90 Locations:localhost:; 14/10/13 19:50:09 INFO mapreduce.JobSubmitter: number of splits:4 14/10/13 19:50:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1075626827_0001 14/10/13 19:50:35 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 14/10/13 19:50:35 INFO mapreduce.Job: Running job: job_local1075626827_0001 14/10/13 19:50:35 INFO mapred.LocalJobRunner: OutputCommitter set in config null 14/10/13 19:50:35 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.sqoop.mapreduce.NullOutputCommitter 14/10/13 19:50:35 INFO mapred.LocalJobRunner: Waiting for map tasks 14/10/13 19:50:35 INFO mapred.LocalJobRunner: Starting task: attempt_local1075626827_0001_m_000000_0 14/10/13 19:50:36 INFO mapreduce.Job: Job job_local1075626827_0001 running in uber mode : false 14/10/13 19:50:36 INFO mapreduce.Job: map 0% reduce 0% 14/10/13 19:50:36 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/13 19:50:36 INFO mapred.MapTask: Processing split: Paths:/mysql/import/part-m-00000:537+90,/mysql/import/part-m-00000:627+90 14/10/13 19:50:36 WARN conf.Configuration: map.input.file is deprecated. Instead, use mapreduce.map.input.file 14/10/13 19:50:36 WARN conf.Configuration: map.input.start is deprecated. Instead, use mapreduce.map.input.start 14/10/13 19:50:36 WARN conf.Configuration: map.input.length is deprecated. Instead, use mapreduce.map.input.length 14/10/13 19:50:36 DEBUG mapreduce.CombineShimRecordReader: ChildSplit operates on: hdfs://10.0.2.15:8020/mysql/import/part-m-00000 14/10/13 19:50:36 DEBUG db.DBConfiguration: Fetching password from job credentials store 14/10/13 19:50:38 DEBUG mapreduce.CombineShimRecordReader: ChildSplit operates on: hdfs://10.0.2.15:8020/mysql/import/part-m-00000 14/10/13 19:50:38 DEBUG mapreduce.AutoProgressMapper: Instructing auto-progress thread to quit. 14/10/13 19:50:38 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 14/10/13 19:50:38 DEBUG mapreduce.AutoProgressMapper: Waiting for progress thread shutdown... 14/10/13 19:50:38 DEBUG mapreduce.AutoProgressMapper: Progress thread shutdown detected. 14/10/13 19:50:38 INFO mapred.LocalJobRunner: 14/10/13 19:50:38 DEBUG mapreduce.AsyncSqlOutputFormat: Committing transaction of 1 statements 14/10/13 19:50:39 INFO mapred.Task: Task:attempt_local1075626827_0001_m_000000_0 is done. And is in the process of committing 14/10/13 19:50:39 INFO mapred.LocalJobRunner: map 14/10/13 19:50:39 INFO mapred.Task: Task 'attempt_local1075626827_0001_m_000000_0' done. 14/10/13 19:50:39 INFO mapred.LocalJobRunner: Finishing task: attempt_local1075626827_0001_m_000000_0 14/10/13 19:50:39 INFO mapred.LocalJobRunner: Starting task: attempt_local1075626827_0001_m_000001_0 14/10/13 19:50:39 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/13 19:50:39 INFO mapreduce.Job: map 100% reduce 0% 14/10/13 19:50:39 INFO mapred.MapTask: Processing split: Paths:/mysql/import/part-m-00000:0+179 14/10/13 19:50:39 DEBUG mapreduce.CombineShimRecordReader: ChildSplit operates on: hdfs://10.0.2.15:8020/mysql/import/part-m-00000 14/10/13 19:50:39 DEBUG db.DBConfiguration: Fetching password from job credentials store 14/10/13 19:50:41 DEBUG mapreduce.AutoProgressMapper: Instructing auto-progress thread to quit. 14/10/13 19:50:41 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 14/10/13 19:50:41 DEBUG mapreduce.AutoProgressMapper: Waiting for progress thread shutdown... 14/10/13 19:50:41 DEBUG mapreduce.AutoProgressMapper: Progress thread shutdown detected. 14/10/13 19:50:41 INFO mapred.LocalJobRunner: 14/10/13 19:50:41 DEBUG mapreduce.AsyncSqlOutputFormat: Committing transaction of 1 statements 14/10/13 19:50:41 INFO mapred.Task: Task:attempt_local1075626827_0001_m_000001_0 is done. And is in the process of committing 14/10/13 19:50:41 INFO mapred.LocalJobRunner: map 14/10/13 19:50:41 INFO mapred.Task: Task 'attempt_local1075626827_0001_m_000001_0' done. 14/10/13 19:50:41 INFO mapred.LocalJobRunner: Finishing task: attempt_local1075626827_0001_m_000001_0 14/10/13 19:50:41 INFO mapred.LocalJobRunner: Starting task: attempt_local1075626827_0001_m_000002_0 14/10/13 19:50:41 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/13 19:50:41 INFO mapred.MapTask: Processing split: Paths:/mysql/import/part-m-00000:179+179 14/10/13 19:50:41 DEBUG mapreduce.CombineShimRecordReader: ChildSplit operates on: hdfs://10.0.2.15:8020/mysql/import/part-m-00000 14/10/13 19:50:41 DEBUG db.DBConfiguration: Fetching password from job credentials store 14/10/13 19:50:46 DEBUG mapreduce.AutoProgressMapper: Instructing auto-progress thread to quit. 14/10/13 19:50:46 DEBUG mapreduce.AutoProgressMapper: Waiting for progress thread shutdown... 14/10/13 19:50:46 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 14/10/13 19:50:46 DEBUG mapreduce.AutoProgressMapper: Progress thread shutdown detected. 14/10/13 19:50:46 INFO mapred.LocalJobRunner: 14/10/13 19:50:46 DEBUG mapreduce.AsyncSqlOutputFormat: Committing transaction of 1 statements 14/10/13 19:50:46 INFO mapreduce.Job: map 50% reduce 0% 14/10/13 19:50:47 INFO mapred.Task: Task:attempt_local1075626827_0001_m_000002_0 is done. And is in the process of committing 14/10/13 19:50:47 INFO mapred.LocalJobRunner: map 14/10/13 19:50:47 INFO mapred.Task: Task 'attempt_local1075626827_0001_m_000002_0' done. 14/10/13 19:50:47 INFO mapred.LocalJobRunner: Finishing task: attempt_local1075626827_0001_m_000002_0 14/10/13 19:50:47 INFO mapred.LocalJobRunner: Starting task: attempt_local1075626827_0001_m_000003_0 14/10/13 19:50:47 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 14/10/13 19:50:47 INFO mapred.MapTask: Processing split: Paths:/mysql/import/part-m-00000:358+179 14/10/13 19:50:47 DEBUG mapreduce.CombineShimRecordReader: ChildSplit operates on: hdfs://10.0.2.15:8020/mysql/import/part-m-00000 14/10/13 19:50:47 DEBUG db.DBConfiguration: Fetching password from job credentials store 14/10/13 19:50:47 INFO mapreduce.Job: map 100% reduce 0% 14/10/13 19:50:49 DEBUG mapreduce.AutoProgressMapper: Instructing auto-progress thread to quit. 14/10/13 19:50:49 DEBUG mapreduce.AutoProgressMapper: Waiting for progress thread shutdown... 14/10/13 19:50:49 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 14/10/13 19:50:49 DEBUG mapreduce.AutoProgressMapper: Progress thread shutdown detected. 14/10/13 19:50:49 INFO mapred.LocalJobRunner: 14/10/13 19:50:49 DEBUG mapreduce.AsyncSqlOutputFormat: Committing transaction of 1 statements 14/10/13 19:50:49 INFO mapred.Task: Task:attempt_local1075626827_0001_m_000003_0 is done. And is in the process of committing 14/10/13 19:50:49 INFO mapred.LocalJobRunner: map 14/10/13 19:50:49 INFO mapred.Task: Task 'attempt_local1075626827_0001_m_000003_0' done. 14/10/13 19:50:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1075626827_0001_m_000003_0 14/10/13 19:50:49 INFO mapred.LocalJobRunner: Map task executor complete. 14/10/13 19:50:49 INFO mapreduce.Job: Job job_local1075626827_0001 completed successfully 14/10/13 19:50:50 INFO mapreduce.Job: Counters: 23 File System Counters FILE: Number of bytes read=21070 FILE: Number of bytes written=61282544 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=59987532 HDFS: Number of bytes written=0 HDFS: Number of read operations=1434 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Map-Reduce Framework Map input records=7 Map output records=7 Input split bytes=576 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=29 CPU time spent (ms)=0 Physical memory (bytes) snapshot=0 Virtual memory (bytes) snapshot=0 Total committed heap usage (bytes)=196083712 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 14/10/13 19:50:50 INFO mapreduce.ExportJobBase: Transferred 57.2086 MB in 42.576 seconds (1.3437 MB/sec) 14/10/13 19:50:50 INFO mapreduce.ExportJobBase: Exported 7 records. 14/10/13 19:50:50 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@10f243b Querying Migrated Table Data Hiving migrated MySQL table to Oracle Database, query the Oracle Database table using a SELECT statement in SQL*Plus. The 7 rows migrated from MySQL database table wlslog get listed. The complete output from the SELECT statement is as follows. SQL SET PAGESIZE 1000 SQL SELECT * FROM OE.WLSLOG; TIME_STAMP -------------------------------------------------------------------------------- CATEGORY -------------------------------------------------------------------------------- TYPE -------------------------------------------------------------------------------- SERVERNAME -------------------------------------------------------------------------------- CODE -------------------------------------------------------------------------------- MSG -------------------------------------------------------------------------------- Apr-8-2014-7:06:22-PM-PDT Notice WebLogicServer AdminServer BEA-000360 Server started in RUNNING mode Apr-8-2014-7:06:16-PM-PDT Notice WebLogicServer AdminServer BEA-000365 Server state changed to STANDBY Apr-8-2014-7:06:17-PM-PDT Notice WebLogicServer AdminServer BEA-000365 Server state changed to STARTING Apr-8-2014-7:06:18-PM-PDT Notice WebLogicServer AdminServer BEA-000365 Server state changed to ADMIN Apr-8-2014-7:06:19-PM-PDT Notice WebLogicServer AdminServer BEA-000365 Server state changed to RESUMING Apr-8-2014-7:06:20-PM-PDT Notice WebLogicServer AdminServer BEA-000361 Started WebLogic AdminServer Apr-8-2014-7:06:21-PM-PDT Notice WebLogicServer AdminServer BEA-000365 Server state changed to RUNNING 7 rows selected. SQL In this tutorial we migrated a MySQL Database table to Oracle Database using Sqoop.
↧
Wiki Page: Object Types and Column Substitutability
This article shows you how to use extend parent (or superclass) objects. You extend parent classes when you implement specialized behaviors (or methods) in subtypes. That’s because SQL statements can’t work with specialized methods when a table’s column stores subclasses in a superclass column type. Substitutability is the process of storing subtypes in a super type column. It is a powerful feature of the Oracle database. The “type evolution” feature of the Oracle Database 12 c release makes it more important because it makes it more flexible. The flexibility occurs because Oracle lets you evolve parent classes. You evolve parent classes when you implement MEMBER functions or procedures, and you want to access them for all substitutable column values. That’s necessary because you need to define the MEMBER function or procedure in the column’s base object type. Prior to Oracle Database 12 c , you couldn’t change (evolve) a base type. If you’re new to the idea of object types and subtypes, you may want to check out my earlier “Object Types and Subtypes” article. Before discussing the complexity of creating and evolving object types to support column substitutability, let’s create a base_t object type. The base_t object type will become our root node object type. A root node object type is our most general object type. A root node is also the topmost node of an inverted tree of object types. All subtypes of the root node become child nodes, and child nodes without their own children are at the bottom of the tree and they’re leaf nodes. The following creates the base_t object type. It is similar to object types that I use in related articles to keep ideas consistent and simple across the articles. This version of the base_t object doesn’t try to maintain an internal unique identifier because the table maintains it as a surrogate key. SQL CREATE OR REPLACE 2 TYPE base_t IS OBJECT 3 ( oname VARCHAR2(30) 4 , CONSTRUCTOR FUNCTION base_t 5 RETURN SELF AS RESULT 6 , MEMBER FUNCTION get_oname RETURN VARCHAR2 7 , MEMBER PROCEDURE set_oname (oname VARCHAR2) 8 , MEMBER FUNCTION to_string RETURN VARCHAR2) 9 INSTANTIABLE NOT FINAL; 10 / The oname attribute on line two holds the name of the object type. Lines 4 and 5 define the default constructor, which has no formal parameters. Line 6 defines an accessor method, or getter , and line 7 defines a mutator , or setter . Line 8 defines a traditional to_string method that lets you print the contents of the object type. Next, let’s implement the base_t object type’s body: SQL CREATE OR REPLACE 2 TYPE BODY base_t IS 3 /* A default constructor w/o formal parameters. */ 4 CONSTRUCTOR FUNCTION base_t 5 RETURN SELF AS RESULT IS 6 BEGIN 7 self.oname := 'BASE_T'; 8 RETURN; 9 END; 10 /* An accessor, or getter, method. */ 11 MEMBER FUNCTION get_oname RETURN VARCHAR2 IS 12 BEGIN 13 RETURN self.oname; 14 END get_oname; 15 /* A mutator, or setter, method. */ 16 MEMBER PROCEDURE set_oname 17 ( oname VARCHAR2 ) IS 18 BEGIN 19 self.oname := oname; 20 END set_oname; 21 /* A to_string conversion method. */ 22 MEMBER FUNCTION to_string RETURN VARCHAR2 IS 23 BEGIN 24 RETURN '['||self.oname||']'; 25 END to_string; 26 END; 27 / Line 7 assigns a literal value to the oname attribute. Line 24 returns the value of the oname attribute for the instance. The remainder of the object type is generic. You can read about the generic features in my “Object Types and Bodies Basics” and about accessor and mutator methods in my “Object Types with Getters and Setters” articles. Let’s define and implement a hobbit_t subtype of our base_t object type. The hobbit_t object type is: SQL CREATE OR REPLACE TYPE hobbit_t UNDER base_t 2 ( genus VARCHAR2(20) 3 , name VARCHAR2(20) 4 , CONSTRUCTOR FUNCTION hobbit_t 5 ( genus VARCHAR2 6 , name VARCHAR2) RETURN SELF AS RESULT 7 , MEMBER FUNCTION get_genus RETURN VARCHAR2 8 , MEMBER FUNCTION get_name RETURN VARCHAR2 9 , MEMBER PROCEDURE set_genus (genus VARCHAR2) 10 , MEMBER PROCEDURE set_name (name VARCHAR2) 11 , OVERRIDING MEMBER FUNCTION to_string RETURN VARCHAR2) 12 INSTANTIABLE NOT FINAL; 13 / Lines 2 and 3 add two new genus and name attributes to the hobbit_t subtype. The hobbit_t subtype also inherits the oname attribute from its parent base_t type. Lines 7 and 8 define two getters and lines 9 and 10 define two setters, which support the genus and name attributes of the hobbit_t subtype. The hobbit_t object type’s getter and setters are unique to the subtype. They are also a specialization of the base_t object type. As such, these getters and setters are inaccessible to instances of the base_t object type. Line 11 defines an overriding to_string function for the base_t type’s to_string function. The following implements the hobbit_t object body: SQL CREATE OR REPLACE TYPE BODY hobbit_t IS 2 /* A default constructor with two formal parameters. */ 3 CONSTRUCTOR FUNCTION hobbit_t 4 ( genus VARCHAR2 5 , name VARCHAR2 ) 6 RETURN SELF AS RESULT IS 7 BEGIN 8 self.oname := 'HOBBIT_T'; 9 self.name := name; 10 self.genus := genus; 11 RETURN; 12 END; 13 /* An accessor, or getter, method. */ 14 MEMBER FUNCTION get_genus RETURN VARCHAR2 IS 15 BEGIN 16 RETURN self.genus; 17 END get_genus; 18 /* An accessor, or getter, method. */ 19 MEMBER FUNCTION get_name RETURN VARCHAR2 IS 20 BEGIN 21 RETURN self.name; 22 END get_name; 23 /* A mutator, or setter, method. */ 24 MEMBER PROCEDURE set_genus 25 ( genus VARCHAR2 ) IS 26 BEGIN 27 self.genus := genus; 28 END set_genus; 29 /* A mutator, or setter, method. */ 30 MEMBER PROCEDURE set_name 31 ( name VARCHAR2 ) IS 32 BEGIN 33 self.name := name; 34 END set_name; 35 /* A to_string conversion method. */ 36 OVERRIDING MEMBER FUNCTION to_string RETURN VARCHAR2 IS 37 BEGIN 38 /* Uses general invocation on parent to_string 39 function. */ 40 RETURN (self as base_t).to_string 41 || '['||self.genus||']['||self.name||']'; 42 END to_string; 43 END; 44 / Lines 4 and 5 list the parameters for the hobbit_t constructor. Line 8 assigns a literal value to the oname attribute of the base_t object type. Lines 9 and 10 assign the formal parameters to the genus and name attributes of the hobbit_t subtype. Line 40 uses a general invocation statement to call the base_t’s to_string function. You can now create a table that has a substitutable column that uses the base_t parent object type. The Oracle database assumes object type columns are substitutable at all levels, unless you turn off a column’s substitutability. The following creates a tolkien table, and it has only two columns. One column has a NUMBER data type and the other has a user-defined object type. The base_t object type column is substitutable at all levels: SQL CREATE TABLE tolkien 2 ( tolkien_id NUMBER 3 , character BASE_T ); You create a tolkien_s sequence for the unique tolkien_id column with the following: SQL CREATE SEQUENCE tolkien_s START WITH 1001; You can insert one base_t and two hobbit_t object types with the following INSERT statements: SQL INSERT INTO tolkien VALUES 2 ( tolkien_s.NEXTVAL, base_t() ); SQL INSERT INTO tolkien VALUES 2 ( tolkien_s.NEXTVAL, hobbit_t('HOBBIT','Bilbo') ); SQL INSERT INTO tolkien VALUES 2 ( tolkien_s.NEXTVAL, hobbit_t('HOBBIT','Frodo') ); The following simple query shows you the unique identifier in the tolkien_id column and collapsed object types in the character column of the tolkien table: SQL COLUMN character FORMAT A40 SQL SELECT tolkien_id 2 , character 3 FROM tolkien; It should display the following: TOLKIEN_ID CHARACTER(ONAME) ---------- ---------------------------------------- 1001 BASE_T('BASE_T') 1002 HOBBIT_T('HOBBIT_T', 'HOBBIT', 'Bilbo') 1003 HOBBIT_T('HOBBIT_T', 'HOBBIT', 'Frodo') Oracle always stores object instances as collapsed object instances in tables. You need to use the TREAT function in SQL to read instances of an object type. The TREAT function lets you place in memory an instance of an object type. The TREAT function requires that you designate the type of object instance. If you want the TREAT function to work with all rows of the table, you designate the column’s object type as the base (or superclass) type. Designating a subtype to work like a parent, grandparent, or any antecedent type is a form of casting. Though casting in this case is actually dynamic dispatch. Dynamic dispatch lets you pass any subtype as a parent or antecedent type. Dynamic dispatch inspects the object and treats it as a unique object. The following query uses the TREAT function to read the parent and any subtype of the parent object type: SQL COLUMN to_string FORMAT A40 SQL SELECT tolkien_id 2 , TREAT(character AS BASE_T).to_string() AS to_string 3 FROM tolkien; It prints the oname attribute for base_t instances and the oname, genus, and name attributes for hobbit_t instances, like TOLKIEN_ID TO_STRING ---------- --------------------------- 1001 [BASE_T] 1002 [BASE_T] 1003 [HOBBIT_T][HOBBIT][Bilbo] 1004 [HOBBIT_T][HOBBIT][Frodo] The TREAT function manages dynamic dispatch but requires any specialized method of a subtype to exist in the parent or antecedent type to which it is cast. Any query can cast to the root or an intermediate parent subtype. The TREAT function raises an exception when you don’t have an equivalent method stub (definition) in the parent or antecedent type. For example, let’s modify the previous query and change the method call on line 2 from the to_string function to the get_name function. The new query is: SQL COLUMN to_string FORMAT A40 SQL SELECT tolkien_id 2 , TREAT(character AS BASE_T).get_name() AS get_name 3 FROM tolkien; It fails with the following error: , TREAT(character AS BASE_T).get_name() AS get_name * ERROR at line 2: ORA-00904: "C##PLSQL"."BASE_T"."GET_NAME": invalid identifier The reason for the failure is interesting. It occurs because the get_name function is only part of the hobbit_t subtype and can’t be found as an identifier inside the base_t object type. PL/SQL identifiers are: reserved or key words; predefined identifiers; quoted identifiers; user-identifiers; and user-defined variables, subroutine, and data or object type names. You can access the MEMBER functions or procedures (method) of a subtype when you cast to a parent type provided you meet two conditions. First, you must implement the MEMBER method in the subtype. Second, you must define the same MEMBER method in the parent type. Accessing a subtype MEMBER method differs from general invocation. General invocation occurs when you call a MEMBER method from a parent or antecedent type from a subtype’s OVERRIDING MEMBER method. Oracle doesn’t explain how you call a subtype’s method from a parent or antecedent type but there is a close corollary – packages. For example, you can only call a package function or procedure from another PL/SQL block when you’ve defined it in the package specification. This means you need to implement a stub for the get_name function inside the base_t object type because it acts as the specification. You add a get_name function to the base_t object type in the next example: SQL CREATE OR REPLACE 2 TYPE base_t IS OBJECT 3 ( oname VARCHAR2(30) 4 , CONSTRUCTOR FUNCTION base_t 5 RETURN SELF AS RESULT 6 , MEMBER FUNCTION get_name RETURN VARCHAR2 7 , MEMBER FUNCTION get_oname RETURN VARCHAR2 8 , MEMBER PROCEDURE set_oname (oname VARCHAR2) 9 , MEMBER FUNCTION to_string RETURN VARCHAR2) 10 INSTANTIABLE NOT FINAL; 11 / Line 6 adds the get_name function to the base_t object type. The following shows you how to implement get_name function stub in the object type body: SQL CREATE OR REPLACE 2 TYPE BODY base_t IS 3 CONSTRUCTOR FUNCTION base_t 4 RETURN SELF AS RESULT IS 5 BEGIN 6 self.oname := 'BASE_T'; 7 RETURN; 8 END; 9 MEMBER FUNCTION get_name RETURN VARCHAR2 IS 10 BEGIN 11 RETURN NULL; 12 END get_name; 13 MEMBER FUNCTION get_oname RETURN VARCHAR2 IS 14 BEGIN 15 RETURN self.oname; 16 END get_oname; 17 MEMBER PROCEDURE set_oname 18 ( oname VARCHAR2 ) IS 19 BEGIN 20 self.oname := oname; 21 END set_oname; 22 MEMBER FUNCTION to_string RETURN VARCHAR2 IS 23 BEGIN 24 RETURN '['||self.oname||']'; 25 END to_string; 26 END; 27 / Lines 9 through 12 implement the get_name function stub. You should note that it returns a null value because the name attribute doesn’t exist in the root node (base_t) object type. The change to the hobbit_t object type is simpler. All you need to do is add the OVERRIDING keyword before the get_name member function in the hobbit_t object type and body. With that change, you can successfully run the following query: SQL COLUMN get_name FORMAT A20 SQL SELECT tolkien_id 2 , TREAT(character AS BASE_T).get_name() AS get_name 3 FROM tolkien; It now works and prints: TOLKIEN_ID GET_NAME ---------- -------------------- 1001 1002 Bilbo 1003 Frodo This article showed you how to extend parent object types. It also showed you how to modify parent types to support generalized calls with the TREAT function. Together these principles show you how to leverage substitutability on columns.
↧