SQL Loader parallel sessions [message #659930] |
Sun, 05 February 2017 16:54 |
|
abhi_orcl
Messages: 40 Registered: December 2016
|
Member |
|
|
Hi Everyone,
I have requirement where I have to load 30-40 millions rows from a file into the database.I am supposed to use SQL Loader and the best I can think of is implementing parallel and direct path loads. Now, I am trying the skip and load rows options to create multiple parallel SQL Loader sessions. For eg:
Sql loader 1 : load first 1000000 rows
Sql loader 1 : skip = 1000000 load 10000000
...... and so on
I have the below parameters for show parallel:
NAME TYPE VALUE
-------------------------------------------------- ----------- ----------------------------------------------------------------------------------------------------
fast_start_parallel_rollback string LOW
parallel_adaptive_multi_user boolean FALSE
parallel_automatic_tuning boolean FALSE
parallel_degree_level integer 100
parallel_degree_limit string CPU
parallel_degree_policy string AUTO
parallel_execution_message_size integer 16384
parallel_force_local boolean FALSE
parallel_instance_group string
parallel_io_cap_enabled boolean FALSE
parallel_max_servers integer 128
parallel_min_percent integer 0
parallel_min_servers integer 24
parallel_min_time_threshold string AUTO
parallel_server boolean FALSE
parallel_server_instances integer 1
parallel_servers_target integer 48
parallel_threads_per_cpu integer 1
recovery_parallelism integer 4
Currently I am loading data for the tables in chunk sizes of 2000000. So it takes around 10-12 parallel sessions and works fine. But I am just wondering if I can still reduce the chunks to say 100k and then run parallel sessions? From the above parameters is it that I can run at least 128 parallel sessions in a given instance?
I need an optimum number and the optimum parallel which would work magic. Could someone advise.
Just to add, I already have no indexes, no constraints etc on the tables and it is a one time activity and the target tables are all empty.
Regards
|
|
|
|
|
|
|
Re: SQL Loader parallel sessions [message #659961 is a reply to message #659944] |
Mon, 06 February 2017 10:04 |
|
abhi_orcl
Messages: 40 Registered: December 2016
|
Member |
|
|
@LittleFoot : Its a sybase database. Also, I am not sure if we can create a db link between oracle and sybase.Will check that.
@Michel : Here are my comments:
1. I am using direct=true parallel=true and multithreading=true while invoking concurrent sql loaders. Do i need to do more for the parallel option?
2. I have to check with the DBA.
3. I did query v$osstat and the NUM_CPUS has value 48 and NUM_CPU_CORES 24. So i think its 24.
4. I hope you meant to ask if there would be concurrent sessions running on the db at that time. If yes then just wanted to inform that this process would be a weekend task where no process would be triggered except the above.
[Updated on: Mon, 06 February 2017 10:22] Report message to a moderator
|
|
|
|
Re: SQL Loader parallel sessions [message #659964 is a reply to message #659961] |
Mon, 06 February 2017 10:36 |
John Watson
Messages: 8962 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
abhi_orcl wrote on Mon, 06 February 2017 16:04
<snip>
1. I am using direct=true parallel=true and multithreading=true while invoking concurrent sql loaders. Do i need to do more for the parallel option?
<snip>
.
Yes, you do need more. First, you must include DIRECT=TRUE and second you must have multiple input data files and third you must launch one sqlldr session for each file.
|
|
|
|
|
|
|
|
|
|
|
|
|
Re: SQL Loader parallel sessions [message #660059 is a reply to message #660057] |
Wed, 08 February 2017 02:52 |
|
s234blog
Messages: 5 Registered: February 2017
|
Junior Member |
|
|
On 2), yes I in principle agree, but there may be situations where you have to load over the network. And if the author has say 10GbE I doubt network is his bottleneck.
On 3) - simply from experience. See here for example [...] - up to 40 SQL*Loader sessions and the bottleneck stays client CPU, not client IO, network or database (of course it depends on the environment).
Best,
Chris
[Edit MC: remove spam for blog]
[Updated on: Wed, 08 February 2017 02:55] by Moderator Report message to a moderator
|
|
|
|
Re: SQL Loader parallel sessions [message #660061 is a reply to message #660035] |
Wed, 08 February 2017 02:58 |
John Watson
Messages: 8962 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Have you set the tables to NOLOGGING ? That can help quite a lot with direct loads. It is also worth checking whether you are in fact getting direct load. There are various circumstances (I cannot enumerate them) that cause SQL Loader to revert to conventional path load when you don't expect it.
|
|
|
|
Re: SQL Loader parallel sessions [message #660184 is a reply to message #660062] |
Fri, 10 February 2017 22:44 |
|
abhi_orcl
Messages: 40 Registered: December 2016
|
Member |
|
|
Ok Guyzz.. It is really all messed up for me now . To start with here are few of my findings which I could find till now:
The Sql loader logs state that the direct path with parallel option has been used - "Path used: Direct - with parallel option."
But as suggested I did try to query the explain plan being used and it says:
SQL_ID 6xdtp2w0zdjsq
--------------------
INSERT /*+ SYS_DL_CURSOR */ INTO "TABLE_X"
("ACCT_ID","INSTR_ID","I_TRAN","D_TRD","I_REG","I_EXTL_TRAN","C_TX_CDE",
"A_TX_COST_TRD","D_SET","A_TRD_PRC","Q_TRD","A_TRD_DOLLAR","C_TRD_TRAN_T
YPE","A_BKR_COMSN","BKR_ID","A_ACCR_INT_AMT","A_REAL_GAIN_LOSS","S_ROW_U
PD","N_SYB_DB_USER","I_ACCT_ID") VALUES (trim (:KLLUBV_1),trim
(:KLLUBV_2),trim (:KLLUBV_3),TO_TIMESTAMP(:KLLUBV_4, 'mon dd yyyy
HH:mi:ss:ff3 AM'),trim (:KLLUBV_5),trim (:KLLUBV_6),trim
(:KLLUBV_7),trim (:KLLUBV_8),TO_TIMESTAMP(:KLLUBV_9, 'mon dd yyyy
HH:mi:ss:ff3 AM'),trim (:KLLUBV_10),trim (:KLLUBV_11),trim
(:KLLUBV_12),trim (:KLLUBV_13),trim (:KLLUBV_14),trim (:KLLUBV_15),trim
(:KLLUBV_16),trim (:KLLUBV_17),TO_TIMESTAMP(:KLLUBV_18, 'mon dd yyyy
HH:mi:ss:ff3 AM'),trim (:KLLUBV_19),trim (:KLLUBV_20))
-------------------------------------------------------------
| Id | Operation | Name | Cost |
-------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 |
| 1 | LOAD TABLE CONVENTIONAL | TABLE_X| |
-------------------------------------------------------------
Note
-----
- cpu costing is off (consider enabling it)
- automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold
But the wait in v$session shows direct path load.
I am really not able to understand this parallel concept now. Is my Sql Loader really using the direct with parallel option as the logs and the explain plan contradict each other? Is it because I am trying to initiate 10-20 parallel sessions. Also, I can see in session tables that at any point of time, I can't see more than 2-3 insert statements related to my Sql Loader running. Seems like OS CPU issue (Linux OS has cpu as 2)???
Also, for other tables, I did try to load the tables after moving the files to the DB server using FTP and it did run pretty fast. So definitely there is some network issue. But the above is really killing me now. Pleaseeeee suggest
[Updated on: Fri, 10 February 2017 23:17] Report message to a moderator
|
|
|
|
|
|
|
|
|
Re: SQL Loader parallel sessions [message #664673 is a reply to message #659930] |
Thu, 27 July 2017 02:36 |
|
All of it depends on following,
1. N/W Bandwidth
2. Average ROW Size
3. How fast read is possible at Source.
4. How fast is your Target (In case of Exa its best )
I am using Named Pipe Approach to do the Data Migration from DB2 to Oracle, along with some other performance improvement mechanism like Partitioning the Target Tables and Loading the partitions using multiple SQLLDR sessions. With 1 GBPS N/W bandwidth, i can migrate 100 M rows having Avg Row size as 50-60 bytes in 6-8 Minutes. Number of parallel SQLLDR sessions are 10.
No Index, Key on the Target Table but Table is in LOGGING mode.
There is No Intermediate File getting created.
|
|
|