Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Mailing Lists -> Oracle-L -> Re: anyone use pipelined functions?
My responses below are below
"Ryan" <ryan_oracle_at_cox.net>
Sent by: ml-errors_at_fatcity.com
12/31/2003 11:54 AM
Please respond to
ORACLE-L_at_fatcity.com
To
Multiple recipients of list ORACLE-L <ORACLE-L_at_fatcity.com>
cc
Subject
Re: anyone use pipelined functions?
great response. questions inline.
----- Original Message -----
To: "Multiple recipients of list ORACLE-L" <ORACLE-L_at_fatcity.com>
Sent: Wednesday, December 31, 2003 2:14 PM
> At the time, I did: I used simple sql_tracing for much of the analysis,
> and definitely analyzed in stages. Unfortunately, most of the trace
data
> was lost. I have a couple of the files, from which I started with
10,000
> row inserts (with commit batches of 2000) vs. 10,000 directly appended
> rows.
>
> For 10,000 single row inserts (non-pipelined function), the average was
> about 370 rows/second.
> For 10,000 appended rows (pipelined), the average was about 2100
> rows/second; this scaled mostly linearly to 1000000 rows (in further
> testing), and to the total number of rows in the table. Clearly this
was
> a big improvement to the original function, although the query/index was
> probably the best performance improvement overall.
did you test this with an 'insert select'? or was it not possible given
your
requirements?
Not sure what you mean by 'this,' but the pre-pipeline version was simply a bunch of single-row inserts within the body of the procedure. The whole point of the needing to stuff this into a routine is that data manipulation requirements made it impossible to simple use an insert into ... select func1(col1), func2(col2), ... func3(col3) ... The pipelined version leveraged insert /*+ append */ ... as select, but I never tested omitting the append hint (which would have simply run slower). I'm sure, though, that is would have at least been faster than single row inserts. So, pipelining has the advantage of allowing insert into with or without an append hint (for example, if you need to keep constraints and or triggers enabled), something you couldn't do based on looped inserts. (I believe you can before similar functions with bulk inserts within the PL/SQL code.)
>
> Another not insignificant contributor to the overall time of the
original
> (and tuned) procedure was the target table sequence. Adding even a
small
> cache (10) to the sequence dropped its overall contribution to the
runtime
> significantly, but it was still the second-largest contributor to the
> tuned function, following the insert. I also removed redundant calls to
> USER with a single call and variable (an obvious programming flaw).
how do you know how much time was spent on sequences?
~1400 sequence values / second with nocache. ~12000 sequence values / second with cache 100.
You only seem to get marginal benefits by increasing the cache by factors of 10. A cache of 100 was, I think, only a bit better performance wise than a cache of 10. We probably could have generate even better performance by getting the initial sequence value, using a variable within the routine to generate it, and then recreating the sequence following the load. Almost a third of the ~30 minute runtime was spent generating sequence values.
It's pretty easy to test the use of a cache in a 'naive' way, too. Let's compare against manually generated values (using PL/SQL variable):
adonahue_at_orcl/rhsac3> create sequence cache0_seq nocache;
Sequence created.
adonahue_at_orcl/rhsac3> create sequence cache10_seq cache 10;
Sequence created.
adonahue_at_orcl/rhsac3> create sequence cache100_seq cache 100;
Sequence created.
adonahue_at_orcl/rhsac3> set timing on
adonahue_at_orcl/rhsac3> set autotrace traceonly
adonahue_at_orcl/rhsac3> begin
2 for i in 0..10000 loop
3
adonahue_at_orcl/rhsac3> declare
2 v_n number;
3 begin
4 for i in 1..10000 loop 5 select cache0_seq.nextval into v_n from dual; 6 end loop;
PL/SQL procedure successfully completed.
Elapsed: 00:00:07.06 <-- NOCACHE adonahue_at_orcl/rhsac3> /
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.58
adonahue_at_orcl/rhsac3> /
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.91
adonahue_at_orcl/rhsac3> declare
2 v_n number;
3 begin
4 for i in 1..10000 loop 5 select cache10_seq.nextval into v_n from dual; 6 end loop;
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.46 <-- CACHE 10 adonahue_at_orcl/rhsac3> /
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.45
adonahue_at_orcl/rhsac3> /
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.46
adonahue_at_orcl/rhsac3> declare
2 v_n number;
3 begin
4 for i in 1..10000 loop 5 select cache100_seq.nextval into v_n from dual; 6 end loop;
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.94 <-- CACHE 100 adonahue_at_orcl/rhsac3> /
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.92
adonahue_at_orcl/rhsac3> /
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.92
adonahue_at_orcl/rhsac3> declare
2 v_n number := 1;
3 begin
4 for i in 1..10000 loop 5 v_n := v_n + 1; 6 end loop;
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.03 <-- NO SEQUENCE, much faster. adonahue_at_orcl/rhsac3> /
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.01
adonahue_at_orcl/rhsac3>
As we can see from this, there was much improvement from a nocache sequence to a cache of 10. (Interestingly, the overhead in populating the cache 100 sequence actually made it cost more for this small number of rows than the cache 10 sequence. Larger row counts probably make larger cache values more appropriate.)
On the other hand, bypassing the use of a sequence is by FAR much faster: it returns almost instantaneously. Let's use a larger test on the cache100 vs. manually generated values:
adonahue_at_orcl/rhsac3> declare
2 v_n number;
3 begin
4 for i in 1..1000000 loop 5 select cache100_seq.nextval into v_n from dual; 6 end loop;
PL/SQL procedure successfully completed.
Elapsed: 00:02:23.73 <-- CACHE 100 for 1,000,000 values
adonahue_at_orcl/rhsac3> declare
2 v_n number;
3 begin
4 select cache100_seq.nextval into v_n from dual; 5 for i in 1..1000000 loop 6 v_n := v_n + 1; 7 end loop;
PL/SQL procedure successfully completed.
Elapsed: 00:00:01.06 <-- MANUAL (accounting for seed)
Lessons seem to be:
Adam
>
>
> Fantastic results Adam.
>
> You didn't perhaps do interim testing did you, so that you
> know how much of the benefit was due to the pipelined functions?
>
> You made quite a few changes, and a breakdown of the
> the benefits of each would be interesting to see.
>
> Jared
>
>
>
>
>
>
> AdamDonahue_at_maximus.com
> Sent by: ml-errors_at_fatcity.com
> 12/31/2003 09:04 AM
> Please respond to ORACLE-L
>
> To: Multiple recipients of list ORACLE-L
> <ORACLE-L_at_fatcity.com>
> cc:
> Subject: Re: anyone use pipelined functions?
>
>
>
> I recently rewrote a poor-performing data load procedure (with single
row
> inserts, commit batches of 2000) to a pipelined table function, which
> enabled insert /*+ append */ into the target table, which greatly
enhanced
>
> performance. The original routine contained an embedded select, a
second
> select using a top-level select key, and then a large loop with data
> operations culminating with an insert of each row (and sequence value
> generation). The routine took about four hours to run.
>
> I joined the queries into a single inner join select, parallelized;
added
> a cache to the sequence (which had been set to zero); added a second
index
>
> to the source table to enable FFS; engineered the function to leverage
> pipelining (moving the to an insert /*+ append */ into ... select * from
> table( function ); made the requisite modifications to the target table,
> and reran the load. The time came down to about 25 minutes. The total
> work spent in engineering the procedure was about 2 hours, so the work
put
>
> into tuning it + its improved runtime came in at less than the total
> original runtime! (Of course, this does not include reenabling
> constraints, triggers, etc., but these things bring the total runtime up
> to about ~1 hour in this case, still an improvement.)
>
> The benefit-cost ratio here was quite high!
>
> Adam
>
>
>
>
> <ryan_oracle_at_cox.net>
-- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: INET: AdamDonahue_at_maximus.com Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).Received on Wed Dec 31 2003 - 15:59:25 CST