Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: Informatica Bulk Mode behavior
Tim, Waleed, we got this response from Informatica Support:
We are using the OCI DPL (direct path load) in the following way:
First, We get a block with "x" # of rows.
Do following loop
Call ColArrayToStream for these "x" rows.
Load whatever we can (call DirPathLoadStream)
Do a DirPathStreamReset.
while ColArrayToStream did not convert all the rows, repeat the above
steps for the remainder rows.
Every so often, depending on our "commit" interval, we will call DirPathFinish (and then reprepare).
The 2 differences between our process and Oracle's demo program are:
1) the # of rows "x" may be different. They may be optimizing the "x"
that they pick. We derive "x"=20
from the buffer block size.
2) Oracle does 1 DirPathFinish at the end of the process, whereas we do it on every commit. =20
Thus, we could probably simulate exactly what Oracle's test program is
doing by making sure that the=20
# of rows in the block matches their test program and the target commit
interval is really high.
So there seem to be 2 possible places where we may be causing more extents to be created than necessary. =20
We still have one X factor that we need to account for. The CR mentions that if the commit interval is high, we=20 sometimes get the following error:
ORA-01658: unable to create INITIAL extent for segment in table space=20
Our guess is that somehow we ran out of space to get an extent.
(perhaps in the temp table space) Perhaps=20
this can be limited somehow if there is a way to turn off logging (if it
was, indeed, on), but I didn't see=20
this option available. Although we haven't tested this yet, we would
think that the sample program would have=20
similar issues, although it may take a little longer because they may
not have wasted extents as listed in option=20
1) above.
Solutions:
The workaround fix is to play with the buffer block size and the commit
size. If the buffer block size fits=20
the correct amount of rows, we may have a somewhat optimal fix. We may
want to consider logging certain=20
information that may help the user pick a correct buffer size, but, of
course, this can be kinda cryptic.
-----Original Message-----
From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Khedr, Waleed
Sent: Tuesday, April 27, 2004 12:17 AM
To: oracle-l_at_freelists.org
Subject: RE: Informatica Bulk Mode behavior
I use Informatica 6.x and so far did not have any problem. But you may be right, that this might have to do with the way =3D Informatica calls Oracle OCI.
The other way I see it, Informatica is giving bulk load as a relational
=3D method to load the target instead of external loader, which must be =
=3D
thread-safe to run without problems using their multi-thread options. =
=3D
This leads to the issues we encountered when we run direct load in =3D
parallel, which is different than parallel DML/insert in direct mode =3D
where threads try to broker the HWM.
Regards,
Waleed
-----Original Message-----
From: Tim Gorman [mailto:tim_at_sagelogix.com]
Sent: Monday, April 26, 2004 3:16 PM
To: oracle-l_at_freelists.org
Subject: Re: Informatica Bulk Mode behavior
Jeff and Waleed,
I think Informatica is blowing smoke on this error message, or at least =3D they are smoking something...
Each time you initiate a direct-path load operation, it will allocate a
=3D new extent, but the arraysize or data save frequency has no bearing =
on
size =3D or number of extents, whether using "direct-path" (a.k.a. =
"bulk")
loads or "conventional" SQL INSERT loads. In other words, based on the
behavior described, it is quite likely that Informatica has a bug in
their =3D bulk-load functionality resulting from improper use of the
Oracle Call Interface =3D
(OCI)
application programming interface (API). I don't think that the cause =
=3D
of the ORA-01658 error has to do with a problem of any kind within
Oracle =3D at all, no matter whose perspective you use.
Programs like Informatica can only use the OCI functions for direct-path loading (i.e. Functions named "OCIDirPath...()") and none of these OCI functions presents any controls for extent sizing.
The only explanation is that Informatica is doing something silly and pointless like calling "OCIDirPathFinish()" instead of "OCIDirPathDataSave()" after each data save.
The "OCIDirPathFinish()" function closes a direct-path load session,
requiring that a new direct-path load session be initiated, causing a =
=3D
new extent to be allocated. The "OCIDirPathDataSave()" function merely
=3D saves the rows to blocks on disk, but does not initiate a new extent
or have =3D any control over extents.
In which case, the Informatica documentation should read something like:
Informatica code bug: Due to a misreading of Oracle's documentation
and sample OCI code, Informatica is improperly performing data saves during bulk loads using the "direct-path" API. Solution: Set the commit interval very high, to mask the fact that Informatica is improperly ending the direct-path load with each data save. This will result in fewer extents being used and less space being wasted, depending on the table's extent size.
Just my $0.02 as a long-time OCI programmer. I don't know for sure if =
=3D
this is what is happening, but it sure as heck sounds like it. It would
be interesting to "truss" the Informatica load process, to see what OCI
function calls it is issuing... :-)
Hope this helps...
-Tim
P.S. It would be interesting to find out if Informatica v7.x
includes any "performance enhancements" for bulk loading over v6.x. If so, the only "enhancement" I can think of is that they detected their own bug and fixed it... :-) After all, nothing new has been added to the "OCIDirPath...()" functions in the way of a performance enhancement since they became available in v8.0.x, I'm pretty sure...
on 4/26/04 6:48 AM, Thomas Jeff at jeff.thomas_at_thomson.net wrote:
> Waleed, =3D3D20
>=3D20
>=3D20 >=3D20
>=3D20
>=3D20
>=3D20
>loading. To resolve this=3D3D20 issue increase the commit interval to=20 >very high value. SUPPORTING
>(using same code as bulk=3D3D20
>Direct Path loading does=3D3D20 not check for Partial blocks are not=20 >used, so no reads are needed to find them and fewer writes=3D3D20
>=3D20 >=3D20
>=3D20 >=3D20
>=3D20
>=3D20
>=3D20 >=3D20
>=3D20
>before=3D3D3D20 they reran their job. =3D3D3D20 =3D20
>=3D20
-- Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html ----------------------------------------------------------------- ---------------------------------------------------------------- Please see the official ORACLE-L FAQ: http://www.orafaq.com ---------------------------------------------------------------- To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line. -- Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html ----------------------------------------------------------------- ---------------------------------------------------------------- Please see the official ORACLE-L FAQ: http://www.orafaq.com ---------------------------------------------------------------- To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line. -- Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html -----------------------------------------------------------------Received on Tue Apr 27 2004 - 07:50:51 CDT
![]() |
![]() |