Re: Client sqlplus SELECT + SPOOL knocks down server network
From: DA Morgan <damorgan_at_psoug.org>
Date: Mon, 10 Mar 2008 10:27:45 -0700
Message-ID: <1205170062.458836@bubbleator.drizzle.com>
>
> Daniel:
>
> I am doing the test from sqlplus (no ODBC this time)
>
>
> select /*+ full(a) */
> a.col1,
> a.col2
> from schema.table_name a;
>
> It starts displaying/spooling and then breaks with:
>
> ...
> 955800207 T
> ERROR:
> ORA-03135: la conexión ha perdido contacto
>
> 10078290 filas seleccionadas.
>
> ERROR:
> ORA-03114: no conectado a ORACLE
>
> (There are 11339771 rows in the table).
>
> From here on, there is no ping from or to the server.
>
> ifconfig on the server shows:
>
> RX packets:674109 errors:0 dropped:0 overruns:0 frame:0
> TX packets:674109 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:53271778 (50.8 MiB) TX bytes:259126296 (247.1 MiB)
> Interrupt:169
>
> And it's time to reboot again...
>
> Cheers.
>
> Carlos.
Date: Mon, 10 Mar 2008 10:27:45 -0700
Message-ID: <1205170062.458836@bubbleator.drizzle.com>
Carlos wrote:
> On 10 mar, 14:08, DA Morgan <damor..._at_psoug.org> wrote:
>> Carlos wrote: >>> On 10 mar, 13:21, DA Morgan <damor..._at_psoug.org> wrote: >>>> Carlos wrote: >>>>> Hello all: >>>>> I'm facing a strange problem on a 10.2.0.3 Oracle on Oracle Enterprise >>>>> Linux 5 server. >>>>> This morning someone was issuing a SELECT from a windows client via >>>>> ODBC and suddenly all the server communications broke down: even no >>>>> ping from or to that server worked. >>>>> Tried stop&start network but no avail. The server appears to be dead >>>>> at communications until reboot. >>>>> So I tried to repeat the SELECT from sqlplus on a windows client to a >>>>> OS file with SPOOL and all the communications on the server are gone >>>>> once again. Ping only answers to 127.0.0.1 or its own IP address, and >>>>> Linux is telling me that the network card is OK. Stop&start network >>>>> didn't fix the problem either. >>>>> Client sqlnet.log shows: >>>>> Fatal NI connect error 12170. >>>>> VERSION INFORMATION: >>>>> TNS for 32-bit Windows: Version 10.2.0.1.0 - Production >>>>> Windows NT TCP/IP NT Protocol Adapter for 32-bit Windows: Version >>>>> 10.2.0.1.0 - Production >>>>> Time: 10-MAR-2008 12:11:59 >>>>> Tracing not turned on. >>>>> Tns error struct: >>>>> ns main err code: 12535 >>>>> TNS-12535: Message 12535 not found; No message file for >>>>> product=NETWORK, facility=TNS >>>>> ns secondary err code: 12560 >>>>> nt main err code: 505 >>>>> TNS-00505: Message 505 not found; No message file for >>>>> product=NETWORK, facility=TNS >>>>> nt secondary err code: 60 >>>>> nt OS err code: 0 >>>>> Client address: <unknown> >>>>> The select is a two column (varchar2(9), char(1)) select of 11 million >>>>> rows aprox. >>>>> Could anybody tell me where to search next? (I didn't find anything at >>>>> metalink). >>>>> TIA. >>>>> Cheers. >>>>> Carlos. >>>> Some of what you've written seems self-conflicting. >>>> You start off telling us you've 10.2.0.3 and then everything >>>> thereafter indicates 10.2.0.1. >>>> Then you say "dead until reboot." Does that mean it is now >>>> working fine or not? >>>> How about queries using Oracle's native interface SQLNET >>>> rather than ODBC? >>>> Ping only answers to 127.0.0.1, I presume on that machine, >>>> have you verified that other addresses exist? >>>> Have you performed an ipconfig /renew? >>>> Perform an ipconfig /all ... what does it show? >>>> Are you using DHCP? >>>> Can you ping the server from elsewhere on the network? >>>> What are your firewall settings? >>>> What tool is connecting using ODBC? >>>> -- >>>> Daniel A. Morgan >>>> Oracle Ace Director & Instructor >>>> University of Washington >>>> damor..._at_x.washington.edu (replace x with u to respond) >>>> Puget Sound Oracle Users Groupwww.psoug.org >>> Hello Daniel: >>>> You start off telling us you've 10.2.0.3 and then everything >>>> thereafter indicates 10.2.0.1. >>> Yup: 10.2.0.3 is the server on EL5. 10.2.0.1 was the windows client >>> from which I did the test. >>>> Then you say "dead until reboot." Does that mean it is now >>>> working fine or not? >>> Working fine after reboot... until I issue the SELECT again. >>>> How about queries using Oracle's native interface SQLNET >>>> rather than ODBC? >>> This is what I did on the windows 10.2.0.1 client with sqlplus. >>>> Ping only answers to 127.0.0.1, I presume on that machine, >>>> have you verified that other addresses exist? >>> Yes. If I ping the server network IP address from the server itself, >>> it answers OK, but no answer from any other IP in the network. If I >>> ping the server from another machine there is no answer. >>>>> Are you using DHCP? >>> Nope. >>>> Can you ping the server from elsewhere on the network? >>> Nope >>>> What are your firewall settings? >>> No firewall. Isolated environment. >>>> What tool is connecting using ODBC? >>> Visual FoxPro. But, as said before, sqlplus bring the server >>> communications to its knees too. >>> I'll try the ipconfig (after repeating the whole process from startup >>> the server again). >>> Thank you very much. >>> Cheers. >>> Carlos. >> A few additional thoughts. >> >> 1. Upgrade the client to 10.2.0.3 >> 2. Post the SQL statement: I'd like to see it >> 3. Capture the statement and see if it contains any non-visible >> characters when Fox sends it through >> 4. What ODBC driver? Try changing drivers >> >> Is it only this one SQL statement or can more than one statement >> do it? What happens if you take the problem statement and >> reconstruct it from scratch in SQL*Plus starting with the >> simplest SELECT single_column FROM single_table and then >> incrementally add back its full functionality? When does it >> break? >> -- >> Daniel A. Morgan >> Oracle Ace Director & Instructor >> University of Washington >> damor..._at_x.washington.edu (replace x with u to respond) >> Puget Sound Oracle Users Groupwww.psoug.org
>
> Daniel:
>
> I am doing the test from sqlplus (no ODBC this time)
>
>> 2. Post the SQL statement: I'd like to see it
>
> select /*+ full(a) */
> a.col1,
> a.col2
> from schema.table_name a;
>
> It starts displaying/spooling and then breaks with:
>
> ...
> 955800207 T
> ERROR:
> ORA-03135: la conexión ha perdido contacto
>
> 10078290 filas seleccionadas.
>
> ERROR:
> ORA-03114: no conectado a ORACLE
>
> (There are 11339771 rows in the table).
>
> From here on, there is no ping from or to the server.
>
> ifconfig on the server shows:
>
> RX packets:674109 errors:0 dropped:0 overruns:0 frame:0
> TX packets:674109 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:53271778 (50.8 MiB) TX bytes:259126296 (247.1 MiB)
> Interrupt:169
>
> And it's time to reboot again...
>
> Cheers.
>
> Carlos.
Aha:
The title is: Subject: ORA-03113 Or ORA-03135 Reported for Failover with Network Outage
The doc id is: Note:455771.1
The date is: 10-MAR-2008
That's today.
The bug number is:
Bug 5758870 Select Type TAF fails with ORA-3113 If Network Cable Pulled off
The solution is:
Problem is fixed from 11g onwards and in 10.2.0.4 Patch set release.
One off patchs for some platforms are available.
Apply one-off Patch 5758870 for your platform, if available.
Are you on a RAC cluster?
-- Daniel A. Morgan Oracle Ace Director & Instructor University of Washington damorgan_at_x.washington.edu (replace x with u to respond) Puget Sound Oracle Users Group www.psoug.orgReceived on Mon Mar 10 2008 - 12:27:45 CDT