Re: ORDS Restart requirement

From: kris rice <kris.rice_at_jokr.net>
Date: Tue, 19 Oct 2021 10:43:16 -0400
Message-ID: <CAPVZWiM8+NwOw-uUe_g6eXB9QXJq2HgX=totY7N7bgBOpZ6G-w_at_mail.gmail.com>



Ruan,
  There was an issue a while back that if not available at startup it was flagged and never revisited. That was changed but I can not recall what version. There were basically 2 issues that got addressed. 1) what you see in that a down db is bad listed and 2) connecting to every defined db at startup. What version are you using?

-kris

On Mon, Oct 18, 2021 at 7:24 PM Ruan Linehan <ruandav_at_gmail.com> wrote:

> Hi Kris/Tim,
>
> After quite a few various scenario based attempts at "breaking" the
> connectivity between ORDS and the DB endpoints, I think I have observed
> what is really going on. Trying to recreate my original suggested symptom
> by creating a service availability disconnect, even for long periods of
> time, was unsuccessful; insofar as ORDS always then successfully resumed
> work once the DB connection was re-established (Which is good). So
> apologies for mis-representing the issue in my original mail.
>
> I investigated some of the older historical APEX logs. What is more likely
> happening in our situation, is that restarts of the ORDS services (i.e.
> Which we do frequently to introduce new endpoint configs) on the load
> balanced VMs is sometimes crossing over with periodic maintenance window
> periods of individual pluggable databases.
>
> Therefore, sometimes it will happen that for 1 in 100 endpoints, there
> will be an ORDS complaint with respect to the initial startup validation of
> the pool config. e.g. "WARNING: The pool named: |apex|pu| is missing and
> will be ignored: The database service named: |apex|pu| does not exist."
> These startup errors were not being properly trapped / flagged as part of
> our processes, so I can easily fix that.
>
> Testing of this scenario with an ORDS startup, whilst a PDB database
> service is currently down, does result in the connection never establishing
> once the PDB service is eventually started. So I believe this is actually
> what is occurring for us; Not a disconnect of the database from ORDS, but
> it is the initial startup verification which if unsuccessful for a
> particular endpoint, means that the pool entry is literally "ignored" from
> then on.
>
> I assume this is the expected behaviour and if so, is there any work
> around beyond a restart of ORDS at that point?
>
> Kind regards,
> Ruan
>
> On Tue, Oct 12, 2021 at 9:58 PM Ruan Linehan <ruandav_at_gmail.com> wrote:
>
>> *"and it never requires a restart. The pools should reestablish
>> themselves as needed"*
>>
>> Thanks Tim and Kris for taking the time to reply.
>>
>> Well, that is quite puzzling to me that our 'broken' connection issue
>> seems unexpected to you both; but it also makes me hopeful that this is
>> some mis-config or error in implementation on our part. I'm certainly not
>> proficient in configuring ORDS as I'm usually investigating from the other
>> side of the (database) fence. I'm intrigued now, as you mentioned Tim, that
>> maybe we have some sequence of events or triggers leading to our particular
>> issue. Yes, we have multiple ORDS installs on separate VMs behind a
>> load-balancer.
>>
>> I've just "broken" a non-production environment ORDS web services pool
>> connection in the last hour, by stopping the PDB services and killing
>> existing ORDS sessions to test. I'm going to leave this down now for a few
>> different periods of hours to see what happens and will reply back here
>> with some specifics.
>>
>> Kind regards
>> Ruan
>>
>> On Tue, Oct 12, 2021 at 1:57 PM kris rice <kris.rice_at_jokr.net> wrote:
>>
>>> I'd suggest, as always, upgrade. We have ords nodes on some databases
>>> that go up/down, active/readonlny,... and it never requires a restart. The
>>> pools should reestablish themselves as needed. For example, I have to
>>> manage all the ords nodes in Autonomous DB and those things come and go on
>>> the whim of customers kicking tires or shutting down to save money when not
>>> in use. These ords nodes run somewhere around 2k connection pools each and
>>> never need a restart, even when the pdb is relocated out from under us to
>>> another CDB.
>>>
>>> Happy to jump on a zoom or medium of choice and chat more if you'd like.
>>>
>>> -kris
>>>
>>> On Tue, Oct 12, 2021 at 7:43 AM Tim Hall <tim_at_oracle-base.com> wrote:
>>>
>>>> Hi.
>>>>
>>>> That's interesting.I can't ever remember having to restart ORDS as a
>>>> result of a database outage. Even prolonged ones. We install in the PDB,
>>>> not the CDB and have one or more ORDS instances in Docker containers for
>>>> each PDB. As a result, a problem with one instance doesn't affect
>>>> everything else. Even so, these are basic connection issues you are having,
>>>> so I don't think the topology differences can be that relevant. I think
>>>> this may be a job for the ORDS team. They could certainly tell you what the
>>>> expected behaviour is.
>>>>
>>>> Do you have a non-prod/test setup where you can test some failure
>>>> scenarios? I wonder if there are specific patterns that cause the issue,
>>>> rather than a general overarching issue.
>>>>
>>>> I guess in the interim I would consider a mitigation. I assume you have
>>>> multiple ORDS installations behind a load balancer to support this. If so,
>>>> you could script a restart of all ORDS instances (one at a time of course),
>>>> and call that at the end of every piece of scheduled maintenance. It would
>>>> minimise the apparent outage.
>>>>
>>>> I'll see if I can get someone from the ORDS team to look at this thread.
>>>>
>>>> Cheers
>>>>
>>>> Tim...
>>>>
>>>> On Tue, Oct 12, 2021 at 9:22 AM Ruan Linehan <ruandav_at_gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I've researched elsewhere but not been able to identify a suitable
>>>>> solution, so I'm asking here in the hopes that an ORDS aficionado might
>>>>> provide some direction.
>>>>>
>>>>> My issue is around the perception of a restart of ORDS being a
>>>>> requirement to re-establishing a connection to an endpoint which may have
>>>>> been unavailable for a period of time.
>>>>>
>>>>> We run ORDS v20 on a Linux VM as part of a solution accompanying an
>>>>> Exadata multitenant environment. ORDS is made available to all PDBs,
>>>>> installed in the CDB. Within the 'conf' directory of ORDS - we stage all
>>>>> the associated apex_aa.xml, apex_ab.xml, apex_ac.xml etc configuration
>>>>> mapping files. Periodically, one of the PDB environments may be made
>>>>> unavailable (i.e Closed or else RAC services stopped, or someone
>>>>> inadvertently locks the ORDS_PUBLIC_USER account etc) for maintenance, for
>>>>> a day or weekend etc. When this takes place, the pluggables
>>>>> ORDS_PUBLIC_USER database sessions are terminated and the ORDS connection
>>>>> cannot be re-established for a period of time to that PDB. So far so good.
>>>>>
>>>>> Once the maintenance is complete, and the PDB is re-opened once again,
>>>>> RAC services restarted, ORDS does not automatically re-establish a database
>>>>> connection to that same PDB.
>>>>>
>>>>> If I need to get the ORDS_PUBLIC_USER connections re-established once
>>>>> more for that specific PDB, then I need to stop ORDS processes for all
>>>>> clients and restart.
>>>>> i.e. This reads the url mapping xml and validates the associated
>>>>> apex_aa.xml files etc., and eventually successfully re-establishes ALL the
>>>>> database connection and all is good.
>>>>>
>>>>> The difficulty is though, that we have literally hundreds of these
>>>>> PDBs in a CDB, and literally hundreds of accompanying ORDS endpoints. So,
>>>>> if one of these environments is impacted by a "maintenance" of some sort,
>>>>> and the database connection is severed for a time, then it requires a full
>>>>> restart of ORDS for ALL to get it back, which is rather painful.
>>>>>
>>>>> There must be something I am missing right? I understand XML config
>>>>> changes require a restart of ORDS to be picked up, but I find it troubling
>>>>> that a full restart is also required when just one client endpoint
>>>>> connection out of a hundred is impacted?
>>>>> Is there any way I can force ORDS to 'reinit' and re-read the conf
>>>>> files to re-establish a single broken connection with restarting?
>>>>>
>>>>> Kind regards,
>>>>> Ruan
>>>>>
>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Oct 19 2021 - 16:43:16 CEST

Original text of this message