Data Integrator [message #648544] |
Thu, 25 February 2016 02:19 |
|
deceneu
Messages: 4 Registered: February 2016
|
Junior Member |
|
|
Hi,
on project, that our company should start, raised a discussion regarding solution to use, for the following problem:
- we have a big database (1T/month - window) that scale very good on real-time data collecting project,
- was requested to build some reports, based on this database.
To meet this requirement was decided to replicate data from primary database into another one, because of performance reasons.
For this requirement I'm asking what is the best way to do it?
Our way was to use:
1. Data Gard to replicate primary database, but this solution is not applicably, because secondary database should be fully functional in read/write mode.
2. Data Pump partial, what is not applicable because, offline time for database is not accepted.
3. CDC - not applicable because of many problems that we have in the past and also is deprecated.
4. last one, to develop custom scripts for data transfers, that is not accepted from maintenance costs.
Please Help with best solution !
|
|
|
|
|
|
|
Re: Data Integrator [message #648570 is a reply to message #648568] |
Thu, 25 February 2016 09:59 |
|
Michel Cadot
Messages: 68718 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
It seems logical standby is what you need.
It is also possible to use cascade replication:
(1) OLTP -> (2) Data warehouse -> (3) Data mart (aggregation)
Between (2) and (3) you can use materialized views.
The architecture depends on your current (detailed) and future needs... and on the money you can have.
As I said, is the standby a 24/24, 7/7? Does it need to have the current data or a delay is possible? Often the user needs are only to have in the morning the reports of the end of the day before.
[Updated on: Thu, 25 February 2016 10:00] Report message to a moderator
|
|
|
|
Re: Data Integrator [message #648597 is a reply to message #648570] |
Fri, 26 February 2016 02:33 |
|
deceneu
Messages: 4 Registered: February 2016
|
Junior Member |
|
|
Michel Cadot wrote on Thu, 25 February 2016 09:59
It seems logical standby is what you need.
It is also possible to use cascade replication:
(1) OLTP -> (2) Data warehouse -> (3) Data mart (aggregation)
Between (2) and (3) you can use materialized views.
The architecture depends on your current (detailed) and future needs... and on the money you can have.
As I said, is the standby a 24/24, 7/7? Does it need to have the current data or a delay is possible? Often the user needs are only to have in the morning the reports of the end of the day before.
Michel you so right,
that I want to scream that way that all, my company and the customer will hear this.
All normal people will do like this, but not our customer.
He wants to see real-time info on active system, some kind of real-time Data warehouse.
I'll try to explain here in more details.
The main reason for this project to start is to be able provide some information of the processing data from active system.
If it is not able to respond with a delay of maximum 10 min, there will be no reason for the project to exists.
In other words, primary database (Oracle 11gR2) will provide data, secondary database should replicate data incrementally during the all day. We can negotiate 1/2 hour maximum gap.
The main database process medium 3 000 of transactions per second. Replicated data will be around 1G per hour, in the local fiber-link network. Network overhead is not our regarding. The stand-by time for primary is 24/24 7/7 and for secondary 20/24 7/7.
It was a hard 5 years work, of tuning at all levels, hardware and software. We can not think to introduce some overhead for queering or aggregations on the main database, not even triggering some bath processing queries for materialized views.
We are using RAC for primary database and DataGuard for backup of RAC.
The idea of DataGuard seems magical at the beginning, but has to many constraints.
RAC node - was also considered, but not applied because of reasons below.
Regarding costs, it should be fixed and known from the beginning.
Normally secondary database will suffer a lot of changes on the table structures, and also on tables rows data, and we want to keep options open for the futures.
Please give more infos about use of cascade replication:
(1-primary db) OLTP -> (2-secondary db) Data warehouse -> (3-secondary db) Data mart (aggregation)
Between (2) and (3) you can use materialized views.
Does I understood well ?
|
|
|