Re: Horizontally scaling a database

From: John Darrah <darrah.john_at_gmail.com>
Date: Fri, 16 Mar 2007 13:31:27 -0600
Message-ID: <ec40ac060703161231q11b351cckc5278940559efe0e@mail.gmail.com>

Seems like you would need to logically partition your data onto separate servers using whatever on back end (mysql, BerkleyDB, etc). You would then need application servers sitting in front that have an index into where the data is physically stored based on the partition key. These application servers would act as routers and aggregaters sending the actual queries to appropriate servers and aggregating the results back to the requester. I don't know of anything that would do this out of the box but it can be done. If there wasn't a good key that could consistently reduce the number of back end servers to a small subset of the total then you would forget about the index routing strategy all together and just broadcast to all back end servers every time. How long would you need to retain the data? A much simpler strategy to implement would be to keep the data on one database and partition it you can put a LOT of rows into a partitioned table in oracle, several years worth, even at your requirements. I realize you are looking at shared nothing but it'll cost you to implement the solution I describe above. You will also need exceptional developers and architects that have implemented systems like this before.

On 3/15/07, Sandeep Dubey <dubey.sandeep_at_gmail.com> wrote:
>
> Hi all,
>
> I am looking for ideas how to scale database horizontally as opposed
> to adding more CPUs, memory and disks on a single machine.
>
> Here are the main requirements:
>
> We need to store apporx 300 GB of data a month. It will be OLTP system.
> We want to use commodity hardware and open source database. we are
> willing to sacrifice performance for cost. E.g. a single row search
> from 2 billion rows table should be returned in 2 sec.
> Start with one server, when it reaches its capacity add one more
> server and so on.
>
> Oracle RAC doesn't fit as it is single storage system. MySQL cluster
> database replicates data to all its cluster- we don't want to
> replicate the data.
>
> I am trying to emulate google type file system with RDBMS capabilities
> something like a database farm where each database work independently.
> A query is submitted to all databases in the farm, matching data found
> and gathered at one place and returned as single output.
>
>
> Any ideas?
>
> Thanks
>
> Sandeep
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

--
http://www.freelists.org/webpage/oracle-l

Received on Fri Mar 16 2007 - 14:31:27 CDT