Re: Info on large tables wanted

From: Michael Austin <maustin_at_firstdbasource.com>
Date: Mon, 07 Jun 2004 13:51:46 GMT
Message-ID: <Sp_wc.4938$Uh6.4559@newssvr22.news.prodigy.com>

Romeo Olympia wrote:

> 36 million records (or even 150 million) isn't really that big a
> number. The structure even suggests that your average row length is
> quite well.. average. Of course it would help if we know your hardware
> capacity beforehand (number of CPUs, disk configuration, and memory
> which you already provided).
> 
> Your system is probably not a dedicated data warehouse but look up the
> Data Warehousing Guide for Oracle. The link below is for 9i; find the
> one for your specific version (hopefully you have at least 8i).
> 
> Oracle9i Data Warehousing Guide
> Release 2 (9.2)
> Part Number A96520-01
> http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96520/toc.htm
> 
> This document will introduce you to the different methods of handling
> "large" data in Oracle. Things to pay attention to (not exhaustive):
> - Partitioning
> - Parallelization
> - Bitmap indexes (if applicable)
> - etc.
> 
> Hope that helps.
> 
> Romeo
> 
> "Markus Vohburger" <markus.vohburger_at_t-online.de> wrote in message news:<c9vumj$3e9$05$1_at_news.t-online.com>...
>

>>Hello out there!
>>
>>
>>I am just planning a new Reporting Application for a large chain of
>>bakeries. I have done already applications like this, but not with that
>>amount of Data.
>>
>>
>>Roughly, the Bakery has 120 Shops.
>>Each Bakery has about 500 Customers per Day
>>Each customer buys 2 Different Items in Average. There are about 100
>>different Items.
>>
>>so this makes
>>120*500*2 = 120000 Items bought per day
>>120000*6 = 720000 per week
>>720000*50 =36.000.000 per Year with 300 buisness days
>>
>>Raw data is delivered in the following Format
>>
>>Shop-ID Integer
>>Item-Id Integer
>>Item Amount Float
>>Item Price Float
>>Timestamp DateTime
>>
>>
>>I need to build a database where each Item is stored with the above Data, so
>>they can make Reports on when particular Items sell best eg.
>>
>>Is there a problem with tables that contain 36.000.000 records? maybe they
>>want to accumulate 5 Year of Sales Data, so this would make over 150.000.000
>>records.
>>
>>I have done several tests with about 10.000.000 Dummy Records on my test
>>machine.
>>AMD2000, 1GB Ram, Index Tablespace on a seperate harddisk and the like, the
>>response time was quite satisfactory, but will it be with 20 times the data?
>>
>>Has anybody experience with such large Tables?
HAHAHA.. large tables... HAHAAHAHA... (sorry, it's too early...)

I agree with Romeo, 36M records/year is nothing... 350M/day at ~36GB/day is only somewhat medium sized!! 150GB/day is approaching big. If my byte count is close, you are only looking at approximately 4.6GB/year.

>>How big are your Databases (number of Records)
I had one database at 91.2B records/year with very small record sizes totalling ~13.4TB/year in data storage (plus index and reference tables)

>>What machines do you use?

Certainly not a PC. But in your case, it may be sufficient provided you have a couple of mirrored 36GB drive for data and index. I would definitely use some sort of SCSI controller-based RAID for redundancy. I would consider some UNIX or UNIX derivative (ie Linux) Or if you want to use a REAL system, Alpha/OpenVMS. :) I just couldn't resist.
>>
>>Any comments welcome!

The key to extracting data will be how you "mine" the data. By date, by shop id, etc... and ensure the appropriate indexing for each. As Romeo also stated, for reporting performance on a per bakery id, consider partitioning (data and index), this will achieve one other side benefit of potentially improving loads by allowing multiple jobs applying data concurrently. If you have 150 stores, depending on the run time per store, this could be a significant time savings -- especially if you have more than one CPU.

With volumes, this low, you shouldn't have too many performance issues...

>>
>>regards
>>MV

Michael Austin. Received on Mon Jun 07 2004 - 08:51:46 CDT