Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Migrate 200 Million of rows

Re: Migrate 200 Million of rows

From: Joel Garry <joel-garry_at_home.com>
Date: 29 Jul 2003 15:23:34 -0700
Message-ID: <91884734.0307291423.25875581@posting.google.com>


rgaffuri_at_cox.net (Ryan Gaffuri) wrote in message news:<1efdad5b.0307291000.5d987842_at_posting.google.com>...
> vslabs_at_onwe.co.za (Billy Verreynne) wrote in message news:<1a75df45.0307282103.1b5c53cf_at_posting.google.com>...
> > rgaffuri_at_cox.net (Ryan Gaffuri) wrote
> >
> > > which compression program are you using? TAR? I was told its slow.
> >
> > TAR does not do compression - except for the Linux version that has
> > Lempel-Ziv filtering built in (you use that with the -z/-Z switches
> > for compress/gzip).
> >
> > The compression program (standard on all Unix flavours AFAIK) is
> > called compress. And as I said, it uses an adaptive Lempel-Ziv coding
> > scheme.
>
> i dont know anything about compression algorithms. ive always wanted
> to get a high level explanation of how they actually 'compress'
> something. Im assuming its similiar to how voice is amplified when you
> speak in the phone. samples are taking and applied from there.
>
> or am I wrong?

That would be "lossy compression" which is suitable for some video type applications, but not for your data!

From the unix man page for compress:

      compress uses the modified Lempel-Ziv algorithm popularized in A
      Technique for High Performance Data Compression , Terry A.
Welch, IEEE
      Computer, vol. 17, no. 6 (June 1984), pages 8-19.  Common
substrings
      in the file are first replaced by 9-bit codes 257 and up.  When
code
      512 is reached, the algorithm switches to 10-bit codes and
continues
      to use more bits until the limit specified by the -b flag is
reached
      (default 16).

      After the maxbits limit is attained, compress periodically
checks the
      compression ratio.  If it is increasing, compress continues to
use the
      existing code dictionary.  However, if the compression ratio is
      decreasing, compress discards the table of substrings and
rebuilds it
      from scratch.  This allows the algorithm to adapt to the next
"block"
      of the file.

I've commonly seen ten to 1 compression ratios for database files, since they are so full of... whatever.

jg

--
@home.com is bogus.
http://www.unisys.com/about__unisys/lzw
Received on Tue Jul 29 2003 - 17:23:34 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US