| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> c.d.o.server -> Re: Migrate 200 Million of rows
rgaffuri_at_cox.net (Ryan Gaffuri) wrote in message news:<1efdad5b.0307291000.5d987842_at_posting.google.com>...
> vslabs_at_onwe.co.za (Billy Verreynne) wrote in message news:<1a75df45.0307282103.1b5c53cf_at_posting.google.com>...
> > rgaffuri_at_cox.net (Ryan Gaffuri) wrote
> >
> > > which compression program are you using? TAR? I was told its slow.
> >
> > TAR does not do compression - except for the Linux version that has
> > Lempel-Ziv filtering built in (you use that with the -z/-Z switches
> > for compress/gzip).
> >
> > The compression program (standard on all Unix flavours AFAIK) is
> > called compress. And as I said, it uses an adaptive Lempel-Ziv coding
> > scheme.
>
> i dont know anything about compression algorithms. ive always wanted
> to get a high level explanation of how they actually 'compress'
> something. Im assuming its similiar to how voice is amplified when you
> speak in the phone. samples are taking and applied from there.
>
> or am I wrong?
That would be "lossy compression" which is suitable for some video type applications, but not for your data!
From the unix man page for compress:
compress uses the modified Lempel-Ziv algorithm popularized in A
Technique for High Performance Data Compression , Terry A.
Welch, IEEE
Computer, vol. 17, no. 6 (June 1984), pages 8-19. Common
substrings
in the file are first replaced by 9-bit codes 257 and up. When
code
512 is reached, the algorithm switches to 10-bit codes and
continues
to use more bits until the limit specified by the -b flag is
reached
(default 16).
After the maxbits limit is attained, compress periodically
checks the
compression ratio. If it is increasing, compress continues to
use the
existing code dictionary. However, if the compression ratio is
decreasing, compress discards the table of substrings and
rebuilds it
from scratch. This allows the algorithm to adapt to the next
"block"
of the file.
I've commonly seen ten to 1 compression ratios for database files, since they are so full of... whatever.
jg
-- @home.com is bogus. http://www.unisys.com/about__unisys/lzwReceived on Tue Jul 29 2003 - 17:23:34 CDT
![]() |
![]() |