RE: Effective compression ratio with minimal performance overhead

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 24 May 2022 08:53:29 -0400
Message-ID: <067201d86f6d$44ba0630$ce2e1290$_at_rsiz.com>

… AND … if you go “too few” performance of some query that someone is counting on being faster will let you know pretty damn fast, while if you go “too many” it requires analysis of unused or rarely used indexes to discover the useless ones.

On another note, so NOT discount the compression possible from physically ordering your data. Unless you have the rare circumstance of anti-correlation of index cluster factor for two important indexes, this can make everything a lot smaller.

Give it a try on a representative partition and let us know how you make out.

Good luck,

mwf

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Andy Sayer Sent: Monday, May 23, 2022 11:26 AM
To: Lok P
Cc: Jonathan Lewis; Mladen Gogala; Oracle L; tim.evdbt_at_gmail.com Subject: Re: Effective compression ratio with minimal performance overhead

Oracle will tell you what it thinks the perfect number of keys to include is if you use analyze index. Check the index_stats view after (run your analyze commands in your dev environment if your data is representative there).

If you’re going to guess, best to go too few than too many.

Thanks,

Andy

On Mon, 23 May 2022 at 16:17, Lok P <loknath.73_at_gmail.com> wrote:

Thank You Andy.

The top consumers were tables though, but will definitely see how much size the index compression is giving back.

Regarding the index compression, I see the "compress advanced high" and "compress advance low" options are not available untill 12.2 and we have this database on version 12.1. So is it okay to just grab non-unique indexes(say top ~10 partition/non partitioned indexes) based on size descending. And then go for simple compression scripts like below without any prefix length and let Oracle do the compression on all the keys? OR do we really have to find the exact prefix columns on which we should do the compression else will have some performance penalty?

Alter index IDX1 REBUILD TABLESPACE TBS_IDX COMPRESS <LOCAL> PARALLEL 16;

On Mon, 23 May 2022, 7:34 pm Andy Sayer, <andysayer_at_gmail.com> wrote:

Not to distract from all the complexities of table compression. Have you tested how much space you can save with standard prefix index compression? You might find it both reduce your storage requirements and improve your performance across the board.

Thanks,

Andrew

On Mon, 23 May 2022 at 14:34, Lok P <loknath.73_at_gmail.com> wrote:

Thank You All. I think the DML(mainly INSERT) which i have tested on the HCC compressed(compress for query low) table, it was performing in same speed as non compressed table , because it was not compressing those newly inserted data. Its just meant for compressing the existing data in the table. And Advanced compression is the one seems to be really compressing the future/incoming DML/data but giving significant(100%) performance overhead. So i was thinking if its okay to just do the HCC compression applied on the existing data and let it incoming data remain uncompressed, but as Mohamed highlighted , if the read queries wont be doing smartscan post this HCC on the existing table , that is going to be an issue.

_at_Mohamed, Yes, i was testing the performance of UPDATE and INSERT query on top of the compressed table. There are partitioned tables but the SELECT queries dont partition prune so they gets scanned full and/or through global indexes etc many times.

My understanding was that considering this is an OLTP database and we do conventional DML most of the time. As per Doc, Advanced compression is best suited even it gives us least compression ratio. However as i tested the DML performance overhead on the advanced compression table , the execution time is doubled which is little concerning. I have not tested the performance of full scan on the compressed table but the index access seems to be performing with same speed as that of non compressed table.

Regarding the HCC compression the doc says , 'compress for query low' is the one which will provide minimal compression ratio or minimal storage space save but its a best compromise for SPACE reduction VS DML and SELECT query performance in a OLTP environment. But as you said it wont perform smartscan post "compress for query low", that is concerning and i have to test it out then.

I ran the select query which you posted and its giving "COMP_FOR_QUERY_LOW" as output for the compressed table which is expected.

We donot have any tables holding CLOB column though.Also tried running the index fragmentation script in the Jonathan's blog, i see it will give us ~200-300GB of storage space back. But looking for more space reduction.

Regards

Lok

On Mon, 23 May 2022, 6:12 pm Tim Gorman, <tim.evdbt_at_gmail.com> wrote:

Mladen,

That is unlikely, as HCC compression (and TDE, for that matter) is performed immediately prior to direct-path INSERTs, so that completed blocks (including compression and/or encryption) are written to the underlying storage layer.

It would be interesting to see if Exadata somehow changes that?

Thanks!

-Tim

On 5/23/2022 5:05 AM, Mladen Gogala wrote:

On 5/23/22 07:43, Jonathan Lewis wrote:

_at_Mladen Gogala <mailto:gogala.mladen_at_gmail.com>

When did that change; I haven't been watching closely but the compression used to be done at the compute node, and HC decompression at storage cells, through software, except for the cases where the storage cell sent the CU to the compute node either to balance the CPU usage or because the extract from a smart scan was too large for the message buffer.

Regards

Jonathan Lewis

Hi Jonathan,

I think that I've read in some white paper that HCC is done on storage nodes, transparently to the compute nodes, as of X7. I'll look for the paper and let you know.

Regards

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

 





--
http://www.freelists.org/webpage/oracle-l

Received on Tue May 24 2022 - 14:53:29 CEST

This message: [ Message body ]
Next message: Andy Sayer: "Re: Information missing in dba_ views after plugging PDB"
Previous message: Noveljic Nenad: "Information missing in dba_ views after plugging PDB"
In reply to: Andy Sayer: "Re: Effective compression ratio with minimal performance overhead"
In reply to Lok P: "Re: Effective compression ratio with minimal performance overhead"
Next in thread: Lok P: "Re: Effective compression ratio with minimal performance overhead"
Reply: Lok P: "Re: Effective compression ratio with minimal performance overhead"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message