There are already a lot of blogposts and presentations done about Hybrid Columnar Compression and i am adding one more blogpost to that list. Recently i was doing some small tests one HCC and noticed that that inserts on a HCC row didn’t got compressed and yes i was using direct path loads:
DBA@TEST1> create table hcc_me (text1 varchar2(4000)) compress for archive high; Table created. KJJ@TEST1> insert /*+ append */ into hcc_me select dbms_random.string('x',100) from dual; 1 row created. KJJ@TEST1> commit; Commit complete. KJJ@TEST1> select rowid from hcc_me; ROWID ------------------ AAAWw/AAAAAACXzAAA DBA@TEST1> @compress Enter value for schemaname: kjj Enter value for tablename: hcc_me Enter value for rowid: AAAWw/AAAAAACXzAAA old 2: dbms_compression.get_compression_type(upper('&SchemaName'),upper('&TableName'),'&RowID'), new 2: dbms_compression.get_compression_type(upper('kjj'),upper('hcc_me'),'AAAWw/AAAAAACXzAAA'), COMPRESSION_TYPE --------------------- COMP_NOCOMPRESS
So my row did not got compressed, let’s insert a little bit more data into our test table:
declare sql_stmt varchar(200); begin for i in 1..1000 loop sql_stmt := 'insert /*+ append_values */ into hcc_me select dbms_random.string(''x'',100) from dual'; execute immediate sql_stmt; commit; end loop; end; /
And lets see what we end up with:
select count(*), compression_type from ( select decode(dbms_compression.get_compression_type('KJJ','HCC_ME',rowid), 1, 'COMP_NOCOMPRESS', 2, 'COMP_FOR_OLTP', 4, 'COMP_FOR_QUERY_HIGH', 8, 'COMP_FOR_QUERY_LOW', 16, 'COMP_FOR_ARCHIVE_HIGH', 32, 'COMP_FOR_ARCHIVE_LOW', 64, 'COMP_BLOCK', 1000000, 'COMP_RATIO_MINROWS', -1, 'COMP_RATIO_ALLROWS') "COMPRESSION_TYPE" from hcc_me) group by compression_type;
so none of my records got compressed:
COUNT(*) COMPRESSION_TYPE ---------- --------------------- 1000 COMP_NOCOMPRESS
Maybe it size dependent, the row i am inserting into this HCC table is extremely small. Lets re-create the table and make every row one byte bigger then the previous row:
declare sql_stmt varchar(200); v_random1 varchar2(4000); begin execute immediate 'drop table hcc_me'; execute immediate 'create table hcc_me (text1 varchar2(4000)) compress for archive high'; for i in 1..1000 loop v_random1 := dbms_random.string('x', i); sql_stmt := 'insert /*+ append_values */ into hcc_me values (:1)'; execute immediate sql_stmt using v_random1; commit; end loop; end;
This will give me a table that has row 1 being 1 byte big and the last row 1000 bytes big. re-run our select statement and see if we have HCC compressed rows now:
COUNT(*) COMPRESSION_TYPE ---------- --------------------- 697 COMP_FOR_ARCHIVE_HIGH 303 COMP_NOCOMPRESS 2 rows selected.
Victory! But now lets see where our records are starting to compress, lets adapt the query a bit:
select vsize(text1) row_bytes, decode(dbms_compression.get_compression_type('KJJ','HCC_ME',rowid), 1, 'COMP_NOCOMPRESS', 2, 'COMP_FOR_OLTP', 4, 'COMP_FOR_QUERY_HIGH', 8, 'COMP_FOR_QUERY_LOW', 16, 'COMP_FOR_ARCHIVE_HIGH', 32, 'COMP_FOR_ARCHIVE_LOW', 64, 'COMP_BLOCK', 1000000, 'COMP_RATIO_MINROWS', -1, 'COMP_RATIO_ALLROWS') COMPRESSION_TYPE from hcc_me; ROW_BYTES COMPRESSION_TYPE ---------- --------------------- 1 COMP_NOCOMPRESS 2 COMP_NOCOMPRESS 3 COMP_NOCOMPRESS 4 COMP_NOCOMPRESS 5 COMP_NOCOMPRESS <cut> 292 COMP_NOCOMPRESS 293 COMP_NOCOMPRESS 294 COMP_NOCOMPRESS 295 COMP_FOR_ARCHIVE_HIGH 296 COMP_NOCOMPRESS 297 COMP_NOCOMPRESS 298 COMP_NOCOMPRESS 299 COMP_NOCOMPRESS 300 COMP_FOR_ARCHIVE_HIGH 301 COMP_NOCOMPRESS 302 COMP_NOCOMPRESS 303 COMP_NOCOMPRESS 304 COMP_FOR_ARCHIVE_HIGH 305 COMP_FOR_ARCHIVE_HIGH 306 COMP_FOR_ARCHIVE_HIGH 307 COMP_FOR_ARCHIVE_HIGH 308 COMP_FOR_ARCHIVE_HIGH 309 COMP_FOR_ARCHIVE_HIGH 310 COMP_FOR_ARCHIVE_HIGH 311 COMP_FOR_ARCHIVE_HIGH 312 COMP_FOR_ARCHIVE_HIGH 313 COMP_FOR_ARCHIVE_HIGH 314 COMP_NOCOMPRESS 315 COMP_NOCOMPRESS 316 COMP_FOR_ARCHIVE_HIGH 317 COMP_FOR_ARCHIVE_HIGH 318 COMP_FOR_ARCHIVE_HIGH 319 COMP_FOR_ARCHIVE_HIGH <cut> 996 COMP_FOR_ARCHIVE_HIGH 997 COMP_FOR_ARCHIVE_HIGH 998 COMP_FOR_ARCHIVE_HIGH 999 COMP_FOR_ARCHIVE_HIGH 1000 COMP_FOR_ARCHIVE_HIGH 1000 rows selected.
Alright, this is unexpected. I ran this test multiple times and ended up with different results, some rows around the 300 bytes mark are getting compressed and some are not. So somewhere around the 300 bytes oracle decides randomly wetter or not to compress. Oh wait, randomly… i am using DBMS_RANDOM to fill my 1 column with data, so lets take the random factor out of the equation and fill our rows fit on fixed character:
declare sql_stmt varchar(200); begin execute immediate 'create table hcc_me (text1 varchar2(4000)) compress for archive high'; for i in 1..1000 loop sql_stmt := 'insert /*+ append_values */ into hcc_me select lpad( ''x'','||i||',''x'') from dual'; execute immediate sql_stmt; commit; end loop; end;
Now we end up with a much effective compression:
ROW_BYTES COMPRESSION_TYPE ---------- --------------------- 1 COMP_NOCOMPRESS 2 COMP_NOCOMPRESS 3 COMP_NOCOMPRESS 4 COMP_NOCOMPRESS 5 COMP_NOCOMPRESS <cut> 65 COMP_NOCOMPRESS 66 COMP_NOCOMPRESS 67 COMP_NOCOMPRESS 68 COMP_FOR_ARCHIVE_HIGH 69 COMP_FOR_ARCHIVE_HIGH 70 COMP_FOR_ARCHIVE_HIGH 71 COMP_FOR_ARCHIVE_HIGH <cut> 998 COMP_FOR_ARCHIVE_HIGH 999 COMP_FOR_ARCHIVE_HIGH 1000 COMP_FOR_ARCHIVE_HIGH 1000 rows selected.
Moral of this story is to be careful with HCC and small inserts, there is some logic built into the HCC engine that decides wetter or not it should compress based on size and estimated compression ratio. If you end up with a situation were not all rows are compressed (the insert was to small, forgot the do a direct load, or for whatever reason) a simple alter table move will compress all rows again.