Inserts on HCC tables

There are already a lot of blogposts and presentations done about Hybrid Columnar Compression and i am adding one more blogpost to that list. Recently i was doing some small tests one HCC and noticed that that inserts on a HCC row didn’t got compressed and yes i was using direct path loads:

DBA@TEST1> create table hcc_me (text1 varchar2(4000)) compress for archive high;

Table created.

KJJ@TEST1> insert /*+ append */ into hcc_me select dbms_random.string('x',100) from dual;

1 row created.

KJJ@TEST1> commit;

Commit complete.

KJJ@TEST1> select rowid from hcc_me;

ROWID
------------------
AAAWw/AAAAAACXzAAA

DBA@TEST1> @compress
Enter value for schemaname: kjj
Enter value for tablename: hcc_me
Enter value for rowid: AAAWw/AAAAAACXzAAA
old   2:    dbms_compression.get_compression_type(upper('&SchemaName'),upper('&TableName'),'&RowID'),
new   2:    dbms_compression.get_compression_type(upper('kjj'),upper('hcc_me'),'AAAWw/AAAAAACXzAAA'),

COMPRESSION_TYPE
---------------------
COMP_NOCOMPRESS

So my row did not got compressed, let’s insert a little bit more data into our test table:

declare
    sql_stmt varchar(200);
begin
for i in 1..1000 loop
    sql_stmt := 'insert /*+ append_values */ into hcc_me select dbms_random.string(''x'',100) from dual';
    execute immediate sql_stmt;
    commit;
end loop;
end;
/

And lets see what we end up with:

select count(*), compression_type from (
select decode(dbms_compression.get_compression_type('KJJ','HCC_ME',rowid),
       1, 'COMP_NOCOMPRESS',
       2, 'COMP_FOR_OLTP',
       4, 'COMP_FOR_QUERY_HIGH',
       8, 'COMP_FOR_QUERY_LOW',
      16, 'COMP_FOR_ARCHIVE_HIGH',
      32, 'COMP_FOR_ARCHIVE_LOW',
      64, 'COMP_BLOCK',
      1000000, 'COMP_RATIO_MINROWS',
      -1, 'COMP_RATIO_ALLROWS') "COMPRESSION_TYPE"
      from hcc_me)
group by compression_type;

so none of my records got compressed:

  COUNT(*) COMPRESSION_TYPE
---------- ---------------------
      1000 COMP_NOCOMPRESS

Maybe it size dependent, the row i am inserting into this HCC table is extremely small. Lets re-create the table and make every row one byte bigger then the previous row:

declare

sql_stmt    varchar(200);
v_random1   varchar2(4000);

begin

     execute immediate 'drop table hcc_me';
     execute immediate 'create table hcc_me (text1 varchar2(4000)) compress for archive high';

 for i in 1..1000 loop
     v_random1 := dbms_random.string('x', i);
     sql_stmt := 'insert /*+ append_values */ into hcc_me values (:1)';
     execute immediate sql_stmt using v_random1;
     commit;
   end loop;
end;

This will give me a table that has row 1 being 1 byte big and the last row 1000 bytes big. re-run our select statement and see if we have HCC compressed rows now:

  COUNT(*) COMPRESSION_TYPE
---------- ---------------------
       697 COMP_FOR_ARCHIVE_HIGH
       303 COMP_NOCOMPRESS

2 rows selected.

Victory! But now lets see where our records are starting to compress, lets adapt the query a bit:

select vsize(text1) row_bytes,
       decode(dbms_compression.get_compression_type('KJJ','HCC_ME',rowid),
       1, 'COMP_NOCOMPRESS',
       2, 'COMP_FOR_OLTP',
       4, 'COMP_FOR_QUERY_HIGH',
       8, 'COMP_FOR_QUERY_LOW',
      16, 'COMP_FOR_ARCHIVE_HIGH',
      32, 'COMP_FOR_ARCHIVE_LOW',
      64, 'COMP_BLOCK',
      1000000, 'COMP_RATIO_MINROWS',
      -1, 'COMP_RATIO_ALLROWS') COMPRESSION_TYPE
      from hcc_me;

 ROW_BYTES COMPRESSION_TYPE
---------- ---------------------
         1 COMP_NOCOMPRESS
         2 COMP_NOCOMPRESS
         3 COMP_NOCOMPRESS
         4 COMP_NOCOMPRESS
         5 COMP_NOCOMPRESS
<cut>
       292 COMP_NOCOMPRESS
       293 COMP_NOCOMPRESS
       294 COMP_NOCOMPRESS
       295 COMP_FOR_ARCHIVE_HIGH
       296 COMP_NOCOMPRESS
       297 COMP_NOCOMPRESS
       298 COMP_NOCOMPRESS
       299 COMP_NOCOMPRESS
       300 COMP_FOR_ARCHIVE_HIGH
       301 COMP_NOCOMPRESS
       302 COMP_NOCOMPRESS
       303 COMP_NOCOMPRESS
       304 COMP_FOR_ARCHIVE_HIGH
       305 COMP_FOR_ARCHIVE_HIGH
       306 COMP_FOR_ARCHIVE_HIGH
       307 COMP_FOR_ARCHIVE_HIGH
       308 COMP_FOR_ARCHIVE_HIGH
       309 COMP_FOR_ARCHIVE_HIGH
       310 COMP_FOR_ARCHIVE_HIGH
       311 COMP_FOR_ARCHIVE_HIGH
       312 COMP_FOR_ARCHIVE_HIGH
       313 COMP_FOR_ARCHIVE_HIGH
       314 COMP_NOCOMPRESS
       315 COMP_NOCOMPRESS
       316 COMP_FOR_ARCHIVE_HIGH
       317 COMP_FOR_ARCHIVE_HIGH
       318 COMP_FOR_ARCHIVE_HIGH
       319 COMP_FOR_ARCHIVE_HIGH
<cut>
       996 COMP_FOR_ARCHIVE_HIGH
       997 COMP_FOR_ARCHIVE_HIGH
       998 COMP_FOR_ARCHIVE_HIGH
       999 COMP_FOR_ARCHIVE_HIGH
      1000 COMP_FOR_ARCHIVE_HIGH

1000 rows selected.

Alright, this is unexpected. I ran this test multiple times and ended up with different results, some rows around the 300 bytes mark are getting compressed and some are not. So somewhere around the 300 bytes oracle decides randomly wetter or not to compress. Oh wait, randomly… i am using DBMS_RANDOM to fill my 1 column with data, so lets take the random factor out of the equation and fill our rows fit on fixed character:

declare

sql_stmt    varchar(200);

begin

     execute immediate 'create table hcc_me (text1 varchar2(4000)) compress for archive high';

 for i in 1..1000 loop
     sql_stmt := 'insert /*+ append_values */ into hcc_me select lpad( ''x'','||i||',''x'') from dual';
     execute immediate sql_stmt;
     commit;
   end loop;
end;

Now we end up with a much effective compression:

 ROW_BYTES COMPRESSION_TYPE
---------- ---------------------
         1 COMP_NOCOMPRESS
         2 COMP_NOCOMPRESS
         3 COMP_NOCOMPRESS
         4 COMP_NOCOMPRESS
         5 COMP_NOCOMPRESS
<cut>
        65 COMP_NOCOMPRESS
        66 COMP_NOCOMPRESS
        67 COMP_NOCOMPRESS
        68 COMP_FOR_ARCHIVE_HIGH
        69 COMP_FOR_ARCHIVE_HIGH
        70 COMP_FOR_ARCHIVE_HIGH
        71 COMP_FOR_ARCHIVE_HIGH
<cut>
       998 COMP_FOR_ARCHIVE_HIGH
       999 COMP_FOR_ARCHIVE_HIGH
      1000 COMP_FOR_ARCHIVE_HIGH

1000 rows selected.

Moral of this story is to be careful with HCC and small inserts, there is some logic built into the HCC engine that decides wetter or not it should compress based on size and estimated compression ratio. If you end up with a situation were not all rows are compressed (the insert was to small, forgot the do a direct load, or for whatever reason) a simple alter table move will compress all rows again.