Columns Encoding and Compression
You can reduce Data Warehouse column storage space by applying encoding and compression techniques. Available encoding methods include:
- Run-length encoding (sorted repeating values replaced with the value and a number of occurrences)
- Dictionary
- Various delta encodings
By default, the AUTO encoding is used on column values. This method applies LZO compression to CHAR/VARCHAR, BOOLEAN, BINARY/VARBINARY, and FLOAT columns.
For INTEGER, DATE/TIME/TIMESTAMP, and INTERVAL type columns, Data Warehouse uses a compression scheme based on the delta between consecutive column values.
For sorted, many-value columns such as primary keys, the AUTO encoding is usually the best choice. For repeating sorted low cardinality columns, run-length encoding (RLE) may be the right choice.
For more detailed information on individual encoding types, see Encoding Types.