You are viewing our older product's guide. Click here for the documentation of GoodData Cloud, our latest and most advanced product.

Take Advantage of Table Partitions

Data partitioning can help you improve performance in the following cases:

Archiving and purging
- Drop partitions instead of deleting many rows. DELETE creates delete vectors, which decrease performance of any consecutive statements using the same table.
- Move partitions to keep only up-to-date data in an active table. Example: You want to prevent MERGE from slowing down because the target table is growing infinitely.
Pruning data
- Eliminate partitions from reading by using a ‘filter on partition’ column. Example: You need incremental loads to process only the latest data.
- Instead of creating a secondary projection sorted by timestamp, use partitioning. Example: You need to read rows from the last day from two tables, and then JOIN them. Partitioning prunes old data from reading. A projection sorted by JOIN keys enables MERGE JOIN.
Preventing fragmentation of a file system
- Use for large datasets (greater than 1 TB).
- Merge inactive partitions into one file so that the file system has to manage fewer files.
- Use table partitions to prevent the file size from becoming too large. Example: You need to resize a cluster (add or remove nodes). Smaller files are rebalanced among nodes more efficiently. Less free space on disks is needed for rebalancing.