Using a Custom Time Dimension in a Logical Data Model
In some situations, you need to slice your data by timestamps. For example:
- To predict the number of call center personnel required at different times of a day, you may want to see data from a call center sliced by hour and minute or by hour and quarter hour (15-minute intervals).
- To plan the movie schedule at a cinema, you may need to know how many tickets are sold for particular movies at different times of day across a selected range of dates.
To handle such situations, we recommend that you split the date and time into separate attributes. To do so:
- Create a date dimension for the date attribute using either a standard (default) calendar, or a custom fiscal calendar.
- Add a custom time dimension to handle the time of day.
You can use the custom time dimension together with a GoodData date dimension for slicing the data, or you can use it independently if you need to see trends by time-of-day across dates.
We recommend that you not create a timestamp dimension because of the large amount of data required. For example, a day consists of 1,440 minutes or 86,400 seconds. A timestamp to the minute for the years 1900 through 2050 would contain 79.4 million rows. A timestamp to the second would contain 4.8 billion rows. Besides, you cannot use date filters and attributes for day, week, quarter, and year with a timestamp dimension.
A custom time dimension, on the other hand, covers a 24-hour period with one row per desired time interval, such as:
- Hour and minute
- Hour, minute, and second
When you are creating a custom time dimension, consider the following:
- Adding attributes for any custom intervals (for example, quarter hour)
- Adding labels for any custom display formats (for example, 12-hour clock vs. 24-hour clock)
- Adding labels for sorting so that hours, minutes, and seconds are displayed without leading zeros, and so that AM times precede PM times when using a 12-hour clock
Custom Time Dimensions in a Logical Data Model
The following picture shows a sample logical data model (LDM) for collecting data from a call center. The LDM has two custom time dimension datasets: Started Time
and Ended Time
.
The following picture shows the attributes and labels in the Started Time
time dimension dataset:
Load the Data into Your Data Warehouse
Follow these steps to create and load Data Warehouse tables for your custom time dimension:
Create a table (
wh_time_of_day
) in your Data Warehouse instance using theTIME
data type.DROP TABLE IF EXISTS wh_time_of_day; CREATE TABLE wh_time_of_day (time_of_day TIME) SEGMENTED BY HASH(0) ALL NODES ;
Generate 24 hours of data:
Create a temporary table (
tmp_hour
) for hours with 24 rows containing a single column that holds values of ‘00’ through ‘23’.CREATE LOCAL TEMPORARY TABLE tmp_hour (hour_of_day CHAR(2) ) ON COMMIT PRESERVE ROWS; INSERT INTO tmp_hour VALUES('00'); ... INSERT INTO tmp_hour VALUES('23');
Create a temporary table (
tmp_min_sec
) for minutes and seconds with 60 rows containing a single column that holds values of ‘00’ through ‘59’.CREATE LOCAL TEMPORARY TABLE tmp_min_sec (min_sec_of_day CHAR(2) ) ON COMMIT PRESERVE ROWS; INSERT INTO tmp_min_sec VALUES('00'); ... INSERT INTO tmp_min_sec VALUES('59');
Load the data to the Data Warehouse table (
wh_time_of_day
). In this example, you are going to truncate and load thewh_time_of_day
table usingINSERT
with aSELECT
of all rows fromtmp_hour
with a Cartesian join (CROSS JOIN
) totmp_min_sec
for minutes. The Cartesian join will join each hour (00-23) with minutes (00-59). If the time dimension needs to be down to the second, add another Cartesian join that will join each hour and minute to each second (00-59).TRUNCATE TABLE wh_time_of_day; INSERT /*+direct*/ INTO wh_time_of_day SELECT TO_TIMESTAMP(a.hour_of_day || ':' || b.min_sec_of_day || ':' || c.min_sec_of_day, 'HH:MI:SS') FROM tmp_hour a CROSS JOIN tmp_min_sec b CROSS JOIN tmp_min_sec c ORDER BY a.hour_of_day, b.min_sec_of_day, c.min_sec_of_day;
Load the output stage table (or GoodData dataset) from the Data Warehouse table (
wh_time_of_day
), using SQL for formatting the desired attributes and labels. The following are some examples of formatting the various attributes and labels:-- SELECT from wh_time_of_day... -- 24 hour clock time formats TO_CHAR(time_of_day, 'HH:MI:SS') AS time_hhmmss, TO_CHAR(time_of_day, 'HH:MI') AS time_hhmm, TO_CHAR(time_of_day, 'HH') AS time_hh, -- 12 hour clock time in hours CASE WHEN LEFT(TO_CHAR(time_of_day, 'HH12'), 1) = '0' THEN RIGHT(TO_CHAR(time_of_day, 'HH12 AM'), 4) ELSE TO_CHAR(time_of_day, 'HH12 AM') END AS time_hour, EXTRACT(HOUR FROM time_of_day) AS time_hour_sort, -- integer for hour sort
-- 12 hour clock down to minutes CASE WHEN LEFT(TO_CHAR(time_of_day, 'HH12'), 1) = '0' THEN RIGHT(TO_CHAR(time_of_day, 'HH12:MI AM'), 7) ELSE TO_CHAR(time_of_day, 'HH12:MI AM') END AS time_minute, (EXTRACT(HOUR FROM time_of_day) * 100) + EXTRACT(MINUTE FROM time_of_day) AS time_minute_sort, -- integer HHMM for sorting
-- 12 hour clock down to seconds CASE WHEN LEFT(TO_CHAR(time_of_day, 'HH12'), 1) = '0' THEN RIGHT(TO_CHAR(time_of_day, 'HH12:MI:SS AM'), 10) ELSE TO_CHAR(time_of_day, 'HH12:MI:SS AM') END AS time_second, (EXTRACT(HOUR FROM time_of_day) * 10000) + (EXTRACT(MINUTE FROM time_of_day) * 100) + CAST(TO_CHAR(time_of_day, 'SS') AS INTEGER) AS time_second_sort, -- integer HHMMSS for sorting
-- half hour intervals (similar for quarter hour intervals) CASE WHEN EXTRACT(MINUTE FROM time_of_day) < 30 AND LEFT(TO_CHAR(time_of_day, 'HH12'), 1) = '0' THEN RIGHT(TO_CHAR(time_of_day, 'HH12'), 1) || ':00 ' || TO_CHAR(time_of_day, 'AM') WHEN EXTRACT(MINUTE FROM time_of_day) < 30 THEN TO_CHAR(time_of_day, 'HH12') || ':00 ' || TO_CHAR(time_of_day, 'AM') WHEN EXTRACT(MINUTE FROM time_of_day) < 60 AND LEFT(TO_CHAR(time_of_day, 'HH12'), 1) = '0' THEN RIGHT(TO_CHAR(time_of_day, 'HH12'), 1) || ':30 ' || TO_CHAR(time_of_day, 'AM') ELSE TO_CHAR(time_of_day, 'HH12') || ':30 ' || TO_CHAR(time_of_day, 'AM') END AS time_halfhour, -- half hour interval sort (integer formatted as HH00 or HH30) CASE WHEN EXTRACT(MINUTE FROM time_of_day) < 30 THEN (EXTRACT(HOUR FROM time_of_day) * 100) ELSE (EXTRACT(HOUR FROM time_of_day) * 100) + 30 END AS time_halfhour_sort