snowflake support micro partition cluster by hour?

snowflake-datawarehouse

194 观看

2回复

353 作者的声誉

the doc says support clustering by Expressions on base columns and has an example of using TO_DATE(timestamp) as clustering key,

https://docs.snowflake.net/manuals/user-guide/tables-micro-partitions.html#defining-clustering-keys-for-a-table

that indeed works as I verified, but in my case, cluster by hour is better so use this DATE_TRUNC by HOUR as cluster key

CREATE OR REPLACE TABLE t(
  "timestamp" datetime,
  ... more fields
) CLUSTER BY (DATE_TRUNC('HOUR', "timestamp"));

the CREATE TABLE runs successfully, however, when loading data with COPY INTO, it says not supported within a COPY, so wonder is clustering by hour supported? or is the COPY command's limitation ? anyone knows a workaround?

002300 (0A000): SQL Compilation error: Function 'TRUNCTIMESTAMPTOHOUR' 
    not supported within a COPY
作者: user5672998 的来源 发布者: 2017 年 12 月 27 日

回应 2


0

991 作者的声誉

As a workaround, try COPY into a staging table "stg" that does not have the Clustering key defined. Then insert to the target table:

INSERT INTO t 
SELECT * FROM stg 
ORDER BY (DATE_TRUNC('HOUR', "timestamp"));
作者: Stuart Ozer 发布者: 2018 年 1 月 4 日

1

1006 作者的声誉

Expanding on Stuart's answer: just perform an INSERT INTO, SELECT FROM [staging area].

INSERT INTO DB.SCHEMA.T(timestamp, col2, ...)
    SELECT 
    $1::timestamp_ntz,
    $2
    ...
    FROM 
        @SOURCE.EXTRACTS/folder_location (FILE_FORMAT => '[file_format_name]', 
                                            PATTERN => '[regex file pattern]')
作者: SimonD 发布者: 2018 年 3 月 9 日
来自类别的问题 :
32x32