语法

select_sample_by_statement ::=  SELECT ( select_clause | '*' )
                                FROM table_name
                                WHERE where_clause
                                SAMPLE BY time_interval [ FILL fill_option ]
select_clause              ::=  selector [ AS identifier ] ( ',' selector [ AS identifier ] )
selector                   ::=  tag_name | time | function_name '(' field_name [ ',' function_args ] ')'
where_clause               ::=  relation ( AND relation )* (OR relation)*
relation                   ::=  ( field_name | tag_name ) operator term
operator                   ::=  '=' | '<' | '>' | '<=' | '>=' | '!=' | IN | CONTAINS | CONTAINS KEY
time_interval              ::= interval units

说明

参数 说明
interval 数值,例如5、60等,特殊的0all表示时间维度聚合为一个点。
units s代表秒,m代表分,h代表小时,d代表天,n代表月,y代表年。
说明 Lindorm支持基于日历时间间隔的降采样。如果要使用日历界限,您需要在时间单位units后添加一个 c。例如, 1dc代表从当日零点到次日零点之间的24小时。

降采样查询对于每条时间线(关于时间线概念,请参见名词解释)按照sample by指定的时间进行降精度。

插值:对于降采样后无数据的时间点进行插值,支持的差值策略参见附表二:插值函数表

示例

SELECT查询指定的tag列无需指定降采样函数,其他field列必须制定降采样函数。支持的降采样函数参见附表一:降采样聚合算子表。如:

select device_id,region,time,avg(temperature) from sensor where device_id = 'F07A1260' and time >= 1619076780000 and time <= 1619076800000 sample by 20s;

sample by目前不支持与group by、limit offset、order by配合。但可以配合子查询一起使用:

> select region, max(avg_temperature) from (select device_id,region,time,avg(temperature) as avg_temperature from sensor where time >= 1619076780000 and time <= 1619076800000 sample by 20s) group by region;
+-----------+---------------------+
|  region   |       EXPR$1        |
+-----------+---------------------+
| north-cn  | 12.649999999999999  |
| south-cn  | 18.9                |
+-----------+---------------------+
2 rows selected (0.036 seconds)

配合limit offset,注意目前limit offset没有下推优化,尽量使用条件限制数据量。

> select device_id,region, avg_temperature from (select device_id,region,time,avg(temperature) as avg_temperature from sensor where time >= 1619076780000 and time <= 1619076800000 sample by 20s) limit 1 offset 1;
+------------+-----------+------------------+
| device_id  |  region   | avg_temperature  |
+------------+-----------+------------------+
| F07A1260   | north-cn  | 10.6             |
+------------+-----------+------------------+
1 row selected (0.035 seconds)

固定值插值:

> select device_id,region,time,avg(temperature) from sensor where time >= 1619076780000 and time <= 1619076800000 sample by 20s fill 1;
+------------+-----------+----------------+---------------------+
| device_id  |  region   |      time      |       EXPR$3        |
+------------+-----------+----------------+---------------------+
| F07A1260   | north-cn  | 1619076780000  | 12.649999999999999  |
| F07A1260   | north-cn  | 1619076800000  | 10.6                |
| F07A1260   | north-cn  | 1619076820000  | 1.0                 |*
| F07A1261   | south-cn  | 1619076780000  | 18.9                |
| F07A1261   | south-cn  | 1619076800000  | 1.0                 |*
| F07A1261   | south-cn  | 1619076820000  | 1.0                 |*
+------------+-----------+----------------+---------------------+

使用前值插值:

> select device_id,region,time,avg(temperature) from sensor where time >= 1619076780000 and time <= 1619076800000 sample by 20s fill previous;
+------------+-----------+----------------+---------------------+
| device_id  |  region   |      time      |       EXPR$3        |
+------------+-----------+----------------+---------------------+
| F07A1260   | north-cn  | 1619076780000  | 12.649999999999999  |
| F07A1260   | north-cn  | 1619076800000  | 10.6                |
| F07A1260   | north-cn  | 1619076820000  | 10.6                |*
| F07A1261   | south-cn  | 1619076780000  | 18.9                |
| F07A1261   | south-cn  | 1619076800000  | 18.9                |*
| F07A1261   | south-cn  | 1619076820000  | 18.9                |*
+------------+-----------+----------------+---------------------+

rfirst,rlast,rmin或rmax不支持插值。

附表一:降采样聚合算子表

算子 描述
avg 平均值
count 数据点数
first 取第一个值
last 取最后一个值
min 最小值
max 最大值
sum 求和
rfirst 功能与first相同。

但降采样后返回的结果的时间戳是原始数据的时间戳;而非降采样对齐后的时间戳。

rlast 功能与last相同。

但降采样后返回的结果的时间戳是原始数据的时间戳;而非降采样对齐后的时间戳。

rmin 功能与min相同。

但降采样后返回的结果的时间戳是原始数据的时间戳;而非降采样对齐后的时间戳。

rmax 功能与max相同。

但降采样后返回的结果的时间戳是原始数据的时间戳;而非降采样对齐后的时间戳。

附表二:插值函数表

Fill Policy 填充值
none 默认行为,不填值。
nan NaN
null null
zero 0
linear 线性填充值
previous 之前的一个值
near 邻近的一个值
after 之后的一个值
fixed 用指定的一个固定填充值