DISTKEY 示例 DISTSTYLE EVEN 示例 DISTSTYLE ALL 示例

分配示例

以下示例显示如何根据您在 CREATE TABLE 语句中定义的选项分配数据。

DISTKEY 示例

查看 TICKIT 数据库中的 USERS 表的 schema。USERID 定义为 SORTKEY 列和 DISTKEY 列：


select "column", type, encoding, distkey, sortkey 
from pg_table_def where tablename = 'users';
    
    column     |          type          | encoding | distkey | sortkey
---------------+------------------------+----------+---------+---------
 userid        | integer                | none     | t       |       1
 username      | character(8)           | none     | f       |       0
 firstname     | character varying(30)  | text32k  | f       |       0

...

USERID 是该表上分配列的良好选择。如果您查询 SVV_DISKUSAGE 系统视图，就会发现该表的分配非常均匀。列编号从零开始，因此，USERID 为第 0 列。


select slice, col, num_values as rows, minvalue, maxvalue
from svv_diskusage
where name='users' and col=0 and rows>0
order by slice, col;

slice| col | rows  | minvalue | maxvalue
-----+-----+-------+----------+----------
0    | 0   | 12496 | 4        | 49987
1    | 0   | 12498 | 1        | 49988
2    | 0   | 12497 | 2        | 49989
3    | 0   | 12499 | 3        | 49990
(4 rows)

表包含 49990 行。rows (num_values) 列显示每个切片包含大致相同的行数。minvalue 和 maxvalue 列显示每个切片上值的范围。每个切片都包含几乎整个值范围，因此，很有可能每个切片都会参与筛选用户 ID 范围的查询执行过程。

本示例演示了一个小型测试系统上的分配。切片的总数通常会高得多。

如果您通常使用 STATE 列联接或组合，则您可能会选择根据 STATE 列分配。以下示例显示您使用与 USERS 表相同的数据创建一个新表，但将 DISTKEY 设为 STATE 列的情况。在这种情况下，分配不平均。切片 0（13587 行）的行数比切片 3（10150 行）多约 30%。在大得多的表中，这种程度的分配偏斜可能会对查询处理产生负面影响。


create table userskey distkey(state) as select * from users;

select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'userskey' and col=0 and rows>0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 13587 |        5 |    49989
    1 |   0 | 11245 |        2 |    49990
    2 |   0 | 15008 |        1 |    49976
    3 |   0 | 10150 |        4 |    49986
(4 rows)

DISTSTYLE EVEN 示例

如果您使用与 USERS 表相同的数据创建一个新表，但将 DISTSTYLE 设为 EVEN，则行总是在所有切片上均匀分配。


create table userseven diststyle even as 
select * from users;

select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'userseven' and col=0 and rows>0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 12497 |        4 |    49990
    1 |   0 | 12498 |        8 |    49984
    2 |   0 | 12498 |        2 |    49988
    3 |   0 | 12497 |        1 |    49989  
(4 rows)

但是，由于分配不是基于特定的列，查询处理性能可能会下降，特别是当该表联接到其他表时。联接列分配缺失通常会影响能够高效执行的联接操作类型。当两张表依据各自的联接列分配和排序时，联接、聚合和组合操作都能得到优化。

DISTSTYLE ALL 示例

如果您使用与 USERS 表相同的数据创建一个新表，但将 DISTSTYLE 设为 ALL，则所有行都分配到每个节点的第一个切片上。


select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'usersall' and col=0 and rows > 0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 49990 |        4 |    49990
    2 |   0 | 49990 |        2 |    49990

(4 rows)

Javascript 在您的浏览器中被禁用或不可用。

要使用 Amazon Web Services 文档，必须启用 Javascript。请参阅浏览器的帮助页面以了解相关说明。

查询计划示例

排序键