Amazon Redshift
数据库开发人员指南 (API Version 2012-12-01)
AWS 服务或AWS文档中描述的功能,可能因地区/位置而异。点 击 Getting Started with Amazon AWS to see specific differences applicable to the China (Beijing) Region.

查询计划示例

本示例演示如何评估查询计划以找到优化分配的机会。

使用 EXPLAIN 命令运行下面的查询以生成查询计划。

Copy
explain select lastname, catname, venuename, venuecity, venuestate, eventname, month, sum(pricepaid) as buyercost, max(totalprice) as maxtotalprice from category join event on category.catid = event.catid join venue on venue.venueid = event.venueid join sales on sales.eventid = event.eventid join listing on sales.listid = listing.listid join date on sales.dateid = date.dateid join users on users.userid = sales.buyerid group by lastname, catname, venuename, venuecity, venuestate, eventname, month having sum(pricepaid)>9999 order by catname, buyercost desc;

在 TICKIT 数据库中,SALES 是事实数据表,LISTING 是其最大的维度。为并置这些表,SALES 根据 LISTID(LISTING 的外键)分配,LISTING 根据其主键 LISTID 分配。下面的示例显示了 SALES 和 LISTID 的 CREATE TABLE 命令。

Copy
create table sales( salesid integer not null, listid integer not null distkey, sellerid integer not null, buyerid integer not null, eventid integer not null encode mostly16, dateid smallint not null, qtysold smallint not null encode mostly8, pricepaid decimal(8,2) encode delta32k, commission decimal(8,2) encode delta32k, saletime timestamp, primary key(salesid), foreign key(listid) references listing(listid), foreign key(sellerid) references users(userid), foreign key(buyerid) references users(userid), foreign key(dateid) references date(dateid)) sortkey(listid,sellerid); create table listing( listid integer not null distkey sortkey, sellerid integer not null, eventid integer not null encode mostly16, dateid smallint not null, numtickets smallint not null encode mostly8, priceperticket decimal(8,2) encode bytedict, totalprice decimal(8,2) encode mostly32, listtime timestamp, primary key(listid), foreign key(sellerid) references users(userid), foreign key(eventid) references event(eventid), foreign key(dateid) references date(dateid));

在下面的查询计划中,基于 SALES 和 LISTING 的联接的合并联接步骤显示 DS_DIST_NONE,表明该步骤无需重新分配。但是,上移查询计划,另一个内部联接显示 DS_BCAST_INNER,表明内部表作为查询执行的一部分广播。使用键分配只能并置一对表,因此,五张表需要重新广播。

Copy
QUERY PLAN XN Merge (cost=1015345167117.54..1015345167544.46 rows=1000 width=103) Merge Key: category.catname, sum(sales.pricepaid) -> XN Network (cost=1015345167117.54..1015345167544.46 rows=170771 width=103) Send to leader -> XN Sort (cost=1015345167117.54..1015345167544.46 rows=170771 width=103) Sort Key: category.catname, sum(sales.pricepaid) -> XN HashAggregate (cost=15345150568.37..15345152276.08 rows=170771 width=103) Filter: (sum(pricepaid) > 9999.00) -> XN Hash Join DS_BCAST_INNER (cost=742.08..15345146299.10 rows=170771 width=103) Hash Cond: ("outer".catid = "inner".catid) -> XN Hash Join DS_BCAST_INNER (cost=741.94..15342942456.61 rows=170771 width=97) Hash Cond: ("outer".dateid = "inner".dateid) -> XN Hash Join DS_BCAST_INNER (cost=737.38..15269938609.81 rows=170766 width=90) Hash Cond: ("outer".buyerid = "inner".userid) -> XN Hash Join DS_BCAST_INNER (cost=112.50..3272334142.59 rows=170771 width=84) Hash Cond: ("outer".venueid = "inner".venueid) -> XN Hash Join DS_BCAST_INNER (cost=109.98..3167290276.71 rows=172456 width=47) Hash Cond: ("outer".eventid = "inner".eventid) -> XN Merge Join DS_DIST_NONE (cost=0.00..6286.47 rows=172456 width=30) Merge Cond: ("outer".listid = "inner".listid) -> XN Seq Scan on listing (cost=0.00..1924.97 rows=192497 width=14) -> XN Seq Scan on sales (cost=0.00..1724.56 rows=172456 width=24) -> XN Hash (cost=87.98..87.98 rows=8798 width=25) -> XN Seq Scan on event (cost=0.00..87.98 rows=8798 width=25) -> XN Hash (cost=2.02..2.02 rows=202 width=41) -> XN Seq Scan on venue (cost=0.00..2.02 rows=202 width=41) -> XN Hash (cost=499.90..499.90 rows=49990 width=14) -> XN Seq Scan on users (cost=0.00..499.90 rows=49990 width=14) -> XN Hash (cost=3.65..3.65 rows=365 width=11) -> XN Seq Scan on date (cost=0.00..3.65 rows=365 width=11) -> XN Hash (cost=0.11..0.11 rows=11 width=10) -> XN Seq Scan on category (cost=0.00..0.11 rows=11 width=10)

一种解决方案是使用 DISTSTYLE ALL 重新创建这些表。表创建完毕后,您将无法更改其分配方式。要重新创建采用其他分配方式的表,请使用深层复制。

首先,重命名这些表。

Copy
alter table users rename to userscopy; alter table venue rename to venuecopy; alter table category rename to categorycopy; alter table date rename to datecopy; alter table event rename to eventcopy;

运行下面的脚本以重新创建 USERS、VENUE、CATEGORY、DATE、EVENT。不要对 SALES 和 LISTING 做出任何更改。

Copy
create table users( userid integer not null sortkey, username char(8), firstname varchar(30), lastname varchar(30), city varchar(30), state char(2), email varchar(100), phone char(14), likesports boolean, liketheatre boolean, likeconcerts boolean, likejazz boolean, likeclassical boolean, likeopera boolean, likerock boolean, likevegas boolean, likebroadway boolean, likemusicals boolean, primary key(userid)) diststyle all; create table venue( venueid smallint not null sortkey, venuename varchar(100), venuecity varchar(30), venuestate char(2), venueseats integer, primary key(venueid)) diststyle all; create table category( catid smallint not null, catgroup varchar(10), catname varchar(10), catdesc varchar(50), primary key(catid)) diststyle all; create table date( dateid smallint not null sortkey, caldate date not null, day character(3) not null, week smallint not null, month character(5) not null, qtr character(5) not null, year smallint not null, holiday boolean default('N'), primary key (dateid)) diststyle all; create table event( eventid integer not null sortkey, venueid smallint not null, catid smallint not null, dateid smallint not null, eventname varchar(200), starttime timestamp, primary key(eventid), foreign key(venueid) references venue(venueid), foreign key(catid) references category(catid), foreign key(dateid) references date(dateid)) diststyle all;

将数据插回到表中并运行 ANALYZE 命令以更新统计数据。

Copy
insert into users select * from userscopy; insert into venue select * from venuecopy; insert into category select * from categorycopy; insert into date select * from datecopy; insert into event select * from eventcopy; analyze;

最后,删除副本。

Copy
drop table userscopy; drop table venuecopy; drop table categorycopy; drop table datecopy; drop table eventcopy;

再次使用 EXPLAIN 运行相同的查询,并检查新的查询计划。现在,联接显示 DS_DIST_ALL_NONE,表明无需重新分配(因为数据已通过 DISTSTYLE ALL 分配到每一个节点)。

Copy
QUERY PLAN XN Merge (cost=1000000047117.54..1000000047544.46 rows=1000 width=103) Merge Key: category.catname, sum(sales.pricepaid) -> XN Network (cost=1000000047117.54..1000000047544.46 rows=170771 width=103) Send to leader -> XN Sort (cost=1000000047117.54..1000000047544.46 rows=170771 width=103) Sort Key: category.catname, sum(sales.pricepaid) -> XN HashAggregate (cost=30568.37..32276.08 rows=170771 width=103) Filter: (sum(pricepaid) > 9999.00) -> XN Hash Join DS_DIST_ALL_NONE (cost=742.08..26299.10 rows=170771 width=103) Hash Cond: ("outer".buyerid = "inner".userid) -> XN Hash Join DS_DIST_ALL_NONE (cost=117.20..21831.99 rows=170766 width=97) Hash Cond: ("outer".dateid = "inner".dateid) -> XN Hash Join DS_DIST_ALL_NONE (cost=112.64..17985.08 rows=170771 width=90) Hash Cond: ("outer".catid = "inner".catid) -> XN Hash Join DS_DIST_ALL_NONE (cost=112.50..14142.59 rows=170771 width=84) Hash Cond: ("outer".venueid = "inner".venueid) -> XN Hash Join DS_DIST_ALL_NONE (cost=109.98..10276.71 rows=172456 width=47) Hash Cond: ("outer".eventid = "inner".eventid) -> XN Merge Join DS_DIST_NONE (cost=0.00..6286.47 rows=172456 width=30) Merge Cond: ("outer".listid = "inner".listid) -> XN Seq Scan on listing (cost=0.00..1924.97 rows=192497 width=14) -> XN Seq Scan on sales (cost=0.00..1724.56 rows=172456 width=24) -> XN Hash (cost=87.98..87.98 rows=8798 width=25) -> XN Seq Scan on event (cost=0.00..87.98 rows=8798 width=25) -> XN Hash (cost=2.02..2.02 rows=202 width=41) -> XN Seq Scan on venue (cost=0.00..2.02 rows=202 width=41) -> XN Hash (cost=0.11..0.11 rows=11 width=10) -> XN Seq Scan on category (cost=0.00..0.11 rows=11 width=10) -> XN Hash (cost=3.65..3.65 rows=365 width=11) -> XN Seq Scan on date (cost=0.00..3.65 rows=365 width=11) -> XN Hash (cost=499.90..499.90 rows=49990 width=14) -> XN Seq Scan on users (cost=0.00..499.90 rows=49990 width=14)