Amazon Redshift
数据库开发人员指南 (API Version 2012-12-01)
AWS 服务或AWS文档中描述的功能,可能因地区/位置而异。点 击 Getting Started with Amazon AWS to see specific differences applicable to the China (Beijing) Region.

诊断查询以优化查询

使用以下查询发现与可能影响查询性能的查询或基础表有关的问题。建议您组合使用这些查询及分析和改进查询中讨论的查询优化流程。

发现极需要优化的查询

下面的查询可找出过去 7 天执行的前 50 条最耗时的语句。您可以利用这些结果发现特别耗时的查询以及经常运行的查询(在结果集中多次出现的查询)。通常可以优化此类查询以提高系统性能。

此外,此查询还提供与每个所发现的查询关联的提醒事件计数。这些提醒提供详细信息,供您用于提高查询的性能。有关更多信息,请参阅 查看查询提醒

Copy
select trim(database) as db, count(query) as n_qry, max(substring (qrytext,1,80)) as qrytext, min(run_minutes) as "min" , max(run_minutes) as "max", avg(run_minutes) as "avg", sum(run_minutes) as total, max(query) as max_query_id, max(starttime)::date as last_run, sum(alerts) as alerts, aborted from (select userid, label, stl_query.query, trim(database) as database, trim(querytxt) as qrytext, md5(trim(querytxt)) as qry_md5, starttime, endtime, (datediff(seconds, starttime,endtime)::numeric(12,2))/60 as run_minutes, alrt.num_events as alerts, aborted from stl_query left outer join (select query, 1 as num_events from stl_alert_event_log group by query ) as alrt on alrt.query = stl_query.query where userid <> 1 and starttime >= dateadd(day, -7, current_date)) group by database, label, qry_md5, aborted order by total desc limit 50;

发现具有数据偏斜或未排序行的表

下面的查询可找到具有不均匀数据分配(数据偏斜)或未排序行占比较高的表。

skew 值较低表明表数据分配适当。如果表的 skew 值达到 4.00 或以上,可以考虑修改其数据分配方式。有关更多信息,请参阅 非最优数据分配

如果表的 pct_unsorted 值大于 20%,可以考虑运行 VACUUM 命令。有关更多信息,请参阅 未排序或排序错乱的行

您还应检查每个表的 mbytespct_of_total 值。这些列标识表的大小及表占用的原始磁盘空间比例。原始磁盘空间包括 Amazon Redshift 保留供内部使用的空间,因此它大于名义磁盘容量(可供用户使用的磁盘空间量)。使用这些信息可确保可用磁盘空间等于最大表大小的 2.5 倍或以上。拥有这样的可用空间能使系统在处理复杂查询时将中间结果写入到磁盘。

Copy
select trim(pgn.nspname) as schema, trim(a.name) as table, id as tableid, decode(pgc.reldiststyle,0, 'even',1,det.distkey ,8,'all') as distkey, dist_ratio.ratio::decimal(10,4) as skew, det.head_sort as "sortkey", det.n_sortkeys as "#sks", b.mbytes, decode(b.mbytes,0,0,((b.mbytes/part.total::decimal)*100)::decimal(5,2)) as pct_of_total, decode(det.max_enc,0,'n','y') as enc, a.rows, decode( det.n_sortkeys, 0, null, a.unsorted_rows ) as unsorted_rows , decode( det.n_sortkeys, 0, null, decode( a.rows,0,0, (a.unsorted_rows::decimal(32)/a.rows)*100) )::decimal(5,2) as pct_unsorted from (select db_id, id, name, sum(rows) as rows, sum(rows)-sum(sorted_rows) as unsorted_rows from stv_tbl_perm a group by db_id, id, name) as a join pg_class as pgc on pgc.oid = a.id join pg_namespace as pgn on pgn.oid = pgc.relnamespace left outer join (select tbl, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl inner join (select attrelid, min(case attisdistkey when 't' then attname else null end) as "distkey", min(case attsortkeyord when 1 then attname else null end ) as head_sort , max(attsortkeyord) as n_sortkeys, max(attencodingtype) as max_enc from pg_attribute group by 1) as det on det.attrelid = a.id inner join ( select tbl, max(mbytes)::decimal(32)/min(mbytes) as ratio from (select tbl, trim(name) as name, slice, count(*) as mbytes from svv_diskusage group by tbl, name, slice ) group by tbl, name ) as dist_ratio on a.id = dist_ratio.tbl join ( select sum(capacity) as total from stv_partitions where part_begin=0 ) as part on 1=1 where mbytes is not null order by mbytes desc;

发现具有嵌套循环的查询

下面的查询可发现记录有嵌套循环提醒事件的查询。有关如何修复嵌套循环条件的信息,请参阅嵌套循环

Copy
select query, trim(querytxt) as SQL, starttime from stl_query where query in ( select distinct query from stl_alert_event_log where event like 'Nested Loop Join in the query plan%') order by starttime desc;

查看查询的队列等待时间

下面的查询显示最近的查询在执行前等待(等待查询队列中有可用的槽)了多长时间。如果等待时间较长,可能需要修改查询队列配置,以获得更高的吞吐量。有关更多信息,请参阅 定义查询队列

Copy
select trim(database) as DB , w.query, substring(q.querytxt, 1, 100) as querytxt, w.queue_start_time, w.service_class as class, w.slot_count as slots, w.total_queue_time/1000000 as queue_seconds, w.total_exec_time/1000000 exec_seconds, (w.total_queue_time+w.total_Exec_time)/1000000 as total_seconds from stl_wlm_query w left join stl_query q on q.query = w.query and q.userid = w.userid where w.queue_start_Time >= dateadd(day, -7, current_Date) and w.total_queue_Time > 0 and w.userid >1 and q.starttime >= dateadd(day, -7, current_Date) order by w.total_queue_time desc, w.queue_start_time desc limit 35;

按表查看查询提醒

下面的查询可发现记录有提醒事件的查询以及最常出现的提醒类型。

如果所发现的表的行的 minutes 值较大,请检查该表以确认是否需要对其执行日常维护(如对其运行 ANALYZEVACUUM)。

如果行的 count 值较大,但 table 值为空,则对 STL_ALERT_EVENT_LOG 运行相关 event 值的查询,以调查该提醒经常出现的原因。

Copy
select trim(s.perm_table_name) as table, (sum(abs(datediff(seconds, s.starttime, s.endtime)))/60)::numeric(24,0) as minutes, trim(split_part(l.event,':',1)) as event, trim(l.solution) as solution, max(l.query) as sample_query, count(*) from stl_alert_event_log as l left join stl_scan as s on s.query = l.query and s.slice = l.slice and s.segment = l.segment and s.step = l.step where l.event_time >= dateadd(day, -7, current_Date) group by 1,3,4 order by 2 desc,6 desc;

发现缺失统计数据的表

下面的查询提供对缺失统计数据的表运行的查询的计数。如果该查询返回任意行,则查看 plannode 值以确定受影响的表,然后对其运行 ANALYZE

Copy
select substring(trim(plannode),1,100) as plannode, count(*) from stl_explain where plannode like '%missing statistics%' group by plannode order by 2 desc;