Analyzing the query plan
Run the EXPLAIN command to get a query plan.
Before analyzing the query plan, you should be familiar with how to read it. If you are unfamiliar with reading a query plan, we recommend that you read Creating and interpreting a query plan before proceeding.
To analyze the data provided by the query plan, follow these steps:
-
Identify the steps with the highest cost. Concentrate on optimizing those when proceeding through the remaining steps.
-
Look at the join types:
-
Nested Loop: Such joins usually occur because a join condition was omitted. For recommended solutions, see Nested loop.
-
Hash and Hash Join: Hash joins are used when joining tables where the join columns are not distribution keys and also not sort keys. For recommended solutions, see Hash join.
-
Merge Join: No change is needed.
-
-
Notice which table is used for the inner join, and which for the outer join. The query engine generally chooses the smaller table for the inner join, and the larger table for the outer join. If such a choice doesn't occur, your statistics are likely out of date. For recommended solutions, see Table statistics missing or out of date.
-
See if there are any high-cost sort operations. If there are, see Unsorted or missorted rows for recommended solutions.
-
Look for the following broadcast operators where there are high-cost operations:
-
DS_BCAST_INNER: Indicates that the table is broadcast to all the compute nodes. This is fine for a small table, but not ideal for a larger table.
-
DS_DIST_ALL_INNER: Indicates that all of the workload is on a single slice.
-
DS_DIST_BOTH: Indicates heavy redistribution.
For recommended solutions for these situations, see Suboptimal data distribution.
-