
上QQ阅读APP看书,第一时间看更新
How to do it...
- Create a table with two columns.
- Choose PI as a date column, which is skewed:
CREATE volatile MULTISET TABLE MY_SKEW_TABLE
(
ITEM_ID DECIMAL(18,0) NOT NULL,
Today_DT DATE NOT NULL
)
PRIMARY INDEX (Today_DT)
on commit preserve rows;
- Insert the first day of data (100,000 rows).
- Check the distribution of your PI; try this query:
/**Hash AMP which row query**/
LOCKING ROW FOR ACCESS
SELECT HASHAMP(HASHBUCKET(HASHROW(Today_DT))) AS WhichAMP
,COUNT(*) AS RowsPerAMP
FROM MY_SKEW_DB.MY_SKEW_TABLE
GROUP BY 1
ORDER BY 2 DESC
;
- Sort the data in the following format to check which AMP holds the most number of rows:

- Now, to resolve this you can use the same query and use other columns of the table in composite form, or single, to get the better distribution:
CREATE MULTISET TABLE MY_NONSKEW_DB.MY_NONSKEW_TABLE
(
ITEM_ID DECIMAL(18,0) NOT NULL,
Today_DT DATE NOT NULL
) PRIMARY INDEX (Today_DT,ITEM_ID)
;