I'm trying to optimize a join between two tables with different PIs
- Customer Table: Indexed by an Customer Surrogate Key (CUSTOMER_SK)
- Account Table: Indexed by an Account Surrogate Key (ACCOUNT_SK)
The Account table has a conceptual foreign key (Conceptual because it's not enforced by the table declarion) that references the customer owner of the account (CUSTOMER_FK). A customer can have more than one account, but each account has only one customer.
I'm trying to bring an attribute from the Customer to the Account, for posterior processing of business rules. Currently, i'm joining the two tables using the following steps:
1- Extract a subset (filter on the table) by information reference date, and reindexing the results by the Customer Foreign Key. That way, both tables are distributed by the same keys over the AMPs
2- Join the tables on the Customer Key on both tables (CUSTOMER_SK and CUSTOMER_FK), reindexing the results on ACCOUNT_SK.
Thing is, I have a few clientes (< 10) that have millions of accounts. Because of this, the reindexed account table is terribly skewed. The skew factor of the original table (Indexed over the Account Surrogate) is 1, against 98 on the reindexed table, and I'm frequently running out of Spool Space (Even when the query gets to run, it impact the server overall performance)
Is there any way to optimize this query?
Aucun commentaire:
Enregistrer un commentaire