-
Notifications
You must be signed in to change notification settings - Fork 618
Multi-column IndexScan plan selection fix #1305
base: master
Are you sure you want to change the base?
Conversation
@vkonagar Can you provide test cases for this to know what you fixed? Thanks! |
@apavlo Andy, I have added a test to verify the query plan correctness with respect to multi-column indexes. This doesn't fix the cost model for multi-column indexes, which is not currently supported in the optimizer. I have talked to bowei and we will look into that. |
As discussed in today's meeting, we want to fix the cost model to consider multi-column indices. Let me see if I can fix it. @GustavoAngulo @nappelson I'm wondering if we have a testing infrastructure for cost model correctness right now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comment.
Forgot to add the comment... will submit again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the comment.
index_expr_type_list.push_back(expr_type_list[offset]); | ||
index_value_list.push_back(value_list[offset]); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should check for an exact same ordering here. For example, if you have an index on column (a, b), and your predicates are "b = 5 and a = 1", then we should be able to use the index scan. However, the check here won't identify that because it requires the order in the predicates to be exactly the same as in the index.
After thinking about this, I actually think that you should just keep the old index_key_column_id_list
. You just need to add a flag about whether the lead (highest) column in the index has been referenced in the index. As long as that is true, we should be able to use the index for the scan. Thoughts? @chenboy @vkonagar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I also think we don't need to consider order here. The way to fix this issue is letting the cost model compute the correct cost for these indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another important fix that we are going to need for TPC-C. |
This PR fixes the issue #1299. This changes the way we find index match for predicate columns in IndexScan rule implementation. Specifically, this change makes sure that the optimizer picks a multi-column index only if the predicate columns match the index columns in order.