It’s very common to face situations where I have to decide which is the best way for creating a specific index over the mainly queries, but there is a principle that I follow <<Know your data>>. I will use the StackOverflow Database for showing the direct effect between the build of indexes.
I create a copy of the table Users, this has around 3,4 millions of users and I have cloned with the name Voters2, I know not much imaginative 🙂 .
SELECT * INTO Voters2 FROM [dbo].[Voters]
The first myth that probably you can read in a lot of websites is: “You must put the columns in the same order than your predicate for an effective use of the indexes”. About this fact I could say: It’s depend, if you require that the index will be used only in a predicate with multiple equality then order doesn’t matter.
For showing my point of view about the order I am going to create a new index with an arbitrary order, the query that I need to satisfy is:
SELECT Id,CreationDate FROM dbo.Votes2 WHERE PostId = 37843 AND UserId=383;
The tool for analyzing the behavior is the Execution Plan that has been generated by the optimizer. In this step I am defining the first index, instead of creating the index with the same order of columns we will inverse the order, due to the amount of rows in this table the process can take some minutes, the follow code will create the index:
CREATE NONCLUSTERED INDEX IX_VotesTwo_UserId_Postid ON Votes2(UserId,PostId) INCLUDE (Id,CreationDate); GO
When you execute the query the statistics show us the next:
The next image shows the Execution Plan
Now we are going to drop the index previously created.
DROP INDEX IX_VotesTwo_UserId_Postid ON dbo.Votes2 GO
The next step consists in create a new index with a different order however with the same columns
CREATE NONCLUSTERED INDEX IX_VotesTwo_Postid_UserId ON Votes2(PostId,UserId) INCLUDE (Id,CreationDate); GO
After to create the new index we follow with the query execution and observe the statistics and the Execution Plan
Have you seen how the result are the same and the statistics doesn’t show any difference between both indexes, in fact it’s interesting to verify that for these type of cases the order in the columns do not matter.