SQL 2014 ColumnStore – The best of both worlds !
I’m sure that many of you are familiar with the Column Store index feature that was launched with SQL 2012. Using Microsoft’s In-memory technology, the Column Store index delivered a massive performance boost to data warehouse queries. However , I’ve found that many people were put off the feature because of the fact that the Fact table becomes read only, once you implement Column Store. Well , you’ve probably heard the big news ( I’m writing this post months later than I originally wanted to ) , and for those of you who haven’t, the Column Store feature in SQL 2014 is update-able. That’s right , no more read-only fact table. However, that isn’t the reason for this post – I have something else here that might interest you. Space, the final frontier…. Firstly , you actually have 2 options with SQL 2014 – You can create the Non-Clustered Column Store as per 2012 ( which is NOT update-able ). Alternatively , you can create the new Clustered Column Store which IS update-able. The Clustered Column Store has some other interesting features : 1) It becomes the primary storage mechanism for the table. 2) It eliminates the need for other covering indexes. 3) It dramatically reduces the storage requirements for the table – and that’s a fact. Now its point 3 that I want to talk more about. In my test DW that I built for demo purposes, I have a fact table with about 20 million rows. Lets take a look at the storage statistics of that table. This table is currently taking up about 2.3 GB of space. We could apply Page Compression to this table ( which I believed has also been improved in 2014 ) , and we would get the following result. Not bad , a reduction of the space used to less than 25% of the original size.However, we haven’t built any indexes yet to boost performance. Prior to applying the Page compression , I created a copy of the table , called FactSales2. Let’s apply the Non Clustered Column Store index to that table , to give us that performance boost , and see what happens to the storage. The storage space of the table increases , and we can see that there is a 242MB overhead for the index. Now we could implement page compression and then the Non Clustered Column Store index , but then your table would still only be read-only. In addition to that , you would probably need to implement more indexes which takes up more space. In SQL 2014 , we have a better solution , which is, implement the Clustered Column Store Index. What about the Page Compression ? ….. ….. it simply isn’t needed. The Clustered Column Store delivers better compression than Page Compression , reducing the size of the Table and the Index together to a mere 200MB. Technically , the index is the table. This is astonishingly less than a tenth of the storage space required when compared to the regular table with the Non-Clustered Column Store. Is the performance the same ? I’ve written a typical Data Warehouse reporting query with joins to the dimensions , and executed it on all 4 table types. The results are as follows :
|Type||Avg. Execution Time||Notes|
|Regular Table||32 Seconds||No Indexes|
|Regular Table with Page Compression||23 Seconds||No Indexes|
|Table with Non Clustered Column Store||5 Seconds||No Compression ( 2.3 GB Table ), Read Only|
|Clustered Column Store Table||5 Seconds||Table is 200MB !|
The only thing left to say is that, if you had reservations about using the Non-Clustered Column Store previously, you would be mad not to use it on Fact tables in SQL 2014. The benefits with regards to storage and performance are astounding. It remains to be seen if there is any impact to insert performance , but that’s a topic for another day.