I’m sure that many of you are familiar with the Column Store index feature that was launched with SQL 2012. Using Microsoft’s In-memory technology, the Column Store index delivered a massive performance boost to data warehouse queries. However , I’ve found that many people were put off the feature because of the fact that the Fact table becomes read only, once you implement Column Store. Well , you’ve probably heard the big news ( I’m writing this post months later than I originally wanted to ) , and for those of you who haven’t, the Column Store feature in SQL 2014 is update-able. That’s right , no more read-only fact table. However, that isn’t the reason for this post – I have something else here that might interest you. Space, the final frontier…. Firstly , you actually have 2 options with SQL 2014 – You can create the Non-Clustered Column Store as per 2012 ( which is NOT update-able ). Alternatively , you can create the new Clustered Column Store which IS update-able. The Clustered Column Store has some other interesting features : 1) It becomes the primary storage mechanism for the table. 2) It eliminates the need for other covering indexes. 3) It dramatically reduces the storage requirements for the table – and that’s a fact. Now its point 3 that I want to talk more about. In my test DW that I built for demo purposes, I have a fact table with about 20 million rows. Lets take a look at the storage statistics of that table. This table is currently taking up about 2.3 GB of space. We could apply Page Compression to this table ( which I believed has also been improved in 2014 ) , and we would get the following result. Not bad , a reduction of the space used to less than 25% of the original size.However, we haven’t built any indexes yet to boost performance. Prior to applying the Page compression , I created a copy of the table , called FactSales2. Let’s apply the Non Clustered Column Store index to that table , to give us that performance boost , and see what happens to the storage. The storage space of the table increases , and we can see that there is a 242MB overhead for the index. Now we could implement page compression and then the Non Clustered Column Store index , but then your table would still only be read-only. In addition to that , you would probably need to implement more indexes which takes up more space. In SQL 2014 , we have a better solution , which is, implement the Clustered Column Store Index. What about the Page Compression ? ….. ….. it simply isn’t needed. The Clustered Column Store delivers better compression than Page Compression , reducing the size of the Table and the Index together to a mere 200MB. Technically , the index is the table. This is astonishingly less than a tenth of the storage space required when compared to the regular table with the Non-Clustered Column Store. Is the performance the same ? I’ve written a typical Data Warehouse reporting query with joins to the dimensions , and executed it on all 4 table types. The results are as follows :
|Type||Avg. Execution Time||Notes|
|Regular Table||32 Seconds||No Indexes|
|Regular Table with Page Compression||23 Seconds||No Indexes|
|Table with Non Clustered Column Store||5 Seconds||No Compression ( 2.3 GB Table ), Read Only|
|Clustered Column Store Table||5 Seconds||Table is 200MB !|
The only thing left to say is that, if you had reservations about using the Non-Clustered Column Store previously, you would be mad not to use it on Fact tables in SQL 2014. The benefits with regards to storage and performance are astounding. It remains to be seen if there is any impact to insert performance , but that’s a topic for another day.
As you will have heard by now, some exciting announcements were made at SQL Pass. One of those was Hekaton , the exciting new in-memory database platform for SQL Server. I’ve found some more information on Hekaton.
Before I move onto that though, remember that xVelocity ColumnStore debuted in SQL 2012 , and the performance results are already impressive. I did a blog post on ColumnStore here, and the results were very impressive.
But that currently only applies to Data Warehouse workloads. Now we have “Hekaton” , which promises to bring the xVelocity technology to OLTP workloads. What we’ve heard at pass about Hekaton :
- Will ship with the next major version of Sql Server
- Will be integrated into SQL Server – no separate interface or programming model
- Basically allows you to place most frequently used OLTP data tables in memory , in order to boost reads and writes ( that’s a very simplified explanation of it )
- You can keep your other OLTP tables on normal storage. So a database application can be designed to run with a hybrid storage mechanism.
I would imagine that at some interval , the in-memory table data is written to disk. According to Microsoft , some customers on TAP are testing this at the moment , and are seeing anywhere between 5 and 50 times throughput gains. While I don’t like repeating such claims , we did see the evidence of such speed boosts with ColumnStore , so this is very exciting.
In the mean time , start ordering more RAM for your servers ….. J