Home > SQL Server > Writing Code that scales – Part 3 – Big fields on Tables

Writing Code that scales – Part 3 – Big fields on Tables

The third tip form my Tech Ed deck that I’ll share with you concerns the definition of tables in a Sql database. Most developers focus on writing code that either just gets the job done or thats optimized in some way. However , I’ve noticed that many SQL developers don’t pay attention to the actual table definition, lazily using standard column definitions. Most of you might think that its not really a big deal – How much space do you really save by narrowing columns on a sql table definition ? The answer might surprise you.

Lets take a look at a table that I created. Its very simple , holding just a primary key , a number column and a date column.

So an alternative would be the following definition.

Much more efficient

The Primary key stays the same. The number column changes from INT to TINYINT , since I know that the values I will insert are not greater than 256. The DATETIME field becomes a SMALLDATETIME, since I don’t need the extra precision.

Now what exactly will all of this achieve ? Surely this is being obsessive ? Well, I wrote a script to populate both tables with a million rows each. The script was a while-loop so that the data generated would be exactly the same in each table , except possibly the second table would have a slightly later datevalue ( shouldn’t affect it ). The results are very interesting. Without any compression enabled, the “Big” table used up 24.188MB , while the “Small” table came in at 17.406MB , which is about 71% the size or a 29% saving. Now I don’t know about you , but a 30% saving before compression is something I’ll take anyday. This is just a table with 2 columns , it might be more on a “wider” table. And if you implement this on a fact table which will grow to a billion rows , the savings become really meaningful.

Other things to consider :

  • More memory is used when larger fields are queried
  • More data is transported across the network
  • If all tables in a database are larger , this will mean larger backups
  • Larger backups mean longer restore times , which means longer recovery times in disasters

The NVARCHAR issue

The other issue when it comes to field definition lies with the NVARCHAR and NCHAR type. Some developers don’t know what the difference is between VARCHAR and NVARCHAR, so they just use NVARCHAR. Well NVARCHAR can handle international language code sets – so if you have a user captured field that might contain European or Asian language characters , then you would need NVARCHAR. However , for text fields where you know that the data is stored in English ( or is constrained ), you are simply wasting a lot of space by defining the field as NVARCHAR. How much space ? Try twice as much – yep , NVARCHAR uses double the space as VARCHAR – definitely something to think about.

Categories: SQL Server
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: