SQL Server is an application which supports NUMA. This means that it can correctly use hardware which has been designed with NUMA architecture. Additionally, you can use software configuration in order to get higher performance, as you'll see.
What is NUMA?
NUMA= Non Uniform Memory Access. NUMA architecture is characterized by a series of nodes, each one having its own CPU and local memory circuitry, which are interconnected by a bus (a token ring, shared bus,…whatever,…but they are interconnected).
This corresponds to the following diagram:
This post will not go into details of the benefits of NUMA hardware, but will focus rather on Soft-NUMA.
What is Soft-NUMA?
Soft-NUMA is the way you can, in SQL Server, “simulate” NUMA architecture or “trick” your SQL Server instances into believing that there is a NUMA architecture like that described above. This is not only used to configure nodes within a hardware NUMA node, but also to configure Soft-NUMA on SMP machines, where there are a series of CPUs which share ALL the same memory (there are no isolated nodes, there is just one node they can share)
This type of scenario would correspond to the following diagram:
And, in this case, the SQL Server would behave, to all intents and purposes, as though there were a single NUMA node, since that is how you can technically think of it.
Don't believe everything you read about Soft-NUMA
In many references in prestigious books (which I'm obviously not going to mention here, out of respect for the authors), and even in the books in official SQL Server screens, as of the date of writing this (3/5/2012), the following comment is made:
“The benefits of soft-NUMA include reducing I/O and lazy writer bottlenecks on computers with many CPUs and no hardware NUMA. There is a single I/O thread and a single lazy writer thread for each NUMA node. Depending on the usage of the database, these single threads may be a significant performance bottleneck. Configuring four soft-NUMA nodes provides four I/O threads and four lazy writer threads, which could increase performance”
Source: http://msdn.microsoft.com/en-us/library/ms178144.aspx
This is even a question in a certification exam for SQL Server
Well, in fact, this is not correct, and it has even been reported in connect as a bug to be corrected…but was closed without corrections :( https://connect.microsoft.com/SQLServer/feedback/details/670328/books-online-incorrect-soft-numa-information
How can I reproduce it?
Very easily. Imagine you've got a machine - as I have - with 8 CPUs (Intel core i7)
Imagine that you want to create 2 Soft-NUMA nodes, so that cores from 0 to 3 are node 1 and from 4 to 7 are node 2
CPUs 0 1 2 3 4 5 6 7
Soft-NUMA <----N0----><----N1---->
You should go to regedit.exe and do the following:
In SQL Server 2012:
Open regedit.exe HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\110\
If not exists KEY "NodeConfiguration" (this will happen in non hardware-NUMA systems)
Create new key: "NodeConfiguration"
If not exists key "Node0"
Create new key: Node0
If not exists key "Node1"
Create key "Node1"
Remember this:
Decimal Binary Allow SQL Server threads on processors
value bit mask
15 00001111 0, 1, 2, and 3
240 11110000 4,5,6 and 7
Then, apply the following configuration:
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\110\NodeConfiguration]
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\110\NodeConfiguration\Node0]
"CPUMask"=dword:15
"Group"=dword:0
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\110\NodeConfiguration\Node1]
"CPUMask"=dword:240
"Group"=dword:0
Once you have restarted SQL Server, you will have proof that you have 2 NUMA nodes.
By looking in the SQL Server error log
And, of course, in the DMV which displays the NUMA nodes available
But, however hard you search, there are no additional LAZYWRITER processes resulting from having more Soft-NUMA nodes.
A simple sp_who2 will show you that there is only one LAZYWRITER process.
NOTE: Remember that this is only true for Soft-NUMA; if you have hardware-NUMA you will indeed have a LAZYWRITER process for each of the nodes
Therefore, this highly specific type of SQL Server configuration is not, as you might think, useful for improving disk I/O completion waits ... rather, it is useful for improving network I/O completion waits!!