Auto Update Statistics – good or bad?
As we all know, table statistics are a critical resource that the optimizer uses in determining a good query plan. Not just that they exist but that they are up to date and as accurate as it can be. Statistics provide the optimizer algorithm with the essential data density maps that allow it to choose the most efficient physical methodology for retrieving the data required to meet the query in question. Without this data map (or data densities) the optimizer has no way of knowing whether a search argument filters records to 1 record or 10 million records; or whether the join criteria will result in a single match or hundreds of thousands, perhaps even millions.
Statistics are required. No way around it. And they need to be updated as the data changes within the table itself. Statistics can be updated/ created manually or SQL Server can handle that automatically in the “back end” so to speak. For manual statistics, a good DBA will know the rate of change within the database itself and schedule manual statistics updates accordingly. For automatic statistics updates – it is a different story. And that is where we sometimes run into issues.
SQL Server tracks modifications to tables in order to know when to trigger an automatic update of those statistics. This is due to the requirement that in order to produce an efficient query plan, those statistics need to be as up to date as is reasonable. It isn’t going to update the data density histogram with every single DML against the table as updating takes time. So SQL Server relies on the change tracking to determine when the statistics are out of date sufficiently to warrant the time and resources to update them.
So what happens during an update anyway? In short, once a change threshold has been reached (in older versions of SQL Server this was tracked via the rowmodctr column in sysindexes),and a query is executed involving the table where that threshold has been reached, SQL Server will – before -the query is executed, spend cycles scanning the table in question and producing anew data density histogram (i.e.statistic). Once it is updated, the query is turned back over to the optimizer which then uses the new statistic information to generate a query plan. Any existing cached query plans are also out dated at this point. So, needless to say, this update of the statistics requires cycles to complete. These are in CPU and in most cases disk IO as well. While the update is occurring, the query is in a state of “suspension” awaiting the “back end” process to complete. (Note: The awaiting spid is not charged the CPU and physical IO required to update the statistics).
So, the query is waiting for the statistics to complete. The question is, “Ok, so my query is on hold while SQL Server updates the statistics. So how long is that going to take?” Well, it depends on the size of the table. The smaller the table, the faster the process completes and inversely so. This is where it might make sense to take things into our own hands.
Because of the critical importance of up to date statistics, SQL Server turns the update of these objects ON by default. Within the database options, the Auto Update Statistics and Auto Create Statistics are enabled upon database creation. In a lot of cases – perhaps even most – this is a GOOD thing. After all, we expect the RDBMS to return records to us in a timely manner do we not? However, in certain cases, this automatic update is a bad, bad thing.
Enter high performance OLTP systems. Online Transaction Processing systems are those in which queries and DML statements alike are expected to both read and write data in seconds or sub second time frames. In fact, most default SQL Server timeout values enforced by front end applications such as .NET are 30 seconds. So what happens when those queries are placed on hold while SQL Server decides it needs to update statistics? Well, again if those tables are large or if the hardware is unable to deliver this data quick enough, it actually can cause sporadic timeouts with the application. And in an OLTP system where you may have a customer or shopping cart that is waiting on that data, ANY timeouts are bad.
Now remember, the auto update only kicks in once the change tracking indicates an update of affected statistics is required. So, symptoms of delays caused by this process are primarily sporadic application timeouts on queries that otherwise execute very quickly(i.e. seconds or sub seconds). and because default timeouts are at 30 seconds, a lot of times, developers and system engineers are not even aware that their users are suffering sporadic long delays as a 28 second delay is not typically going to cause an error that would be noticed… but I am sure the user did.:-) But once the auto update process exceeds 30 seconds, that’s when the firefighting mostly begins.
So – what should we do? SQL Tuners recommends that for high performance OLTP systems as well as some Data Warehouse systems, the Auto Update Statistics database option be disabled. Auto Create Statistics should, by contrast be left enabled. It is also critical that once the auto update option is disabled, that a proactive maintenance plan be put into place to perform manual update statistics in the database on a recurring schedule. The timing of that schedule will need to be researched by the DBA to ensure that the rate of change of the database is taken into consideration. Some performance tracing can help in this regard but generally, once a week should be sufficient.
Be wary of architectures which rely on transient permanent tables that handle session data for example. SQL Server provides great temp table support in local #,global ##and table variables which are great for handling transient data. Table variables, in particular, while they are written to disk in the same manner as#and ## tables (despite widespread belief these table variables are in memory only), can provide relief from another common performance issue in OLTP systems known as recompiles. Using permanent tables which are filled with data and subsequently truncated or deleted from at high rates, is another common cause of auto update statistics causing sporadic timeouts with the application. If you are using this type of architecture, it becomes even more crucial that you disable auto update statistics. (And perhaps look into changing the architecture to use one of the temporary table forms instead.)