3 Reasons To Modernize Your Data Estate

3 Reasons To Modernize Your Data Estate

There are major benefits awaiting organizations that modernize their data estate, including enhanced security and compliance, significant cost savings, and the ability to empower employees with business insights and advanced analytics capabilities.

With an out-of-date data platform, you could be…

  • Missing out on game changing cost benefits
  • Opening your business to potential vulnerabilities
  • Falling behind without modern analytics capabilities

What does it mean to ‘modernize’?

Modernization means taking advantage of the latest technologies to create transformational business benefits. With Microsoft, you have the flexibility to modernize your data estate in the way that’s right for your business:

  • Upgrade to SQL Server 2017: Move to a database with industry-leading performance and security—now on Windows, Linux, and Docker containers.
  • Modernize in the cloud: Move to Azure to get greater cost efficiencies, state-of-the-art security, and powerful data and analytics services – with fully-managed database-as-a-service (DBaaS) solutions.
  • Take a hybrid approach: Deploy across on-premises and the cloud to extend your infrastructure and improve business continuity.

Why modernize now?

Reduce cost & maximize your investment

Benefits of SQL Server 2017

  • #1 database in Online Transaction Processing (OLTP) price/performance
  • Unmatched Total Cost of Ownership (TCO) with everything built in
  • Get a comprehensive solution at one-fifth the cost of other self-service solutions with comprehensive BI
  • Reduce costs and create efficiencies with faster transactions with In-Memory OLTP and up to 100x faster analytics with in-memory Columnstore
  • Cut productivity losses and speed query performance without tuning using Adaptive Query Processing

Benefits of Microsoft Azure

  • Scale up or down quickly with pay-as-you-go, consumptionbased pricing and reduce licensing and hardware costs
  • 50% of businesses reported reaching ROI at a faster rate with cloud applications
  • Optimize costs for database workloads with different VM sizes
  • 75% lower costs on software licenses with Azure

Gain state-of-the-art, award-winning security

Benefits of SQL Server 2017

  • Least vulnerable of any enterprise database over last 7 years
  • Encrypt data at rest and in motion with Always Encrypted
  • Conceal sensitive data with Dynamic Data Masking
  • Control access to database rows with Row-Level Security

Benefits of Microsoft Azure

  • Meet a broad set of international and industry-specific compliance regulations
  • Continuous security-health monitoring across Azure, on-premises, and public clouds

Solve bigger problems with advanced business insights

Benefits of SQL Server 2017

  • Real-time operational analytics when you combine in-memory technologies
  • Visualize data anywhere with Mobile BI
  • Scale and accelerate machine learning by pushing intelligence to where the data lives with in-database R and Python analytics

Benefits of Microsoft Azure

  • Leading AI innovation to discover insights faster
  • Advanced machine learning capabilities at your fingertips

With the tech available to every business, modernization is now the norm. With a solution like SQL Server, you can reduce costs and maximize your investment, gain state-of-the-art, award-winning security, and solve bigger problems with advanced business insights.

If that sounds great but you’re unsure of how to proceed, it’s time to call the experts at Oakwood Systems Group at (314) 824-3000 or drop us a line in the form below. We’re industry professionals who have in-depth experience with helping businesses just like yours to plan, integrate, and execute with new technological solutions like SQL Server. Contact us today to find out more on how we can help your business.


We get it. Change is scary!

We get it. Change is scary!

We get it. Change is scary. If you’ve been running the same operating system for many years, a large-scale overhaul is understandably intimidating.

However, with service ending for SQL and Windows Server 2008 and 2008 R2, now is the time to make the switch to Azure and experience the latest in innovation and security.

If you are still running either of these, don’t panic! Microsoft has put options in place to give you the time you need, and we at Oakwood Systems Group, Inc., are here to assist you with a smooth transition.

Almost any business using technology is producing enormous amounts of data. Unfortunately, much of this data comes in a wide range of formats and most businesses don’t have the tools or know-how to fully utilize it.  

This is exactly what SitePro observed in the oil and gas industry, where large amounts of data were being generated but not used to their fullest extent. Using Azure, SitePro was able to compile this data and channel it all through a single, user-friendly application available to their customers. 

Suddenly, oil and gas companies could see in real time what was going on at their various fields and know exactly when servicing was needed on their equipment, saving them valuable time and resources. 

What could your data be waiting to tell you? Find out with Azure. Call or write us today.

When Was the Last Time You Upgraded SQL Server?

When Was the Last Time You Upgraded SQL Server?

Upgrade your SQL Server.  But how?  It’s kind of like that college class you hated but was necessary to take in order to graduate. The homework was awful so you kept putting it off until the last minute.  In the case of SQL Server, you may have systems that are 5 or even 10 years old.  There are various reasons why you don’t upgrade them. Maybe the software they run won’t work on newer versions, maybe the hardware won’t run the latest edition, or maybe there is no reason to upgrade because no one complains.  Be wary, complacency can be an expensive trap.  The dangers of not upgrading can be both highly visible but also hidden.  Hidden costs include higher maintenance, lower productivity, or higher turnover.

Upgrade Your SQL Server

Of course, this is all easily avoidable. Upgrading SQL Server does not have to be a major chore.  Two things are needed: coordination and planning.  Upgrading SQL Server falls mainly into two categories.  The first is an in-place upgrade.  Service packs and hotfixes are good candidates for in-place upgrades. The other option is to build out an upgrade alongside a production system.  This is a good option for critical systems or older systems where there is not a good path from one version to another.

Why A Side-By-Side Upgrade?

A side-by-side upgrade is also a good option when the production system is running on older hardware or an application is being upgraded along with SQL Server.  The method you choose is determined by your unique environment.  Finally, a successful upgrade project depends on notifying users and providing them with realistic downtime expectations.

Benefits

  • Security.  Each version provides increased security against any number of modern threats. Upgrades are critical to staying ahead of these threats.
  • Performance. With each new version, Microsoft improves the performance of SQL Server. Upgrading allows you to take advantage of these improvements.
  • Productivity. Newer versions introduce tools and methods for increasing productivity. These include newer DMV’s and a better user interface.
  • Interoperability.  You may have some applications that will only run on older versions of SQL but most vendors are better than that. Newer versions can increase interoperability between other applications as well as other database management systems.
  • Cost. Real and hidden costs. If you have SA then you are paying for upgrades whether you do them or not. If you have Microsoft vouchers then you are paying for upgrade services you are not using.  Most good DBA’s won’t stand for managing older versions. Most new DBA hires won’t know how to manage older versions.

Upgrade Your SQL Server Now

Are there reasons not to upgrade?  Actually, there are. If you have systems marked for decommissioning or used only for archiving old data, upgrading may not be worth the time.  You can potentially save money in licensing costs by excluding these systems from your SA agreement.  Most of your systems will fall into an upgrade path. Keep in mind the longer you wait to upgrade the harder it will be to upgrade. By maintaining a consistent upgrade schedule you ensure yourself a more stable environment with all the tools and features included in the latest versions.


Next Steps

Review our case studies and engagements where we helped companies just like yours solve a variety of business needs.


About Oakwood

Since 1981, Oakwood has been helping companies of all sizes, across all industries, solve their business problems.  We bring world-class consultants to architect, design and deploy technology solutions to move your company forward.   Our proven approach guarantees better business outcomes.  With flexible engagement options, your project is delivered on-time and on budget.  11,000 satisfied clients can’t be wrong.


Contact Us

Email

General Questions | Request a Scope of Work
sales@oakwoodsys.com

Charles Windsor | President
cwindsor@oakwoodsys.com
314-824-3000

Phone

♦ 314-824-3000 | St. Louis

♦ 913-232-4057 | Kansas City

♦ 800-810-8412 | After Hours

 

Broken SQL, Broken Company:  A Client Story

Broken SQL, Broken Company: A Client Story

Everyone has heard about SQL Server.  We share client stories in the hope that something will spark something, cause someone to take some action, or otherwise turn on a light bulb to alleviate some sort of corporate darkness (ha), and this story is, frankly, filled with corporate darkness – the type of darkness we see all too frequently. In fact, Rob Helvey wrote an article recently that ties right into this: The Problem with Infrastructure Operations Managing Applications. This story proves out his position.

Broken SQL Environments – A Client Story

Our client has a large, mission-critical SQL Server running their ERP system, but, over the years, has not followed best practices nor performed regular maintenance on the SQL environment. They’ve been able to get away with this for some time due to the reliability and ease of use of SQL Server, but, well, it can’t last forever, now can it?

Here’s the situation we walked into: The database environment was in terrible shape. CPU utilization on the database server was consistently running at 100%, and during the busiest time of day, the system would become unresponsive. Mission-critical ERP was at a standstill and no work could be done; manufacturing sites and the business itself came to a halt.

Key Issues

Leading up to the systems current state, the client had been experiencing poor performance with their ERP system. Their IT team had started working on the problem by adding numerous indexes to the database, but it didn’t help and performance continued to degrade.  However, during that process, they noticed that performance would improve temporarily after a reboot, so the IT team scheduled regular server restarts, but performance continued to degrade until multiple restarts were needed throughout the day, leading them to attempt a different fix.  Unfortunately, that fix created the CPU utilization issues mentioned above and to the eventual locking of the system during heavy periods of system load.  An expensive monitoring tool was purchased and installed on the SQL Server, and the monitoring activities of the tool caused the problem to actually get worse. Their final solution was to upgrade the server in a hope to gain improved performance. The business cost was mounting, and the corporate reputation was even being impacted.

Summary

  • Added more SQL indexes – no  improvement
  • Buy and install an expensive monitoring tool – problem gets worse
  • Added more hardware – no improvement

Once they called for help, we were able to fix the problem in one day, without buying any more software or hardware.

“How?” you might be asking right about now.

Action Steps

We quickly got to the root cause of the problem, established a plan of attack, put the plan into action, then created a custom maintenance plan based on business rules, so that this situation would never arise again. Even if you are not a SQL Server expert (as I am not), the root cause analysis below (look at the end of the post) should make you shudder (as I did), and the pre- and post-remediation statistics are shudder-worthy as well.

The bottom line here is that SQL Servers should not behave like this, and trying to solve fundamental SQL Server issues with standard network server troubleshooting techniques will not work. This client spent more money than they needed to spend, with negative results, because they did not have a properly trained SQL Server expert involved in diagnosing and remediating the issue.

SQL Server Root Cause Analysis

  • SQL Server configuration not appropriate for workload
  • Object maintenance not run within time frame relevant to workload
  • Excessive index quantities (most unused), all fragmented and part of regular DML operation (very large tables with nearly 100 columns, with indexes on every column)
  • Long-running queries with insufficient query plans due to out of date statistics and fragmented index utilization contributing to extensive lock waits, then blocking (top wait statistics LOCK and Parallelism related)
  • Backups taking more than 48 hours, not completing due to overlap
  • ‘AppDomain’ memory dumps, flushing the procedure cache every 15-30 seconds
  • Page life expectancy consistently near 0
  • DBCC checks not occurring
  • Average Runnable Tasks per Scheduler: ~10
  • CPU utilization: 95-100% consistently
  • Boost SQL priority set ON
  • All default server level and database level configuration
  • Best practice tempdb and data file sizing absent
  • Reporting off of live production environment

Pre-Remediation Implementation Performance Results

  • Page Life Expectancy per NUMA: <50
  • Average Runnable Tasks per Scheduler: 10-20
  • CPU utilization: 95-100% consistently
  • Large Index Fragmentation: >95%

Post-Remediation Implementation Performance Results

  • Page Life Expectancy per NUMA: >11,000
  • Average Runnable Tasks per Scheduler: 0
  • CPU utilization: normal
  • Large Index Fragmentation: <5%
  • SQL Server environment is now stable.

Next Steps

Review our case studies and engagements where we helped companies just like yours solve a variety of business needs.


About Oakwood

Since 1981, Oakwood has been helping companies of all sizes, across all industries, solve their business problems.  We bring world-class consultants to architect, design and deploy technology solutions to move your company forward.   Our proven approach guarantees better business outcomes.  With flexible engagement options, your project is delivered on-time and on budget.  11,000 satisfied clients can’t be wrong.

SQL Server: To Compress or Not to Compress, That Is the Question

SQL Server: To Compress or Not to Compress, That Is the Question

Are you ready to have some fun with SQL Server compression?  Come on, let’s party!  As with all things in life, you seek solutions based on a need SQL Server is no different.  That said, compression is no different than anything else in life – a solution based on a need. Welcome to SQL Server data compression. I hope you enjoy the ride!

Compression

So what does table compression do?  It decreases your storage requirements on disk by compressing your data.  “So it saves disk space?” you may be asking.  It sure does!  But it also means that more data fits on a page both on disk and in memory.  It helps you squeeze more space out of that buffer pool.  One other cool thing about compression is that it is pretty darn flexible in terms of being able to pick and choose the objects to compress.

So here are the steps you’ll take. First, you’ll need to find out if you would benefit from compression. If you would benefit from compression, then it’s time to decide what kind of compression is going to work best (aka – most bang for the buck). Lastly, you’ll implement compression and determine what kind of downtime you’ll need or performance impacts you’ll see.

I often consider using table/index compression as a solution when I hear about these needs:

  • Disk space pressure
  • Memory performance issues
  • Disk I/O performance issues

There are several different ways you can identify these symptoms.  Everyone has their own preferences for how they make their determinations, but I am going to talk about how I like to do my investigation.

Is Compression the Cure?

Disk Space Pressure

This one’s pretty straightforward.  Do you have much, or any, space available on the presented drive that your database files are stored on?  More importantly, if this is a shared storage environment, is it possible to easily add additional capacity to the LUN(s)?  The first question can be answered simply by looking at disk management (diskmgmt.msc).  The second one may involve taking your wonderful SAN administrator out to lunch to ask (and it certainly wouldn’t hurt to pay for said lunch and bat your eyes at him/her as you beg, plead and/or threaten to key their car for additional capacity).

It is generally considered a best practice to right size your data files to account for growth and monitor your free space within the data files so that you can grow them in a manual and controlled manner, so this should not really be fluctuating very much.  You will more than likely want to take a look at the free space within your data files to find out how close you are to either an autogrowth event or outgrowing your available disk space, depending on how much room you left yourself on disk for growth.  I put together the following query to allow you to take a look at that information (it executes at an instance level and iterates all databases within the instance):

set nocount on;

if object_id( 'tempdb..#results' ) is not null
	drop table #results;

create table #results
(
	name nvarchar( 256 ) null,
	Unallocated_Size_in_DataFile_MB decimal( 18, 4 ) null,
	Allocated_Size_in_DataFile_MB decimal( 18, 4 ) null,
	Total_DataFile_Size_on_Disk_MB decimal( 18, 4 ) null,
	Percent_DataFile_Allocated decimal( 18, 4 ) null,
	Percent_Mixed_Extents decimal( 18, 4 ) null
);

declare @currDbId as int;
declare @currDbName as nvarchar( 256 );
declare @sqlStatement as nvarchar( max );

declare dbCursor cursor fast_forward for
	select 
		database_id,
		name
	from 
		sys.databases
	where 
		database_id > 4;

open dbCursor;

fetch next from dbCursor into @currDbID, @currDbName;
while( @@FETCH_STATUS = 0 )
	begin
		set @sqlStatement = N'
			select 
				DB.name,
				cast( ( ( cast( FSU.unallocated_extent_page_count as decimal( 18, 4 ) ) * 8 ) / 1000 ) as decimal( 18, 4 ) ) as Unallocated_Size_in_DataFile_MB,
				cast( ( ( cast( FSU.allocated_extent_page_count as decimal( 18, 4 ) ) * 8 ) / 1000 ) as decimal( 18, 4 ) ) as Allocated_Size_in_DataFile_MB,
				cast( ( ( cast( FSU.total_page_count as decimal( 18, 4 ) ) * 8 ) / 1000 ) as decimal( 18, 4 ) ) as Allocated_Size_on_Disk_MB,
				cast( ( ( ( ( cast( FSU.allocated_extent_page_count as decimal( 18, 4 ) ) * 8 ) / 1000 ) / ( ( cast( FSU.total_page_count as decimal( 18, 4 ) ) * 8 ) / 1000 ) ) * 100. ) as decimal( 18, 4 ) ) as Percent_Allocated_in_DataFile,
				cast( ( ( cast( FSU.mixed_extent_page_count as decimal( 18, 4 ) ) / ( cast( FSU.total_page_count as decimal( 18, 4 ) ) + ( 8 - ( FSU.total_page_count % 8 ) ) / 8 ) ) * 100. ) as decimal( 18, 4 ) ) as Percentage_Mixed_Extents
			from 
				' + @currDbName + '.sys.database_files as FI
					inner join ' + @currDbName + '.sys.dm_db_file_space_usage as FSU on FSU.file_id = FI.file_id
					inner join ' + @currDbName + '.sys.databases as DB on FSU.database_id = DB.database_id
			where
				FI.type_desc = ''ROWS'';
			';

		insert into #results
			exec sp_executesql @stmt = @sqlStatement

		fetch next from dbCursor into @currDbID, @currDbName;
	end;

close dbCursor;
deallocate dbCursor;

select * from #results;

Memory Performance

Memory performance is best looked at from a holistic view and not from the silo of a singular memory counter. If you are fortunate enough to have a SQL monitoring solution, you will likely already have historical information that you can reference. In the event that you do not have a monitoring solution already in place, the quickest way to get this information is to use the Performance Monitor (perfmon.msc) and the PAL (Performance Analysis of Logs) tool.

I’m going to describe the minimum set of counters that I like to use. But before I get to that, let me say a few things about the PAL tool and thresholds. Most of the rule-of-thumb thresholds I use come from the PAL Tool’s SQL Server 2012 threshold file. If you haven’t heard of this tool, it lives on the CodePlex website and is a wonderful tool to use for diagnosing system issues (not just SQL Server). You can, in fact, use the PAL Tool to export a canned performance template that you can use to gather most of this information and make life easy (https://pal.codeplex.com/). I will caution you to let your Collector Set run somewhere… anywhere that isn’t production so you can take a look at what the size and performance impacts are going to be (especially if you are running the full set of counters that the PAL Tool exports for you in the template).

The memory counters that I like to look at are:

  • SQLServer: Buffer Manager – Page life expectancy
    • Unhealthy < 300s (5 minutes)
    • This is the duration, in seconds, that a data page stays in the buffer pool.  Higher is better for this statistic.
    • Note: if your SQL Server has multiple NUMA nodes, it is a good idea to also include SQLServer: Buffer Node – Page life expectancy.
  • SQLServer:SQL Statistics – Batch Requests/sec
    • There is no unhealthy threshold for this, it is merely a metric to give you an idea of how busy your SQL Server is at that time
    • This statistic is included for contextual purposes to help identify if there are a lot of simultaneous requests vs. few, heavy queries.  Why do you care?  Because it’s infinitely less effort and less impactful to the system to tune queries and indexes than it is to compress your data.
  • SQLServer: Memory Manager – Pages/sec
    • Unhealthy > 500
    • A high value doesn’t necessarily indicate an issue, but it is contextually valid for getting a view of the big picture.
  • SQLServer: Buffer Manager – Lazy writes/sec
    • Unhealthy > 20
    • This is how many times per second the lazy writer is flushing dirty pages from the buffer pool to disk.  This counter should increase as page life expectancy decreases.
  • Paging File – % Usage
    • Unhealthy > 60-70%
    • This is more indicative of external memory pressures (assuming maximum server memory has been configured for the instance)
  • Memory – Available Mbytes
    • Unhealthy < 200 MB
    • Your SQL server instances should have their maximum server memory settings configured to allow the operating system and any other applications on the server to have adequate memory available to complete their roles.  If you see this decreasing, the OS/applications on the server may require additional memory.

Some of the counters above are not explicitly relevant for identifying SQL Server memory pressures (and are also not in the PAL template), but they are useful in identifying whether memory issues are external or internal, which I like to do.  Why do I like to know if it’s internal vs. external memory pressure?  Internal memory pressure is more adequately solved with compression, however, you can also address external memory pressure with compression and decreasing SQL Server’s memory footprint (but hey, we’re DBAs and we don’t usually want to decrease our memory footprint!).  The most significant indicators of internal memory pressure are going to be high lazy writes/sec and low/decreasing Page Life Expectancy.  External memory pressures are going to be evident if you notice low Memory – Available Mbytes and high Paging File – % Usage.

Once you have an adequate sample size, you can either evaluate it by eyeballing it or by pointing the PAL Tool to the counter log path to do the analysis for you.  You’re looking for any signs of memory pressure (even external memory pressure, although there are other things that you can do to mitigate that as well).  One last comment, this is just a piece of the puzzle.  Use it in conjunction with other clues.

Disk I/O Performance

Now you would think that disk I/O performance would be the only thing to look at to identify whether you want to compress your data (After all, making the access methods pull fewer pages is the most important thing, right?! Nope…). To figure out if your storage subsystem needs a break, it’s time to dust off PerfMon again and pull another two counters for investigation. Here are the counters:

  • Physical Disk: Avg. disk sec/Read
    • Less than stellar > .010
    • Unhealthy > .020
    • This counter tells you how long (in seconds) it takes to perform a read operation from disk.  Remember, this increases as the demand on the storage subsystem increases.  Make sure you obtain samples from a long enough period of time to hopefully even obtain a few times when you have depleted the reserves of the SAN’s disk caches (if you are in a SAN environment).  We want to make sure we’re getting some figures that observe all pieces of your storage access profile.
  • Physical Disk: Avg. disk sec/Write
    • Less than stellar > .010
    • Unhealthy > .020
    • This counter tells you how long (in seconds) it takes to perform a write operation to disk.  I could copy and paste the same information as what was written for Avg. disk sec/Write, but I think we get the point.

Unhealthy disk metrics could certainly benefit from compression (less so if you have poorly tuned queries and indexes that do a lot of scanning, but still some benefit).

Free Data Compression Guide!

So now you’ve determined whether or not you can benefit from compression. If compression is the answer, you now need to decide what kind of compression is going to work best and ultimately what kind of downtime you’ll need — or performance impacts you’ll see.

Quick Guide to Data Compression

It will delve into some hoops you’ll have to jump through, the different types of data compression, when NOT to compress, balanced scorecards, page compression, row compression and more!


Next Steps

Review our case studies and engagements where we helped companies just like yours solve a variety of business needs.


About Oakwood

Since 1981, Oakwood has been helping companies of all sizes, across all industries, solve their business problems.  We bring world-class consultants to architect, design and deploy technology solutions to move your company forward.   Our proven approach guarantees better business outcomes.  With flexible engagement options, your project is delivered on-time and on budget.  11,000 satisfied clients can’t be wrong.


Best Practices for SQL Server in Azure

Best Practices for SQL Server in Azure

Most of the typical items that DBAs will hit in tuning out a server for SQL Server hold true in Azure. The three major spots you will usually focus on are OS configurations, storage configurations, and SQL Server instance configurations.  These are generalizations and, as your workload demands, there may be variations that are necessary, but those are best-covered case by case and out of scope for this post.

OS Configurations

When configuring an OS for SQL Server, these are the items that you should hit:

  • Provision separate service accounts for all SQL Server services (database engine, SQL server agent, Integration Services, etc.). This is a security best practice and may be considered domain level work if you need the accounts to have the ability to hit domain locations.
  • Add the database engine service account to the following user rights:
    • Perform volume maintenance tasks – this enables instant file initialization, which can yield 4-5x performance improvements on tasks such as data file growths (it has no impact on transaction logs, those *must* be 0 initialized), database creation, and database restores.
    • Lock pages in Memory – this prevents the OS from paging out SQL Server’s buffer pool
  • Change power settings from balanced to high performance. CPU improvements didn’t appear to be as drastic as other documented cases (private clouds where the hypervisor and VMs were under the same administrative team), but based on my geek bench testing, you can still squeak out an ~8% performance improvement with this setting (which was a bit surprising to me, actually). This may be fluky, but it was consistent over ~10 tests at varying times of the day.

Storage Configuration

Azure storage is extremely simple to configure and use. That simplicity does, however, come with a price: the ability to make extremely fine-grained configuration decisions based on your workload. That said, for most purposes, Azure storage is perfectly fine, as long as you configure it properly. Here’s the list of best practice configuration guidelines for Azure storage for SQL Server:

  • Use data disks with no caching and no geo-redundancy
  • There are two options for scaling IO in Azure: Storage pools or simple volumes on singular data disks with file groups containing files on multiple volumes
    • The SQLCat team has benchmarked the performance of the two. The files/filegroups option yields better scaling of IOPS but comes at the cost of more management overhead and partial unavailability of any of the disks can result in the database being unavailable.
  • Set allocation units for the volumes which will hold SQL datafiles to 64kb
  • Use storage pools, not Windows’ software RAID. IOPS and throughput do not scale with Windows’ software RAID and can yield erratic and terribly performant results.
    • Caps to be aware of:
      VM SizeData DisksMax IOPSBandwidth
      A384000400 Mbps
      A4168000800 Mbps
      A6840001000 Mbps
      A71680002000 Mbps
      A816800040 Gbps
      A916800040 Gbps
    • Overall IOPS for a subscription is 20,000
  • There are a couple common ways that you can configure your storage pools:
    • One large storage pool with all of your disks in it and then placing all of your database files on that
    • Segregate out IO needs by category. Kinda’ like:
      • 4x data disks – TempDB
      • 4x data disks – Transaction Logs
      • 8x data disks – Data files
  • Number of columns on virtual disk should be equal to the number of physical disks in the pool (6 disks = 6 columns – this allows IO requests to be evenly distributed across all data disks in the pool)
  • Interleave value on virtual disks should be 256kb for all workloads
  • Linear scaling starts breaking down around 4 disks and writes scale better than reads

You may use the following PowerShell commands to verify the configuration of your virtual disks:

Get-VirtualDisk | ft FriendlyName, ResiliencySettingName, @{Expression={$_.Interleave / 1KB}; Label="Interleave(KB)"}, NumberOfColumns, NumberOfDataCopies, @{Expression={$_.Size / 1GB}; Label="Size(GB)"}, @{Expression={$_.FootprintOnPool / 1GB}; Label="PoolFootprint(GB)"} –AutoSize

On an interesting note. Azure storage is a shared environment. This means there are going to be some behind the scenes things that happen and may not necessarily be what you want. If your systems are under heavy use, you will likely never notice this, but for new systems and for performance testing benchmark machines, you are going to want to warm up the disks (for roughly 20 minutes). Last, but not least: ALWAYS test your throughput. I recommend SQLIO when you are trying to just generally test the performance of the storage subsystem. It’s out of scope for what I want to cover here, but here is where you can find that application and really good documentation on it.

SQL Server Instance Configuration

There are instance level configuration options that generally benefit SQL Server. These are irrespective of where your SQL Server instance is located, but this is what you’ll want to hit:

  • Install only minimum feature set for instance
  • Enable option Advanced > optimize for ad hoc workloads
  • Change option Advanced > cost threshold for parallelism to 50. SQL Server’s default setting for this is just flat out too low. If you are tuning a pre-existing instance, you’ll notice a lot of CXPACKET waits in your wait_statistics
  • Change option Advanced > max degree of parallelism to value equal to the number of cores/vCPUs (up to 8). 0 defaults to all available logical processors up to 64.
  • Enable option Database Settings > Compress backup. There is really no reason to not compress your backups.
  • Split your TempDB up across multiple data files. You will use the same number of data files as you have processors, up to 8. These will all need to be of the same size, as SQL Server uses a weighted round robin when leveraging multiple data files. The goal behind this is to reduce contention for the GAM, SGAM and PFS pages for the TempDB.
  • Change option Memory > minimum server memory (in MB) to value ~2048-4096 lower than maximum server memory value
  • Change option Memory > maximum server memory (in MB) as follows:
    Server Total RAM (GB)Max server memory (MB)
    75120
    1411264
    2824576
    5652224
    112108544

Conclusion

That’s all folks! That should get you going on configuring an Azure VM to be home to your lovely little SQL Server instance(s). Happy clouding!


Next Steps

Review our case studies and engagements where we helped companies just like yours solve a variety of business needs.


About Oakwood

Since 1981, Oakwood has been helping companies of all sizes, across all industries, solve their business problems.  We bring world-class consultants to architect, design and deploy technology solutions to move your company forward.   Our proven approach guarantees better business outcomes.  With flexible engagement options, your project is delivered on-time and on budget.  11,000 satisfied clients can’t be wrong.


SQL Memory Internal Clock Hands

SQL Memory Internal Clock Hands

Let's talk about SQL Internal Memory.  OK, not your favorite subject but I have something interesting to share.  I had an issue recently that required me to be digging quite heavily into the memory internals, specifically to leverage clock hands to identify memory pressure. This is a topic that has very little around its specifics but can be dead useful at times. To explain a little about why it’s dead useful, I’ll explain why the scenario that caused me to start digging.

Where Is The Value?

After some planned upgrades to a SQL Server AlwaysOn environment – including some changes to the max/min memory settings and other instance level configurations – I noticed extremely poor performance.

  • I looked at the sqlservr.exe process, whose memory footprint was steady at around 10GB, which was where it should be.
  • Since all other resource utilization statistics appeared to be in normal ranges, I decided to investigate if there were long-running queries. Nope!
  • It was then I thought to myself, “Self! I wonder where the buffer pool is allocated… Maybe there is heavy data access in one database/table causing everything to be slow because it’s being flushed from the pool.” Here’s where it got interesting.

The buffer pool was a shocking 500 MB in size and stature, but the process had 10GB assigned to it. You can check the data page cache's memory allocation as follows:

select
	case
		when database_id = 32767 then 'ResourceDB'
		else db_name( database_id )
	end as Database_Name,
	count( * ) * 8 / 1024 as MBUsed
from
	sys.dm_os_buffer_descriptors
group by
	database_id;

Either there were some internal caches taking up all that space, or there was external pressure in the environment. As far as I know, there is no way to inspect the internal cache control mechanisms for the buffer pool, so I generally rely on investigating the memory pressure for all of the other caches, controlled by the common caching framework. One of the ways I like to get a feel for memory pressure is by looking at these cache stores’ clock hands to see how actively they’re moving (the velocity of the delta), which translates, in a general sense, to the overall memory pressure on the cache and data page cache.

An Interesting Preface and Side Note

From SQLOS’s point of view, consider what the terms pool and cache mean. They are not interchangeable, as I thought they were. In SQLOS-land, a cache is for heterogeneous types of data (different sizes) that usually have a cost and a state associated with them. A cache implements its own control over an entry’s lifetime, visibility (locking/concurrency controls) and page replacement policies (implemented in the form of the LRU – least recently used – algorithm). A pool, on the other hand, is for homogeneous types of data (same sizes) that usually have neither cost nor state. They also have limited control over an entry’s lifetime and visibility. I will henceforth and forevermore refer to the buffer pool as the data page cache! The side note to this conversation is that the common caching framework is *NOT* implemented by the data page cache.

Before SQL Server 2005

The two had different costing policies, but the procedure cache relied on the buffer pool’s eviction mechanism. Because of the small cache size, there was no real need for a common caching framework to handle memory management. Starting with SQL Server 2005, that changed (more features that required caches), which meant more work making a singular mechanism work for everything. Or the more robust, and really the only feasible, option of creating a framework to provide common costing mechanisms and more granular methods for controlling the caches.

The common caching framework that Microsoft settled on is a variation of the clock algorithm – as specified earlier. A clock algorithm is an implementation that provides a way to give data age-based weighting for entries in the cache, for the purposes of optimal page replacement. A good description of the details can be found here:

http://en.wikipedia.org/wiki/Page_replacement_algorithm#Clock.

Clock algorithms are essentially used for determining what to age out of memory when there is memory pressure in the environment.

Microsoft’s implementation for SQL Server leverages two clock hands per cache store, an internal and an external. The internal clock hand is controlled by SQLOS and is essentially a way to set max caps on individual caches so that you don’t get a scenario where you have a single cache pushing the entire process into cataclysmic memory pressure. The external clock hands are moved by the Windows Resource Monitor when the process and system as a whole are experiencing memory pressure. The clock hands are designed to not interfere with the usage of the cache, but as a clock hand is moved, it will divide the cost of not-in-use entries by 2. If the entry is not-in-use and its cost is 0, it will remove the entry’s visibility and then try to remove it from the cache. The removal can fail if another clock hand is operating on that entry simultaneously, but fear not! Removal will succeed when both clock hands have moved on to other entries. Upon access, an entry’s cost is reset. If a specific cache is growing too large internally, internal clock hands start moving and aging entries – these events are defined by local policies. If the system is under memory duress, external clock hands start moving and aging entries – these events are defined by global policies. In the current implementation, there is a

If the system is under memory duress, external clock hands start moving and aging entries – these events are defined by global policies. In the current implementation, there is a 1: many relationships for clocks: clock hands, a 2:1 correspondence for clocks: cache (internal and external clock with a clock hand per cache), and a 1:1 correspondence between memory sweeps: hands. The implication, then, is that each clock can control multiple memory sweeps.

Most of the significant changes to the SQLOS Memory Manager in 2012 appear to revolve around the data page cache and the addition of a separate cache optimized for column store indexes. The data page cache changes aren’t necessarily pertinent to this conversation, but of note – the cache for column store indexes is implemented separate from the buffer pool and leverages the common caching framework’s clock hand implementation for lifetime control of entries. This also carries the caveat that column store objects can pressure the buffer pool for memory allocation within the process virtual address space (VAS).

For my production issue, I was leveraging the results of the following query to take a look at memory pressure:

select
	type,
	clock_hand,
	sum( rounds_count ) as total_sweeps,
	sum( removed_all_rounds_count ) as total_removed
from
	sys.dm_os_memory_cache_clock_hands
group by
	type,
	clock_hand
order by
	total_removed desc;

 

This presented me with some interesting findings. The delta on the objectstore_lock_manager cache’s external clock hand was absolutely insane (remember, the velocity of the delta is what we’re most interested in). The delta of the total entries removed was exponentially higher than any other cache’s entry removal over a 30 second period:

typeclock_handtotal_removed
OBJECTSTORE_LOCK_MANAGERHAND_EXTERNAL2264403526
OBJECTSTORE_LOCK_MANAGERHAND_EXTERNAL2264558403

 

What made this even more interesting was that this was the only external clock hand that exhibited this behavior. Even though every cache maintains its own external clock hand, these should, more or less, move synchronously in order to simulate a global clock hand. At this point, I made a mild presumptive leap that the data page cache was likely experiencing the same insane, incorrect external pressure and was perpetually flushing entries from the cache. In this case, an instance reset during a maintenance window became the method of resolution and provided a fix (my working theory maintains a bug condition pertaining to AlwaysOn deployments and instance level memory settings).

To read more about the nitty-gritty on clock hands and SQLOS caching (albeit from 2005, but there really haven’t been huge changes to it that I’m aware of), I recommend the following:

http://blogs.msdn.com/b/slavao/archive/2005/06/03/424856.aspx
http://blogs.msdn.com/b/slavao/archive/2005/03/18/398651.aspx


Next Steps

Review our case studies and engagements where we helped companies just like yours solve a variety of business needs.


About Oakwood

Since 1981, Oakwood has been helping companies of all sizes, across all industries, solve their business problems.  We bring world-class consultants to architect, design and deploy technology solutions to move your company forward.   Our proven approach guarantees better business outcomes.  With flexible engagement options, your project is delivered on-time and on budget.  11,000 satisfied clients can’t be wrong.

Gartner Calls Microsoft a Leader in Database Management

Gartner Calls Microsoft a Leader in Database Management

Check out this brief Microsoft video and see why SQL Server outshines others in the database management space.

SQL Server Outshines Others Benefits

• Affordability, with everything built-in
• Security—most secure six years in a row
• Performance, at a fraction of the cost

Database Management Conclusion

Don’t just take our word for it. Read Gartner’s Magic Quadrant for Operational Database Management Systems and see how Microsoft ranks against Oracle.

Gartner Report

Read the report.


Next Steps

Review our case studies and engagements where we helped companies just like yours solve a variety of business needs.


About Oakwood

Since 1981, Oakwood has been helping companies of all sizes, across all industries, solve their business problems.  We bring world-class consultants to architect, design and deploy technology solutions to move your company forward.   Our proven approach guarantees better business outcomes.  With flexible engagement options, your project is delivered on-time and on budget.  11,000 satisfied clients can’t be wrong.