SQL Server – Microsecond Storage Series – Buffer Pool Extension

Solid State Flash Storage have brought about a fundamental shift in disk response times, crossing the threshold from millisecond to microsecond latency. This has opened up a host of new possibilities covering everything from application and system design to brand new features within SQL Server. In this first post of a series  I will take a look at a new feature in SQL Server 2014 called Buffer Pool Extension.

StorageSpeed

Figure 1

Processing database transactions requires reading and writing data. SQL Server processes all changes on data structures in an area of memory called the SQL buffer pool.  For data that needs to be read into or written from the buffer pool, the further data is from the CPU core the more time and CPU cycles it will take to retrieve it.  A response time of 20 milliseconds is considered acceptable for spinning disks in most applications.  Now with the advent of flash storage taking off in the enterprise arena we are able to achieve sub-millisecond latency as the norm for flash disk response times.

Serving different business requirements flash storage comes in two basic implementations 1) PCIe flash cards such as EMC XtremSF and 2) Enterprise Storage Arrays such as EMC’s Xtremio all flash arrays. The PCIe flash card is closer to the CPU core but does not offer the enterprise data management capabilities of a SAN array so is best suited for situations where data persistence is not a business goal such as tempdb or as we are going to explore Buffer Pool Extensions.

Overview of Buffer Pool Extensions

SQL Server 2014 introduced a feature that can leverage a PCIe flash card to extend the buffer pool that I introduced earlier. This technology termed Buffer Pool Extension or BPE creates a multi-level caching hierarchy which Microsoft calls L1 and L2 cache, optimising data retrieval by using internal algorithms to place the most relevant clean pages on faster storage. You can find more details about how it works here. Once the BPE feature has been setup it is primarily self-managing, however it does change the way SQL Server processes data pages with corresponding changes to performance tools such as Extended Events, DMV’s & Performance Monitor. For a DBA it is important to understand the impact on performance baselines and troubleshooting. In this first post I will cover an overview of how these changes affect both the Buffer Manager and the Lazywriter process. In the next post of the series I will go into more details on performance management.

Please note that for editorial conciseness I have really simplified the diagrams and charts below. The intention is to show BPE in action. SQL Server has many optimisations and additional components to optimise data requests. If you want additional information this presentation is a great place to start.

Reading Pages

  1. 1. Using a simple example in figure 2 below once the relational engine has determined what access method is needed to service a request (in our example a single 8k page read) it is the responsibility of the buffer manager to read and retrieve the relevant data page.
  2. 2. Depending on the location of the page at the time of the request this can either be a logical (memory 8k) or
  3. 3. A physical (disk 64K since the buffer manager will usually read an entire extent) request.
  4. 4. When BPE is enabled and the data page is resident in the BPE file on the PCIe flash card the buffer manger differentiates the request as an extension page read.
figure2

Figure 2

When SQL Server requires data pages from either a PCIe flash card or a SAN it posts an asynchronous request to the operating system. In the operating system this requests is observed as a Physical Disk reads.

figure3

Figure 3

In the table below I have associated the various read events with their relevant performance monitor counters

Action Relevant SQL Server Buffer Manager counters Relevant OS Logical (or Physical) Disk counters
  1. 2. If the page exists in the Buffer Pool this is called alogical read.
SQL Server Buffer Manager : Page lookups/secSQL Server Buffer Manager : Readahead pages/secSQL Server Buffer Manager: Buffer Cache Hit Ratio
  1. 3. If the page has to be retrieved from disk this is called aphysical read.
SQL Server Buffer Manager: page reads /sec Logical Disk: Disk Reads /Sec.
  1. 4. When enabled if the page has to be retrieved from BPE this is called an extension page read
SQL Server Buffer Manager: Extension page read/secSQL Server Buffer Manager: Buffer Cache Hit Ratio Logical Disk: Disk Reads /Sec.

Figure 4

Page Writes

SQL Server has two main reasons to write pages from the buffer pool the first is to relieve memory pressure and the second is to maintain the recovery interval of the database.

Clean and Dirty pages

A dirty page is a page that has been changed since it was read in from disk or just created in memory. It will eventually get persisted to disk.

A clean page is a page that has not been modified since it was read from disk therefore it does not need to be written back out to disk

Memory pressure occurs when SQL Server does not have enough unused space in the buffer pool to servicepages needed for a request. For example if you need to read in pages from disk to service a read request. When this situation occurs SQL Server will get rid of pages. With a standard buffer pool configuration dirty pages get written out to disk while clean pages get evicted out from memory.  When BPE is enabled the clean pages may now get spilled out to faster storage. In figure 7 I have plotted page writes for a memory pressure event with additional steps for BPE highlighted in red.

figure4

Figure 5

In the table below I have associated the various buffer pool write events and their associated performance monitor counters

Action Buffer Manger counters Relevant OS Logical (or Physical) Disk counters
Write a Dirty Page to disk
  • Buffer Manager: page writes / sec
  • Lazy writes /Sec
Logical Disk: Disk Writes /Sec
BPE write
  • Buffer Manager: Extension page writes /Sec
Logical Disk: Disk  Writes /Sec.

Figure 6

I took an xperf  trace during some SQL Server activity on an instance where BPE is enabled. In the xperf table below we can observe the Lazywriter process writing to the SQL data file and the buffer pool extension file.

figure5

Pros and Cons of BPE

Optimising the placement of database pages in memory has been refined by Microsoft and others.  For example SQL Server is really smart at preventing operations such as CHECKDB from flushing the buffer pool. With the advent of super-fast flash based storage Microsoft SQL Server has extended the buffer pool onto SSD disk. From a performance management perspective DBA’s need to take into account that the extended buffer pool is now accounted for in performance counters such as SQL Server Buffer Manager: Buffer Cache Hit Ratio which counts both main memory and BPE when calculating the cache hit ratio.

So is BPE the right option for your environment?

  • One of the intended use cases for BPE is to improve performance for OLTP workloads by offloading small random I/O’s from platter disks to flash based PCIe cards. There are other ways to improve performance for these types of scenarios, if possible adding additional memory to the server may produce better performance results. Indeed according to Microsoft on systems with more than 64 GB of RAM BPE may not yield expected performance gains.
  • For Data Warehouse type workloads large block sequential IO can be optimally serviced from traditional spinning disks.

Similar to considering any new technology you should review your current environment and choose the best approach that balances all your requirements.  Importantly whichever solution you decide to use always test it out prior implementing in production.

Conclusion

In this post I introduced flash based storage and how it is utilised by SQL server for the Buffer Pool Extension feature. I explained how data access is faster the closer it is to the CPU core and how SQL Server uses this to optimise data page access & placement. BPE is designed to further enhance this by taking advantage of super-fast flash storage close to the CPU. I introduced new performance monitor counters and showed how BPE operations get exposed through both the buffer manager and disk OS counters. Finally I discussed the advantages and disadvantages of BPE and highlighted alternative approaches to improve I/O performance.  In the next post of the series I will show you how to correlate the various performance monitor counters to help determine if BPE is right for your workload.

If you are currently using or considering BPE please drop me a line I would really like to hear your experiences with it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s