DAS

Which is Faster: SAN or Directly-Attached Storage?

imran's picture

Should I place my database files on SAN or directly-attached storage? This is a frequently asked question. It comes up repeatedly in public newsgroups, email discussion lists, and private meetings with customers who are concerned with database performance.

What SAN?
A Storage Area Network (SAN) may mean two different things. To the storage professionals, it could narrowly mean the switched fabric between the host servers and the sophisticated disk arrays that actually store data. To most other folks, however, the distinction between the switched fabric and the disk arrays means very little. Rather, they view the fabric and the disk arrays -- practically everything behind the drive and beyond the host server -- as comprising the SAN. This article assumes this most common view.

To simplify discussions in the rest of this article, a drive from a directly-attached storage will be referred to as a DAS drive, and a drive presented from a SAN as a SAN drive.

What is meant by "faster"?
To address the question of which is faster, you must be clear about what you mean by "faster". In general, you can use three key metrics to quantify the performance of a disk I/O path:

  • I/Os per second (IOps)
  • Megabytes per second (MBps)
  • Latency

IOps measures how many I/O requests can be satisfied by the disk I/O path in a second. For a given disk I/O path, this metric is generally in reverse proportion to the size of the I/O requests. That is, the larger the I/O requests, the lower the IOps. This is intuitive. After all, it takes more time to process a 256KB I/O request than it does an 8KB request.

MBps measures how much data can be pumped through the disk I/O path. If you view the I/O path as a pipeline, MBps measures how big the pipeline is and how much pressure it can sustain. So, the bigger the pipeline is and the more pressure it can handle, the more megabytes of data it can push through. For a given I/O path, MBps is in direct proportion to the size of the I/O requests. That is, the larger the I/O requests, the higher the MBps. Larger I/Os give you better throughput because they incur less disk seek time penalty than smaller I/Os.

Note that for I/O requests of a fixed block size, MBps is simply IOps times the block size in megabytes.

I/O latency--also known as I/O response time--measures how fast an I/O request can be processed by the disk I/O subsystem. For a given I/O path, it is in reserve proportion to the size of the I/O request. As mentioned previously, a larger I/O request takes longer to complete.

Generally speaking, you should pay more attention to IOps and I/O latency when it comes to small-sized I/O requests, and be more concerned with MBps when dealing with large-sized I/O requests. But whether you report IOps or MBps, you should always keep I/O latency in the picture. Note that as you push a disk I/O subsystem to sustain higher IOps or MBps, the average I/O latency will continue to go up. After a certain point, the latency will increase exponentially as you put even more I/O requests on the system. If you obtain a good IOps or MBps number at the expense of a very high I/O latency, that IOps or MBps number may not be very useful in practice.

What does all this have to do with the question of which is faster, SAN or directly-attached storage? Well, these three metrics define precisely what is meant by "faster". So when you say one is faster than the other, you have to be explicit about what measure you are referring to.

Let us examine the question along these three performance measures, and let's start with the I/O latency measure first.

Disk I/O latency
SANs almost always come out a loser when it competes with DAS on I/O latency under light load. If you don't believe me, you can measure it yourself, and you don't need any fancy tool to obtain convincing data points.

To measure the latency of an I/O path under the best possible condition, you should issue single-threaded I/O requests, i.e. synchronous I/Os. If you are using an I/O benchmark tool such as IOMeter or sqlio.exe, for instance, you can configure a single worker thread to issue small sequential writes with a queue depth of one.

Alternatively, you can use a simple SQL Server script to measure the latency of an I/O path. The following T-SQL script is an example that can help you ascertain the best possible disk I/O latency (or response time) when committing a SQL Server database transaction. The script assumes that D is a directly-attached internal drive and E is a drive presented to the same server from a SAN.

-- create a small database on the local internal drive D
CREATE DATABASE io_test_D ON PRIMARY
(NAME=N'io_test_d_data', FILENAME=N'd:\io_test_d.mdf', SIZE=255)
LOG ON
(NAME=N'io_test_d_log', FILENAME=N'd:\io_test_d.ldf', SIZE=2000);
go
ALTER DATABASE io_test_D SET RECOVERY SIMPLE;

-- create a small database on the SAN drive E
CREATE DATABASE io_test_E ON PRIMARY
(NAME=N'io_test_e_data', FILENAME=N'e:\io_test_e.mdf', SIZE=255)
LOG ON
(NAME=N'io_test_e_log', FILENAME=N'e:\io_test_e.ldf', SIZE=2000);
go
ALTER DATABASE io_test_E SET RECOVERY SIMPLE;
go

-- Create two identical tables, one in each database
use io_test_D
go
CREATE TABLE test (i CHAR(200) NOT NULL);
INSERT test VALUES('ABC');
go
use io_test_E
go
CREATE TABLE test (i CHAR(200) NOT NULL);
INSERT test VALUES('ABC');
go

-- Run the following script on database io_test_D and io_test_E.
-- Run the script multiple times in each database, and alternate
-- the test runs to get consistent measures
SET NOCOUNT ON;
DECLARE @i int,
@start datetime;
SET @i = 1;
SET @start = getdate();

WHILE @i < 10
BEGIN
UPDATE test
SET i = cast(@i as CHAR(200));
SET @i = @i + 1;
END
SELECT 'Duration' = datediff(ms, @start, getdate()),
'Latency' = datediff(ms, @start, getdate())/10000.0;

Syndicate content