Translate this website:
Search this website:

BC/DRCloud StorageComplianceData CentresDeduplicationDisk/RAID/Tape/SSDsEthernet StorageSAN/NASTiered StorageVirtualization

The ‘True’ Cost of Storage

By Erik Frieberg, VP of Marketing and Alliances at 10gen.


Date: 20 Feb 2012

NoSQL databases are changing how we think about storage management, and enabling the industry to reach new levels of cost efficiency. When analysing the cost of our storage infrastructure, we need to consider the media, network, and operating costs. But the choice of storage infrastructure is guided in no small part by the demands on the application tier above it.

Application workload can be characterised by the Service Level Agreement (SLA) expect on the storage. Changing the SLA that applications expect from a storage infrastructure can result in dramatic changes to the cost of storage.

NoSQL databases place very different requirements on our storage infrastructure in comparison to legacy architectures. This allows the IT manager to operate a much more scalable and cost effective storage infrastructure.

In this article, we will examine the different storage requirements posed by NoSQL and the dramatic implications this has on storage management costs.

Drivers of storage cost

There are three major components of storage cost:

- The cost of disk drives
- The cost of the network connecting servers to storage
- The cost of operating the storage

Whether buying direct attached storage, network attached storage, or storage area networks, we’re packing the same disk drives into those chassis. Therefore, we must choose between spinning disks, flash, and ram that have similar costs differentials, regardless of chassis.

The real big difference between storage architectures is within the network - between servers and storage. At the low-end we have direct attached disks utilising only the built-in bus on the servers themselves. This is cheap and fast, but allows only a limited number of drives to be attached to a single computer. By contrast, Fiber Channel and Infiniband allow many more drives to be attached to a single server as well as very high transfer rates.

The operational costs of storage reflect the amount of human intervention required to provision, monitor and manage disk infrastructure. As we add more components including drives, networks and interconnects, we obviously increase management overhead. Additionally, features like de-duplication, replication, and backup-restore add management complexity to our storage.
Traditional relational database management systems require expensive storage infrastructure

Traditionally, we have used relational database management systems (RDBMS) to store structured data. RDBMS systems are typically built using a vertical scaling strategy. The performance characteristics demanded by the RDBMS are as follows:

Total volume size Entire database stored in a single volume. As database grows, the volume must grow as well.
Storage network Since data must reside in a single volume, a dedicated storage network must exist between servers and disks to support growth.
Operational complexity Database stores single copy of data. Requires storage to duplicate data using RAID, Snapshots, or other technology.

Relational database typically use a single centralised storage volume to store all data. As the database grows in size, you must scale this volume appropriately. Once the number of disks exceeds the capacity of a system chassis, you’ll need to invest in a SAN or NAS to aggregate additional drives.

As the size of your database increases, the size of this network will need to increase as well to accommodate the additional disk drives and input/ output operations per second (IOPs) demanded by the database.

The RDBMS also expects the underlying storage to be fairly reliable. While the RDBMS may support application level replication, it is often a complex manual procedure to failover the database. As a result, the RDBMS needs the storage to ensure a fair degree of reliability to ensure that a database failover is an infrequent activity. This means that we must deploy technologies like RAID to ensure high availability of data blocks.
NoSQL databases dramatically simplify storage management

NoSQL changes the rules for database storage management by bringing a whole new set of requirements on the underlying storage:

Total volume size Database is shared nothing, with many small volumes used by distributed nodes. No need to grow volumes.
Storage network Drives attached to each computer directly. Utilizes existing bus so there’s no need for a dedicated storage network. All replication happens at database tier using IP.
Operational complexity Database has replication and failover built-in, so it need not be provided by the storage layer.

NoSQL databases are changing how we think about storage. They have been designed with shared nothing design in mind. Rather than having a single enormous volume, we have many smaller volumes managed by individual servers in a distributed database system.

While our total data size remains the same, it means that we can get away with a dramatically cheaper network between servers and storage. Instead of a single massive volume shared over a SAN, we can use the local direct attached storage in a number of commodity servers. Each server we add to our distributed database adds capacity and IOPs without any need for Fiber Channel or Infiniband. This leads to dramatic cost reduction on the network and interconnects for database storage. In fact, for most NoSQL databases, no SAN or special network equipment is deployed at all, bringing network costs to zero.

NoSQL databases are designed to handle failures at the database level and not assume reliability at the storage level. Modern NoSQL systems store data on multiple nodes throughout a distributed system. Data is replicated (mirrored) between multiple servers to ensure high availability, and it is partitioned (striped) across multiple servers to enable high performance.

The availability and performance management functions typically implemented inside of monolithic storage area networks have been moved up to the software layer. This dramatically simplifies the management and reliability requirements of the underlying storage. In many cases there is no need to deploy RAID at all as the database itself stripes and mirrors data even over multiple data-centers.

When we compare NoSQL storage to traditional RDBMS, we find that we need about the same number of disks to store data (based on the volume of data we’re storing and the overall IOP needs of the system), but the network and management of that storage is dramatically simplified.
The hidden opportunity cost: Cloud Computing

As businesses look to move their operations to the cloud, your storage infrastructure is likely to be one of your major roadblocks to migration. If your application runs on an RDBMS and requires a centralized storage architecture supported by SAN or NAS, you are unlikely to find a cloud vendor capable of matching the SLA’s you provide today.

To easily adopt the cloud, you want an application layer that can tolerate varying performance profiles delivered by cloud vendors today, and work well on commodity hardware. You’re most likely to need a system that can run on virtualized servers with locally attached disks. NoSQL is often your only choice for scaling in the cloud

Your choice of database has massive implications on the cost of your storage. Traditional RDBMS implementations drive high costs in the form of networking gear and storage management. NoSQL databases can be deployed with commodity hardware in simpler configurations while providing management of common functions like high availability and performance at the application, rather than storage tier.

As you map your storage infrastructure plan, you should consider which portions of your database data might be better suited to the cost efficiencies and flexibility of a NoSQL database than the RDBMS you’re using today.


« Previous article

Next article »

Tags: ICT, BC/DR, Compliance, Deduplication, Disk/RAID/Tape/SSDs, Ethernet Storage, SAN/NAS, Tiered Storage, SSDs, Fibre Channel

Related News

26 Aug 2015 | ICT

25 Aug 2015 | ICT

25 Aug 2015 | ICT

20 Aug 2015 | ICT

Read more News »
Related Web Exclusives

4 Dec 2014 | ICT

17 Mar 2014 | ICT

10 Mar 2014 | ICT

3 Mar 2014 | ICT

Read more Web Exclusives»

Related Magazine Articles





Read more Magazine Articles»


Latest IT jobs from leading companies.


Click here for full listings»