There’s no two ways about it – Cloud computing is here to stay and Cloud storage is offered by all of the major cloud providers, which is great, right?
Genuinely, it is – for the most part.
But when we’re looking at high end usage, how do you confidently transfer multiple terabytes of data into and out of the Cloud, be it for backup purposes or for actual processing and manipulation?
IBM, Aspera and FASP
At the end of last year, IBM acquired Aspera, which with its FASP solutions, claims to have overcome a major obstacle to transferring high volumes of data over long distances.
You see, generally speaking there are two set ways of doing this – via TCP or UDP.
TCP (Transmission Control Protocol) allows reliable data delivery under ideal conditions. However, when packet loss and higher latency occurs (something that’s common in long distance WAN), the combined mechanism for network congestion avoidance and in-order delivery of packets results in low utilisation of available bandwidth and ultimately, slow file transfer speeds. Packet loss is taken to mean ‘network congestion’, so transmission is drastically throttled back and dropped packets retransmitted.
The alternative, UDP-based solutions, have tried to speed up file transfer, as UDP (User Datagram Protocol) dispenses with the reliability and congestion control mechanism that slow down TCP file transfer, but aside from driving networks (sometimes too) hard, these solutions tend to be actually retransmitting up to 10x the required data. This means the actual transfer speed is still not great and the bandwidth is wasted on unnecessary retransmission.
In-order delivery is important to many applications, but not file transfer. Aspera’s FASP solutions use UDP in the transport layer, however, it separates reliability and rate control mechanisms, gently backing off transmission as queuing in the network increases, yet maintaining a high level of bandwidth utilisation. What’s more, it only retransmits minimal dropped packets rather than stopping and starting to ensure in-order delivery.
As such, this promises better than 90% utilisation of bandwidth compared to TCP-based transfer figures, which can get below 20% on long
And the cost benefit of this is clear – use ALL the bandwidth you have bought instead of a fraction, or buy more to achieve the desired throughput?
FASP in TSM
In his recent inspirational speech at Silverstring’s summer barbecue, IBM’s Director of Product Management (Storage Software) Ian Smith indicated that he’d like to see this technology incorporated into TSM in the future. Efficient bandwidth usage and higher file transport speeds are obvious and necessary enablers to Cloud backup solutions.
Of course, increasing transfer efficiency in the network is one thing, so how about just transferring less?
Introducing Steelstore
Riverbed WAN optimisation products have been in widespread use for some time now. A new product that recently came to our attention is Steelstore. Having all of the deduplicating / compressing / encrypting goodness that you get from point-to-point Riverbed Steelhead / Granite etc, it’s also a gateway to Cloud storage.
Deployed either as a hardware appliance or on a VM, the Steelstore appliance appears to a backup application as a large (up to a PB) disk accessed by CIFS / NFS. It could even be thought of as an offsite disk storage pool in the Cloud, with data directed to the device cached on local disk for quick recall of recent backup data.
The real magic happens as all data is deduplicated inline, encrypted (AES 256-bit and SSLv3) and replicated to the Cloud storage provider of your choice.
Steelstore appliances have built-in support for the APIs from Amazon S3, AT&T Synaptics Storage as a Service, Microsoft Azure and Rackspace Cloud Files, with baseline support for other instances of OpenStack (Swift) object storage and EMC Atmos. More cloud providers will be added over time based on demand.
Where possible, data for restore is recalled from disk cache and, if not, the Cloud. When the site and appliance are ‘lost’, the data can be accessed by building / deploying another device and re-entering the encryption keys and Cloud access credentials (remember to keep these safe – offsite!).
Cloud storage and data transfer tend to be charged commodities, and the value of Steelstore appears to be in reducing consumption of these. Promotional material claims “Steelstore gateways reduce your WAN data transmission and cloud storage needs by 10-30 times on average.”
In a TSM context, this claim needs a pinch of salt taking with it as TSM already uses ‘incremental forever’ philosophy for unstructured data, and now VM backups, so serious data reduction has happened in the backup application before hitting storage.
And the truth is, less sophisticated backup applications that throw lots of full backups into storage would probably see better deduplication ratios at the gateway.
Always interested to hear your thoughts, have you used Aspera’s FASP solutions? Steelstore? Get in touch and let us know.