Analyzing AWS: Diving Deeper on New Amazon Tech

AWSCloudImageby Joel Rosenberger and Randall Barnes

As the world’s largest cloud IaaS provider, AWS turns out new features and developments at impressive speed. It’s hard for anyone to keep up, but we’ve compiled our view of the most interesting and potentially business-changing AWS trends of late. Some, like Amazon Redshift, are in full swing already in terms of customer demand. Others, like Lambda, need some time to marinate.

Amazon Redshift
What they’re saying:
Amazon Redshift is a petabyte-scale data warehouse solution that starts for $0.25 per hour and which can scale to a petabyte or more for $1,000 per terabyte per year, less than one-tenth of most other cloud data warehousing solutions, per AWS. Big companies like Airbnb and Nokia are using Amazon Redshift, making it a viable alternative to traditional enterprise-scale databases such as Oracle and IBM. This spring, the news came out that in May 2014, Amazon acquired a startup by the name of Amiato; the technology can extract unstructured data from NoSQL databases and migrate it to Amazon Redshift.

Our take: Our sales reps are seeing midsize to large companies inquire weekly about this technology and every month we’re launching a new customer project around it. This product is going mainstream quickly: established database vendors better be thinking on their feet. What’s tremendous about Amazon Redshift is that, to backup Amazon Web Service’s claim, it costs roughly 10% of traditional solutions. Further, because it’s on demand, you pay for exactly what you need when you need it. Your IT department doesn’t have to support it and it’s always going to be improving in the background. Amazon Redshift’s amazing cost economics will finally enable smaller companies to take on data warehousing, a sophisticated IT project that has previously been financially unavailable from both a procurement and IT management perspective. Yet with any new cloud technology, we expect to make minor changes in how things run on premise. A primary consideration with Amazon Redshift is that it requires some tuning of the ETL processes. Any experienced DBA can learn it but the platform ingests data in the cloud differently than on-premise platforms. Fortunately, data management tools for working with Amazon Redshift are becoming more available, such as from Informatica, SnapLogic, and others.

What they’re saying:
AWS Lambda is a new code execution environment which AWS says makes it easy to build mobile, tablet, and IoT backends that scale automatically without provisioning or managing infrastructure. Lambda runs your code on high-availability compute infrastructure and performs all the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, code and security patch deployment, code monitoring and logging.

Our take: Lambda goes beyond virtualization, by delivering a way companies can control when code runs. IT defines the code, and then the app executes it. The cloud service, in this case AWS, guarantees that all this happens in a highly-responsive manner. Lambda provides an alternate to message bus architecture, with the ability to deploy an auto-scaling message bus that can be used for event processing to enterprise application integration. IT no longer must manage the process nor deploy a VM to add capacity and it’s always fast. You can scale up and down without launching a virtual machine, similar to containerization services like Docker. Companies that are super agile will adopt this faster than others but for now the overwhelming response from our customers is a little lukewarm. It’s likely a matter of priorities, and Lambda is still very new. Yet the fact that Lambda doesn’t support a standard coding language is a consideration. Developers must learn a new script and rewrite the code accordingly.

AWS Config
What they’re saying:
In April, AWS announced expansion to five additional regions for its new AWS Config service, a tool to help enterprises better track resources and configurations in rapidly changing environments. The service takes a snapshot of the state of your AWS resources and tracks changes that take place between them.

Our take: AWS introduced CloudTrail to provide enterprise companies the ability to audit API calls to its IaaS. While CloudTrail is extremely useful, the service does not tell anything about relationships between AWS resources running in your accounts. More specifically, CloudTrail gives the ability to know that an EC2 instance has been terminated via the API but does not inform you that five EBS volumes were attached to that instance – hence why AWS Config was created. In this example, AWS Config would provide a change in the resource relationships enabling you to quickly determine that five EBS volumes were orphaned when the EC2 instance was terminated.

AWS Config is not limited to describing just the mapping of EC2 to EBS but encompasses nearly all resource relationships in AWS. For instance, AWS Config describes the dependencies of security groups. Again, CloudTrail would tell you that an API call was made where a security group was modified or deleted but not which EC2 instances were attached to that security group. AWS Config provides a resource map that describes which EC2 instances, network interfaces and VPCs are associated with the security group and then notifies you when a change occurs. These resource maps are enormously valuable when researching possible vulnerabilities to applications and data or even when investigating a breach or unauthorized access event.

In summary, AWS Config provides a quick reference in understanding the scope as changes are made in your account and reduces the number of AWS API calls, thus reducing your cost. In addition, 2nd Watch is developing visualization software for customers to see changes as they occur.

EC2 Dense-storage (D2) Instances
What they’re saying:
In April, AWS announced that customers could launch Amazon Elastic Map-Reduce (EMR) clusters on the next generation of Amazon EC2 Dense-storage (D2) instances. “D2 instances allow you to take advantage of the low cost, high disk throughput and high sequential I/O access rates offered by these instances,” AWS states. This could be viable for running workloads on Hadoop Distributed File System (HDFS) and generally, for storing and processing large amounts of data in the cloud, observers say.

Our take: We are seeing a great deal of interest from customers for deploying EC2 dense-storage instances for high-volume workloads such as file servers and traditional RDBMS servers like Oracle. This gives customers the ability to more easily scale their workloads without adding complexity to the storage management. Dense-storage EMR workloads are still nascent in our customer base, although we expect in time that companies will look to them for consolidating workloads or replacing high-cost proprietary enterprise data warehouse transform applications. AWS is forward-looking in preparing the foundation for these emerging workloads.

Joel Rosenberger is EVP, Software and Randall Barnes is Senior Cloud Architect, both at 2nd Watch.


Leave a Reply

WWPI – Covering the best in IT since 1980