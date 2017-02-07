BlueData, provider of Big-Data-as-a-Service (BDaaS) software platform, announced Tuesday its winter release for the BlueData EPIC software platform. This new release delivers several new enhancements for data science operations, bringing DevOps agility and collaboration to data science teams as well as support for new machine learning use cases.

The BlueData EPIC software platform delivers agility and flexibility, providing an easy-to-use self-service interface that allows data science teams to quickly spin up Docker-based environments for their preferred tools — running on shared infrastructure either on-premises or in the public cloud, with secure access to common data (e.g. in an HDFS data lake or Amazon S3).

It will now provide users with the option for data science teams to use JupyterHub, RStudio Server, and/or Zeppelin notebooks. Each of these notebooks are pre-configured and pre-tested as Docker images in the BlueData EPIC App Store and can be installed through automated one-click deployment. BlueData EPIC ensures governance, security, and authentication while providing the ability for users to share their data, models, and code in a multi-tenant environment on common infrastructure.

Data science environments in BlueData EPIC are pre-configured for R and Python support, with or without Spark. This enables data science teams to use their preferred languages, packages, and tools — without the operational challenges of testing and validating configurations or version dependencies. For example, data scientists can start with R standalone (e.g. RStudio, Shiny Server) and then later opt to use R with Spark for other use cases; the same applies to Python (i.e. with JupyterHub and PySpark). Other tools can also be added to the BlueData EPIC App Store, using the App Workbench to create new Docker-based images.

This release also allows users to submit R, Python, Spark, Hadoop, or SQL jobs — for persistent or transient clusters — from either the BlueData EPIC Web-based UI or REST API. This helps data science teams to quickly respond to dynamic business requirements by running a variety of jobs ranging from analytical SQL to Spark machine learning scripts against their data in a matter of a few clicks or with simple code.

The new release offers the ability to patch and update some or all of the nodes in a running environment with a single click. Bootstrap action scripts within BlueData EPIC can be executed both during and after creating the environments. This helps to removes the operational overhead of setting up, configuring, and managing the end-to-end lifecycle of data science environments.

BlueData continues to add new Big Data frameworks and tools to its platform — including pre-integrated H2O as well as Spark MLlib and other machine learning tools. With the pre-configured Docker image for H2O in the BlueData EPIC App Store, customers can now quickly deploy the H2O set of machine learning libraries — including H2O with R, H2O with Spark (i.e. “Sparkling Water”), as well as the H2O Flow user interface.

Data science never really works right with a “one size fits all” cookie cutter solution. With the BlueData EPIC platform, data science teams can analyze data using Scala, Python, R, or SQL; build models using R, Python, Spark MLlib, or H2O; and run and visualize their analysis using RStudio or JupyterHub or Zeppelin notebooks — all on the same Spark cluster, using shared data. Data scientists and other users have the flexibility to take advantage of a wide variety of tools and algorithms to address an increasingly complex set of use cases.

This new release builds upon BlueData’s continued innovation over the past year.

Most recently, BlueData announced BlueData EPIC on AWS to provide ultimate flexibility and choice for Big Data deployments on the Amazon cloud, including the ability to tap into both Amazon S3 and on-premises storage. BlueData provides the only Big-Data-as-a-Service solution that can be deployed either on-premises, in the public cloud, or in a hybrid architecture.

“It’s time for enterprises to extend the benefits of DevOps to their data science and engineering teams, whether for real-time analytics and machine learning or other use cases,” said Kumar Sreekanti, co-founder and CEO at BlueData. “BlueData customers can bring this agility and speed to their data science operations, with the ability to create fully integrated data science environments in just a few mouse clicks — both on-premises and in the public cloud.”