MapR integrates with unified SQL layer to unlock insights using business intelligence tools



Converged data platgorm vendor MapR Technologies announced availability of Apache Drill 1.6 as the unified SQL layer for the MapR Converged Data Platform through integration with MapR-DB. Customers and partners benefit from the flexibility of reporting and analytics on JSON data stored in MapR-DB tables, realizing faster time-to-value with insights gleaned from operational data.

The MapR-DB document database format plugin is introduced in Drill 1.6, which enables uers to query JSON tables in MapR-DB directly. This means that no ETL and no transformation is required at any layer. This combination gives end-to-end flexibility when it comes to JSON in order to store, update, and query the data in its natural form and fidelity at Hadoop scale in global environments using familiar ANSI SQL capabilities, thereby resulting in operational analytics capabilities, and ability to adapt to changes to data models in the underlying applications.

Drill is designed with JSON and schemaless/semi-structured data at its core, and it already is able to query and manipulate raw JSON files in MapR-FS. This is now extended to MapR-DB for operational and fast-changing data. “We will be extending the power of Drill very soon to MapR Streams, a global publish-subscribe messaging framework, so SQL can be used to query real-time streaming events. Essentially, Drill becomes the unified, high- performance, and flexible SQL access layer across files, tables and streams in the MapR Converged Data Platform,” MapR added in a related blog post.

The new MapR-DB document database plugin allows analysts to perform SQL queries directly on JSON data stored in MapR-DB tables. There are a variety of pushdown capabilities available with this plugin to provide optimal interactive experience. Its enhanced query performance provides better query performance on data in Hadoop and NoSQL systems via numerous query planning improvements, such as partition pruning, metadata caching and other optimization improvements.

The solution offers up to 10 to 60 times performance gains in query planning compared to the previous releases of Drill. Its improved memory management delivers more stability and scale which enables customers to run not only larger but also more SQL workloads on a MapR cluster. It also offers improved integration with visualization tools like Tableau to provide metadata query performance improvements and introduces client impersonation for end-to-end security from the visualization tool to data in Hadoop. Version 1.6 also provides enhanced SQL Window functions.

“Operational analytics on document databases such as MapR-DB is a rapidly growing use case,” said Neeraja Rentachintala, senior director, Product Management, MapR Technologies. “For the first time, there is a stack that allows BI developers and business analysts to store and query data in native formats without cumbersome ETL or transformation, providing end-to-end flexibility and scale.”

Drill is used in a variety of use cases. For example, media companies can instantly query and analyze incoming content delivery network (CDN) files without requiring data transformations, allowing them to analyze several terabytes of CDN logs and reduce customer attrition. High-tech chip manufacturers can develop offerings that allow them to better analyze dropped calls and provide that information to their handheld device partners and thereby improve quality of service. Communications providers can instantly query and analyze logs from cell towers that enable mobile operators to proactively monitor and improve subscriber experience.

Leave a Reply

WWPI – Covering the best in IT since 1980