Databricks Adds Deep Learning Support to Cloud-Based Apache Spark Platform

databricks_logo_NEW
Company Delivers Comprehensive Deep Learning Toolkit for Big Data with GPUs Alongside CPUs

Databricks®, the company founded by the creators of the Apache® Spark™ project, today announced the addition of deep learning support to its cloud-based Apache Spark platform. This enhancement adds GPU support and integrates popular deep learning libraries to the Databricks’ big data platform, extending its capabilities to enable the rapid development of deep learning models. Data scientists looking to combine deep learning with big data — whether it’s recognizing handwriting, translating speech between languages, or distinguishing between malignant and benign tumors — can now utilize Databricks for every stage of their workflow, from data wrangling to model tuning. Databricks is the first to integrate these diverse workloads in a fast, secure, and easy-to-use Apache Spark platform in the cloud.

Apache Spark and Deep Learning

The 2016 Spark Survey found that machine learning usage in production saw a 38 percent increase since 2015, making it one of Spark’s key growth areas. Many leaders in machine learning, such as Yahoo, are choosing Spark for deep learning to achieve groundbreaking results with big data.

In March 2016, Databricks created and open sourced TensorFrames, a software library that enables the popular deep learning framework, TensorFlow to run on Spark. The enhancements announced today simplify deep learning on Spark by adding out-of-the-box support for usingTensorFrames with GPUs — specialized hardware that can perform an impressive amount of deep learning-specific computations in parallel. With Databricks, data teams can easily conduct deep learning on highly optimized hardware with a few clicks or API calls.

We are proud to enable organizations to achieve better results in their mission-critical applications and are always looking ahead at the latest technologies — such as deep learning — to provide the Spark community with the most flexible, approachable big data toolset,” said Ali Ghodsi, CEO and Cofounder at Databricks.

End-to-End Deep Learning with Databricks

Databricks allows organizations to perform data wrangling, interactive exploration, stream data processing, and other advanced analytics techniques alongside deep learning in a comprehensive platform. By seamlessly combining these techniques on Databricks, organizations can avoid unwanted system complexities and simplify the development of deep learning applications such as:

  • More timely and accurate cancer detection for healthcare providers: To read and interpret pathology images with higher accuracy than humans;
  • Faster drug discovery for pharma: To predict therapeutic uses of drugs at earlier stages to speed up the development and sales pipelines;
  • More capable artificial intelligence, such as language translation: To translate spoken speech with computers at an accuracy that rivals human performance.

Today’s dynamic data teams are applying a broad range of analytic tools to more data, but requiring insights and faster ROI,” said Tony Baer, Principal Analyst at Ovum. “With the Databricks’ platform, they can easily utilize the latest innovations, whether it’s Spark Streaming or deep learning, enabling them to build and deploy sophisticated business applications, in a simpler and faster way.”

Read the blog to learn more: http://dbricks.co/db-deep-learning

Sign up for the free insideBIGDATA newsletter.