Data is at the center of the modern analytics revolution. Large amounts of data must be delivered to the parallel processors, like multi-core CPUs and GPUs, at incredibly high speeds in order to train machine learning and analytic algorithms faster and more accurately. Today, most machine learning production is undertaken by hyperscalers and large, web-scale companies. Recently, however, ML has moved to the forefront in numerous industries. In the automotive industry, where the global race to market the first autonomous vehicles has heated up, ML has become the de facto approach. In finance, organizations are implementing AI and ML to automate trades, understand credit exposure and manage risk. In healthcare research, analysis of MRI images can pick out genetic markers in brain tumors that are invisible to the naked eye to assist medical professionals in clinical diagnosis. FlashBlade, which is optimized for any and all unstructured workloads, is best positioned to accelerate modern emerging workloads for the world’s most innovative organizations.
FlashBlade is massively parallel – from the top down. Parallelism mimics the human brain, and enables multiple queries or jobs to run simultaneously,” said Par Botes, VP of Products, Pure Storage. “Think of a storage array like a grocery store checkout – if one lane is open for 10 customers, the checkout process becomes a bottleneck. If 10 lanes are open, each customer breezes through simultaneously. Similarly, a storage system that can take 10 varied workloads and run them all at the same time will run 10x faster. While most systems are tuned for very specialized and specific types of workloads, FlashBlade was built from the ground up to be optimized for everything, which provides customers with a significant competitive advantage.”
Zenuity, a joint collaboration between Volvo® Cars and global airbag manufacturer Autoliv®, selected FlashBlade and NVIDIA® DGX-1 as the foundation for its machine learning project to put the safest autonomous vehicles on the road by 2021. Each vehicle is equipped with sensors like LIDARs and cameras to safely navigate in its surroundings. Millions of frames collected from the cars are used to train deep neural networks that are then used to power the software that runs Zenuity’s fleet of self-driving vehicles.
FlashBlade provides the scalability and performance needed for a machine learning project of this magnitude,” said Samuel Scheidegger, Machine Learning Researcher at Zenuity. “With its ability to scale linearly, FlashBlade will allow Zenuity to expand its machine learning platform with computational power for future needs.”
FlashBlade was also selected to power one of the most powerful AI supercomputers in the world at a webscale cloud company. The engine ingests massive amounts of unstructured user data that is then relies on FlashBlade to provide a high-performance platform for textual analysis, facial recognition, targeted advertisements, predictive analysis that improve user safety, and application design for AI apps that improve overall efficiency.
UC Berkeley’s AMPLab created and pioneered real-time analytics engine Apache Spark™, the fastest, most cutting-edge analysis tool in the world. The UC Berkeley genomics department then implemented Apache Spark on top of FlashBlade to serve as an accelerator to make major leaps in genomic sequencing.
Medicine today is very much trial-and-error. FlashBlade provides a data platform which allows us to analyze everyone’s genetic makeup and deliver medicine that is more precise and tailored to each individual,” said Professor Anthony D. Joseph, a core faculty member of UC Berkeley’s Center for Computational Biology (CCB). “Our goal is to provide patients with personalized care and treatment based on their specific genetic makeup, in near real-time. This will optimize medications, improves post-op care and rehabilitation, and ultimately lowers patient cost. With FlashBlade, we can utilize SPARK to sequence the genome of a patient and cross-reference that information against all known bacteria and genomic defects at previously unachievable speeds.”
Man AHL, a London-based pioneer in the field of systematic quantitative investing, also leverages Apache Spark on top of FlashBlade to create and execute computer models that make investment decisions. Roughly 50 quantitative researchers and more than 60 technologists collaborate to formulate, develop and drive new investment models and strategies that can be executed by computer. The firm adopted FlashBlade to deliver the massive storage throughput and scalability required to meet its most demanding simulation applications.
Our researchers have found that can FlashBlade greatly improve the usability and performance of Spark for performing multiple simulations,” said Gary Collier, Co-CTO, Man AHL. “We have seen as much as 10-20x improvement in throughput for Spark workloads, which really has the potential to be a game-changer for us when it comes to creating a time-to-market advantage.”
Pure Storage has also leveraged AI and FlashBlade as key components of its internal technology platform. Pure1® META delivers global predictive intelligence by collecting and analyzing more than one trillion array telemetry points per day. With META, Pure Storage has implemented AI to help customers predict and identify potential system or workload issues before they occur, which improves operations and end-user experience.
AI and other emerging workloads are changing the game for business operations and customer engagement,” said Eric Burgener, Research Director, Storage, IDC. “For years, slow, complex legacy storage systems unable to cope with modern data volume and velocity have been a roadblock for next-generation insights and breakthroughs. Purpose-built systems like FlashBlade eliminate that roadblock, removing the storage infrastructure as a barrier to customers fully leveraging data analytics to move their business forward.”
Sign up for the free insideBIGDATA newsletter.