Indra den Bakker

Battle of the Deep Learning frameworks — Part I: 2017, even more frameworks and interfaces

The deep learning landscape is constantly changing. Theano was the first widely adopted deep learning framework, created and maintained by MILA— headed by Yoshua Bengio, one of the pioneers of deep learning. However, things have changed. In September of this year MILA announced that there will be no further development work on Theano in 2018 after releasing the latest version. The news didn’t come as a surprise. In the past years different open source Python deep learning frameworks were introduced, often developed or backed by one of the big tech companies, and some got a lot of traction.

At the moment, TensorFlow by Google seems to be the most used deep learning framework out there— based on Github stars & forks and Stack Overflow activity. Some expected that with the introduction of TensorFlow Google would dominate the market for years. However, it looks like other frameworks did manage to attract a growing —and passionate — user base as well. Most mention worthy might be the introduction and growth of PyTorch. PyTorch was introduced by Facebook, amongst others, in January 2017. It’s a port to the popular Torch framework (implemented in C with a wrapper in Lua) with the Torch binaries wrapped in GPU accelerated Python.

Next to the GPU acceleration and the efficient usages of memory, the main driver behind the popularity of PyTorch is the use of dynamic computational graphs. These dynamic computational graphs were already being used by other, lesser known, deep learning frameworks like Chainer. The advantage of these dynamic graphs is that the graphs are defined by the run (“define by run”) instead of the traditional “define and run”. Especially, in cases where the input can vary, for example with unstructured data like text, this is extremely useful and efficient.

Other tech giants haven’t been sitting still as well. Microsoft developed an internal deep learning framework called CNTK and officially launched the 2.0 version in 2017 after renaming it to the Microsoft Cognitive Toolkit. In 2017, Facebook also launched Caffe2. It’s meant to be the successor of the well known Caffe framework. The original Caffe framework, developed by the Berkeley Vision and Learning Center, was and still is extremely popular for it’s community, it’s applications in computer vision, and it’s Model Zoo — a selection of pre-trained models. However, it seems that Caffe2 isn’t entering into the footsteps of Caffe just yet.

Another popular deep learning framework is MXNet, supported by Microsoft and Amazon. MXNet have been around for a while, but when MXNet is mentioned as deep learning framework, I often hear people respond with “isn’t that a deep learning framework for R?”. Yes it is, but it’s more. It actually supports many languages, from C++ to Python, JavaScript, Go, and, indeed, R. Where MXNet stands-out is it’s scalability and performance (stay tuned for Part II — where we will compare the most popular frameworks on speed amongst other metrics).

These are just small selection of a wide range of frameworks. Other open source deep learning frameworks include Deeplearning4j and Dlib (C++ based). And also in 2017, Google’s DeepMind released Sonnet (a high-level object oriented library build on top of TensorFlow). Other frameworks worth mentioning are H20.ai and Spark.

Next to all these frameworks, we also have interfaces that are wrapped around one or multiple frameworks. The most well know and widely used interface for deep learning is without a doubt Keras. Keras is a high-level deep learning API, written in Python and created by François Chollet — a deep learning researcher at Google. Google announced in 2017 that Keras has been chosen to serve as the high-level API of TensorFlow. This means that Keras will be included in the next TensorFlow release. Next to TensorFlow, Keras can also use Theano or CNTK as backend.

Keras is powerful because it’s really straightforward to create a deep learning model by stacking multiple layers. When using Keras, the user doesn’t have to do the maths behind the layers. This seems ideal for quick prototyping and Keras is also a popular tool in Kaggle competitions.

So at one side we currently have the high-level Keras API, that let you easily build simple and advanced deep learning models, and the low-level TensorFlow framework that gives you more flexibility in building models. Both are backed by Google. As expected, the competition didn’t sit still and — in October 2017 — Microsoft and Amazon’s AWS jointly announced the Gluon API. Gluon is a high-level Python deep learning interface that wraps MXNet and soon it will also include Microsoft’s CNTK. Gluon is a direct competitor for Keras and although AWS claims that they strongly support all deep learning frameworks they, of course, bet on Gluon for the democratization of AI.

Surprisingly enough, the biggest competitor of TensorFlow today seems to be PyTorch. After the growing interest in PyTorch from the community — for example, in the latest competitions at Kaggle users often choose to use PyTorch as part of their solutions and it has been used in the latest research papers as well — , TensorFlow introduced in October 2017 Eager Execution. A “define-by-run” interface to TensorFlow. With this launch, Google hopes to win back the users that fell in love with PyTorch and it’s dynamic graph.

For the developers of the popular deep learning course fast.ai this change came too late. In September, fast.ai announced to switch from Keras & TensorFlow to PyTorch. Jeremy Howard, founding researcher at fast.ai and former President and Chief Scientist at Kaggle, thinks that PyTorch will be able to stay ahead of the curve. Only time will tell.

With all these deep learning frameworks around it can be challenging for new comers to choose a framework. Quite frankly, even for seasoned researchers and developers it’s hard to keep up with the latest developments. A positive note to this is the release of the Open Neural Network Exchange (ONNX). Announced in September of 2017 and the release of V1 in December, ONNX is an open format to represent deep learning models. This allows users to more easily move models between different frameworks. For example, it allows you to build a PyTorch model and run the model for inference using MXNet.

ONNX is launched by Microsoft, AWS, and Facebook amongst others. It doesn’t come as a surprise that Google isn’t part of this list. ONNX supports Caffe2, Microsoft Cognitive Toolkit, MXNet, and PyTorch from the start, but like with other open source projects the community already added a converter for TensorFlow as well.

A lot of exciting developments in 2017, it resembles the fast moving field of deep learning and AI in general. It’s hard to predict what will happen in the new year. We might see some consolidation, although, the big tech companies will definitely want to use and promote their own tech stack. It’s good to see that the different frameworks, backed by different tech giants, push each other to innovate faster. In Part II we will compare the different frameworks in more detail on the basis of different metrics like speed, memory usage, portability, and scalability.