**By Matthew Mayo, KDnuggets.**

I originally intended to play around with MXNet **long** ago, around the time that Gluon was released publicly. Things got busy. I got sidetracked.

I finally started using MXNet recently. In the interests of getting to know my way around, I thought covering some basics, such as how tensors and derivatives are handled, might be a good place to start (such as I did here and here with PyTorch).

This won't repeat what is in those previous PyTorch articles step by step, so look at those if you want any further context. What's below should be relatively straightforward, however.

MXNet is an open source neural network framework, a "flexible and efficient library for deep learning." Gluon is the imperative high-level API for MXNet, which provides additional flexibility and ease of use. You can think of the relationship between MXNet and Gluon as being similar to TensorFlow and Keras. We won't cover Gluon any further herein, but will explore it in future posts.

MXNet's tensor implementation comes in the form of the `ndarray`

package. Here you will find what's needed to build multidimensional (*n*-dimensional) arrays and perform some of the operations on them required for implementing neural networks, along with the `autograd`

package. It is this package we will make use of below.

`ndarray`

(Very) Basics

First, let's import what we need from the library, in such a way as to simplify making our API calls:

import mxnet as mx from mxnet import autograd as ag from mxnet import nd

Now, let's create a basic `ndarray`

(on the CPU):

# Create CPU array a = nd.ones((3, 2)) print(a)

[[1. 1.] [1. 1.] [1. 1.]] <NDArray 3x2 @cpu(0)>

Note that printing an `ndarray`

also prints out the type of the object (again, `NDArray`

), as well as its size and the device to which it is attached (in this case, CPU).

What if we wanted to create an ndarray object with a GPU context (note that a context is the device type and ID which should be used to perform operations on the object)? First, let's determine whether or not there is a GPU available to MXNet:

# Test if GPU is recognized def gpu_device(gpu_number=0): try: _ = mx.nd.array([1, 2, 3], ctx=mx.gpu(gpu_number)) except mx.MXNetError: return None return mx.gpu(gpu_number) gpu_device()

gpu(0)

This response denotes that there is a GPU device, and its ID is 0.

Let's create an `ndarray`

on this device:

# Create GPU array b = nd.zeros((2, 2), ctx=mx.gpu()) print(b)

[[0. 0.] [0. 0.]] <NDArray 2 x 2 @gpu(0)>

The output here confirms that an `ndarray`

of zeros of size 2 x 2 was created with a context of GPU.

To get a returned transposed `ndarray`

(as opposed to simply a transpose view of the original):

# Transpose T = c.T print(T)

[[1. 2. 3.] [4. 5. 6.]] <NDArray 2x3 @cpu(0)>

Reshape an `ndarray`

as a view, without alteration of the original data:

# Reshape r = T.reshape(3,2) print(r)

[[1. 2.] [3. 4.] [5. 6.]] <NDArray 3x2 @cpu(0)>

Some ndarray info:

# ndarray info print('ndarray shape:', r.shape) print('Number of dimensions:', r.ndim) print('ndarray type:', r.dtype)

ndarray shape: (3, 2) Number of dimensions: 2 ndarray type: <class 'numpy.float32'>

See here for more on `ndarray`

basics.

**MXNet ndarray To and From Numpy ndarray**

It's easy to go from Numpy `ndarrays`

to MXNet `ndarrays`

and vice versa.

import numpy as np # To numpy ndarray n = c.asnumpy() print(n) print(type(n))

[[1. 4.] [2. 5.] [3. 6.]] <class 'numpy.ndarray'>

# From numpy ndarray a = np.array([[1, 10], [2, 20], [3, 30]]) b = nd.array(a) print(b) print(type(b))

[[ 1. 10.] [ 2. 20.] [ 3. 30.]]<class 'mxnet.ndarray.ndarray.NDArray'>

**Matrix-matrix multiplication**

Here's how to compute a matrix-matrix dot product:

# Compute dot product t1 = nd.random.normal(-1, 1, shape=(3, 2)) t2 = nd.random.normal(-1, 1, shape=(2, 3)) t3 = nd.dot(t1, t2) print(t3)

[[1.8671514 2.0258508 1.1915313] [9.009048 8.481084 6.7323728] [5.0241795 4.346245 4.0459785]] <NDArray 3x3 @cpu(0)>

See here for more on linear algebra operations with `ndarray`

.

**Using autograd to Find and Solve a Derivative**

On to solving a derivative with the MXNet `autograd`

package for automatic differentiation.

First we will need a function for which to find the derivative. Arbitrarily, let's use this:

To see us work out the first order derivative of this function by hand, as well as find the value of our derivative function for a given value of *x*, see this post.

For reasons which should be obvious, we have to represent our function in Python as such:

y = 5*x**4 + 3*x**3 + 7*x**2 + 9*x - 5

Now let's find the value of our derivative function for a given value of *x*. Let's arbitrarily use 2:

x = nd.array([2]) x.attach_grad() with ag.record(): y = 5*x**4 + 3*x**3 + 7*x**2 + 9*x - 5 y.backward() x.grad

Line by line, the above code:

- defines the value (2) we want to compute the derivative with regard to as an MXNet ndarray object
- uses

to allocate space for the gradient to be computed**attach_grad()** - the code block denoted with

contains the computation to be performed with regard to computing and tracking the gradient**ag.record()** - defines the function we want to compute the derivative of
- uses autograd's
**backward()**to compute the sum of gradients, using the chain rule - outputs the value stored in the
*x*`ndarray`

's**grad**attribute, which, as shown below

tensor([ 233.])

This value, 233, matches what we calculated by hand in this post.

See here for more on automatic differentiation with `autograd`

.

This has been a very basic overview of simple `ndarray`

operations and derivatives in MXNet. As these are 2 of the staples of building neural networks, this should provide some familiarity with the library's approaches to these basic buildings blocks, and allow for diving in to some more complex code. Next time we will create some simple neural networks with MXNet and Gluon, exploring the libraries more in-depth.

For more (right now!) on MXNet, Gluon, and deep learning in general, the freely-available book **Deep Learning - The Straight Dope**, written by those intimately involved in the development and evangelizing of these libraries, is definitely worth looking at.

**Related**:

- PyTorch Tensor Basics
- Simple Derivatives with PyTorch
- WTF is a Tensor?!?