Deep Learning Performance Cheat Sheet

We outline a variety of simple and complex tricks that can help you boost your deep learning models accuracy, from basic optimization, to open source labeling software.

By Chris Dossman, Machine Learning Person, Future asteroid miner.

The question that I get the most from new and experienced machine learning engineers is “how can I get higher accuracy?”

Makes a lot of sense since the most valuable part of machine learning for business is often its predictive capabilities. Improving the accuracy of prediction is an easy way to squeeze more value from existing systems.

The guide will be broken up into four different sections with some strategies in each.

  • Data Optimization
  • Algorithm tuning
  • Hyper-Parameter Optimization
  • Ensembles, Ensembles, Ensembles

Not all of these ideas will boost performance, and you will see limited returns the more of them you apply to the same problem. Still stuck after trying a few of these? That indicates you should rethink the core solution to your business problem. This article is just a cheat sheet, so I’m linking you to more detailed sources of information in each section.

Data Optimization

Balance your data set

One of the easiest ways to increase performance for underperforming deep learning models is to balance your dataset if your problem is classification. Often real-world Subsample Majority Class:You can balance the class distributions by subsampling the majority class.

  • Oversample Minority Class:Sampling with replacement can be used to increase your minority class proportion.
  • Here is a nice write up for detailing this issue in more detail

    More Data

    Many of us are familiar with this graph. It shows the relationship between the amount of

    Batch size and number of epochs

    A standard procedure is using large batch sizes with a large number of epochs for modern deep learning implementations, but common strategies yield common results. Experiment with the size of your batches and the number of training epochs.

    Early Stopping

    This is an excellent method for reducing the generalization error of your deep learning system. Continual training might improve accuracy on your data set, but at a certain point, it starts to reduce the model’s accuracy on data not yet seen by the model. To improve real-world performance try early stopping.

    Example of early stopping

    Network Architecture

    If you want to try something a little more interesting, you can give Efficient Neural Architecture Search (ENAS) a try. This algorithm will create a custom network design that will maximize accuracy on your dataset and is way more efficient than standard Neural architecture search that cloud ML uses.


    A robust method to stop overfitting is to use regularization. There are a couple of different ways to use regularization that you can train on your deep learning project. If you haven’t tried these methods yet I would start to include them into every project you do.

    • DropoutThey randomly turn off a percentage of neurons during training. Dropout helps prevent groups of neurons from all overfitting to themselves.
    • Weight penalty L1 and L2: Weights that explode in size can be a real problem in deep learning and reduce accuracy. One of the ways to combat this is to add decay to all weights. These try to keep all of the weights int he networks as small as possible unless there are large gradients to counteract it. On top of often increasing performance, it has the benefit of making the model easier to interpret.

    Ensembles, Ensembles, Ensembles

    Having trouble picking the best model to use? Often you can combine the outputs from the different models and get better accuracy. There are two steps for every one of these algorithms.

    • Producing a distribution of simple ML models on subsets of the original data
    • Combining the distribution into one “Aggregated” model

    Combined Models/Views (Bagging)

    In this method, you train a few different models, which are different in some way, on the same data and you average out the outputs to create the final output. Bagging has the effect of reducing variance in the model. You can intuitively think of it as having multiple people with different backgrounds thinking about the same problem but with different starting positions. Just as on a team this can be a potent tool for getting the right answer.


    Its similar to bagging the difference here is that you don’t have an empirical formula for your combined output. You create a meta-level learner that based on the input data chooses how to weight the answers from your different models to produce the final output.

    Still having issues?

    Reframe Your Problem

    Take a break from looking at your screen and get a coffee. This solution is all about rethinking your problem from the beginning. I find it helps to sit down and start brainstorming of different ways that you could solve the problem. Maybe start by asking your self some simple questions:

    • Can my classification problem become a regression problem or the reverse?
    • Can you break down your problem any smaller?
    • Are there any observations that you have collected about your data that could change the scope of the problem?
    • Can your binary output become a softmax output or vice versa?
    • Are you looking at this problem in the most efficient way?

    Rethinking your problem can be the hardest of the methods to increase performance but it often the one that yields the best results. It helps to chat with someone that has experience in deep learning and can give you a fresh take on your problem.

    Bio: Chris Dossman is a Chief data Scientist at Wonder Technology, who's specialties include data science, software/hardware engineering, deep learning, and product management.

    Original. Reposted with permission.


    • Introduction to PyTorch for Deep Learning
    • Top 13 Python Deep Learning Libraries
    • Machine Learning Cheat Sheets