**By Derrick Mwiti, Data Analyst**

Sales forecasting is one the most common tasks in many sales driven organizations. This activity enables organizations to adequately plan for the future with a degree of confidence. In this tutorial we’ll use Prophet, a package developed by Facebook to show how one can achieve this. This package is available in both Python and R. We assume that the reader has basic understanding of handling time series data in Python.

### Plan of Attack

- Introduction and Installation
- Model Fitting
- Making Future Predictions
- Obtaining the Forecasts
- Plotting the Forecasts
- Plotting the Forecast Components
- Cross Validation
- Obtaining the Performance Metrics
- Visualizing Performance Metrics
- Conclusion

### Introduction and Installation

There are very many open source forecastings tools, however none of these would be fit to solve all forecasting problems. Prophet works best with hourly, and weekly data that is several months. When working with Prophet yearly data is most prefered. According to Facebook Research:

At its core, the Prophet procedure is an additive regression model with four main components:

1. A piecewise linear or logistic growth curve trend. Prophet automatically detects changes in trends by selecting changepoints from the data.

2. A yearly seasonal component modeled using Fourier series.

3. A weekly seasonal component using dummy variables.

4. A user-provided list of important holidays.

Prophet can be installed using pip in Python as shown below. Prophet depends on a Python module called `pystan.`

This module will be installed automatically as we install Prophet.

The dataset we’ve used for this tutorial is available here. Once you download the dataset make sure to delete some unnecessary rows towards the end of the file because they might interfere with the analysis. For this univariate analysisProphet expects the dataset to have two columns named as *ds* and *y*. *ds* is the date column while *y* is the column that we are forecasting.

Let’s get the ball rolling by importing Pandas for data manipulation and Prophet for forecasting. Next we load in our dataset and check its head.

### Model Fitting

Since we’ve worked with Scikit-learn before,working with Prophet will be a walk in the park for us. This is because the API implementation for Prophet and Scikit-learn are very similar as we’ll see below. We start by creating an instance of the Prophet class and then fit it to our dataset.

### Making Future Predictions

The next step is to prepare our model to make future predictions. This is achieved using the *Prophet.make_future_dataframe *method and passing the number of days we’d like to predict in the future. We use the *periods* attribute to specify this. This also include the historical dates. We’ll use these historical dates to compare the predictions with the actual values in the *ds* column.

### Obtaining the Forecasts

We use the *predict* method to make future predictions. This will generate a dataframe with a *yhat *column that will contain the predictions.

If we check the *head* for our forecast dataframe we’ll notice that it has very many columns. However, we are mainly interested in ** ds, yhat, yhat_lower **and

**is our predicted forecast,**

*yhat_upper. yhat***is the lower bound for our predictions and**

*yhat_lower***is the upper bound for our predictions.**

*yhat_upper*Let’s proceed to check the tail and head of the forecast’s dataframe.

### Plotting the Forecasts

Prophet has an inbuilt feature that enables us to plot the forecasts we just generated. This is achieved using *mode.plot() *and passing in our forecasts as the argument. The blue line in the graph represents the predicted values while the black dots represents the data in our dataset.

### Plotting the Forecast Components

The *plot_components *method plots the trend, yearly and weekly seasonality of the time series data.

### Cross Validation

Next let’s measure the forecast error using the historical data. We’ll do this by comparing the predicted values with the actual values. In order to perform this operation we select cut of points in the history of the data and fit the model with data upto that cut off point. Afterwards we compare the actual values to the predicted values. The *cross_validation *method allows us to do this in Prophet. This method take the following parameters as explained below:

*horizon*the forecast horizon*initial*the size of the initial training period*period*the spacing between cutoff dates

The output of the *cross_validation *method is a dataframe containing *y* the true values and *yhat* the predicted values. We’ll use this dataframe to compute the prediction errors.

### Obtaining the Performance Metrics

We use the *performance_metrics* utility to compute the Mean Squared Error(MSE), Root Mean Squared Error(RMSE),Mean Absolute Error(MAE), Mean Absolute Percentage Error(MAPE) and the coverage of the the `yhat_lower`

and `yhat_upper`

estimates.

**Visualizing Performance Metrics**

The performance Metrics can be visualized using the *plot_cross_validation_metric* utility. Let’s visualize the RMSE below.

### Conclusion

As we have seen Prophet is very powerful and effective in time series forecasting. However, as we mentioned earlier there are a few other forecasting tools. Choice of tool is on a case by case basis depending on the nature of the dataset. One can always compare these tools and use the one that gives the best predictions with the least amount of errors. Some of these methods include ARIMA, Holt-Winters’ Method, Holt’s linear trend, Simple exponential smoothing and Moving Averages among others. You can learn more about Prophet from the official docs or by reading Prophet’s Paper.

**Bio: Derrick Mwiti** is a data analyst, a writer, and a mentor. He is driven by delivering great results in every task, and is a mentor at Lapid Leaders Africa.

**Related:**

- Using a Keras Long Short-Term Memory (LSTM) Model to Predict Stock Prices
- Introduction to Deep Learning with Keras
- Introduction to PyTorch for Deep Learning