By Kyle Weller
Environment Setup Prerequisites
- Install ML Services on SQL Server
In order for R or Python to execute within SQL, you first need the Machine Learning Services feature installed and configured. See this how-to guide.
- Install RevoscalePy via Microsoft's Python Client
In order to send Python execution to SQL from Jupyter Notebooks, you need to use Microsoft's RevoscalePy package. To get RevoscalePy, download and install Microsoft's ML Services Python Client. Documentation Page or Direct Download Link (for Windows).
After downloading, open powershell as an administrator and navigate to the download folder. Start the installation with this command (feel free to customize the install folder):
Be patient while the installation can take a little while. Once installed navigate to the new path you installed in. Let's make an empty folder and open Jupyter Notebooks:
Create a new notebook with the Python 3 interpreter:
To test if everything is setup, import revoscalepy in the first cell and execute. If there are no error messages you are ready to move forward.
Database Setup (Required for this tutorial only)
For the rest of the tutorial you can clone this Jupyter Notebook from Github if you don't want to copy paste all of the code. This database setup is a one time step to ensure you have the same data as this tutorial. You don't need to perform any of these setup steps to use your own data.
- Create a database
Modify the connection string for your server and use pyodbc to create a new database.
- Import Iris sample from SkLearn
Iris is a popular dataset for beginner data science tutorials. It is included by default in sklearn package.
- Use RecoscalePy APIs to create a table and load the Iris data
(You can also do this with pyodbc, sqlalchemy or other packages)
Define a Function to Send to SQL Server
Write any python code you want to execute in SQL. In this example we are creating a scatter matrix on the iris dataset and only returning the bytestream of the .png back to Jupyter Notebooks to render on our client.
Send execution to SQL
Now that we are finally set up, check out how easy sending remote execution really is! First, import revoscalepy. Create a sql_compute_context, and then send the execution of any function seamlessly to SQL Server with RxExec. No raw data had to be transferred from SQL to the Jupyter Notebook. All computation happened within the database and only the image file was returned to be displayed.
While this example is trivial with the Iris dataset, imagine the additional scale, performance, and security capabilities that you now unlocked. You can use any of the latest open source R/Python packages to build Deep Learning and AI applications on large amounts of data in SQL Server. We also offer leading edge, high-performance algorithms in Microsoft's RevoScaleR and RevoScalePy APIs. Using these with the latest innovations in the open source world allows you to bring unparalleled selection, performance, and scale to your applications.
Check out SQL Machine Learning Services Documentation to learn how you can easily deploy your R/Python code with SQL stored procedures making them accessible in your ETL processes or to any application. Train and store machine learning models in your database bringing intelligence to where your data lives.
Basic R and Python Execution in SQL Server: https://aka.ms/BasicMLServicesExecution
Set up Machine Learning Services in SQL Server: https://aka.ms/SetupMLServices
End-to-end tutorial solutions on Github: https://microsoft.github.io/sql-ml-tutorials/
Other YouTube Tutorials:
How to Install SQL Server Machine Learning Services: https://aka.ms/InstallMLServices How to Enable SQL Server Machine Learning Services: https://aka.ms/EnableMLServices Basics of R and Python Execution in SQL: https://aka.ms/ExecuteMLServices
Bio: Kyle Weller is a Program Manager on Microsoft’s Azure Machine Learning team. After working as a Software Developer at a few startups, he joined Microsoft and spent time in Office, and Bing building scalable data instrumentation, platform, and API solutions. He was on the Cortana data & Analytics team driving measurement strategy, researching user behavior patterns, standardizing KPIs, and identifying growth opportunities for the product. He now works on Azure machine learning products focusing on bringing intelligence to where data lives. This includes SQL Server’s new Machine Learning Services that allow R + Python execution in SQL.
- Jupyter Notebook for Beginners: A Tutorial
- SQL Cheat Sheet
- Genetic Algorithm Implementation in Python