By Conor Dewey, Virginia Tech
Side projects serve as a way to apply data science in a less goal-driven environment than you probably experience at work or school. They offer an opportunity to play with data however you want, while learning practical skills at the same time.
Aside from being a lot of fun and a great way to learn new skills, side projects also help your chances when applying for jobs. Recruiters and managers love to see projects that show you’re interested in data in a way that goes beyond classes and employment.
Have you ever wanted to start a new project but you can’t decide what to do? First, you spend a couple hours brainstorming ideas. Then days. Before you know it, weeks have gone by without shipping anything new.
This is extremely common for self-driven projects in all fields; data science is no different. It’s easy to have grand ambitions but much more difficult to execute on them. I’ve found the hardest part of a data science project is getting started and deciding which path to go down.
In this post, my intention is provide some useful tips and resources to springboard you into your next data science project.
Before we jump into the resources below, there’s a couple quick things worth noting when thinking about data science projects: your goals and interests.
Data science is an extremely diverse field; this means that it’s virtually impossible to squeeze every concept and tool into one single project. You need to pick and choose which skills you want to focus on developing further. A few relevant examples could include:
- Machine learning and modeling
- Exploratory data analysis
- Metrics and experimentation
- Data visualization and communication
- Data mining and cleaning
Note that while it’s hard to incorporate every concept, you may be able to tie a few of them together, for example you can scrape data for exploratory data analysis, and then visualize it in an interesting way.
Basically, if you want to become a more effective machine learning engineer, chances are that you won’t accomplish that by doing a data viz project. Your project should reflect your goals. That way, even if it doesn’t blow up or uncover any groundbreaking insights, you still walk away with a win and a bunch of applied knowledge to show for it.
Like we touched on before, side projects should be enjoyable. Whether we realize it or not, we all ask ourselves hundreds of questions a day. Try tuning into these questions for the rest of today a little more than you usually would. You’ll be surprised by what happens. You may see that you’re a bit more creative and interested in certain things that you thought.
Now apply this to your next data science project. Are you curious about how to classify your morning runs? Want to know how and when Trump tweets the things he does? Interested in the greatest one-hit wonders in sports history?
The possibilities are truly endless. Let your interests, curiosities, and goals drive your next project. Once you’ve checked those boxes, let’s get inspired.
It’s easy to think that we’re on our own, but turns out that this is rarely the case. There’s always others out there with similar interests and goals if you look hard enough. This effect can be incredibly powerful for ideation.
“Nothing is original. Steal from anywhere that resonates with inspiration or fuels your imagination.” — Jim Jarmusch
Find projects that you like or admire, and then put your own twist on them. Use them as jumping off points to generate new, original work that stands alone. Some of my favorite resources for inspiration are as follows:
Data is Beautiful
I could spend hours just browsing this subreddit of data visualizations. You’ll be interested in all of the unique ideas and questions that people think up. There’s also monthly challenge where a dataset is chosen, and users are tasked with visualizing it in the most effective way possible. Sort by best all time for instant gratification.
I would be remiss if I didn’t mention the poster child of online data science. There’s a couple ways to use Kaggle effectively for inspiration. First, you can look at the trending datasets and think of interesting ways to leverage the information. If you’re more interested in machine learning and the examples themselves, the kernels feature has gotten better and better over time.
It really is true that visual essays are an emerging form of journalism. The Pudding embodies this movement like none other. The team uses original datasets, primary research, and interactivity in order to explore tons of interesting topics.
A classic, but still good to this day. I mean come on, Nate Silver is the man. The data-driven blog touches on everything from politics to sports to culture. Not to mention, they just revamped their much improved data export page.
Towards Data Science
Lastly, I’ve got to give a shoutout to the TDS Team for bringing together this community of smart people with a passion for achieving things and helping others grow in data science. Browsing recent stories will bring you more than a few interesting project ideas on any given day.
Side projects have not only helped me out immensely throughout my development, but they’re also generally a lot of fun. Recently, there’s been more and more awesome content coming out on data science portfolios. If interested, I highly recommend checking out the following links:
- Advice on Building data Science Portfolios
- How to Build a data Science Portfolio
- How to Build a Compelling data Science Portfolio & Resume
The hardest part of anything is getting started. I hope that the tips and resources above help you on your path to completing and shipping your next data science project. I’ll be on the lookout.
Bio: Conor Dewey (Medium) is a data Scientist and Writer currently studying at Virginia Tech.
Original. Reposted with permission.
- 5 data Science Projects That Will Get You Hired in 2018
- GitHub Python data Science Spotlight: AutoML, NLP, Visualization, ML Workflows
- Data Science Predicting The Future