Daniel D. Gutierrez – Managing Editor, insideBIGDATA
insideBIGDATA: Tell us about your background. Did it naturally lead you to data science? If not, what gave you the push to enter the field?
David Steinmetz: I studied materials science, which is a blend of math, physics, and chemistry. I noticed that programming was integral to solving many of the materials science problems, which also tended to be steeped in sophisticated math. I used a genetic algorithm during my undergrad and a particle swarm algorithm during my PhD. After I graduated, I joined a management consultancy and learned about business and data analysis at companies. That background coupled with a genuine interest in computers naturally led to data science.
insideBIGDATA: What were things that made you consider a bootcamp? Were there other things you were considering at the time?
David Steinmetz: I took online data science courses to get my feet wet. Upon realizing that I really enjoyed the work, I looked at how to learn as much material as possible as quickly as possible. A friend mentioned the bootcamps, and I quickly decided it was the right move for me. I had been considering jobs in materials science, but data science drew me in.
insideBIGDATA: What skills were most useful in helping you land your position at Capital One?
David Steinmetz: The skills that were most useful were practical experience with a number of machine learning algorithms, project work shown on Github, and a working knowledge of data structures and algorithms. The question an interviewer is really asking is “can this person do the job”. The more project work you have on your public profile, the less of a risk it will be to hire you, because the hiring manager can already see your abilities.
insideBIGDATA: What are the tools you find most relevant in your position? What are the skills you thought that were most important?
David Steinmetz: Python, AWS, Github, Scala, and Spark are the tools which are most relevant to my current position and project. I use Pandas and Spark Datasets often, and Github always. I thought R would be used more, but it’s not, because it’s harder to use R in production. I also thought I would rely more on the standard machine learning libraries, but we don’t hesitate to implement an algorithm that doesn’t exist in Scitkit-Learn or MLlib if it suits our purposes.
insideBIGDATA: Can you describe your day to day job as a Data Scientist?
David Steinmetz: Often I spend time reading original research papers and books in the attempt to find state-of-the-art approaches to the problem I am trying to solve. The rest of the time is spent coding, visualizing data, bouncing ideas off of colleagues, and creating new products to solve our clients’ needs. I use cloud services and open source software extensively, allowing me to iterate quickly and try new approaches.
insideBIGDATA: What do you find most enjoyable about your job?
David Steinmetz: It’s varied, mentally challenging, and at the cutting edge of implemented machine learning. The people I work with are amazingly fascinating, and it’s motivating and an honor to be able to work with them.
insideBIGDATA: What are skills your team looks for in a Data Scientist?
David Steinmetz: We look for someone who is curious, passionate, and well-rounded in the sense that they have experience both with data engineering and distributed systems as well as data science and machine learning. Since we work so much in the cloud, knowledge of cloud services is a plus. A lot of work is done in Scala and Java, so knowledge of one of those two also helps.
insideBIGDATA: What advice do you have for people looking to enter the field?
David Steinmetz: There is so much to learn in the field, so pick one thing and learn it well before moving on to another. Learning many things superficially will backfire once you get into the interview or onto the job. There deep understanding and the capacity for further learning is necessary. A bootcamp is a great way to get both the deep understanding and cover the breadth of material necessary to get you started in the field. Whatever you do, get advice on what to learn, otherwise what you are learning might not be best suited to your situation.
Sign up for the free insideBIGDATA newsletter.