What Kind Of Data Scientist Do You Want To Be?

What Kind Of Data Scientist Do You Want To Be?

“Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.”

Josh Wills, Director of Data Engineering at Slack

During an introductory What Is Data Science? lecture at the start of my course, one of the questions that was raised was ‘What are the skills that a data scientist needs?’ (It was also a question posed by Mark Hunter and Sandy Scott of Sainsbury’s Bank at the Data Scientist 2.0 event two weeks ago.)

What skills do data scientists need?

The potential list is huge, comprising skills across maths/statistics, programming, databases, domain expertise, soft skills, communication and visualisation.

My first reaction was the thought, ‘How am I supposed to get good at all that stuff in the next twelve months?’

My second thought was, ‘How can anyone be good at all that, no matter how long they’ve got to learn it?’

Of course, no one can.

It’s all about the skills in the team rather than the individual. (A brief aside: what percentage of data scientists actually work in a team of other data scientists, and how many work alone?).

This got me thinking:

  • how do my existing skills fit with the skills required of a data scientist?
  • what skills do I need to develop more?
  • what kind of data scientist do I think I could be?
  • What would be a good fit for my skills, values and interests?
  • what kind of data scientist do I want to be?

What kind of data scientist might I be?

Harlan Harris, one of the organisers of Data Community DC, in Washington D.C, surveyed a group of data scientists about how they viewed their own skills, described in ‘Analysing the Analysers’.

The chart below shows how the data scientist self-identify. This chart was shown to us during our lecture, and of course my first thought was ‘Which one am I?’

Skills and Self-ID Top Factors

The one I feel is the best fit for me is that of Data Creative. Of course, I’ve got the next year to figure out what’s the best fit for me, but at the moment this is my best guess.

Why Data Creative?

Well, first off, I think of myself of a creative person. I’m interested in solving problems, specifically problems that impact on people’s lives (rather than esoteric, abstract problems - I don’t give a hoot about those). I’m the kind of person that asks, ‘How could we make this better?’ I want to use data science to help people improve their lives. I want to use data as a force for good. That takes creativity.

Secondly, the role of Data Creative (as described here at least) is a balanced one, requiring a range of skills, but where no single skill dominates.

Thirdly, I like the amount of programming that’s shown within this category. I enjoy coding. It’s a great tool. However, I’m never going to be the most techie guy in the room, nor would I want to be. And I certainly wouldn’t want most of my work-life to be spent coding. That’s just not for me. That said, one skill-set I have that I believe could be useful is that of software engineering, and managing software engineering projects. I’ve managed the delivery of software products before and could foresee myself playing a useful role in delivering data products.

There are, however, a couple of aspects of the Data Creative I’d change to to fit my personal skills. The ‘Business’ skills section I would make bigger. I can relate well to business people and can speak in their language. Through talking to some people in data science recruitment, this appears to be a skill that’s not always evident in data science candidates.

Additionally, two vital skills for data scientists aren’t shown in the chart at all, namely communication and visualisation. I believe I’m a strong communicator. Visualisation (as far as data science goes) is the art of making the complex understandable. I’m sure I’ve got lots to learn here, but communicating with stakeholders/clients/colleagues (including communcating results) is fundamental to the art and science of creating actionable insights from data, and a skill I look forward to developing further.

What kind of data scientist are you? What kind do you want to be?