The job of a Data Scientist

The job of a Data Scientist

First, for the fact that Data Science has different definitions because of its overlap across many disciplines, now everyone has their own definition of what Data Science is and what is not.

So what is really Data Science?

Data science is a vast, multi-disciplinary area that combines the application of subjects, which in this case are computer science, software engineering, mathematics, statistics, programming, economics, and business to extract insights from the data.

Now you know why you should run from those claiming to teach you Data Science in 6hrs. You can learn the concepts within 6hrs, this doesn't mean you understand it, the methodologies used, the approach applied and it doesn't mean you have developed the thinking capacity, curiousness, diversity, etc needed to be a Data Scientist.

And if I have learned one thing, it is that Data Science is not what many people define it to be.

  • Now back to the subject of today!

Because Data Science means different thing to different people/organization, the function of a Data Scientist in a company is defined by what the company understands to be Data Science, but one thing remains constant and that is, the job you will be doing will remain in the scope/discipline of Data Science.

Remember our definition of Data Science; Data science is a vast, multi-disciplinary field, now if you acquire skills across these fields, you're ready for a Data Science job.

Skills required from a Data Scientist

  • Machine Learning
  • Data Analysis
  • Data Engineering
  • Domain Knowledge
  • Business Intelligence
  • Mathematics
  • Statistics

  • Machine Learning

Machine learning (ML) is the study of computer algorithms that improve automatically through experience. It is seen as a subset of artificial intelligence.

As a Data Scientist, you must have more than average knowledge in machine learning algorithms. Is not enough that you how to import an algorithm from Sklearn, fit, and predict.

You're expected to know the theoretical part of these algorithms, this will not only help you use them efficiently but it will influence your choice of Machine Learning algorithm to use when trying to solve problems with Machine learning.

And being able to fine-tune the algorithm to meet the requirements of the problem statement you have at hand is what the company expects from you.

  • Data Analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

A high level of skill in Data Analysis is required of you! What do you know so far?

  • Data Engineering

Information engineering, also known as Information technology engineering, information engineering methodology, or data engineering, is a software engineering approach to designing and developing information systems.

How strong is your Data Engineering skill? How skilled are you in using SQL? If employed as a Data Scientist in a company, all the data you will be working with will definitely be stored in an RDBMS.

  • Domain Knowledge

Domain knowledge is knowledge of a specific, specialized discipline or field, in contrast to general knowledge, or domain-independent knowledge.

How well do you understanding the business operation of your organization? Domain Knowledge is required for you to work efficiently with the company. You will need to understand the workflow from top to bottom, from product/service to customer.

  • Business Intelligence

Business intelligence comprises the strategies and technologies used by enterprises for the data analysis of business information. BI technologies provide historical, current, and predictive views of business operations.

Every company/business has a powering algorithm that is generating all the revenue, this could be the way their service/product is designed, approach to customers, service/product process, etc. Is your job as a Data Scientist to have high-level Business Intelligence to be able to understand these processes, and how to refine it to generate more revenue. Of cause this is why you were employed, to add value that increases revenue or cut cost….

  • Mathematics

Mathematics includes the study of such topics as quantity, structure, space, and change. It has no generally accepted definition. Mathematicians seek and use patterns to formulate new conjectures; they resolve the truth or falsity of such by mathematical proof.

Yes, of course, you need to know maths to be able to understand the techniques and maths behind all the machine learning algorithms you have been using.

  • Statistics

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional, to begin with, a statistical population or a statistical model to be studied.

This is highly needed, your statistical skills will help you not just in the analysis of data, but more in the business area.

What employers expect from you

  • ROI(Return on Investment)
  • Translate business problem --> Data Science problem
  • Identify problems
  • Create pipelines
  • Train
  • Scholar/Researcher

  • ROI(Return on Investment)

Return on investment is a ratio between net profit and the cost of investment. A high ROI means the investment's gains compare favorably to its cost. As a performance measure, ROI is used to evaluate the efficiency of an investment or to compare the efficiencies of several different investments.

Always know the fact that they hired you, is an investment, so you were hired for a reason, either to solve identified problems or identify problems to solve.

  • Translate business problem --> Data Science problem

To translate a business problem into an AI and data science solution, you need to understand the problem, the data analysis goals and metrics, and the mapping to one or more business patterns.

Is your job as a Data Scientist to translate business problems to the Data Science problem.

  • Identify problems

Some companies will just hire you to find problems in the company, or find issues with their product or services, and provide a solution to them.

  • Create Pipelines

Data pipelines, by consolidating data from all your disparate sources into one common destination, enable quick data analysis for business insights. They also ensure consistent data quality, which is absolutely crucial for reliable business insights.

Don't be surprised after being employed as a Data Scientist, you find out that the company does not have any data to work with. No source of data and nothing at all!

  • Train

Sometimes you will be tasked to give lectures on what Data Science is and how it can benefit the company, this might be coming from stakeholders. Depending on how passionate you are about your job, you might engage some staff in training, especially those that their job requires them to interact with data. Teaching then how to format, clean, label, and store data properly.

  • Scholar/Researcher

This is the heart of it all!!! Boom!

Data Scientists are scholars, if you don't like reading and researching you should probably quit now because, in the business environment you will be meeting a lot of unfamiliar concepts, theories, and hypotheses that will require a huge amount of reading and researching to understand. Imagine company X employes you and after onboarding, during one of the standup meetings, your boss as you how the company is doing competition-wise?

Evaluating your skill level

  • Competitions
  • Projects
  • Apply for jobs

freesnippingtool.com_capture_20200922104406.png

If you're asked among those definitions, which one suits a data scientist, which will be your answer? Let me leave you to that lol.

So how do you know you're qualified enough to apply for that job, enroll for that hackathon, or signup for those competitions? the honest answer is, you will never know until you try!

So how do you evaluate your skill and experience as a data scientist?

  • Competitions

Signup for competitions, not to win or compete but to learn, gain experience and also network with other data scientists out there.

You can find a lot of competition and hackathon on these platforms KAGGLE and ZINDI were you compete for a price or a spot at the leaderboard.

  • Projects

The time is now, start building your personal projects around your interest. it could be analytic, machine learning, or research work, just do it and put it out there.

  • Apply For Jobs

Start applying for jobs, don't wait till you know it all before you start applying for jobs. Most of the time, when employed you will find out that, you're actually exhausting only 1/2% of your entire skill or experience. so start applying for jobs to know how the hiring process is, the tests, and interviews.

freesnippingtool.com_capture_20200922111450.png

Data Science has lost its sexiness, Data Science is hard, stressful, and can be overwhelming! Sometimes you will feel like you aren't getting there, but don't give up. And always remember, that point things get difficult, just know that you're close to making the curve and once you do, everything becomes easier.

See you on the other side!

Resources

Cover Photo by Ketut Subiyanto from Pexels

Research

  1. en.wikipedia.org
  2. ibm.com
  3. xplenty.com