What is Data Science?

Data science continues to evolve as one of the most promising and in-demand career paths for skilled professionals. Today, successful data professionals understand that they must advance past the traditional skills of analyzing large amounts of data, data mining, and programming skills. In order to uncover useful intelligence for their organizations, data scientists must master the full spectrum of the data science life cycle and possess a level of flexibility and understanding to maximize returns at each phase of the process.

The Data Science Life Cycle

The image represents the five stages of the data science life cycle: Capture, (data acquisition, data entry, signal reception, data extraction); Maintain (data warehousing, data cleansing, data staging, data processing, data architecture); Process (data mining, clustering/classification, data modeling, data summarization); Analyze (exploratory/confirmatory, predictive analysis, regression, text mining, qualitative analysis); Communicate (data reporting, data visualization, business intelligence, decision making).

The term “data scientist” was coined as recently as 2008 when companies realized the need for data professionals who are skilled in organizing and analyzing massive amounts of data. 1 In a 2009 McKinsey&Company article, Hal Varian, Google's chief economist and UC Berkeley professor of information sciences, business, and economics, predicted the importance of adapting to technology’s influence and reconfiguration of different industries. 2

“The ability to take data — to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it — that’s going to be a hugely important skill in the next decades.”
- Hal Varian, chief economist at Google and UC Berkeley professor of information sciences, business, and economics 3

Effective data scientists are able to identify relevant questions, collect data from a multitude of different data sources, organize the information, translate results into solutions, and communicate their findings in a way that positively affects business decisions. These skills are required in almost all industries, causing skilled data scientists to be increasingly valuable to companies.

What Does a Data Scientist Do?

In the past decade, data scientists have become necessary assets and are present in almost all organizations. These professionals are well-rounded, data-driven individuals with high-level technical skills who are capable of building complex quantitative algorithms to organize and synthesize large amounts of information used to answer questions and drive strategy in their organization. This is coupled with the experience in communication and leadership needed to deliver tangible results to various stakeholders across an organization or business.

Data scientists need to be curious and result-oriented, with exceptional industry-specific knowledge and communication skills that allow them to explain highly technical results to their non-technical counterparts. They possess a strong quantitative background in statistics and linear algebra as well as programming knowledge with focuses in data warehousing, mining, and modeling to build and analyze algorithms.

They must also be able to utilize key technical tools and skills, including:

R

Python

Apache Hadoop

MapReduce

Apache Spark

NoSQL databases

Cloud computing

D3

Apache Pig

Tableau

iPython notebooks

GitHub

Why Become a Data Scientist?

Glassdoor ranked data scientist as the #1 Best Job in America in 2018 for the third year in a row. 4 As increasing amounts of data become more accessible, large tech companies are no longer the only ones in need of data scientists. The growing demand for data science professionals across industries, big and small, is being challenged by a shortage of qualified candidates available to fill the open positions.

The need for data scientists shows no sign of slowing down in the coming years. LinkedIn listed data scientist as one of the most promising jobs in 2017 and 2018, along with multiple data-science-related skills as the most in-demand by companies. 5

The statistics listed below represent the significant and growing demand for data scientists.

28%

Demand Increase by 2020

4,524

Number of Job Openings

$120,931

Average Base Salary

#1

Best Job in America 2016, 2017, 2018

Sources: Glassdoor and Forbes

Where Do You Fit in Data Science?

Data is everywhere and expansive. A variety of terms related to mining, cleaning, analyzing, and interpreting data are often used interchangeably, but they can actually involve different skill sets and complexity of data.

Data Scientist

Data scientists examine which questions need answering and where to find the related data. They have business acumen and analytical skills as well as the ability to mine, clean, and present data. Businesses use data scientists to source, manage, and analyze large amounts of unstructured data. Results are then synthesized and communicated to key stakeholders to drive strategic decision-making in the organization.

Skills needed: Programming skills (SAS, R, Python), statistical and mathematical skills, storytelling and data visualization, Hadoop, SQL, machine learning

Data Analyst

Data analysts bridge the gap between data scientists and business analysts. They are provided with the questions that need answering from an organization and then organize and analyze data to find results that align with high-level business strategy. Data analysts are responsible for translating technical analysis to qualitative action items and effectively communicating their findings to diverse stakeholders.

Skills needed: Programming skills (SAS, R, Python), statistical and mathematical skills, data wrangling, data visualization

Data Engineer

Data engineers manage exponential amounts of rapidly changing data. They focus on the development, deployment, management, and optimization of data pipelines and infrastructure to transform and transfer data to data scientists for querying.

Skills needed: Programming languages (Java, Scala), NoSQL databases (MongoDB, Cassandra DB), frameworks (Apache Hadoop)

Data Science Career Outlook and Salary Opportunities

Data science professionals are rewarded for their highly technical skill set with competitive salaries and great job opportunities at big and small companies in most industries. With over 4,500 open positions listed on Glassdoor, data science professionals with the appropriate experience and education have the opportunity to make their mark in some of the most forward-thinking companies in the world.6

Below are the average base salaries for the following positions: 7

Data analyst: $65,470


Data scientist: $120,931


Senior data scientist: $141,257


Data engineer: $137,776

Gaining specialized skills within the data science field can distinguish data scientists even further. For example, machine learning experts utilize high-level programming skills to create algorithms that continuously gather data and automatically adjust their function to be more effective.

Request More Information

Learn how a Master of Information and Data Science from UC Berkeley can prepare you with the skills needed for a successful career in data science.

1   hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century.   Accessed April 2018. Return to footnote reference
2   www.mckinsey.com/industries/high-tech/our-insights/hal-varian-on-how-the-web-challenges-managers.   Accessed July 2018. Return to footnote reference
3   www.mckinsey.com/industries/high-tech/our-insights/hal-varian-on-how-the-web-challenges-managers.   Accessed July 2018.Return to footnote reference
4   www.glassdoor.com/List/Best-Jobs-in-America-LST_KQ0,20.htm.   Accessed April 2018.Return to footnote reference
5  blog.linkedin.com/2018/january/11/linkedin-data-reveals-the-most-promising-jobs-and-in-demand-skills-2018.   Accessed April 2018.Return to footnote reference
6  www.glassdoor.com/List/Best-Jobs-in-America-LST_KQ0,20.htm.   Accessed April 2018.Return to footnote reference
7  www.glassdoor.com/Salaries/index.htm.   Accessed April 2018.Return to footnote reference