The Data Analysis Process

Hello, there.

For the better part of this year, I have been spending hours trying to get a hang of data analysis. There have been countless scholarships on offer and I managed to secure two of them. So far I have completed one and gained a certificate that is currently decorating my resume.

As an intermediate~ish data analyst, I finally feel like I have the confidence to share something valuable with aspiring data analysts like me. So today we will start with the basics; the data analyst process.

Data Analysis

I have organized the data analysis process into five steps: Question, Wrangle, Explore, Draw Conclusions and Communicate. Source

Below is a review of the key points but feel free to add more or expound on them in the comments sections. We will practice each step in the upcoming posts in a project based format and you will get the whole process down in no time.

Step 1: Ask Questions

Either you’re given data and ask questions based on it or you ask questions first and gather data based on that later. In both cases, great questions help you focus on relevant parts of your data and direct your analysis towards meaningful insights.

Data acquisition can happen in a number of ways:

Step 2: Wrangle Data

You get the data you need in a form you can work with in three steps: gather, assess, clean. You gather the data you need to answer your questions, assess your data to identify any problems in your data’s quality or structure, and clean your data by modifying, replacing or removing data to ensure that your dataset is of the highest quality and as well-structured as possible.

Step 3: Perform EDA (Exploratory Data Analysis)

You explore and then augment your data to maximize the potential of your analyses, visualizations and models. Exploring involves finding patterns in your data, visualizing relationships in your data and building intuition about what you’re working with. After exploring, you can do things like remove outliers and create better features from your data, also known as feature engineering.

Step 4: Draw conclusions (or even make predictions)

This step is typically approached with machine learning or inferential statistics which is on a more advanced level. But when you are just starting out, you will mostly focus on drawing conclusions with descriptive statistics.

Descriptive statistics focuses on describing the physical characteristics of a dataset (a population or sample).

Inferential statistics focuses on making predictions or generalizations about a larger dataset, based on a sample of those data. Source.

Step 5: Communicate your results

You often need to justify and convey meaning in the insights you’ve found. Or, if your end goal is to build a system, you usually need to share what you’ve built, explain how you reached design decisions and report how well it performs. There are many ways to communicate your results; reports, slide decks, blog posts, emails, presentations or even conversations. Data visualization will always be very valuable.

I hope this short guide helps you on your data analysis journey. Follow me on my journey as I do projects and share what I learn along the way.

--

--

Data Analyst | Content Creator

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store