Categorical data have values that are described by words rather than numbers.

When you are collecting your data for research, it is important to know the form of your data for you to effectively interpret and analyze the. In a research study, there are mainly two types of data types: Categorical data and Numerical data.

It is important to identify both of them based on their differences and similarities. In this article, we are going to focus on categorical data and numerical data, what they are and how they differ from each other. Let’s get started.

Explore all the survey question types possible on Voxco

Explore all the survey question types possible on Voxco

What is categorical data?

Categorical data refers to a data type that can be stored and identified based on the names or labels given to them. A process called matching is done, to draw out the similarities or relations between the data and then they are grouped accordingly. 

The data collected in the categorical form is also known as qualitative data. Each dataset can be grouped and labelled depending on their matching qualities, under only one category. This makes the categories mutual exclusive. 

Categorical data have values that are described by words rather than numbers.

Example: sexuality is categorical data, as a person can be straight, homosexual, heterosexual, etc. and they are grouped together depending on the common characteristics possessed by them. 

There are two subtypes of categorical data namely: Nominal data and Ordinal data.

  • Nominal data – this is also called naming data. This is a type that names or labels the data and its characteristics are similar to a noun. Example: person’s name, gender, school name.

Questions to gather nominal data look like:

    • What is your name?
    • What is your pet’s name?
    • What is your gender?
  • Ordinal data – this includes data or elements of data that is ranked, ordered or used on a rating scale. You can count and order ordinal data but it doesn’t allow you to measure it.

Example: seminar attendants are asked to rate their seminar experience on a scale of 1-5. Against each number, there will be options that will rate their satisfaction like “very good, good, average, bad, and very bad”.

What is numerical data?

Numerical data refers to the data that is in the form of numbers, and not in any language or descriptive form. Often referred to as quantitative data, numerical data is collected in number form and stands different from any form of number data types due to its ability to be statistically and arithmetically calculated. 

Categorical data have values that are described by words rather than numbers.

It doesn’t involve any natural language description and is quantitative in nature and it is used to measure quantities like a person’s height, age, IQ, etc. 

It also has two subtypes known as Discrete data and Continuous data. 

  • Discrete data – Discrete data is used to represent countable items. It can take both numerical and categorical forms and group them into a list. This list can be finite or infinite too. 

Discrete data basically takes countable numbers like 1, 2, 3, 4, 5, and so on. In the case of infinity, these numbers will keep going on. 

Example: counting sugar cubes from a jar is finite countable. But counting sugar cubes from all over the world is infinite countable.

  • Continuous data – As the name says, this form has data in the form of intervals. Or simply said ranges. Continuous numerical data represent measurements and their intervals fall on a number line. Hence, it doesn’t involve taking counts of the items. 

Example: in a school exam, students who scored 80%-100% come under distinction, 60%-80% have first-class and below 60% are second class. 

Continuous data is further divided into two categories: Interval and Ratio.

  • Interval data – interval data type refers to data that can be measured only along a scale at equal distances from each other. The numerical values in this data type can only undergo add and subtract operations. Example: body temperature can be measured in degree Celsius and degree Fahrenheit and neither of them can be 0.
  • Ratio data – unlike interval data, ratio data has zero points. Being similar to interval data, zero point is the only difference they have. Example: in the body temperature, the zero point temperature can be measured in Kelvin.

15 differences between Categorical data and Numerical data

Features Categorical data Numerical data
Definition  Categorical data refers to a data type that can be stored and identified based on the names or labels given to them. Numerical data refers to the data that is in the form of numbers, and not in any language or descriptive form.
Alias Also known as qualitative data as it qualifies data before classifying it. Also known as quantitative data as it represents quantitative values to perform arithmetic operations on them. 
Examples What is your gender?
  • Male
  • Female
  • Other 
What is your test score out of 20?
  • Below 5
  • 5-10
  • 10-15
  • 15-20
  • 20
Types  Nominal data and Ordinal data. Discrete data and Continuous data.
Characteristics 
  • No order scale
  • Natural language description
  • Can take numerical values but with qualitative properties
  • Can be visualized using bar charts and pie charts
  • Has an ordered scale
  • Not use of natural language description
  • Takes numeric values with numeric qualities
  • Can be visualized using bar charts and pie charts
User-friendly design Can include long surveys and has a chance of pushing respondents away. Survey interaction is easy and short, hence fewer survey abandonment issues. 
Data collection method Nominal data: open-ended questions Ordinal data: multiple-choice questions Mostly collected through multiple-choice questions and sometimes through open-ended questions.
Data collection tools Questionnaires, surveys, and interviews  Questionnaires, surveys, interviews, focus groups and observations
Analysis and interpretation Median and mode Eg: univariate statistics, bivariate statistics, regression analysis Descriptive and inferential statistics Eg: measures of central tendency, turf analysis, text analysis, conjoint analysis, trend analysis
Uses  Used when a study requires respondents’ personal information, opinions and experiences. Commonly used in business research Used for statistical calculations as a result of the potential performance of arithmetic operations
Compatibility  It is not compatible with most statistical analysis methods, hence researchers avoid using it most of the times It is compatible with most statistical calculation methods. 
Visualization  Can be visualized using only bar graphs and pie charts. Can be visualized using bar graphs, pie charts as well as scatter plots.
Structure  Is known as unstructured or semi-structured data It can use indexing methods to structure data like Google, Bing, etc. It is structured data and can be quickly organized and made sense of

Categorical Vs Numerical Data: Pros and Cons

Advantages

Categorical Data: 

  1. Provides intuitive representation of the data.
  2. To gain a deeper knowledge of a topic or a population.
  3. Respondent-dependent data.

Numerical Data: 

  1. Supports statistical calculations.
  2. Gathers data quickly from a larger group.
  3. Makes analysis easier.

Disadvantages

Categorical Data: 

  1. Large data to process and analyze.
  2. The researcher may have to handle irrelevant data.
  3. Statistical analysis cannot be performed. 
  4. The data is likely to be low sensitive, such as yes/no or good/bad. 

Numerical Data: 

  1. Standardized performance limits investigation.
  2. Significant factors can be eliminated which alters the results.
  3. The researcher has more control over the data than the respondents.
  4. Can lead to doubts about the validity of the result. 

Similarities between Categorical data and Numerical data

Ordinal data 

It can be considered as a crossover between categorical and numerical data. Even though it is generally identified as a subtype of categorical data, it can be called numerical data too.

Uses 

Numerical and categorical approaches when used for research and statistical analysis, are going to yield similar results. 

Researcher sometimes decides to use them both together in a survey to find out different ways to approach the data. 

Example:

A seminar organizer wants to know the reviews of people who attended the seminar. He can ask theme questions in two ways:

Q1) Rate our seminar on a scale of 1-5

2

3

4

5

Q2) Can you explain your reason for the score?

Collection tools 

Both categorical data and numerical data are most commonly collected through methods like surveys, questionnaires, and interviews. 

Surveys are the most common data collection method used by researchers. It can be designed to gather categorical data and numerical data. 

You can either ask your participants to answer with yes/no or use Likert scale questions to gather numerical data. You can also use open-ended questions to gather necessary information from the target audience. 

Categorical data have values that are described by words rather than numbers.

Categorical Data Vs Numerical Data: Which one should you use?

Data is an integral element of any research, be it market research, academic research, or social research. Your data has to be accurate and precise to generate meaningful insights.

To ensure you take advantage of both categorical and numerical data the best way is to use both types in your research. For example, follow up an NPS® question with a qualitative question to gather in-depth information from your audience.

If you want to know how you can gather customer intelligence in categorical data and numerical data you can contact us.

Net Promoter®, NPS®, NPS Prism®, and the NPS-related emoticons are registered trademarks of Bain & Company, Inc., Satmetrix Systems, Inc., and Fred Reichheld. Net Promoter Score℠ and Net Promoter System℠ are service marks of Bain & Company, Inc., Satmetrix Systems, Inc., and Fred Reichheld.

What type of data is described by words rather than numbers?

Categorical data is qualitative. That is, it describes an event using a string of words rather than numbers.

What is meant by random sample?

A simple random sample is a randomly selected subset of a population. In this sampling method, each member of the population has an exactly equal chance of being selected, minimising the risk of selection bias.

What sampling method is often quicker and cheaper?

Non-probability sampling is a method of selecting units from a population using a subjective (i.e. non-random) method. Since non-probability sampling does not require a complete survey frame, it is a fast, easy and inexpensive way of obtaining data.

Is random sampling non

Probability sampling involves random selection, allowing you to make strong statistical inferences about the whole group. Non-probability sampling involves non-random selection based on convenience or other criteria, allowing you to easily collect data.